DESCRIPTION: The overall purpose of this study is to examine the utility and validity of linking data from three claims-based sources to the Virginia Cancer Registry (VCR) for cancer surveillance. The study will focus on the five leading cancers in Virginia: breast, cervical, colorectal, lung, and prostate. The three claims-based files to be linked to the VCR are: Medicare, Medicaid, and the statewide hospital discharge summary files. Because of its high incidence, devastating impact, and potential preventability, monitoring cancer epidemiology is essential. An effective cancer surveillance program could help track groups at high risk for the disease and assess the value of interventions, such as screening. Surveillance mechanisms must produce information in a timely fashion to be useful to policy makers who are deciding about allocation of limited health care resources. Claims files offer an important potential source of routinely available, population-based, computer readable information that could supplement the cancer surveillance activities of statewide registries. These databases, however, have the limitations of minimal clinical content and association with billing activities. Accuracy of diagnosis coding is a particular concern, although one study indicated good accuracy for cancer diagnoses. Despite these limitations, linking claims files to cancer registries could capture more cancer incident cases and add to understanding of cancer care. Four databases will be linked in this study: 1. The Virginia Cancer Registry (VCR). Until 1990, reporting to the VCR was voluntary and included half the hospitals in Virginia; starting in 1990, reporting of cancer incident cases became mandatory. About 85% of the cases are reported by hospitals that include complete staging data; 2. Medicare files for Parts A and B, including all institutional and noninstitutional bills; 3. Virginia Medicaid files, including inpatient, outpatient, and pharmacy claims. The Medicaid files contain a large number of minority (46% black) and high risk patients; and 4. Virginia Health Information (VHI) files. Since 1993, VHI has maintained inpatient claims for all admissions to Virginia hospitals. To address the first Specific Aim, the latter three claims-based files will be linked to the VCR at the person level using AUTOMATCH, a product of Match Ware Technologies for probabilistic record linkages. The second specific Aim involves validating methods for identifying cancer incidence from the three claims files and assessing the reliability and validity of incidence and treatment data in the four data sources. Data will be abstracted from inpatient medical records and outpatient health care providers, and from reviewing laboratory, radiation treatment, and outpatient chemotherapy logs. A stratified random sample of 2,750 patients will be drawn, with the sampling protocol considering the likelihood that patients will be flagged as a cancer case by more than one of the data sources. Strata were created based on the data source of the sampled case (Figure 1, p. 52); sampled cases will be clustered within hospitals. Abstraction of inpatient records will be performed by the Virginia Health Quality Center, the state peer review organization. Collection of outpatient information will have four approaches: 1. A survey will be mailed to physicians from the VCR asking about the treatment given to a specified patient; 2. When physicians fail to respond to the mail survey, the hospital tumor registrar will contact physicians to get them to respond by mail or agree to a telephone interview; 3. Outpatient medical records will be abstracted for a random 10% sample of the 2,750 patients. The purpose of the validation is to judge the accuracy of the physicians' mailed responses. A VCR representative (a trained RN nurse abstractor) will conduct these reviews; and 4. For each hospital in the cluster sample, logs from pathology, radiotherapy, and chemotherapy units will be reviewed. Information from the primary data collection will be compared to that of the four data sources to assess their accuracy. The primary data will serve as the "gold standard" in determining the sensitivity and predictive values positive and negative of the four data files. The areas that will be examined include the definition of incident cases and initial surgical and nonsurgical treatment. An analysis will be performed to assess the representativeness of the different data sources in identifying cancer cases, using as a framework a Venn diagram showing potential overlap and discordance. The completeness of a surveillance approach combining all data sources will also be assessed by estimating the frequency of missed cases using capture-recapture techniques developed by naturalists. To assess further false negatives and the value of the capture-recapture method, a pilot study will be performed at Medical College of Virginia and three associated rural hospitals, combing all primary data sources and billing data to identify all cancer cases. The outcome of this project will be a clear sense, at least for Virginia, about the utility of linking administrative data files to a cancer registry for examining the incidence and initial treatment of cancer.