Multiple myeloma (MM), an incurable malignancy of clonal plasma cells, is preceded by the largely asymptomatic monoclonal gammopathy of undetermined significance (MGUS). An estimated 3% of all US adults aged ?50 years have this precursor condition, yet few biological or clinical markers predicting progression to MM have been identified. As a result, MGUS patients endure the need for periodic clinical follow-up and related medical costs, as well as heightened anxiety. Currently, population-based studies of MGUS rely on stored blood specimens or extensive registries for definitive diagnoses, an inefficient process that makes large-scale studies of MGUS difficult to initiate. The objective of this proposal is to develop new methods to efficiently conduct population-based research on MGUS using automated healthcare claims and electronic health record data. The specific aims of this project are to develop novel algorithms with maximum positive predictive value (PPV) to efficiently and accurately identify patients diagnosed with MGUS in community-based healthcare settings through the use of longitudinally-collected automated healthcare claims and electronic health record data available from 1999-2013. The proposed algorithms will be developed and piloted at the Meyers Primary Care Institute among patients aged ?50 years seeking care at Reliant Medical Group, a community-based practice providing care to a large population in central Massachusetts, and a member of the NCI-funded, nationwide Cancer Research Network. The first algorithm will be derived from data available in healthcare claims alone, including diagnosis and procedure codes. Subsequent algorithms will incorporate results from relevant laboratory tests available in electronic health records, including serum and urine protein electrophoresis and immunofixation, serum free light chain testing, and bone marrow biopsy. All potential data will then be refined in a classification and regression tree analysis. A gold standard will be developed through comprehensive review of electronic health records by two independent physician reviewers for calculation of PPV. The demographic, clinical, and health service utilization characteristics of the population of MGUS patients identified by the algorithms will be described, and compared to an age- and gender-matched patient population without MGUS. The next step beyond the present proposal will validate the algorithm that provides the highest PPV at other healthcare systems participating in the Cancer Research Network and develop a nationally-representative cohort of MGUS patients for etiological and health services research on this prevalent but highly understudied premalignant condition.