Harmful substance use (alcohol, tobacco, and/or prescription opioids) is common and twin studies suggest a substantial genetic role. Further, combined use of alcohol with tobacco and tobacco with opioids, commonly occurs suggesting that environmental and genetic risks for these behaviors overlap. However, identified genetic variation explains only a small proportion of the phenotypic variation for individual or combined substance use. Studies aiming to identify shared genetic pathways across substances (pleiotropy) have yielded inconsistent results. Among the major challenges to gene finding for these traits are phenotypic ambiguity, measurement bias, and inadequate statistical power to detect the small genetic effects associated with complex disorders. Individual clinical assessments often do not capture all substances of interest or relevant clinical factors (e.g., chronic pain) and are subject to substantial variation and bias depending upon the patient's health state, the clinical setting in which the assessment occurs, and the clinician making the assessment. Administrative International Classification of Diseases (ICD) codes derived from these assessments are frequently used because they are readily available for large numbers of subjects, but they can add another layer of inaccuracy and bias. The unique, rich, longitudinal clinical data available within the Veterans Healthcare Administration (VA) combined with data available from the Million Veteran Program (MVP) is enabling us to overcome these limitations. We began with widely available and repeated electronic health record (EHR)-based metrics: AUDIT-C for hazardous alcohol; current/past/never smoking status for tobacco; and morphine equivalent daily dose (MEDD) from pharmacy fill/refill records for prescription opioids. Longitudinal summary metrics derived from these measures were initially validated in the Veterans Aging Cohort Study (VACS) and then extended to MVP, validating them against additional criterion standards and in a much larger, more generalizable, sample. Importantly, MVP also allowed us to validate against genetic criterion standards, previously identified single nucleotide polymorphisms (SNPs). This yielded Electronic Health Record (EHR)-based, CritErion-validated Longitudinal (ExCEL) phenotypes that were substantially more strongly associated with criterion and content standards for alcohol (1, 2), tobacco (3), and prescription opioids [Becker, in preparation] than alternative phenotypes. Genome-wide association studies (GWASs) of alcohol, tobacco, and opioids using ExCEL phenotypes are underway and have both reproduced prior findings and yielded many novel associations of SNPs and genes with these conditions. We have shared ExCEL phenotypes with Alpha and Beta project groups via the MVP wiki and the MVP Phenotype Workgroup. We are currently conducting joint GWASs of ExCEL phenotypes for tobacco and alcohol (Zhao and Dao) and will soon initiate joint GWASs of ExCEL phenotypes for tobacco and opioids. Because chronic pain is strongly associated with substance use, we propose to develop, validate, and apply an ExCEL phenotype of chronic pain using repeated measures of the Numeric Pain Rating Scale (NRS) and validating it against functional impairment due to pain from the MVP survey and a genetic risk score based on previously identified SNPs. In the next four years, we will use ExCEL phenotypes to conduct GWASs of substance use (alcohol, tobacco, and prescription opioids) and chronic pain, treating chronic pain as a confounder, as a necessary exposure, and as a unifying genetic link. We expect that our analyses will reveal the extent to which genetic factors are shared between chronic pain and substance use and shed light on how pain may influence the expression of genetic risk factors for substance-related traits.