One-third or more of individuals treated for major depressive disorder (MDD) do not experience remission of symptoms despite at least two adequate antidepressant trials. Such treatment-resistant depression (TRD) contributes disproportionately to the tremendous costs of MDD, in terms of health care costs, functional impairment, and diminished quality of life. The promise of personalized medicine for individuals at high risk for TRD is apparent. If these individuals could be recognized early in their disease course, they could be triaged to more intensive or targeted interventions to improve their likelihood of remission. For example, they might receive earlier addition of cognitive-behavioral therapy, earlier use of combination medication treatments, or earlier referral for electroconvulsive therapy. With the proliferation of treatment options in MDD, individuals can spend months or years in and out of treatment before receiving these next-step treatments. Moreover, the ability to identify these individuals would facilitate the development of new personalized interventions: rather than the requiring multiple failed prospective trials, high-risk individuals could immediately be offered study participation. At present, there are two primary obstacles to translating personalized medicine into clinical practice. First, no large and generalizable cohorts have been collected in which to build risk models. Second, no validation cohorts exist to demonstrate that such models perform well in clinical settings. The present study proposes to address these two obstacles directly. Previous investigations, including work in the large multicenter Systematic Treatment Alternatives to Relieve Depression (STAR*D) study, have identified putative clinical or genetic predictors of treatment response. However, in the absence of replication, such associations are hypothesis-generating at best. An ongoing study will collect data from 1,000 individuals treated in a New England health system for whom prospective treatment outcomes are available (the Dep1 cohort), including 500 individuals with TRD and 500 with SSRI-responsive MDD, with completion of a genome wide association study expected by spring 2009. The proposed study will first use cutting-edge modeling techniques to construct and cross-validate models of TRD using sociodemographic, clinical, and genetic predictors in the existing Dep1 cohort. In parallel, it will collect an additional 1,000 MDD subjects with 6-month treatment outcomes from the same health system. This second cohort (Dep2) will be used to validate the TRD risk stratification model. To identify these patient cohorts, this study will take advantage of computerized administrative data systems, data-mining, and natural language processing techniques that have been successfully applied to support population-based research. This approach allows identification of clinical features, such as comorbidities, medication treatments, as well as longitudinal outcomes, based on claims, pharmacy data, and medical records. The resulting patient data is far more representative of clinical populations, and far less expensive to generate, than that which could be obtained using more traditional approaches. Therefore, beyond facilitating personalized treatment of MDD, the proposed study would establish the methodology for using large clinical populations to personalize treatment in psychiatry as a whole. Public Health Relevance: A third or more of people with major depression do not get well despite two or more different treatments, and identifying these people early in treatment might allow more personalized approaches with greater chances of success. This study will use statistical techniques to try to predict who is at risk for this treatment- resistant depression, based on clinical differences and genetic variations. Then, it will examine a second group of patients to see how well this technique might work if it is applied in a large health system.