Since receiving the K24 award, I have established a successful clinical research and training program in patient-oriented research (POR) in rheumatic disease with a focus on genetic, biomarker, and environmental risk factors for rheumatoid arthritis (RA) and systemic lupus erythematosus (SLE). I continue to have independent grant support, and have mentored 15 new clinical investigators who have published 25 peer- reviewed papers during the K24 period. Three mentees have received NIH K awards and one has received an NIH R01. With renewal of the K24, I would continue to have protected time to devote to this program that has a unique training environment and an array of important POR projects. Genetic and environmental epidemiology studies have produced convincing evidence for multiple alleles and exposures as RA risk factors. A strong gene-environment (GXE) interaction between HLA-DRB1 alleles and smoking has been demonstrated for risk of the immune phenotype of CCP positive RA but not CCP negative RA. As CCP antibodies occur years before RA onset, I hypothesize that this interaction induces anti-citrulline immunity, a critical step in RA pathogenesis. These findings emphasize the need to study genes, environment and immunity with careful phenotyping. Family history encompasses unmeasured genetic and environmental risk, yet is not measured accurately in other studies including those in my research portfolio. Specific aims are to: 1) Maintain and expand my clinical research training program by mentoring new clinical investigators in POR in rheumatic diseases; 2) Enrich my comprehensive POR program to study family history, genetic, and environmental predictors in the etiology of RA using a new collection of RA cases and controls from Partners HealthCare; 2a) Collect family history data and environmental exposure data concerning smoking and reproductive factors on 1,500 RA patients and 4,500 age- and gender-matched controls by utilizing natural language processing (NLP) queries of electronic medical records; 2b: Examine genetic risk factors, environmental risk factors and GXE in predicting immune phenotypes of RA: RA with and without CCP antibodies (CCP?), and RA with and without rheumatoid factor antibodies (RF?); and 2c: Apply this comprehensive risk model to RA cases and controls and to subsets of RA stratified by specific immune phenotypes and stratified by family history of RA and other autoimmune diseases. The proposed study will leverage the NIH funded Informatics for Integrating Biology and the Bedside study that used an advanced informatics infrastructure to extract clinical data on RA diagnostic features through database mining and NLP. A highly specific algorithm was used to identify RA cases and collect samples from cases and controls. The NLP techniques will be used to extract risk factor data from clinical notes. This proposal builds on my strong track record of POR, extending the work to add family history from a new case-control collection, validate data by patient interview, and develop predictive models for risk of RA that can be used to select high risk individuals for future RA prevention trials.