Osteoarthritis (OA) affects 14 million individuals in the US and over 300 million adults worldwide. The disease is characterized by joint pain and functional limitations and is associated with poor health- related quality of life and increased healthcare utilization. OA of the hips and knees ranks as the 11th highest contributor to global disability. Despite the clinical and economic impact of knee OA, no disease-modifying agents are currently available; current treatments are limited to symptom control and are only modestly efficacious. While several promising treatments are in the pipeline, developing and testing treatments for OA is complicated by disease heterogeneity. We urgently need to identify the right patient for the right treatment to ensure that new therapies are being tested on the appropriate population. This proposal aims to use machine learning methods to address gaps in our understanding of disease heterogeneity in knee OA. We will use publicly available data from the FNIH OA Biomarkers Consortium project. This study of 600 subjects with knee OA includes over 200 parameters that describe the joint structure and disease severity, including measures of cartilage, bone, ligaments, menisci, and inflammation. An unsupervised learning approach using model-based clustering will be used to distinguish disease phenotypes. To implement phenotyping in practice a minimal set of biomarkers must be identified that meets the challenges of both predictive accuracy and feasibility. Thus the second aim will investigate variable selection methods in model-based clustering in order to identify important variables and develop a prediction model to determine phenotype. Finally, a supervised machine learning approach via super learning will investigate algorithms to predict disease progression. The applicant, Dr. Jamie Collins, is a biostatistician at the Orthopedic and Arthritis Center for Outcomes Research at Brigham and Women?s Hospital. Dr. Collins is a committed investigator in rheumatology research with eight first author publications in the field. She holds a career development award from the Rheumatology Research Foundation and pilot funding from the Brigham Research Institute. This proposal will provide protected time and rigorous training so that the applicant can expand her current biostatistical skill set to encompass the burgeoning fields of data science and machine learning. She will take coursework at the Harvard TH Chan School of Public Health and will have access to courses, seminars, and training provided by the Brigham Research Institute and the Harvard Catalyst Program. The applicant will be supported by mentorship from Drs. Elena Losina and Tuhina Neogi, and input from the advisory committee of Drs. Tianxi Cai, Jeffrey Duryea, Ali Guermazi, Tina Kapur, Virginia Kraus, Katherine Liao, and Kurt Spindler. The research and training proposed in this award will address critical research gaps in our understanding of OA heterogeneity and progression. This will set Dr. Collins on the path towards independence and her long-term career objective of being an independent investigator with a focus on applying advanced analytic methods in OA research.