PROJECT SUMMARY Stroke is a leading cause of serious long-term adult disability around the world. Despite intensive therapy, an estimated 2/3 of stroke survivors do not fully recover and are left unable to care for themselves independently. Growing research suggests that rehabilitation is not ?one-size-fits-all?; variability among stroke survivors in terms of lesion location, age, gender, time since stroke and more may all affect a person's likelihood of recovery and response to different types of treatments. Personalized rehabilitation medicine to maximize each individual's recovery potential is thus desperately needed. However, in order to develop accurate, robust, and specific predictive models that can determine an individual's recovery potential and response to different treatments, large, heterogeneous datasets are needed. The current best predictors of stroke outcomes are neuroimaging (MRI) and behavioral biomarkers that look at brain structure/function and motor performance at baseline. Generating a large enough dataset of MRI and behavioral data is extremely difficult and expensive for any one site to do on its own. This proposal addresses this problem by generating a large, diverse dataset using a novel meta-analytic approach that harmonizes post-stroke data collected worldwide. In partnership with an international consortium comprised of over 500 researchers who produce the largest-known neuroimaging and genetic studies of over 18 different diseases (ENIGMA Center for Worldwide Medicine, Imaging, and Genomics), I propose to apply ENIGMA's powerful approach to answer critical questions in stroke recovery. Under this K01 career development award, I will develop skills in big data neuroimaging analytics, clinical research, and consortium building through my ENIGMA Stroke Recovery working group in order to ask questions about stroke recovery using a large dataset approach (goal n>3,000). This project has four specific aims: Aim 1 will leverage ENIGMA's existing methodology to develop the infrastructure, optimal methods, and analysis techniques for harmonizing a large dataset of post-stroke MRI and behavioral data. Aim 2 will use this large dataset to identify neural and behavioral biomarkers predicting recovery of motor impairment (e.g., actual arm movement ability) and recovery of function (e.g., ability to perform tasks, such as picking up objects with the affected arm). Aim 3 will use supervised machine learning to generate and fine-tune highly accurate predictive models of the relationship between these biomarkers and recovery of impairment versus function. Lastly, Aim 4 will use unsupervised machine learning techniques to examine shared properties of outliers from the predictive model and determine additional neurobiological mechanisms that may prevent individuals from recovering. This approach has the potential to revolutionize the way that rehabilitation research is validated, to ensure robust, reliable, and reproducible results. The methods developed here could be extended to other domains of recovery (language, gait), to study other predictors of recovery (functional brain activity, genomics), and to other clinical populations to improve rehabilitation overall.