ABSTRACT The ?preclinical? phase of Alzheimer?s disease (AD) is characterized by abnormal levels of brain amyloid accumulation in the absence of major symptoms, can last decades, and potentially holds the key to successful therapeutic strategies. Today there is an urgent need for quantitative biomarkers and genetic tests that can predict clinical progression at the individual level. This project will develop cutting edge machine learning algorithms that will mine high dimensional, multi-modal, and longitudinal data to derive models that yield individual-level clinical predictions in the context of dementia. The developed prognostic models will specifically utilize ubiquitous and affordable data types: structural brain MRI scans, saliva or blood-derived genome-wide sequence data, and demographic variables (age, education, and sex). Prior research has demonstrated that all these variables are strongly associated with clinical decline to dementia, however to date we have no model that can harvest all the predictive information embedded in these high dimensional data. Machine learning (ML) algorithms are increasingly used to compute clinical predictions from high- dimensional biomedical data such as clinical scans. Yet, most prior ML methods were developed for applications where the ``prediction?? task was about concurrent condition (e.g., discriminate cases and controls); and established risk factors (e.g., age), multiple modalities (e.g., genotype and images) and longitudinal data were not fully exploited. This application?s core innovation will be to develop rigorous, flexible, and practical ML methods that can fully exploit multi-modal, longitudinal, and high- dimensional biomedical data to compute prognostic clinical predictions. The proposed project will build on the PI?s strong background in computational modeling and analysis of large-scale biomedical data. We will employ an innovative Bayesian ML framework that offers the flexibility to handle and exploit real-life longitudinal and multi-modal data. We hypothesize that the developed models will be more useful than alternative benchmarks for identifying preclinical individuals who are at heightened risk of imminent clinical decline. We will use a statistically rigorous approach for discovery, cross-validation, and benchmarking the developed tools. This project will yield freely distributed, documented, and validated software and models for predicting future clinical progression based on whole-genome, longitudinal structural MRI and demographic data. We believe the algorithms and software we develop will yield invaluable tools for stratifying preclinical AD subjects in drug trials, optimizing future therapies, and minimizing the risk of adverse effects.