The process of Alzheimer's disease (AD), the most common form of dementia, is thought to begin years before symptoms. This preclinical phase, characterized by abnormal levels of brain amyloid accumulation consistent with AD, holds the key to identifying causes and developing therapeutic strategies. In the absence of sensitive and specific behavioral/cognitive tests, quantitative biomarkers and genetic tests will be critical for stratified medicine in preclinical AD. This project will examine two high-dimensional data modalities, namely structural brain MRI scans and genome-wide SNP data, in order to derive tools to compute individual-level predictions about future dementia onset. AD imprints a unique atrophy signature on the brain discernable in structural MRI scans. Converging data suggest that AD-associated atrophy is detectable years before clinical symptoms. The machine learning (or pattern analysis) approach, which our laboratory has advocated in neuroimage analysis, offers highly sensitive and specific atrophy detectors. We hypothesize these tools will be invaluable for identifying preclinical AD subjects who are at increased risk of dementia onset. Late-onset AD (LOAD), which represents >95% of all AD cases, is up to 70% heritable. In addition to APOE4, the major genetic risk factor, recent genome-wide association studies (GWAS) have identified a growing list of other common genetic variants associated with LOAD. The complexity of LOAD's genetic underpinnings suggests that sophisticated models that aggregate data across the genome might help us explain some of the variability in disease progression. Developing such models using state-of-the-art machine learning technology and leveraging already-collected large-scale datasets is one of our main aims in this proposal. The proposed project will build on the principal investigator's (Sabuncu) strong background in computational modeling and machine learning to conduct analyses using cutting-edge methods on large-scale data. We will use multi-modal data, including neuroimaging and GWAS data, to develop and validate models that predict future decline in preclinical LOAD. Our method of choice will be a novel Bayesian ML algorithm, specifically designed for longitudinal data. We hypothesize that the developed models will be more useful than alternatives (constructed by discriminating cases and controls) for identifying amyloid positive individuals who are at heightened risk of imminent clinical decline. We will use a multi-level approach for discovery and validation and a multi-modal strategy to test our hypothesis.