Ischemic stroke is the 4th leading cause of death in the U.S. and a major cause of disability. The etiology of stroke is multifactorial and poorly understood. Genetics is a potentially powerful tool for better understanding disease etiology as it can highlight biological mechanisms underlying disease and point the way to improved prevention, treatment, and outcome. Large genome-wide association studies (GWAS) of ischemic stroke (IS) populations have been successful at identifying stroke-risk-associated loci with small effect sizes, however, the role of copy number variation (CNV) variation in stroke susceptibility has yet to be explored, and is the premise of our proposal. Studying CNV has revealed important insights for numerous other complex diseases. Further, we recently demonstrated that a higher CNV burden genome-wide is associated with poorer stroke outcome at 3 months. We therefore hypothesize that CNV analyses of existing GWAS and exome data will be a highly effective and cost-efficient methodology to identify novel associations illuminating stroke mechanisms, treatment targets, and outcome drivers. We further speculate that these analyses will identify CNVs of large effect size in ischemic stroke, as suggested by the existence of numerous monogenic, syndromic and complex diseases associated with CNV and that CNV may help explain the ?missing heritability? known to exist in stroke. For this application, we have already assembled over 24,500 well-phenotyped IS cases, including IS subtypes, and over 43,500 controls, all with readily available genotyping on GWAS and exome arrays, with case measures of stroke outcome. To evaluate CNV-associated stroke risk and stroke outcome we will: 1) perform Risk Discovery using several analytic approaches to identify CNVs that are associated with the risk of IS and its subtypes, across the age-, sex- and ethnicity-spectrums; 2) perform Risk Replication and Extension to determine whether the identified stroke-associated CNVs replicate in the ethnically diverse TOPMed Consortia and then using existing TOPMed and GeneStroke Consortium biomarker data (e.g. methylation, proteomic, RNA, miRNA, etc.) evaluate how the identified CNVs exert their effects on stroke risk, and lastly; 3) perform outcome-based Replication and Extension analyses of our recent findings demonstrating an inverse relationship between CNV burden and stroke outcome at 3 months (mRS) in these additional datasets, and then determine the key CNV drivers responsible for these associations using existing biomarker data. Our study will leverage the numerous advantages of using existing case-control data sets, exploring the relationships between CNV and IS and its subtypes, and outcome at 3 months, across the sex-, age- and ethnicity-spectrums. The proposed study creates a new training network for junior investigators and establishes a unique resource for the continued study of the genetic basis of IS. The successful identification of novel genes, pathways and drug targets has the potential to transform our understanding of the stroke pathophysiology leading to more effective prevention, treatment and outcome strategies.