This application proposes large-scale sequencing and genotyping in type 2 diabetes (T2D) case-control samples, addressing one of the major questions in human genetics: how and to what extent can insights into disease etiology be advanced by studying low frequency variants using next-generation sequencing platforms. First, building on our successful deployment of genome-wide association (GWA) analyses to identify novel common T2D-susceptibility variants, and our leading roles in the 1000 Genomes Project, we will address the strategic issues relevant to design of the next wave of large-scale human genetics projects. Specifically, despite great progress mapping common variants for common diseases including T2D, the vast majority of heritability remains unexplained. Our project compares three strategies that represent near-term approaches to the challenges of discovering and more fully characterizing genes for T2D and other diseases - in particular, by querying lower-frequency causal alleles (such as those found in IL23R, NOD2, IFIH1, and PCSK9). The three strategies are: (1) imputation and in silico association analysis using existing GWA data and data from 1000 Genomes Project;(2) design and deployment of a next-generation high-density SNP array (~5M SNPs);and (3) low-pass (~4x) whole-genome sequencing. Each strategy will be implemented in 3,000 T2D case-control samples from the DGI, FUSION, and WTCCC GWA sets, with extension (through imputation) to ~54,000 samples (T2D and controls) available from the DIAGRAM Consortium. We will evaluate each strategy with regard to completeness of variant discovery, genotype accuracy, and cost-effectiveness, providing guidance to other researchers in the field. The reference genotype data generated will, via imputation, empower GWA analysis of less common variants in a wide variety of diseases and traits. Second, we will use the data generated to identify rare variants influencing T2D and related quantitative traits (QTs), both genome-wide (to find novel loci) and in established regions (to fine-map causal variants and identify new susceptibility alleles). By sequencing cases and controls enriched for extreme phenotypes, we will increase power to discover low-frequency alleles that were poorly-captured in prior GWA studies, and alleles that are rare in the general population but common in cases. We will analyze related QTs in collaboration with the relevant international consortia, providing a broad set of insights into metabolic diseases. This project will leverage information from the 1000 Genomes Project to provide critical tools (genotyping, resequencing, and imputation) for next-generation genetic studies of human traits, and facilitate identification of disease mutations. Application to T2D and related QT's will provide new insights into the pathophysiology of T2D, suggest new targets for therapy, and improve predictive genetic testing to identify individuals at risk. PUBLIC HEALTH RELEVANCE: This proposal is relevant to several key objectives of the Grand Opportunity call. The research described represents groundbreaking, innovative, high impact research with the potential to accelerate genetic research by a wide range of investigators. The work is multi-disciplinary and integrates the activities of outstanding researchers at the Broad Institute, University of Michigan, and (through a proposed joint-funding mechanism with the Wellcome Trust) University of Oxford, and the Wellcome Trust Case Control Consortium. The Aims have the potential to uncover a significant fraction of the as-yet unaccounted for heritability in T2D, by identifying less common alleles of larger effect as well as indels and copy number variants that were not well captured by previous GWA studies. The genes and mutations identified as influencing T2D and metabolic diseases have the potential to inform breakthrough strategies to develop drugs to treat T2D, for genetic tests to stratify risk, and to enable more targeted approaches to prevention and treatment in the population.