PROJECT SUMMARY Large-scale cancer sequencing efforts provide a unique opportunity for the discovery of germline and somatic driver alterations influencing cancer susceptibility, initiation, progression, and clinical response. Detecting such alterations is fundamentally and technically challenging for several reasons including: 1) the combinatorially enormous number of ways that a genome can be altered, 2) the presence of various sized repeats, highly homologous gene families, and other contextual influences on alignment and detection accuracy, 3) systematic errors inherent in current sequencing technologies and tumor preservation techniques, and 4) intratumoral and intertumoral heterogeneity including clonality, purity, and lymphocyte infiltration. As a result, the full complement of driver events for the typical tumor still defies identification and, in many cases, no drivers can be found. Our recent work has also demonstrated that some types of indels/SVs such as complex indels, ITD/PTD (internal/partial tandem duplications), and homopolymer indels are often missed by existing approaches. Beyond detection challenges, functional interpretation of the impact of genomic alterations requires strategies that integrate WGS/exome, RNA-seq, and protein data to reveal translational, splicing, and protein structural effects. In addition, cooperative dynamics between germline and somatic alterations are usually missed, as these events have been analyzed independently. As cancer sequencing projects expand to include well-curated clinical phenotypes, methods necessary to understand the pathogenicity and druggability of driver alterations that underlie phenotypes such as drug resistance or exceptional responders are also urgently needed. To fully harness the power of large-scale cancer genomics and to facilitate advances in personalized medicine, our group proposes to focus on two core competencies, coding and non-coding mutations, outlined in the RFA. In collaboration with other GDACs, GDC, and AWGs, we will extend computational approaches that we have successfully established and applied for TCGA and ICGC projects to detect and functionally and clinically interpret germline and somatic drivers using sequencing data from GCC along with curated clinical data.