PROJECT SUMMARY/ABSTRACT Studies of germline genetic variation in cancer cases and controls as well as studies of somatic mutation have transformed our understanding of cancer etiology and lead to the development of life saving cancer interventions. However, even though tumor progression, evolution, and treatment response are influenced by both somatic and germline variation, these data have largely been examined in isolation. In this work, we propose to integrate extensive data collection, novel statistical methods, and cutting-edge functional validation to discover and characterize somatic-germline interactions in a pan-cancer study. Results from our work will significantly benefit both cancer researcher and multiple medical research discipline more broadly. Within the cancer genetics field, identifying somatic-germline interactions will help (i) identify new classes of drugs targets causally upstream of those identified through somatic driver mutations, (ii) precisely treat patients by selecting interventions the basis of germline and somatic genetics as well as tumor RNA- sequencing, (iii) improve risk profiling, especially for tumor recurrence and outcomes, and (iv) develop hypotheses of the germline risk variants mechanism, especially for non-coding variants. To accomplish these goals, we will leverage tumor sequencing from the DFCI Profile Project together with recent innovations in variant imputation to assemble the largest (N>25,000) pan-cancer germline-somatic cohort to date. We will develop novel statistical and computational methods to maximize the value of these data. Because over 90% of germline genetic variation associated with cancer risk and outcomes is in non- coding regions of the genome we especially focus on integration of functional genomic sequencing from both tumor and normal tissues. Our methods will be capable of modelling proximal germline-somatic interactions as well as distal effects of germline variation on trans and global somatic changes. Furthermore, by focusing largely on RNA-sequencing we investigate a gene-centric model that provides specific hypotheses for mechanism that are readily validated via our experimental follow-up of non-coding variation that is otherwise difficult to interpret.