Regulatory sequences determine the level, location and timing of gene expression. These sequences are important in nearly all biological processes and many disease conditions. In many cases, the onset of cancer is likely related to changes in these regulatory sequences. This might involve single nucleotide changes that destroy or create motifs for transcription factor binding. In other cases, structural variants migh translocate a gene from one location to another, placing it under the wrong regulatory control region entirely. In still other cases integration of viral regulatory sequences into the promoter region of genes might drive expression of oncogenes. The proposed project will develop new tools to identify genes that have undergone a change in their regulatory sequences leading to cancer. Specifically, we will develop new software for the identification and prioritization of non coding mutations from whole genome sequence data. We will also develop experimental reagents in the form of a hybridization-based targeted-capture reagent to allow sequencing of prioritized regulatory regions when whole genome sequencing is either too expensive or is lacking coverage of the regions of interest. Genes found to have recurrently mutated regulatory regions could make suitable targets for therapeutic intervention as well as having prognostic and diagnostic value. In the long term, a better understanding of regulatory elements and gene expression patterns could help in the development of gene- based therapies that reduce the undesired side effects of conventional cancer therapies.