Despite the important role of distant-acting transcriptional enhancers in human biology and disease, identifying their location in the genome and determining their in vivo regulatory activities remains a major challenge. There is compelling evidence from a large number of recent genome-wide association studies that variation in noncoding genomic intervals contributes on a substantial scale to a wide range of traits and disorders. However, the paucity of comprehensive enhancer catalogues has largely precluded systematic studies of the underlying regulatory and etiological mechanisms. Accordingly, the central goal of the parent grant and this renewal application is to identify and define the in vivo activities of a sizeable set of enhancers in the human genome to serve as a springboard for broad community access. In funded years 1 through 3 (2006-2009), we established the power of extreme comparative genomics to identify a large collection of putative enhancers and have unambiguously assigned specific in vivo enhancer function to hundreds of human conserved sequences. However, this comparative approach fails to predict a priori which conserved noncoding fragments are indeed enhancers and, if so, where they will be active in vivo, thus requiring massive- scale transgenesis to resolve. More recently, we demonstrated the power of chromatin immuno- precipitation targeting an enhancer-associated transcriptional coactivator (p300) coupled with massively parallel next generation sequencing (ChIP-Seq) to accurately identify enhancers active directly in mouse tissues. This scalable experimental approach has assigned putative enhancer function to thousands of noncoding regions, thereby dramatically expanding access to genome-wide sets of enhancers active in particular cell types or tissues. However, while p300 represents a remarkable epigenomic marker for enhancer identification in vivo, it is just one member of a larger class of transcriptional co-activators and only marks subsets of known enhancers. Based on these findings, the present proposal involves the extension of these ChIP-Seq studies to 11 prioritized transcriptional co-activators followed by validation of enhancer predictivity for each epigenomic mark through a series of high-throughput transgenic mouse assays. It is anticipated that public access to these data sets will significantly fill our void in gene regulatory annotation of the human genome and to decipher how variation in these sequences causes human disease. PUBLIC HEALTH RELEVANCE: The generation of the entire human genome sequence serves as a routine starting point for a huge investigator base and has aided in defining the majority of genes in our genome. However, our understanding of the sequences that regulate these genes is meager, despite their presumed alterations in human disease. Here, we propose to identify and test epigenomic marks of DNA for their ability to act as gene regulatory sequences in transgenic mice. The identification of positive signatures of enhancers in vivo is expected to significantly fill our void in gene regulatory annotation of the human genome and to decipher their mutation as a cause of human disease.