Project Summary: eQTL Mega-analysis for Functional Assessment of Multi?enhancer Gene Regulation This proposal is in response to RFA HG-13-013 Interpreting Variation in Human Non-Coding Genomic Regions Using Computational Approaches and Experimental Assessment (R01). It utilizes statistical modeling to identify multiple regulatory variants per transcript genome-wide, validates their actual function by genome engineering, and establishes their relevance in the context of inflammation. We propose to combine two parallel approaches to identification of regulatory polymorphisms in a unique resource of 10,000 peripheral blood transcriptome profiles linked to whole genome genotypes. Multivariate regression will then be used to fine map the highest probability common variants, focusing on those that play a critical role in transcriptional regulation specifically inthe context of inflammatory autoimmune diseases. CRISPR/Cas9 mediated site specific genome engineering will be used to experimentally confirm the predictions on a moderate-throughput basis for autoimmune loci in a lymphoid cell line. The computational approach will apply h hierarchical sparse learning (structured SL) models, informed by empirical measures of linkage disequilibrium, also incorporating evolutionary probabilities and ENCODE functional annotations to predict which variants are most likely to influence transcript abundance. Extensive simulations will be used to define parameters influencing the sensitivity and specificity of multivariate regulatory polymorphism detection, while also reducing the regulatory target for each transcript to just a dozen variants. Since a major objective of the RFA is not just to prioritize regulatory variants, but also to establish their influence on organismal phenotypes, we will profile their association with transcript abundance in T-lymphocytes isolated from peripheral blood samples exposed for 24 hours to lipopolysaccharide (LPS) or the inflammatory cytokine TNF?. Peripheral blood contains most of the relevant immune cell types, and our expectation is that genetic effects are modified in disease by the inflammatory agents, some variants losing their effect, other novel variants arising. Furthermore, direct demonstration of regulatory functio will be obtained for a set of up to 150 inflammatory autoimmune disease genes already identified by GWAS, using genome engineering. Non-homologous end joining will be used to disrupt each candidate site in a screening step, using drop digital PCR to measure the impact of mutations on gene expression, and then homology-directed replacement will be used for allele-specific replacement, in a handful of cases generating all possible haplotypes to experimentally confirm the predicted joint effects in a common genetic background. The computational and experimental approaches are expected to be extensible to many common diseases, and all code will be made publically available in conjunction with the MEGA suite of software for evolutionary genome analysis.