PROJECT SUMMARY ABSTRACT Here, we propose to develop a two-step computational strategy to improve the power and resolution of identifying non-coding variants causal for autoimmune rheumatic disease by integrating functional genomic data. The computational methods developed here address an important problem in disease biology: pinpointing the precise disease-causing mutations implicated by genome-wide association studies (GWAS) and understanding the biological mechanisms by which they act. We will develop our program using activated CD4+ T cells as a model system because of their relevance to autoimmune rheumatic disease, the availability of functional genomic data, and the ability to experimentally manipulate primary T cells and related cell lines. The three overlapping aims are: 1. Leveraging allele-specific reads to increase the power of detecting functional genomic quantitative trait loci (fgQTLs). We will (i) develop an approach to accurately quantify allele-specific reads from functional genomic sequencing data while accounting for sequencing and mapping biases, (ii) develop a linear mixed model (LMM) method to perform phase-aware association tests for functional genomic traits, and (iii) apply the method to identify expression and chromatin accessibility QTLs in activated CD4+ T cells in ~100 individuals. 2. Nominate causal non-coding variants in autoimmune rheumatic disease-associated loci. We will (i) develop a method that leverages functional genomic QTLs to fine map disease-causing variants in a locus, (ii) apply the method to integrate expression and chromatin accessibility QTLs from Aim 1 with three autoimmune rheumatic disease GWAS datasets to identify disease-causing variants most likely associated with CD4+ T cell activation, (iii) computationally refine and annotate causal variants using orthogonal functional genomic data in CD4+ T cells. 3. Validate predictions using synthetic biology and genome engineering. We will (i) use massively parallel reporter assays (MPRAs) to test in activated Jurkats, ~500 synthetic constructs harboring predicted causal variants from Aims 1 and 2 prioritized for GWAS loci, and use CRISPR/Cas9 to (ii) knock out 25 enhancers harboring causal variants (a subset of the MPRA hits) in Jurkats and CD4+ primary T cells and (iii) knock-in 10 predicted causal variants in CD4+ primary T cells. We will observe the endogenous effects of genome edits by profiling molecular and cellular phenotypes during CD4+ T cell activation and differentiation.