Systemic lupus erythematosus (SLE) is an autoimmune disease that leads to chronic inflammation and may affect any part of the body, especially the skin, joints, kidneys, brain, and blood. It is estimated that over 1.5 million Americans have SLE and over 16,000 new cases of SLE are reported annually across the country, calling for development of therapies to prevent or manage long-term manifestations of the disease. While the survival rate has been greatly enhanced due to advances in research on the mechanisms of the disease and aggressive therapy, there is no cure for SLE. Among autoimmune disorders, SLE is one of the most difficult to understand and treat, with great heterogeneities in the pathogenesis, manifestations and responses to therapy. Genetic predisposition is a major factor of SLE, and genetic variation is perhaps an important component toward the heterogeneities. Recent studies have suggested that expression quantitative trait loci (eQTL) mapping, an effective tool for the discovery of genetic footprints o transcription variations, may increase the chance of detecting polymorphisms related to susceptibilities and therapeutic responses in SLE patients. The goal of this project is to develop statistical and computational approaches to eQTL mapping with next generation sequencing data, to improve the detection of important genetic variants, provide great insights into gene regulation and lead to a deeper understanding of the genetics of SLE. To understand the molecular mechanisms and genetic factors of SLE, my primary mentor, Dr. Wakeland, and his collaborators have been generating RNA sequencing data, coupled with targeted DNA sequencing data, for samples from hundreds of SLE patients and controls. These large-scale and multi-dimensional sequencing data, with unprecedented resolution and accuracy, provide us great opportunities to generate significant scientific findings, while posing great challenges for data management, data analysis and results interpretation. We have three specific aims: (1) Aim 1: To develop and implement computational and statistical algorithms to build a pipeline for processing RNA and DNA sequencing data. We will build a pipeline to perform quality check for raw sequencing reads, align reads to the reference genome, call SNPs for DNA-seq samples, and identify splicing events, reconstruct isoforms and quantify isoform and overall gene expressions for RNA-seq samples. (2) Aim 2: To develop statistical models to map isoform-specific eQTLs targeting common alleles. We will first map gene eQTLs in immune-related genetic regions using conventional statistical methods. We will then develop a novel statistical model to identify SNPs that influence the regulation of isoform expression. (3) Aim 3: To develop statistical approaches to map multiple loci for improved detection of rare and weak-effect variants. We will develop multi-loci methods that can use prior biological information to group SNPs into annotated genetic regions. First, we seek to identify local eQTL intervals located within and near a gene region. Second, we will develop methods to first aggregate SNPs in the same functional sets like pathways or networks and then identify gene sets that may regulate the gene expression. This multi-loci strategy may identify distant genetic regulators. The proposed research is for a K25 Mentored Quantitative Research Development Award that will prepare the candidate for a successful career in quantitative biomedical research. The primary career goal is to become an independent investigator and an expert in developing statistical and computational methodologies for high-throughput genetic data to improve the understanding of genetic profiles, to discover genetic diagnosis and prognosis markers, as well as to promote prevention and treatment for SLE and other genetic disorders.