Understanding both normal and pathogenic patterns of human gene expression can help shed light on the biology of human disease. Thousands of studies have now been undertaken measuring gene expression in different tissues and diseases. By aggregating and analyzing all available human RNA-sequencing data using a high powered computational and statistical framework, we will provide a transformative resource for characterizing human gene expression patterns including rare transcriptional events, cellular networks, and genetic variation. In Aim 1 we propose to uniformly process all publicly available human transcriptome sequencing data and collect it into a publicly available resource called the Transcriptome Aggregation Resource (TAR); at least 150,000 samples will be processed using cloud computing. This resource will contain single-base resolution maps of expression, de novo mapped exon-exon splice junctions and allele speci?c expression across a set of common variations. We will supplement the expression data with cleaned and predicted metadata. In Aim 2 we will develop statistical and computational methods necessary to fully realize the potential of this resource. Speci?cally we will remove unwanted variation at scale and develop mixture models to summarize the large data resource at the gene, junction and single base levels. In Aim 3 we will analyze this resource to address fundamental questions in expression biology, include a systematic study of expression outliers and allele speci?c expression at the gene, junction and single base resolution. We will infer well-powered co-expression networks over both expressed genes and splicing patterns. This work will contribute signi?cantly to our understanding of gene expression by analyzing genomics data at a massive scale.