Chronic obstructive pulmonary disease (COPD) is a heterogeneous disease, which is not captured by the degree of airflow limitation measured by spirometry. One of the goals of the Genetic Epidemiology of COPD Study (COPDGene) is to define meaningful subgroups of COPD, leading to a new disease classification. These clinical and statistical approaches are utilizing the extensive phenotype data collected in COPDGene, including chest CT scans. The Integrative Genomics of Clinical Subtypes in COPDGene study will use RNA sequencing and miRNA for subtyping, different from the main COPDGene study. The hypothesis is that different COPD subtypes will have distinct pathophysiology, which can be identified through gene expression signatures, miRNA profiling and integrative genomics studies. Genomewide genotyping data can be used to test for associations with the subtypes. However, traditional genomewide association studies may be underpowered to detect subtype effects. Gene expression is an important intermediate phenotype between genotypes and complex traits, and can be easily assayed in peripheral blood. In this proposal, we will use expression quantitative trait locus (eQTL) analysis to identify functional single nucleotide polymorphisms (SNPs) affecting expression of differentially expressed genes and miRNAs. We will address the following Specific Aims: (1) Gene expression profiling in COPD subtypes: We will collect peripheral blood RNA samples from subjects in COPDGene, perform RNA sequencing and miRNA profiling and test for differentially expressed transcripts and miRNAs for two clinical subtype comparisons: (A) emphysema-predominant vs. airway-predominant COPD and (B) frequent vs. infrequent acute exacerbations. We will validate the blood associations by RNA-sequencing in COPD lung tissue samples. (2) Molecular subtypes of COPD: We will use statistical and machine learning methods to define molecular subtypes based on the RNA sequencing and miRNA data. We will validate the molecular subtypes with the clinical, imaging and longitudinal follow-up data. (3) Integrative genomics of COPD subtypes: We will integrate the gene and miRNA expression data with genomewide SNP data to identify eQTL SNPs associated with transcript levels of the differentially expressed and subtype-defining genes and miRNAs from Aims 1 and Aim 2. The eQTL SNPs will be tested for association with the clinical subtypes in the full COPDGene Study population. This proposal will be distinct yet complement the parent COPDGene Study by using mRNA and miRNA expression data to identify genetic influences on COPD subtypes, which could identify biomarkers of disease subtypes or novel pathways and targets, moving towards the goal of precision medicine in COPD. The gene expression, miRNA and eQTL datasets will serve as resources for COPDGene and the community of COPD investigators.