Coronary artery disease (CAD) is a leading cause of death worldwide and in the US. While the genetics of this disease are intrinsically complex, thanks to huge research investments during the last 5-10 years, particularly in genome-wide association studies (GWAS), a more unbiased, data-driven and realistic view of CAD has been achieved. As part of this achievement, ~160 common risk loci for CAD/myocardial infarction (MI) have been identified. An important task is now to understand the molecular mechanisms/pathways by which these loci exert risk for CAD/MI allowing to translating the initial findings into new therapies and diagnostics. However, since the loci identified thus far explain only ~10% of variation in CAD/MI risk, it is also essential to define additional CAD pathways operating in parallel with GWA loci. In recent years, clinical studies that consider intermediate phenotypes (between DNA and disease) have greatly enhanced interpretations of risk loci identified in GWA datasets. In addition, disease networks that can be identified from intermediate molecular phenotypes provide an essential framework to identify novel CAD pathways and targets for new CAD therapies. Over the last 6 years, we have performed a clinical study considering many intermediate phenotypes in CAD patients (the STARNET study). In this proposal we intend to use newly generated DNA genotype and RNA sequence data from the STARNET study to identify atherosclerosis and metabolic networks underlying CAD. We then propose a new prospective study of CAD (the NGS-PREDICT study) with the main purpose of validating findings from the STARNET study. We hypothesize that the extent and stability of coronary lesions, thus clinical outcomes can be accurately assessed by defining the status of key atherosclerosis gene networks. In turn, metabolic networks active in liver, abdominal fat, and skeletal muscle influence the status of the atherosclerosis gene networks. In addition, molecular data isolated from easily obtainable tissues (e.g., blood, subcutaneous fat and plasma) can be used to identify biomarkers that can predict risk for clinical events caused by CAD. To test these hypotheses, we propose the following specific aims. Aim 1: To identify regulatory Bayesian gene networks causally linked to CAD and/or CAD sub-phenotypes using the STARNET datasets and the CARDIoGRAM meta-analysis GWA datasets. Aim 2: Identify biomarkers predicting clinical events of CAD (reflected in SYNTAX score) by applying machine learning on DNA genotype, RNA sequence and CAD plasma protein data from easily obtainable tissues of the STARNET cases. Aim 3: To validate the identified causal CAD eQTLs/networks and the biomarkers using the NGS-PREDICT study performed at the Mt. Sinai Hospital, the Swedish Twin study and CAD cell and animal models. We believe the proposed studies can lead to a significantly better molecular understanding of CAD and thus, serve the more long-term goal of preventive and personalized therapies of CAD patients diagnosed in well-defined molecular subcategories.