Infrastructure for the ClinSeq Project A phase I cohort of 1,000 individuals has been evaluated at the NIH Clinical Research Center for a set of cardiovascular phenotypic features, including, but not limited to, coronary artery calcification, lipid profiles, and blood pressure. Participants will be selected to fall within a spectrum of coronary artery calcification from normal to disease phenotype. Participants underwent a clinical evaluation, targeted clinical tests, and blood sample collection for genomic analysis and they will provide baseline information about pertinent health behavior and a family history. Exome sequencing of peripheral blood DNA has been performed on >950 subjects. Importantly, ClinSeq(c) subjects are consented for return of results (both research and clinical results) and for re-contact for iterative phenotyping. ClinSeq(c) was designed in a way that will provide the long-term potential for pursuing many different clinical projects. We propose to select subsets of subjects from the ClinSeq(c) dataset, identified by their genomic attributes, explore their phenotypic manifestations, as a new path to understanding genotype-phenotype relationships. Furthermore, we have opened a new phase of recruitment, ClinSeq Phase II, which targets African Americans, a group that is underrepresented in clinical genomics research. By committing to focused recruitment in the community, we have so far recruited about 250 individuals toward a goal of 500. Piloting large-scale medical sequencing (LSMS) in a clinical research setting. We have made excellent progress in this area over the last year and are continuing to push forward in this arena. Following on our success in demonstrating the ability to identify unselected patients at risk for cancer susceptibility (Johnston et al 2012), cardiomyopathies and malignant dysrhythmias (Ng et al, 2013) and malignant hyperthermia (Gonsalves et al, in press). These studies required the development of the capacity to screen exomes for hundreds to thousands of variants, filter them for analytic and clinical validity, and integrate them with clinical data. To continue and expand this work, we have broadened our focus and have undertaken a screen of all null variants in all genes known to cause human disease in an autosomal dominant pattern to demonstrate the generalizability of our approach as well as to determine the frequency of occult dominant disease in an adult cohort. This project is in draft manuscript stage and has been submitted for publication. The second project under development dovetails with the clinical and behavioral ClinSeq project by screening for all variants that cause disease in an autosomal recessive pattern and variants associated with pharmacogenetic traits. This enormous project (thousands of variants) will likely take 3 years to complete. We have also developed the capacity for analysis of mitochondrial and copy number variation from exome data. These results show robust detection of mitochondrial variants, including variants associated with human disease. Using genomic technologies to discover genes and variants involved in common disease We have developed the infrastructure and capability to perform RNAseq transcriptome analysis using both peripheral blood and lymphoblast-sourced mRNA. To pilot this technology for disease gene discovery, we have used a tails approach to our cohort, selecting the 8 patients with the highest coronary calcium scores and 8 age and sex matched controls. Peripheral blood mRNA was isolated, reverse-transcribed, sequenced, and quantified to identify differentially expressed transcripts. Twenty candidate transcripts were validated using an orthologous expression technique. These candidates were then replicated in an expression array dataset from the Framingham cohort. One candidate transcript survived these analyses. We then identified cis SNPs from this gene and determined that one of them functions as a cis-eQTL for expression and that this SNP is directly associated with coronary calcium levels. We have validated the structure of this poorly understood gene, analyzed the cDNA, characterized the expression in blood cells and show that it accumulates in the foamy necrotic plaques of vessels in atherosclerosis. This manuscript has been accepted for publication in the Am J Hum Genet (Sen et al, 2014). Evaluating the pathogenicity of variants identified above. A key challenge is to validate variants identified from sequencing and informatics approaches to determine if they are actually pathogenic. This can be done in a number of ways, including clinical phenotyping and pedigree analysis (as performed in HG300387-01) and by laboratory experiments. For the latter, we are proposing to model putative mutations using in vitro assays. For the malignant hyperthermia project, we are collaborating with Prof Sheila Muldoon of USUHS to use an in vitro calcium flux assay to test for the pathogenicity of the variants we have identified in our publication (Gonsalves et al, 2013). For the cardiomyopathy variants, we are developing a collaboration with a stem cell researcher to test the functional consequence of the mutations in differentiated human ES cells. Supporting other research groups by providing control data and phenotyping of controls. We have made ClinSeq variant data widely and openly accessible via dbSNP. This has allowed a number of investigators to use our data as a comparison to measure the background frequency of variants is particular genes. This has led to the publication of a number of papers (Belot et al, 2013; Girirajan et al, 2013; Landour et al, 2013; Vester et al, 2013; Wassiff et al, in press; Cross et al, in press; ). ClinSeq control data are also heavily used by the NIH Undiagnosed Diseases Program and other investigators. ClinSeq data have also been uploaded into the NIH Clinical Center BTRIS research database, accessible to all intramural investigators. We have nearly completed deposition of our data into dbGaP. Finally, we measure the success of ClinSeq for the intramural program by the impact of the program on the wider research community. To that end, ClinSeq has served as the foundation for the development of the Intramural Research Program Clinical Center Genomics Opportunity, which has provided, through competitive applications, 1,000 exomes to be used by researchers outside of NHGRI to explore their phenotypes of interest and we are using the ClinSeq incidental findings analysis experience to develop a clinical incidental findings service for the CCGO program.