Infrastructure for the ClinSeq(c) Project A phase I cohort of 1,000 individuals will be evaluated at the NIH Clinical Research Center for a set of cardiovascular phenotypic features, including, but not limited to, coronary artery calcification, lipid profiles, and blood pressure. Participants will be selected to fall within a spectrum of coronary artery calcification from normal to disease phenotype. Participants undergo a clinical evaluation, targeted clinical tests, and blood sample collection for genomic analysis and they will provide baseline information about pertinent health behavior and a family history. Exome sequencing of peripheral blood DNA will be performed on all samples and has been completed in >950 subjects to date. Importantly, ClinSeq(c) subjects will be consented for return of results (both research and clinical results) and for re-contact for iterative phenotyping. ClinSeq(c) was designed in a way that will provide the long-term potential for pursuing many different clinical projects. We propose to select subsets of subjects from the ClinSeq(c) dataset, identified by their genomic attributes, explore their phenotypic manifestations, as a new path to understanding genotype-phenotype relationships. Furthermore, we have opened a new phase of recruitment, ClinSeq(c) Phase II, which targets African Americans, a group that is underrepresented in clinical genomics research. By committing to focused recruitment in the community, we have so far recruited about 150 individuals toward a goal of 500. Piloting large-scale medical sequencing (LSMS) in a clinical research setting. We have made excellent progress in this area over the last year and are continuing to push forward in this arena. Following on our success in demonstrating the ability to identify unselected patients at risk for cancer susceptibility (Johnston et al 2012), we have extended that approach into cardiovascular diseases by identifying patients at risk for cardiomyopathies and malignant dysrhythmias (Ng et al, 2013). As well, we have demonstrated the feasibility of doing this for malignant hyperthermia (Gonsalves et al, in press). These studies required the development of the capacity to screen exomes for hundreds to thousands of variants, filter them for analytic and clinical validity, and integrate them with clinical data. To continue and expand this work, we have broadened our focus and are now undertaking a screen of all null variants in all genes known to cause human disease in an autosomal dominant pattern to demonstrate the generalizability of our approach as well as to determine the frequency of occult dominant disease in an adult cohort. The second project under development dovetails with the clinical and behavioral ClinSeq project by screening for all variants that cause disease in an autosomal recessive pattern and variants associated with pharmacogenetic traits. In addition, new goals for 2013 include development of a capacity for analysis of mitochondrial and copy number variation from exome data. Using genomic technologies to discover genes and variants involved in common disease We have developed the infrastructure and capability to perform RNAseq transcriptome analysis using both peripheral blood and lymphoblast-sourced mRNA. To pilot this technology for disease gene discovery, we have used a tails approach to our cohort, selecting the 8 patients with the highest coronary calcium scores and 8 age and sex matched controls. Peripheral blood mRNA was isolated, reverse-transcribed, sequenced, and quantified to identify differentially expressed transcripts. Twenty candidate transcripts were validated using an orthologous expression technique. These candidates were then replicated in an expression array dataset from the Framingham cohort. One candidate transcript survived these analyses. We then identified cis SNPs from this gene and determined that one of them functions as a cis-eQTL for expression and that this SNP is directly associated with coronary calcium levels. We have validated the structure of this poorly understood gene, analyzed the cDNA, characterized the expression in blood cells and show that it accumulates in the foamy necrotic plaques of vessels in atherosclerosis. We are working with collaborators at USUHS to validate the malignant hyperthermia variants using a functional assay in lymphoblast cells. We are also working with collaborators in NHLBI to investigate the relationship of variants to cardiomyopathy traits. Evaluating the pathogenicity of variants identified above. A key challenge is to validate variants identified from sequencing and informatics approaches to determine if they are actually pathogenic. This can be done in a number of ways, including clinical phenotyping and pedigree analysis (as performed in HG300387-01) and by laboratory experiments. For the latter, we are proposing to model putative mutations using in vitro assays. For the malignant hyperthermia project, we are collaborating with Prof Sheila Muldoon of USUHS to use an in vitro calcium flux assay to test for the pathogenicity of the variants we have identified in our publication (Gonsalves et al, in press). For the cardiomyopathy variants, we are developing a collaboration with a stem cell researcher to test the functional consequence of the mutations in differentiated human ES cells. Supporting other research groups by providing control data and phenotyping of controls. We have made ClinSeq(c) variant data widely and openly accessible via dbSNP. This has allowed a number of investigators to use our data as a comparison to measure the background frequency of variants is particular genes. This has led to the publication of a number of papers (Belot et al, 2013; Girirajan et al, 2013; Landour et al, 2013; McLaughlin et al, 2012; Rinaldi et al, 2012; Vester et al, 2013). ClinSeq(c) control data are also heavily used by the NIH Undiagnosed Diseases Program and other investigators. ClinSeq(c) data have also been uploaded into the NIH Clinical Center BTRIS research database, accessible to all intramural investigators. Exploring and developing novel tools and approaches for analyzing genomic data. ClinSeq(c) supported the development of the VarSifter genome analysis tool, a genome analysis shareware/open source program. We are expanding this to provide the ability to analyze mitochondrial variation and copy number variation. We collaborate with the Inherited Diseases Branch to use genetic epidemiology tools for the analysis of our clinical and genomic data.