Project Summary/Abstract It is currently feasible for small research groups to sequence individual genomes and for larger groups to sequence tens of thousands of individuals. Unfortunately, our ability to identify variants that impact phenotype has not kept pace with our sequencing capacity. This is particularly true of non-coding variants. This proposal presents a pilot screen of 4,972 disease alleles that revealed 10% of exonic mutations affect splicing. The pilot study also revealed that splicing mutations are not uniformly distributed across disease genes. Preliminary results identify 64 diseases significantly more likely to be caused by a splicing mutation. This proposal will utilize a reporter system to test the effect of substitutions and in/dels on splicing and to annotate splicing element in medically relevant genes. The data set created by this approach will be used to train an online splicing mutation prediction tool. This project will also screen all variants that fall within 75nucleotides of a splice site in the set of 130 ?actionable genes?. The study will utilize a variety of cell lines reflecting distinct tissues of origin and determine which stage of spliceosome assembly is disrupted to provide better characterization of these variants. Finally, Geisinger HealthCare System GHS in partnership with Regeneron (RGN) Pharmaceuticals has created a unique dataset of paired genotypic and phenotypic data. The GHS MyCode project has enrolled over 160,000 patients and completed whole exome sequencing (WES) on over 60,000 of those patient samples. This set will be used to identify (and verify) carriers of variants predict to alter splicing. A deep re-phenotyping of patients to asses the contribution of splicing defects to EHR QTLs, age of onset, severity, penetrance and differential engagement across multiple organ systems).