Expansions of tandemly repeated (TR) sequences are known to cause many genetic disorders, including Huntington's disease (HD), Fragile X and multiple forms of Spinocerebellar Ataxias (SCA). However, testing for known TR expansions in patients with clinical symptoms of SCAs does not usually identify the underlying mutation, even in cases with a strong family history. In part, this is driven by the inability of short-read sequencing technologies to resolve repetitive genomic regions larger than the sequencing reads. We hypothesize that we can identify novel large expansions of pathogenic tandem repeats using third-generation, long-read sequencing technologies. Here, we intend to develop background TR models using a control cohort and search for potential disease TR loci in a cohort of ataxia patients. To test this, we will use algorithms that we have developed (MsPac and PacMonSTR), that assemble long reads into distinct haplotypes and accurately genotypes tandem repeats on both the maternal and paternal haplotype. In Aim 1, we will genotype tandem repeats in a cohort of 600 healthy individuals sequenced with Illumina using HipSTR, and further genotype 26 healthy individuals sequenced with PacBio using MsPac and PacMonSTR. Though our preliminary data shows that short reads are insufficient to detect large TRs,the majority of TRs in a normal genome are short enough to be detected with short reads - making short read data sufficient to develop portion of our control cohort. In Aim 2, we introduce our cohort diagnosed with ataxia. The pedigree of these individuals shows anticipation, and autosomal dominant inheritance, however these individuals have been screened for known ataxia mutations, and many have been whole genome or whole exome sequenced with Illumina short-reads without identification of causal mutations. From our preliminary analysis, we have identified four highly polymorphic loci that might underlie a repeat expansion disease. The first step will be to screen an ataxia cohort of 96 selected samples for these loci using targeted approaches. For 25 individuals in which no expansion is detected, we intend to perform whole genome sequencing with PacBio and detect expanded TRs using our algorithms from Aim 1. The results of this proposal will lead to the identification of novel mutations in a set of ataxia patients, and also a general experimental and computational framework for the identification of such mutations in any patient.