Project Description: Genomewide association studies (GWAS) have identified >4000 genetic loci for a wide range of human traits, but still leaving a large proportion of heritability unexplained. In the post-GWAS era, geneticists are exploiting massively parallel sequencing technologies to study less common (minor allele frequency [MAF] 0.5- 5%) and rare (MAF<0.5%) variants, hereafter together referred to as rare variants for brevity. In the meantime, multiethnic GWAS, recognized as potentially more powerful for gene discovery and fine mapping, are receiving increasing attention from the genetics community. Among the multiethnic populations, admixed populations such as African Americans and Hispanic Americans are particularly attractive because they comprise more than 20% of the US population. These admixed populations offer a unique opportunity for gene mapping because one can utilize admixture linkage disequilibrium (LD) to search for genes underlying diseases that differ strikingly in prevalences across populations. However, little methodological work exists for admixed populations that can accommodate post-GWAS data. The methodological work lags in at least three major areas. First, there are few, if any, genotype imputation methods that are tailored to admixed samples, can accommodate the ever increasing public resources, and the typical mixture of genotyping and sequencing data among the study samples. Imputation will continue to play an essential role as sequencing will remain cost prohibitive for large GWAS collections of samples. Second, there has been no published work on practical issues regarding rare variant imputation in admixed populations. Third, despite the recent rich literature of statistical methods for rare variant association analysis in relatively homogenous populations, the field needs methods that can efficiently analyze rare variants in admixed samples, particularly with imputed or partially imputed data. In this application, we propose the following aims to fill in the above gaps: 1). Develop efficient hidden Markov model and Singular Value Decomposition based methods for haplotype-to-haplotype imputation in admixed populations; 2). Assess quality of and provide practical guidelines on rare variants imputation in admixed populations; 3). Develop a robust statistical test for the analysis of rare variants in admixed populations; and 4). Develop, distribute and support freely available software packages for the methods developed in this project. PUBLIC HEALTH RELEVANCE: Public Health Relevance Genomewide association studies (GWAS) have identified >4000 genetic loci for a wide range of human traits, but still leaving a large proportion of heritability unexplained. In the post-GWAS era, geneticists are exploiting massively parallel sequencing technologies to study less common (minor allele frequency [MAF] 0.5- 5%) and rare (MAF<0.5%) variants, hereafter together referred to as rare variants for brevity. In the meantime, multiethnic GWAS, recognized as potentially more powerful for gene discovery and fine mapping, are receiving increasing attention from the genetics community. Among the multiethnic populations, admixed populations such as African Americans and Hispanic Americans are particularly attractive because they comprise more than 20% of the US population. These admixed populations offer a unique opportunity for gene mapping because one can utilize admixture linkage disequilibrium (LD) to search for genes underlying diseases that differ strikingly in prevalences across populations. However, little methodological work exists for admixed populations that can accommodate post-GWAS data. In this application, we will fill in methodological and practical gaps in the genetic analysis of rare variants in admixed populations