The long-term objective of this project is to develop statistical and computational methods for the analysis of haplotypes in population genetics. With the availability of large numbers of genetic markers in the human genome and the advances in genotyping technology, it is becoming feasible in population genetic studies to genotype thousands of markers in a large number of individuals from multiple populations. The analysis of such data poses challenging statistical and computational issues and both theoretical and empirical studies are needed to develop and evaluate statistical methods that can best extract the most relevant information for statistical inference of parameters of interest. The specific aims of this projects are: (1) Develop statistical and computational methods to infer haplotype frequencies from the observed unphased marker data in multiple populations; (2) Develop general guidelines for marker selection to identify disease susceptibility variants through haplotypes; (3) Use haplotypes consisting of single nucleotide polymorphisms as well as microsatellites from multiple populations for inference on population parameters as well as local recombination rates; (4) Investigate the power of statistical methods to identify chromosomal regions that have been subject to natural selections; (5) Implement and validate the developed methodologies in computer programs that will be distributed to the scientific community; and (6) Collaborate with other investigators to apply the methods and knowledge gained from this project to analyze data from other projects. Our methods will exploit two unique features in the data to be collected: the large number of populations around the world and the exhaustive cataloguing of haplotypes in extended chromosomal regions. The developments of these novel statistical methods and user-friendly computer programs will provide useful tools on population genetic studies and the analysis of data collected from other projects will lead to better understanding of relationships among various populations and different forces leading to linkage disequilibrium patterns in the human genome.