Rapid progress in genotyping has made it possible to identify large numbers of DNA variants, like single nucleotide polymorphisms (SNPs). SNP-based association analysis is potentially a powerful way to map genes for complex diseases. Emerging evidence that SNPs fall into DNA blocks of limited diversity now holds out the possibility that association studies can achieve much greater efficiency. However, this approach relies critically on statistical procedures to summarize the complex patterns of genomic variation. Before this method can be used, a number of questions must be addressed, such as: How can SNP variation be organized into a reasonable number of units? Within a gene, how many blocks are there likely to be and how should they be defined? How frequent should the marker SNPs be in association studies? After defining the blocks, can SNPs be selected in a systematic way that tag blocks? What is the relationship between coding SNPs and common haplotypes within blocks? Do these relationships vary across populations? To address these questions, the primary aims of this research are to: (1) develop a robust method to define haplotype blocks and select haplotype tagging SNPs; (2) apply the haplotype method to real data as well as simulated data to compare with the existing methods; (3) determine the precision and power of haplotype mapping to detect SNPs responsible for variation in complex phenotypes; (4) compare the statistical power obtained using haplotype block information vs. random SNPs. In a separate aim we will also develop association methods that control for population stratification and evaluate the potential contribution of admixture mapping as a tool to localize disease-associated gene variants. For each of these aims we have identified large existing data sets that provide a robust empirical setting in which to examine the statistical aims of this project. By addressing this set of key methodological questions we will attempt to build on large genotyping efforts (eg "Programs in Genomic Applications" and the "HapMap") and apply the analyses to population surveys. The results of this project would therefore provide a bridge between high throughput genotyping and efforts by genetic epidemiologists to localize informative SNPs.