This project is concerned with developing new statistical methodology for population genetic data. Attention will be focused on three main areas concerned with dependencies among sets of alleles: the characterization of population structure, the characterization of the association patterns within and between genetic markers and along haplotypes, and the characterization of relatedness and inbreeding for individuals. Theory will be developed at least in part in response to the needs of current large-scale SNP surveys for humans and in anticipation of whole-genome sequence data sets. The work is proposed by a group of investigators in the Department of Biostatistics at the University of Washington. They propose to continue collaboration with W.G. Hill at the University of Edinburgh and P.M. Visscher at the Queensland Institute of Medical Research. This extended group has interacted successfully over the previous award period, as evidenced by a set of 40 publications. The Beagle approach of S.R. and B.L. Browning will be applied to the detection of tracts of identity by descent. The resulting measures of relationship will be used to refine tests for marker-disease association and to estimate heritability of complex human traits. The population-specific measures of population structure described by B.S. Weir and W.G. Hill will be applied to recently published whole-genome SNP data sets and whole-genome sequence data sets. Methods will be sought to improve methods of drawing inferences about these quantities. Measures of identity by descent and of population structure have the potential to identify regions of the human genome that have been subject to natural selection, and these analyses will be conducted with attention to the large variation and skewness imposed by the evolutionary process. The work of C.C. Laurie and B.S. Weir on detecting chromosomal features, such as inversions, by examining correlations of individual SNPs with principal components derived from large sets of SNPs will be extended. The partial regression approach introduced for QTL mapping will be applied to this problem. Measures of linkage disequilibrium that do not depend on genotypic phase were introduced and have been used previously by these investigators. They will now be extended to the situation of disequilibrium between pairs of loci when several SNPs typed for each gene. Association mapping continues to be of considerable interest to human geneticists and the problem of accounting for (even low level) relatedness will be addressed. Ignoring individuals with at least one relative in a case-control study, for example, can lead to a loss of power. Previous work of Y. Choi and B.S. Weir that modified simple allelic association tests will be extended to the more appropriate logistic regression methods. PUBLIC HEALTH RELEVANCE: As population genetic datasets grow, there is both the need and the opportunity to quantify the dependencies among alleles within and between individuals, or within and between populations. Individual-level dependencies address inbreeding and relatedness and can lead to estimates of heritability of complex human traits. Relatedness estimates can be used to modify tests of association between genetic markers and human diseases. Allelic dependencies at the population level provide characterization of population structure and can be used to infer the presence of natural selection in the history of the populations. Work is proposed to strengthen ways of estimating allelic dependencies, with attention being paid to the variation imposed by the evolutionary process as well as the variation from sampling individuals from current populations.