DESCRIPTION (Proposal abstract): One of the main goals of human genetics is to identify the genetic variants that affect susceptibility to complex, non-Mendelian diseases. A common approach is association mapping, whereby researchers genotype many markers to find those correlated with the phenotype of interest. These markers may not affect disease susceptibility themselves, but are likely to be in strong linkage disequilibrium (LD) with causative markers. One essential tool in the planning and analysis of association studies is computer simulation. Simulations help researchers compare competing experimental designs, and aid in the interpretation of any associations that are found. Despite this importance, there is a lack of proven simulation methods that are appropriate for the genome-wide data sets now being produced. For those methods that do exist, no attempt has been made to test whether the data produced accurately reflects the properties of observed data, or whether their use for power studies introduces a bias in terms of the final estimates of power. In this proposal, we focus on developing methods for simulating whole chromosome genetic data and for analyzing whole-genome association study data. We will test the accuracy of these methods on publicly available data as well as on genotype data collected by our collaborators at the University of Southern California. We will concentrate on how to analyze data from admixed populations such as Latinos, where population stratification makes most existing analytical methods inappropriate. This work will also help us determine the marker density and sample size needed for future association studies.