Recently, genome-wide association studies using single nucleotide polymorphisms (SNPs) have gained some success in detecting genetic variants associated with diseases. Copy number variation (CNV) is another widespread characteristic of the human genome that has been shown to be related to various human phenotypes. The ongoing HapMap project that is constructing a database of validated CNVs will provide valuable information for studying associations of CNVs with disease risk, the effects of CNVs on response to drug treatment, and the role of structural variation in human evolution. However, limited by the available statistical methods, current practice in studies to detect associations between human diseases and genetic variants is separate calling of SNP genotypes and CNVs followed by separate analyses. Two studies published in Nature last year (Korn et al, 2008;McCaroll et al., 2008) have suggested that combining SNP allele and copy number information can lead to accurate inference of both copy numbers and genotypes and thus affect the results of the association studies. New methods are greatly needed for simultaneous inference of SNP and CNV and testing of their joint influences on complex diseases. We therefore propose to develop novel statistical and computational methods and software for whole-genome association studies using integrated CNV and SNP information. The specific aims of this project are (1) to develop calling algorithms for allele-specific copy numbers that integrate copy number and SNP allele information, (2) to develop single-locus and multi-locus methods for joint genotype and copy number association testing, (3) to develop haplotype association methods incorporating copy numbers information, and (4) to release a user-friendly software package in R. The proposed methods will be evaluated through simulations as well as with real data, which will include (but will not be limited to) the publicly available HapMap data and human data sets from our collaborators studying genetic effects on left ventricular hypertrophy, triglycerides, and blood pressure. The proposed methods will greatly facilitate the study of human genetic variations and their association with complex diseases. PUBLIC HEALTH RELEVANCE: The proposed methods will aid in the discovery of genetic variants responsible for complex human diseases, will help us to better understand these diseases, and finally will enhance our ability to prevent, diagnose, and treat these diseases.