A major project of this section is the development of new statistical genetics methodology as prompted by the needs of our applied studies and the testing and comparison of novel and existing statistical methods. [unreadable] [unreadable] The project to develop propensity scores in linkage analyses as a method for inclusion of covariate effects has been continued in conjunction with Dr. Betty Doan. This method appears promising in that it is generally more powerful than including the covariates directly into the model, and does not have strongly inflated Type I error rates. We are currently examining factors that affect the performance of this method and are applying it to Dr. Bailey-Wilsons lung cancer data.[unreadable] [unreadable] We are currently working on establishing a p-value threshold for genome wide association studies using the number of independent SNPs and blocks within the HapMap database, as well as the Affymetrix and Illumina GWAS panels. Since increased density reduces the number of independent tests, using corrections like Bonferroni are not accurate. Instead, we are proposing to identify the true number of independent SNPs across the genome. A manuscript is under review and work is ongoing to refine and extend this method.[unreadable] [unreadable] We also developed a perl program to count and visualize the number of extended tracts of homozygosity in dense SNP data. Excess homozgyosity could reflect inbreeding or possible regions of deletions, and visualizing these regions by case status will allow us to determine if these regions harbor potential deletion sites. This program is currently being tested by members of our Branch.[unreadable] [unreadable] We developed a new approach to error detection and correction in microarray gene expression studies of families. A paper was published this year describing this method and its effect on the power of linkage studies that use the resulting phenotypic data. [unreadable] [unreadable] We also examined the effects of intermarker linkage disequilibrium on linkage Type I error and power in varying types of family structures. We found that even small amounts of LD can inflate Type I error, that multigenerational pedigrees are less affected than are nuclear families and that missing parental genotypes exacerbates this effect. A paper presenting these results was published this year and we are developing optimal methods for removing intermarker LD while maximizing power and controlling Type I error.[unreadable] [unreadable] We are exploring the utility of various machine learning methods in genome-wide association studies, particularly with respect to power and detection of gene-gene and gene-environment interactions.[unreadable] [unreadable] Many of these projects are ongoing.