The main theme of this research is haplotype, multilocus, general genetic association methods, and statistical issues that arise in large scale data analysis, such as in genome-wide association scans (GWAS). Some of our research is focusing on developing of methods to combine genetic association signals across different samples of the same disease, or signals across multiple, etiologically similar diseases. These methods will help to identify genetic loci involved in several diseases with shared pathogenesis. For example, the genetic variant can be involved in susceptibility to several autoimmune diseases. Association signals can be correlated. One example that leads to correlated signals is shared controls design for GWAS, where the fact of reusing a control group while testing for genetic association with different diseases may create strong correlation between association signals. The methods we have developed are general (Zaykin, Kozbur 2010), and they are being applied to diverse problems in collaborations with NIH and extramural scientists (Costigan et al., 2010, Reimann et al., 2010;with Dr. Raja Jothi, ongoing) Ongoing research include development of statistical approaches to address multiplicity issues in whole genome scans. This research includes investigation of novel approaches aimed to improve ranks of true positives in whole genome scans (in collaboration with Dr. Jack Taylor). We have been developing methods that allow evaluation of chances that a true association will rank among best results in a genome scan. A standard calculation in the design of GWAS is a sample size determination needed to achieve adequate power at the genome-wide level of significance. We are taking an alternative approach: to calculate the probability that a true positive will rank among a specific number of best results, when they are sorted by an association statistic. The rank-based approach allows one to find the number of most significant results to follow up on, as determined by the desired probability of capturing a true association. The rank-based approach is appealing, since it provides guidance for the number of SNPs needed in a replication study. Unlike the power-based approach, it does not require specification of a particular significance level. The problem here is that evaluation of ranking probabilities can be very difficult, because it requires non-standard numerical methods and simulations that model realistic patterns of linkage disequilibrium. Linkage disequilibrium may be specific to a particular scan, thus one would have to perform a customized analysis that involves access to the individual genotype data for a given genome scan. At GWAS densities such analysis can take many weeks to run. We have been concerned with development of practical methods for evaluation of ranking probabilities. We have been developing a method that is completely general in that the same simple approach applies regardless of the extent and structure of linkage disequilibrium. Other statistical genetics research included investigation of methods for estimation of relative risk for family data and imprecisely scored genotypes (in collaboration with Drs. Weinberg, Shi, Umbach, London and Hancock).