The ultimate goal is to develop and apply computational and statistical techniques for large-scale analysis of recently emerged genomic data in order to extract optimal and meaningful biological information Specific aims to reach this goal are threefold. (1) To thoroughly understand the technological and biological aspects of gene mapping and microarray- based experiments and to identify the statistical problems involved in functional genomic studies. This learning phase will make the PI, a mathematical statistician, familiar with the biology of human genetics. (2) To develop methods for efficient association analysis between Single Nucleotide Polymorphisms (SNPs) and specific heritable diseases. For many complex traits, the number of families or affected individuals in a study is smaller (or not much larger) than the number of SNP markers used in a genomic screen. This precludes meaningful multi-variate analyses on a genome-wide basis. To selective appropriate subsets of markers for further study and to combine information over multiple markers in multi-variate analyses, novel statistical bootstrap (resampling-based) methods will be developed The resulting subset of SNP markers will be further evaluated by logistic or other multiple regression models for risk assessment. (3) To develop statistical analysis methods for gene expression data obtained through microarray-based technologies. Issues such as reproducibility in multiple experiments and signal function frequently confound the analysis of microarray data. A multi-step procedure based on raw data from oligonucleotide expression array is proposed and computer programs will be developed. The approaches developed in the training period will ultimately allow improved analysis of both genomic and gene expression array data.