Single Nucleotide Polymorphisms (SNPs) are powerful genetic markers for the detection of susceptibility genes through association analysis. SNPs occur frequently, on average once every 1000 base pairs in genomic sequence. This makes them especially useful because several will be present within or very near any given gene, and association analysis is most powerful over short distances. The large amount of sequence data available in public databases will greatly facilitate the efficient detection of SNPs. We will utilize several innovative methods for SNP discovery and use in complex disease. First, we will develop and apply the Ensemble gene annotation system to track gene annotation and SNP information.. We will first use sequences from genes or ESTs to recover the corresponding genomic sequence from public databases. These sequences will be annotated to assist in the identification of regions to search for SNPs. Some SNPs will be identified directly by querying NCBI's ddSNP database. Other potential SNPs will be identified by alignment of redundant EST sequences already in the public domain. Suspected SNPs will be confirmed by naturing High Performance Liquid Chromatograph (HPLC) and sequence analysis. For candidate genes represented by fewer than 10 sequences, we will use the available sequence information to design primers for amplification of multiple PCR products from pools of individuals. These pools will be constructed in such a way that SNPs with allele frequencies as low as 1-3% will be identified. Denaturing HPLC will be used to detect possible SNPs, which will then be confirmed by sequence analysis. In five years, these methods will enable us to discover an average of 10 SNPs within or near each of 500 candidate genes.