There is increasing agreement that association studies using a dense set of single nucleotide polymorphism (SNP) markers, either evenly distributed in the genome or associated with candidate genes, would provide the necessary power to detect small genetic effects for complex disease traits. Currently, there is a concerted effort by the public and private sector in identifying millions of SNPs in the human genome. Through this effort, at least 500,000 candidate SNPs will be deposited to the NCBI public database, dbSNP. However, the usefulness of the candidate SNPs will be severely limited if they are not further characterized because STSs have not been developed for most of the candidate SNPs and the candidate SNPs are mostly uninformative in any given population. It is therefore an extremely costly proposition if every group performing association studies has to characterize all the candidate SNPs before they can find the ones suitable for their particular study. In this proposal, we plan to build on our expertise in STS development and allele frequency estimation to characterize 100,000 candidate SNPs for the human genetics community. The end product of this effort will be 100,000 mapped STSs containing SNPs characterized for their allele frequencies in 3 major populations in the world. The specific aims of this project are to identify 100,000 candidate SNPs in dbSNP that have not been characterized and to estimate the allele frequencies of these SNPs in 3 populations using a comparative pooled DNA sequencing approach. We will deposit updated SNP information on a weekly basis to dbSNP. Even at out initial pace of characterization, 500 SNP markers in the public database will be updated each week and. As soon as our project gets under way, the community will have all the information necessary for choosing SNPs most suitable for their particular studies.