The primary aim of the proposed project is to perform a large-scale survey to identify coding-region single-nucleotide polymorphisms (cSNPs) in 5,000 human genes. The survey will cover approximately 8.5 Mb of sequence on 80 chromosomes (representing a total of about 680 Mb) and is expected to result in the identification of at least 20,000 cSNPs. The coding region of each gene will be amplified by RT-PCR from 40 individuals and the resulting products will be screened for polymorphisms by two independent methods: (a) high-density oligoinucleotide arrays (DNA chips) and (b) denaturing high-pressure liquid chromatography (dHPLC). All genes will be screened by both methods. The use of two screening methods provides high sensitivity to ensure that the vast majority of cSNPs are identified. It also yields a continual cross-check on accuracy, allowing weaknesses of each method to be identified and improvements made. Based on polymorphism frequencies observed in our preliminary studies, the survey is expected to identify at least 20,000 cSNPs with approximately 45 percent encoding an alteration in amino acid sequence. The cSNPs should provide a valuable resource for studying disease association. To facilitate this, all information (including sequence change, effect on protein sequence, observed allele frequencies, etc.) will be promptly deposited on our web site and in national databases and all cSNPs will be made freely available for research.