This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. The Genome Projects worldwide are rapidly pouring a wealth of DNA sequence data into databases at the National Institutes of Health (NIH) and many other repositories. Within this vast quantity of data lie the largely not-yet-understood blueprints which the individual cells in an organism use to build the array of proteins that serve as the molecular machines for executing the wide variety of biological processes necessary to sustain life. These is ever-growing genomic databases serve as a fundamental resource in accelerating research using mass spectrometry for identification of proteins. Mass spectrometry (MS) techniques produce two types of information from a single sample in a matter of minutes. The first are peptide mass values. A so-called peptide-mass fingerprint is obtained after using an enzyme to digest a target protein into a mixture of smaller pieces called peptides. The molecular masses of each peptide in the mixture are measured with a mass spectrometer. The resulting set of masses constitutes a fingerprint. The second is peptide sequence. In a tandem MS experiment individual peptide components in either an unseparated or separated mixture can be selectively dissociated to yield a spectrum of product ions(collision induced dissociation spectrum). Subsequent measurement of the mass values of all of these product ions provides the mass differences between adjacent product ions that can be assigned to amino acid side chain and thus peptide sequence. Because of the complexity of the data produced from these types of experiments and the tremendous sample throughput potential from automation of MS instruments we are continuing to develop software for automatic peptide sequence assignment from these CID spectra. In those situations when the particular protein sequence is not in these databases, de novo sequence can be deduced by manual interpretation of these CID spectra that can be used to initiate gene-cloning efforts. Alternatively, strategies to detect remote homologies from existing database entries can be employed. These are being developed as well. (Additional effort and instrument time reported under Collaborative projects and other Technical Research and Development projects.)