This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. Robust genome sequencing technology has resulted in over 180 completed genomes, with sequencing projects for an additional 700+ organisms in progress. The difficult and important problem of experimentally determining the proteins encoded by these genomes lags far behind. We propose to complement existing messenger-RNA based approaches with high throughput mass spectrometry of the entire protein complement of a complex animal, the nematode Caenorhabditis elegans. Our approach combines open reading- frarne (ORF) analysis of the fully sequenced C. elegans genome with high-throughput mass spectrometry, using multidimensional protein identification technology (MudPIT). Our long-term goal is development of these methods to the point that at least 80% of all proteins in a newly sequenced organism can be identified in a few months of concerted effort by a small group of investigators. This goal l requires development of the following tools: 1) efficient evolutionary analysis of genomic ORFs to identify a computationally manageable set of candidate peptides for mass spectrum matching;2) a robust method for biochemical fractionation of intact proteins from whole organisms or tissues;and 3) analytical approaches to assessing the significance of MudPIT matches to specific candidate peptides. Peptide cleavage, fractionation, and 2- dimensional mass spectrometry methods are established in our labs and are currently sufficient achieve our goal with the addition of these tools.