We conduct research on the mass spectrometric characterization of proteins both collaboratively with groups in NICHD as our first priority, but we also conduct independent investigations in mass spectrometric protein characterization. A major aspect of this work is the identification of proteins isolated in biochemical investigations of other PIs. In terms of identification of unknown proteins, the MS data are used to query genomic databases to ask the general question, "Do any of the protein sequences present in the data base have expected proteolytic cleavage products with theoretical masses that match the empirically determined masses of the peptides generated from the unknown?" Three mass spectrometric approaches are available for this effort. Matrix Assisted Laser Desorption Ionization (MALDI) with Time-of-Flight (TOF) mass analysis, liquid chromatography (LC) followed by electrospray ionization with mass analysis in an instrument capable of using fragmentation reactions to generate peptide sequences, i.e. LC-MS/MS, and MALDI followed by tandem TOF analysis for the determination of peptide sequences from fragment ion spectra. With this combination of instrumentation, we are confident that, given enough material in a gel band to allow as much as 100 fmole to be available for analysis, a positive identification can be made for a protein that is described in a database.[unreadable] [unreadable] There are several areas of development that are being followed in order to improve protein characterization capabilities. First, we have developed a novel approach to providing sequence information for proteins that are not described in data bases, due to data base error, incompleteness splice variants or SNPs; this incompleteness is associated most frequently with organisms having unknown or partially characterized genomes, e.g. X. laevis. We are taking the approach we have termed ?De Novo Peptide Sequencing through Exhaustive Enumeration of Peptide Compositions", EEPC. This approach is novel in comparison to the other widely used methods, in which the so-called "sequence tag" for a peptide is found within peptide fragmentation spectra, typically employing mass accuracies on the order of 0.5-1Da. Our approach requires measurement of fragment ion masses to at least 0.05 Da or better and uses ions arising from the decomposition of energetic ions alone without the use of collision induced dissociation. In order to carry out this work we have calculated a data base that consists of an exhaustive listing of all amino acid compositions up to a maximum of 2 kDa. The data base, a Length-Indexed Peptide Composition lookUp Table, LIPCUT, is indexed by both peptide length and mass in order to facilitate its being accessed during execution of the extension or matching de novo algorithms. We have recently improved the algorithms used for sequencing by adding the capability of predicting internal fragment ions of the highest scoring peptides found by the de novo algorithms. With this addition we have been able to eliminate potential ambiguities in a sequence that arise from incomplete fragmentation in the observed data.[unreadable] [unreadable] In a area related to protein identification and sequencing, we have made major strides in characterizing the C-terminal post-translational modifications of tubulins. Specific progress has centered on improvements in the cyanogen bromide digestion protocol and on the development of software for the the assignment of glycation, glutamylation and de-tyrosination mass spectral peaks within the families of both alpha- and beta-tubulins in samples containing multiple isotypes. These improvements have been conclusively demonstrated with the assignment of more than 60 peaks in spectra of rat brain tublins, the most complex of all tissue tubulin types with the exception of that found in testes. We have recently made a substantial technical improvement in our methodology and are now able to employ in-gel digestion methods rather than relying on solution digestions. This has been accomplished through the use of agarose rather than acrylamide gels. Recent analytical results have shown that bovine and rat brain tubulins appear to have indistinguishable compositions and perhaps more interestingly, the tubulins associated with clathrin coated vesicles have the same tubulin composition as a homogenate of whole brain.[unreadable] [unreadable] Another study has investigated the effect of a particular protein post-translational modification. Asparagine deamidation is an important protein post-translational modification, which increases as a protein ages. While deamidation can occur at all asparaginyl residues, the reaction rates in a protein can vary greatly depending on primary sequence and conformation. Because deamidation can change the biological activity of a protein, it is important to determine both the extent of deamidation and the specific residues where deamidation has occurred. We have used a combination of MALDI TOF and MALDI tandem TOF mass spectrometry for a quantitative determination of deamidation and the mapping of specific deamidation sites. This method is applied to mapping deamidation sites in the recombinant proteins Protective Antigen and Lethal Factor from Bacillus anthracis. Since Protective Antigen is the basis for all current anthrax vaccines, a fundamental understanding of the extent of deamidation with aging of this protein is an important public health issue.[unreadable] [unreadable] The project for characterizing the protein mass fingerprints of amniotic fluid from patients who have undergone premature labor has progressed to the stage of clearly differentiating patients who deliver at term versus those who have delivered prematurely under conditions where both groups have experienced premature labor. This represents the first time that this differentiation has been made and represents an important milestone in this project. The methodology that has been developed employs comparisons of MALDI mass spectra in the range of 2-10 kDa obtained from samples of diluted amniotic fluid samples that have been desalted and then applied directly to the MALDI sample stage. Our experimental design that characterizes the variance of spectra arising from a variety of methodological factors. We have developed a mathematical/statistical approach in MatLab to automate both ANOVA and Principal Component Analysis and reliably differentiate between classes of samples. The same mass spectrometric and mathematical approaches are being used to characterize cells isolated by an improved form of laser capture microdissection as well as in analyses of data collected in studies of the origins and contaminants of foods.