The standard notations of yeast genes suggest that most mRNAs give rise to a single protein product defined by a simple long open reading frame (ORF). However, recent proteomic-scale ribosome profiling data, as well as bioinformatics results, suggest that translation can initiate at sites on mRNAs not currently annotated as start sites. This potential increased complexity of translation initiation raises the exciting possibility that gene regulation at the level of translation initiation may play a greater role than previously realized. The primary goal in this project is to assess the repertoire of protein products of genes contributed by different translation initiation events, using budding yeast as a model system. The first aim is to characterize the complexity of protein N-termini through an N-terminus peptide selection method. The sequences of tryptic peptides originating from the N-termini of proteins will be analyzed using tandem mass spectrometry. SEQUEST interpretation of spectra will incorporate predicted alternative translation initiation sites as well as predicted post-translational cleavage events. This analysis will assess the prevalence of translation initiation at AUG codons, or non-canonical (nonAUG) codons, upstream and downstream of the annotated start codon. The second aim is to assess alternative initiation sites of selected annotated ORFs using carboxy-tagged proteins. Carboxy-tagged primary protein products of genes will be partially purified and analyzed to determine their N-termini using mass spectrometry. Individual genes with potentially misannotated translation initiation sites will be tested. Mutational analysis using epitope-tagged transgenes in centromeric plasmids, will be performed to determine whether mutation of annotated or newly-implicated translation initiation sites interferes with translation of the tagged primary ORF product. Improving our understanding of the repertoire of translation initiation events on mRNAs will lead to more accurate algorithms for predicting and annotating the protein products of genes in yeast and multicellular organisms. With large datasets of mRNA sequences expected from future deep sequencing experiments, improved prediction of translation initiation will make it easier to identify sequence variants and mutations expected to affect translation initiation and change protein products, including cases where altered translation initiation gives rise to disease phenotypes. ) PUBLIC HEALTH RELEVANCE: Based on recent studies, including transcriptome-scale ribosome profiling, it is likely that the regulation of gene expression at the level of translation initiation is more important than previously realized, and we plan to investigate this in the yeast model system through a combination of mass spectrometry and molecular genetic approaches. Improved understanding of the repertoire of translation initiation events of genes in yeast and higher organisms will provide fuller annotations of proteomes, and will be particularly useful for functional interpretation of future large datasets of mRNA sequences, especially in analyses of mRNA sequence variants and mutations associated with disease states.