Mycobacterial disease, primarily tuberculosis, kills nearly two million people annually. Ineffective vaccines, as well as multi-drug and extremely-drug resistant strains of Mycobacterium tuberculosis, exacerbate this chronic global crisis. The design of new efficacious drugs necessitates a fundamental understanding of the biology of the bacterium and its genetic make-up. In turn, the successful application of genomic sequence information requires accurate gene annotation and a comprehensive knowledge of gene architecture and expression profiles. We have empirically determined transcription and translation initiation sites on a genome scale, using RNA- seq and ribosomal profiling (ribo-seq). This work has revealed two novel characteristics of mycobacterial gene architecture: one-third of transcription start sites lack a 5' UTR (leaderless genes lacking a Shine- Dalgarno sequence), and the presence of many hundreds of small open-reading-reading frames encoding small proteins of less than 50 amino acids (sproteins). To date, sproteins have been largely overlooked in bacteria, as they are hard to detect by traditional methodologies or annotation pipelines. Moreover, sproteins represent a completely unexplored class of proteins in mycobacteria, and are likely to have a significant impact on cell physiology. Importantly, our preliminary studies indicate that sproteins are present in large numbers in both the fast-growing model-organism, Mycobacterium smegmatis, and the slow- growing pathogen, M. tuberculosis, Here, we propose to generate the first comprehensive, experimentally validated small proteome for both slow- and fast- growing Mycobacteria. This will be achieved by combining state-of-the art sproteome mass spectrometry approaches with ribo-seq and transcription start site mapping. Together, these approaches will provide an empirically defined, high-confidence data resource for the mycobacterial community. In addition, we will define the mechanism of action for a subset of the sproteins that we hypothesize act in cis to regulate downstream genes. This proposal is highly innovative as it focuses on the discovery of an entirely unexpected class of abundant proteins that has escaped scientific scrutiny for any bacterium. The application of high-throughput, cutting- edge tools to facilitate this analysis will provide a comprehensive overview of the mycobacterial sproteome, while also providing mechanistic insight into potential functions of sproteins. Thus, we anticipate both an immediate and long-term impact on the mycobacterial field, providing new biological insights that will seed multiple emerging fields of study, while expanding our knowledge of gene architecture and regulation for all bacteria.