This application describes a novel approach for the discovery of non-annotated short open reading frame encoded peptides and small proteins (SEPs), a unique class of understudied peptides in the human genome. Application of this approach to a human leukemia cell line revealed the existence of 32 novel human SEPs, the largest number ever reported. Since SEPs are produced from short open reading frames (sORFs) in the genome this discovery also represents the characterization of 32 new human genes. Analysis of the SEP producing sORFs revealed a number of interesting features about mammalian genes,such as the existence of polycistronic genes, the use of non-ATG start codons to produce protein, and the discovery that some non- coding RNAs have been mistakenly assigned because they actually encode peptides. Likewise, some of the SEPs have features typically found in proteins, such as the ability to localize to specific subcellular compartments and partake in protein-protein interactions, which indicates that they may serve functional roles in the cell. One of these newly discovered SEPs, for instance, partners in a specific protein-protein interaction with a known regulator of cancer cell proliferation to suggest a potential function for this SEP in cell growth.The discovery of these SEPs are significant because they indicate that genome and proteome are larger than previously anticipated and demonstrate the need for additional investigation of these unique human genes. The goals of this application are to discover, characterize, and explore the biology, including any role in disease, of SEPs. PUBLIC HEALTH RELEVANCE: This application details the analysis of a leukemia cell line using a novel approach that led to the discovery of a new group of human genes that encode peptides. This is a significant finding because it indicates human genome and proteome are larger than previously appreciated and may contain non-annotated genes that have important functions. In this application we endeavor to discover, validate, and functionally characterize these novel human genes including their roles in disease.