Project Summary The advent of genomic sequencing revolutionized the field of science, revealing the human genome sequence to be ~3 billion base pairs. The result of this venture was the identification of ~20,000 protein-coding genes, which compromise only ~1% of the genome. While ~10% of the genome is considered ?silent?, the remaining ~89% is transcribed, but thought to be untranslated. Previous annotation methods resulted in the exclusion of transcripts under 300bp (100aa) due to the high rate of gene misidentification. Recent papers have discovered small open reading frames (sORFs) that encode peptides under 100aa with distinct biological functions. We hypothesize that there are numerous peptides with unknown biological functions that are prominent in cardiac physiology and disease. To identify novel sORFs we combined in vivo and in silico approaches, including a statistical prediction algorithm sORfinder, to compile a database of putative sORFs. Given the prominent role of mitochondrial dysfunction in heart failure (HF) we included mitochondrial-targeting prediction for all sORFs by employing n-terminal protein sequence analysis using the computational programs MitoFates and MitoProt. In addition, we have incorporated mRNA deep sequencing left ventricular samples of mice subjected to transaortic constriction (pressure-overload HF) or permanent ligation of the left coronary artery (myocardial infarction) at multiple stages of disease progression to identify differentially expressed sORFs in HF. To prioritize our search, we are ranking various components for unbiased target selection and experimental confirmation. We envision this novel database as having great importance within and beyond the cardiovascular field for identifying novel genes with therapeutic potential.