We developed a suite of four transcript prediction algorithms collectively called "FEAST" (Fast Empirical Algorithms Suggesting Transcripts), which are conceptually independent of the two established classes of gene discovery algorithms, namely "ab initio" and database search methods. The main goals of this proposal are (1) to develop further this independent third class of gene prediction algorithms, (2) to apply them to the dentification of novel genes in the genome, and (3) to test the hypothesis that non-coding transcripts are prevalent in the genome, and are the medium for the expression of small RNA genes and other functional genomic elements. We will extend the statistical model and develop the software towards a fully integrated gene prediction tool capable of discovering genes in genomic sequences of one species, or in multiple species simultaneously for higher precision. We will use the new tool to produce a comprehensive catalog of predicted genes. This is the genetic "parts list", that is required for the construction of metabolic and regulatory models of cell function. We will correlate the transcript predictions to expression data from hybridization array technology, and validate novel genes experimentally by RT-PCR and sequencing. We identified an unusual class of genes (which we call "stencil" genes) in which the exons play no other role than the production of introns as precursor material for deriving one or more functional RNA molecules, like miRNAs and snoRNAs. We will put special emphasis in obtaining a comprehensive catalog of such "stencil" genes and will study computationally their prevalence, their modes of regulation and how they evolve. We expect many of the novel transcripts to be central to the genetic regulation of development, and therefore of direct importance to cancer research.