[unreadable] To date, post-genomic analysis has focused on identification and annotation of protein-coding genes and their products. In comparison, the identification of DNA sequences encoding functional RNA has been largely neglected. Functional RNAs can encode trans-acting molecules that interact with other RNAs or proteins, such as tRNAs, rRNAs, RNaseP RNA, small nucleolar RNAs and microRNAs; they can also function in cis as untranslated regions that regulate post-transcriptional expression of mRNAs. Recent experimental and computational studies suggest that there may be hundreds of unknown functional RNAs in prokaryotes and thousands in eukaryotes. [unreadable] [unreadable] The primary goal of this proposal is to develop a computational approach to the identification of novel RNA genes and functional RNA elements in complete DNA genomes. Machine learning techniques will be used to recognize hallmarks of functional RNA coding sequences by comparison with sequences that do not encode RNAs. Several types of signals are useful in discriminating functional RNA including: 1) differences in global sequence composition, 2) calculated RNA secondary structure features, i.e. free energy of folding and 3) specific sequence elements common to RNA structure. These and other parameters can be rapidly tried and tested using machine learning methods. This method will be optimized for individual genomes with respect to input databases, parameterization and machine learning method and architecture. Results will be evaluated computationally by cross-validation testing, comparative genomics, and calculated secondary structure and free energy of folding. Experimental studies of RNA expression and function will be conducted in conjunction with collaborators. [unreadable] [unreadable] Preliminary studies have demonstrated the power of this approach to predict novel functional RNAs in E. coli, other bacteria and archaea as supported by computational cross-validation and experimental confirmation. We will test and apply this approach to discover new functional RNAs in eukaryotic genomes including S. cervisiae, C. elegans, and humans. RNA prediction for these larger and more complex genomes will require the optimization of computational parameters, as well as the development of appropriate input datasets and training algorithms. [unreadable] [unreadable] The prediction of novel functional RNAs in the human genome presents an opportunity to understand new regulatory and developmental processes. Known RNAs implicated in human disease, such as telomerase RNA (cancer, aging), XIST RNA (X-chromosome inactivation) and BIC (a proto-oncogene) underscore the importance of developing a method to identify the full complement of human RNA genes. [unreadable] [unreadable]