The Human Genome Project and related efforts have generated enormous amounts of raw biological sequence data. However, understanding how biological sequences encode structural and functional information remains a fundamental scientific challenge. In particular, the information encoded in RNA viral genomes extends well beyond their protein coding role to the role of intra-sequence base pairing in viral packaging, replication, and gene expression. Thus, deciphering the different levels of information encoded in these sequences is essential for a full understanding of structure-function relationships in RNA viruses. Our goal is understanding how secondary structure information, expressed as the selective formation of base pairs, is encoded in large RNA viral genomes. Since current prediction methods cannot reliably and efficiently treat these lengthy sequences, we are developing novel combinatorial and computational approaches to the analysis, prediction, and design of viral RNA secondary structures. The outcomes of our research will be a discrete mathematical model of RNA folding and high-performance combinatorial algorithms for predicting secondary structures for large RNA molecules. The success of our methods for unenveloped icosahedral RNA viruses would extend to other large RNA molecules and have important implications for the prevention and treatment of numerous RNA-related diseases. Our research addresses 3 specific aims. (1) We will identify and evaluate characteristics of RNA secondary structures which differentiate base pairings that encode significant structural and functional information from those which are not well-determined. By refining our combinatorial model of RNA folding, we will distinguish configurations whose folding follows natural energy minima from base pairings that encode well-determined, and likely functionally significant, substructures. (2) We will predict new structures by developing the mathematical framework and computational techniques needed to construct a low-energy RNA secondary structure from minimal free energy substructures. By exploiting parallel and multicore processors, our novel approach will predict important functional motifs in the secondary structures of large RNA molecules with a greater degree of accuracy. (3) We will compare the compatibility of our predicted secondary structures with experimental information on RNA viruses using three-dimensional molecular modeling methods. These complimentary approaches will be used iteratively to arrive at a final model, and to design experimentally testable hypotheses.