Project Summary: The current proposal is in response to PAR-09-218 (Innovations in Biomedical Computational Science and Technology) and is designed to develop new computational methods for predicting conserved RNA secondary structure for multiple homologs. The objective of the new computational methods is to improve automated RNA secondary structure prediction to match the performance of manual comparative analysis, which, although highly accurate, requires specialized skill and intensive manual effort. We will then apply these methods to determine conserved structures in HIV, SIV, and related immunodeficiency viruses and biochemically test hypotheses about structures required for viral replication. Computational tools for accurate predictions of RNA secondary structure have widespread applications in biology and medicine, contributing to better understanding of RNA function, discovery of new RNA genes, and methods for targeted drug design. The specific aims of the research proposal are to: 1) Automate comparative sequence analysis, 2) Improve the models that define structure alignment by adding flexibility, and 3) Build structural alignments of HIV, SIV, and related immunodeficiency viruses and and, via biochemical experiments, test the role of putative structures for HIV replication. To achieve Aim 1, we propose an innovative iterative computational framework that computes probabilistic representations of RNA folding and inter-sequence alignment and iterates - updating alignments using folding information and vice versa. The novel aspect of this framework is that the complexity of the alignment and folding tasks in each iteration remains manageable, as though these tasks were performed independently, whereas the accuracy is significantly improved, as though folding and alignment was jointly performed. Aim 2 addresses a common shortcoming of computational models for RNA folding and alignment. These models define common secondary structure in a rigid manner, which does not comprehend the variation of structure homology seen in RNA in nature, where entire domain insertions or deletions are commonly seen. Our proposal addresses this limitation, within our computational framework of Aim 1, by improving probabilistic alignment models to better comprehend domain insertions and by introducing scoring modifications that allow insertions and deletions of entire domains with modest penalty instead of totally forbidding these. To accomplish Aim 3, we will deploy our computational algorithms to analyze the RNA genomes of HIV and SIV, formulate hypotheses for the roles of structures in RNA replication, based on common features in predicted structures and then test these hypotheses via biochemical experiments. PUBLIC HEALTH RELEVANCE: This proposal has direct public health relevance. We are developing computational tools to predict and understand RNA structure. These can be applied to understanding the biology of infectious diseases because some viruses, including influenza and HIV, are RNA viruses. Furthermore, they can be used to design novel therapeutics, such as antisense oligonucleotides or small interfering RNA that both target RNA that could be used for diseases such as cancer or inherited diseases. In this proposal, we are specifically applying the tools to studying HIV, which could lead to the discovery of new replication mechanisms that could be targeted by therapeutics.