Applications RNA Structure prediction and analysis HDV MPGAfold and StructureLab were applied in a study of the Hepatitis Delta virus (HDV). HDV is a virus associated with the Hepatitis B virus (HBV). HDV with HBV increases the severity of liver disease and enhances the likelihood of developing liver cancer. HDV produces one protein, the hepatitis delta antigen, which has two forms, the short and the long form. We showed, with the use of MPGAfold, that the Ecuadorian strain (ES) attains two secondary structures that are crucial for functionality. The HDV RNA is edited when it attains a branched conformation, changing a stop codon into a tryptophan. Later, the virus changes into a linear form which is necessary for replication, leading to the translation of a longer peptide which inhibits viral synthesis. At times the RNA bypasses the branched form and attains the linear replication form, avoiding editing, resulting in a shorter peptide required for HDV replication. Recently, MPGAfold indicated that the Peruvian strain (PS) had different folding characteristics than ES. ES attained the editing structure more readily. Our collaborator John Casey verified this with experiments and showed that ES binds to its editing protein adenosine deaminase less efficiently than PS. These results showed that HDV strains maintain a delicate balance between the formation of the editing and replication states. Discovery and Characterization of a New Kind of Translational Enhancer 3'UTRs of cellular and viral mRNAs harbor elements that function in gene expression by enhancing translation using unknown mechanisms. To determine the function of these elements we used a simple model, the Turnip crinkle virus (TCV). TCV is translated in a cap-independent fashion and contains a 3'region that together with the 5'UTR synergistically enhances translation. In collaboration with Professor Anne Simon, from the University of Maryland, we are deciphering the function of this 3'element. We used MPGAfold and Structurelab to identify a series of hairpins and two pseudoknots that were confirmed genetically. Using this structural information with our 3D molecular modeling software, we predicted a structure that resembled a tRNA, the first internal tRNA-like structure found in nature. We then proposed that translational enhancement by the element might involve ribosome binding. The element was found to bind the 60S ribosomal subunit, the first such interaction with the large subunit discovered. It was biochemically determined that this tRNA-like element is a major part of a switch that converts the template from one that is translated to one that is replicated. In collaboration with Yun-Xing Wang, NCI-Frederick, we further investigated the formation of this unique translational enhancer utilizing a newly developed technique that combines Small Angle X-ray Scattering (SAXS) and Residual Dipolar Coupling (RDC) (see below). The results verified the basic model that had been predicted computationally and proved the efficacy of the technique for large RNAs, in addition to further characterizing this newly discovered translational enhancer element. This may open the door to the discovery of similar mechanisms in other genes. eIF4E We determined the properties associated with specific mRNAs that are translationally enhanced due to the overexpression of oncogenic eIF4E in cancer cells. We showed that structuredness in the 5 UTR was not the sole determinant of up-regulation. We showed that up-regulated mRNAs have on average shorter 3 UTRs, higher G+C content and slightly more RNA secondary structure before the start codon and around the stop codon. Another feature is the apparent diminution of binding sites for microRNAs known to be tumor suppressors for mRNAs that are highly responsive to increased eIF4E concentration. A machine classifier was tested which distinguishes between these cases. Characteristics that Determine Abundance of Proteins in a Human Cell Line Transcription, mRNA decay, translation, and protein degradation all contribute to steady state protein concentrations in multi-cellular eukaryotes. In collaboration with Luiz Penalva from the University of Texas, experimental measurements and computational studies were done to determine the absolute protein and mRNA abundances in cellular lysates from the human Daoy medulloblastoma cell line, and the properties that contributed to these abundances. Sequence features related to translation and protein degradation explained two-thirds of protein abundance variation. mRNA sequence lengths, amino acid properties, upstream open reading frames and secondary structures in the 5'untranslated region (UTR) showed the strongest individual correlations for protein concentrations. In a combined model, characteristics of the coding region and the 3'UTR explained a larger proportion of protein abundance variation than characteristics of the 5'UTR. Musashi Mushashi1(Msi1) is a highly conserved RNA binding protein with pivotal functions in stem cell maintenance and development of the nervous system. There is evidence that links Msi1 to tumor formation;its expression has been observed in a variety of tumor types and high levels of expression have been correlated with poor prognosis in glioblastomas and astrocytomas. A high-throughput approach was used by our collaborator,Luiz Penalva at the University of Texas, to identify a group of target mRNAs to elucidate their participation in stem cell maintenance, cell differentiation and tumorigenesis. We applied a computational data mining approach to find the regulatory signal and structural motif in the 3'UTR of these Msi1 targeted genes. Results from experimental and computational data indicated that the Msi1 binding ability depends on multiple factors including closely correlated conserved binding sequences and an associated RNA structural motif detected in the 3'UTRs. RNA Structure Prediction and Analysis Software: CyloFold CyloFold is a new algorithm accessible via our webserver that predicts RNA secondary structure with pseudoknots. Pseudoknot prediction is unrestricted, thus permitting the formation of a multitude of pseudoknots with high degrees of complexity. A unique aspect of the algorithm is a coarse-grained mechanism that checks for steric feasibility of the chosen set of helices representing the structure. Helicical combinations that produce steric conflicts are eliminated from consideration in the predicted structure. Pseudo energy minimization Simulation algorithms that are based on thermodynamic processes often minimize the free energy of folding of single RNA sequences to predict their secondary structures. The additional use of covariance scores derived from multiple sequence alignments can improve the accuracy of these predictions. We developed with Jason Wang at the New Jersey Institute of Technology, an algorithm, RSpredict, that predicts the consensus secondary structure of a set of aligned sequences that combines the principles of dynamic programming with covariation scores. Combining NMR and SAXS The determination of large 3D RNA structures by NMR, X-ray crystallography or other experimental techniques has been a very difficult problem. Our group with Yun-Xing Wang's group in CCR, has developed a methodology that combines techniques from NMR and SAXS to determine the global architecture of large RNAs consisting mostly of A-form helices. The determination of the orientation and the rotation of helices around their helical axes and the relative global positions of the helices can be used to determine structure. SAXS is used to determine the envelop of the shape of the molecule and RDC is used to determine the relative orientations and rotational phases of the helices.