Gene regulatory networks are defined by highly specific interactions between thousands of unique molecules. Transcription factors (TFs) play a central role in these networks, but much remains unknown regarding the structural basis of their sequence specificity and the connectivity between signaling pathways and TFs. We will develop novel computational methods to address these fundamental questions. We will also analyze post-transcriptional regulation of transcript stability by RNA-binding proteins. Most of our research effort will focus on yeast, but our methods will be applicable in all eukaryotes. For data access and experimental validation of our results, we will work with excellent high-throughput experimental collaborators. We will also perform more traditional follow- up experiments within our own laboratory. Our first specific aim is to infer a structure- based protein-DNA recognition code from high-throughput binding data. By performing a simultaneous fit to in vitro binding data for a wide range of TFs, we will estimate free energy potentials for base-pair/amino-acid recognition. These will allow us to predict sequence specificity from the amino-acid sequence of the TF alone and design TFs with prescribed sequence specificity. Our second aim is to identify modulators of TF activity using network-level genetic linkage analysis. We will develop a method that combines the power of genetic linkage analysis with prior information about transcriptional network connectivity, and identify quantitative trait loci whose allelic status affects TF activity. Using this approach, we will perform a comprehensive analysis of the connectivity between the signaling and the transcriptional networks in yeast. Our third aim is to functionally dissect post-transcriptional regulation of mRNA stability. We previously demonstrated that steady-state mRNA expression data contains detailed information about the condition-specific control of mRNA half-life by RNA-binding proteins (RBPs). By integrating a novel high-throughput immunoprecipitation dataset for >40 RBPs with genome wide mRNA expression data for a large number of physiological conditions, we will predict the conditions in which specific RBPs are active. We will analyze combinatorial cis-regulatory interactions with co-factors and use linkage analysis to map connectivity between signaling pathways and post-transcriptional networks. Aberrant regulation of gene expression is often associated with disease. Furthermore, genetic differences between individuals affect responsiveness to drugs as well as disease prognosis. Our work will lead to theoretical and biological insights, as well as practical software tools and databases that will help basic and applied researchers to understand and predict the behavior of gene regulatory networks. PUBLIC HEALTH RELEVANCE: This project aims to further develop computational algorithms and software that can be used to predict how DNA- and RNA-binding "read" the genome sequence in order to control gene expression in a gene- and cell type-specific manner. These tools will allow researchers to understand how the behavior of gene regulatory networks is shaped by the genome sequence, and affected by genetics differences between individuals. Aberrant regulation of gene expression is often associated with disease.