Abstract/Project Summary The goal of this proposal is to discover and interpret the code by which cis-regulatory DNA controls gene expression. This regulatory DNA controls the speci?cation of cell fates with exquisite precision in multicellular organisms, including humans, and its dysregulation underlies both developmental diseases and cancer. The manner in which this control is coded into the genome remains poorly understood. Moreover, the recent discovery that metazoan genes are transcribed in random bursts raises the problem of understanding how this random process is controlled to give rise to the highly precise distribution of mature transcripts observed. Both of these problems constitute a roadblock to further progress in basic science and translational medicine, and we propose to remove them by the work proposed here. The key supporting tool is an established model of transcriptional control that takes DNA sequence and the concentrations of transcription factors as inputs and gives RNA synthesis rate as output. This model is not limited to enhancers, but can also treat an entire genetic locus. We previously used this model to understand how conservation of enhancer function across phylogeneti- cally distant species occurred in the absence of conservation of DNA sequence. We found that the conserved entities were small clusters of binding sites in which the exact positions of binding sites and the identity of bound transcription factors can vary, but only within certain limits. These clusters, which we call ?soft codons,? may have a role as essential as the structural genetic code. To test this, we propose to Aim 1: (a) Discover and model soft codons in the entire eve locus of D. melanogaster, D. virilis, and D. erecta in their native context, and selected enhancers from distant dipterans in the genuses Megaselia, Clogmia, and Chironomus expressed in D. melanogaster. The random bursts of transcription observed in vivo are also under the control of transcription factors. We propose to extend our transcription model to treat control of these bursts by a program of parallel experimen- tation and modeling. All experiments will be conducted in the context of a native intact locus, in which we will analyze the effects of a series of carefully selected perturbations. Speci?cally, we propose to Aim 2: Perform an in vivo regulatory dissection of the Drosophila eve locus in which we will monitor bursting in (a) The whole locus; (b) A series of key stripe two enhancer constructs designed to vary strength and variability of transcription; (c) Rearrangements of enhancers within the whole locus; and (d) Pure transvective constructs in which all interactions between the enhancer and basal promoter are in trans. We will use the resulting data, together with our preexisting quantitative atlas of gene expression at cellular resolution to Aim 3: Construct a new stochastic model of transcriptional control by coupling our current model of transcription to a simple model of stochastic transcription initiation.