Eukaryotic gene expression is regulated by numerous mechanisms, including the identities, precise sequences, and architectural arrangements of key transcription factor binding sites (TFBSs) within a promoter, as well as its genomic environment. For example, once retroviruses integrate their genomes into a semi- random location of the host cell genome, they utilize a highly diverse promoter sequence to integrate the genetic and epigenetic inputs at a particular integration site to initiate viral gene expression and replication. Given the diversity in regulatory sequences and genomic environments in the human genome, however, it is highly challenging to understand how such a promoter transforms a number of regulatory inputs into a temporal pattern of mRNA expression. We thus propose to apply a systems biology approach to investigate the properties of an important and highly variable promoter, the Human Immunodeficiency Virus (HIV-1) long terminal repeat (LTR), to elucidate principles by which broad diversity in promoter sequence and genomic environment regulate gene expression dynamics and replication, work that will yield new quantitative insights into transcriptional regulation and that may aid in the future design of enhanced therapeutics. The two fundamental features of HIV that render it difficult to treat are, like many viruses, its evolution and it ability to establish a latent, inactive population. In the first phase of this work, we developed experimental and computational models of subtype B HIV gene expression and latency, linked through quantitative measurements at the single cell and population level. In particular, we found that stochastic effects in gene expression at a subset of integration positions could lead to highly "noisy" gene expression dynamics that may influence viral replication and latency. However, while laboratory strains of subtype B HIV are the most broadly studied, due to its very rapid rate of evolution, HIV generates highly variable sequences within an individual patient, and this process has accumulated over years at a global scale to yield diverse HIV subtypes with stereotypical differences in architecture, including in the LTR. It is clear that changes in LTR sequence impact numerous aspects of the viral life cycle - including gene expression, replication, and likely virulence - and we now propose to develop deeper insights into the sequence-function relationships of this highly important mammalian promoter. In particular, we hypothesize that different architectures of host TFBSs and chromatin environment interact in predictable ways to control gene expression dynamics of the viral LTR and that models capable of making such predictions can be formulated to predict gene expression behavior of virus containing both synthetic and natural/clinically isolated promoters. The proposed work will thus yield unique, quantitative insights into mechanisms of mammalian gene regulation, in a system that is of fundamental importance to human disease. PUBLIC HEALTH RELEVANCE: The central goal of this proposal is to apply an integrated experimental and computational approach to gain deeper insights into the basic relationship between the sequence and architecture of an important human promoter, the Human Immunodeficiency Virus Long Terminal Repeat, and its gene expression properties and functions. This work has implications for basic mechanisms of gene regulation, as well as potential downstream biomedical applications.