Project Summary Abstract Transcription factors (TFs) control gene expression by binding to DNA and either activating or repressing target gene expression. While our understanding of how TFs bind DNA has grown rapidly, our understanding of how TFs activate transcription once they are bound has not kept pace. As a result, when we identify a mutation in a patient in a TF, if this mutation is in the DNA binding domain, we can sometimes predict if it will disrupt function; if the mutation is in the activation domain, we have no ability to predict its effects. To address this gap, I propose to study the amino acid composition of acidic activation domains using high-throughput assays and modern computational analyses. I have recently developed a high-throughput assay for measuring thousands of designed activation domain mutants in yeast. Here, I propose to develop a similar method in mammalian cell culture. In Aim 1, I propose an in-depth study of a few activation domains as a model for how genetic variation impacts TF function. In Aim 2, I propose a broad survey of human transcription factors to search for new activation domains. In the independent phase of this grant (R00), I propose 2 more aims to investigate the mechanisms how TFs activate target genes. In Aim 3, I will functional classify different type of activation domains and search for common features of each time. In Aim 4, I will link activation domains to cofactors and look for the features that predict activation.] This proposal will create a new layer of annotation for the human genome: all regions that are sufficient to serve as activation domains, create a new genome-scale method and deliver computational models for predicting activation domains from amino acid sequence. This award will support training in mammalian experimental systems and advanced machine learning analysis. Together these aims will support the long term goal of reading the regulatory genome by predicting gene expression from DNA sequence.