Our project seeks to complete the catalog of the regulatory elements recognized by the full set of transcription factors (TFs) in the fruit fly Drosophila melanogaster and the nematode Caenorhabditis elegans. In the initial modENCODE project, an experimental pipeline was developed and applied to ~100 TFs in each organism, leaving approximately 600 TFs to study for each fly and worm. To achieve this scale-up, the project builds on the advances made by the groups in the initial phase and also combines the production pipelines to increase efficiency and to realize economies of scale. For both organisms, the overall strategy tags transcription factor genes by fusion with an enhanced Green Fluorescent Protein (eGFP) sequence through recombineering of large insert clones, and introducing the tagged genes into the genome by transgenesis. ChlP-seq using a high quality anti-GFP antibody is performed on the developmental stage(s) with maximal GFP expression, as augmented by available RNA-seq expression data. The aligned sequence reads are analyzed by PeakSeq and other software to identify candidate binding sites and likely target genes. We will prioritize TFs with human homologs to maximize the broader utility of the data. For 40 TFs in each organism we will also investigate TF expression of specific subsets of tissues or cells to estimate the specificity and sensitivity of whole animal ChlP-seq assays. We will also perform RNAi of 100 TFs in each organism, followed by RNA-seq, to validate called peaks and their assigned target genes. Finally, we will integrate the information for the different data sets to construct regulatory networks implied by the TF binding site data. We will coordinate with ENCODE projects on human TFs, and our data will provide key in vivo and developmental regulatory information that will be essential to delineate both fundamentally conserved as well as human-specific properties of TFs.