How cell-type specific gene expression programs are established and maintained is a fundamental question in molecular biology. In mammalian cells, hundreds of sequence-specific transcription factors have been catalogued, and they bind the regulatory regions of their target genes in cell-type specific and combinatorial occupancy patterns. Moreover, the developmental programs that generate different cell lineages are accompanied by complex chromatin remodeling. Increasing evidence suggests that the regulatory regions of cell-type specific genes may often be established and sometimes poised by chromatin marks at earlier stages in development. However, the detailed characterization of gene regulatory regions-including their initial establishment in earlier progenitor cells, the dynamics of their chromatin state, and the combinatorial control of gene transcriptional output by multiple transcription factors-has only been studied for a handful of developmentally important genes. The goal of this project is to develop new integrative computational methods that exploit massive next- generation sequencing data sets to fundamentally advance our understanding of cell-type specific transcriptional programs. We will develop integrative computational analysis methods for (1) learning the sequence and chromatin determinants of transcription factor binding from ChIP-seq and DNase-seq; (2) mapping the landscape of chromatin accessibility of all regulatory regions in the human and mouse genomes using DNase-seq across all available cell types, dissecting the poising of their chromatin state in earlier progenitor cells, and extracting the sequence code governing their gain and loss in differentiation; and (3) modeling cell-type specific gene expression programs as a function of chromatin state, transcription factor binding, and regulatory sequence analysis. We will couple our computational methods development with targeted experimental validation, including both locus-specific and genome-wide assays.