(I) IDENTIFICATION OF GENES REQUIRED FOR MOUSE EMBRYONIC STEM CELL MAINTENANCE. Embryonic stem cells (ESCs) maintain an epigenetic state that enables both self-renewal and differentiation into all embryonic lineages. Because of their ability to differentiate into any of the 220 cell types in adult body and their capacity for self-renewal, ES cell-based therapies have been proposed for regenerative medicine and tissue replacement after injury or disease. The development of such therapies, however, largely depends on a complete understanding of the genes essential for the self-renewal and pluripotency properties of ES cells. Focused functional as well as high throughput analysis revealed that in ES cells Oct4, Sox2, Nanog, and Klf4 form the core transcriptional circuitry to stably maintain the expression of pluripotency genes, and to repress lineage determinant genes. RNAi screens of nearly 20K genes in mouse ESCs (done by 3 different groups) collectively have revealed 400 other genes that maintain ESC cell identity. Unfortunately, although 3 these studies screened more or less the same set of genes, there is almost no agreement on which genes they report to be essential for ESC maintenance. Despite these large-scale efforts, our understanding of self-renewal still remains largely incomplete. In an effort to identify the complete set of genes essential for ESC identity, we undertook a massive effort to integrate previously published gene expression microarray datasets in mouse ESCs and differentiated cells (DCs) across various developmental stages from over 30 studies. A robust meta-analysis framework was used to analyze the expression data from each study separately to generate a ranked list of genes ordered from those that are over-expressed in ESCs to those that are under-expressed in ESCs. Currently, we are performing RNAi screens and other experiments to validate identified candidate ESC markers. Our findings will help understand the role and importance of novel regulators in core pluripotency transcriptional network. (II) EMBRYONIC STEM CELLS AND GENE REGULATION. The establishment and maintenance of pluripotency requires signaling through Leukemia Inhibitory Factor (LIF) and Signal Transducer and Activator of Transcription 3 (STAT3), which oppose Extracellular Signal-regulated Kinase (ERK)-induced differentiation. How STAT3 produces an ESC-specific response and coordinates with pluripotency chromatin regulators is unknown. To understand the interplay between signaling pathways and chromatin regulators in ESCs, we mapped genome-wide STAT3 binding and found that over 80 percent of its binding depends on Brg, the ATPase of a specialized, SWI/SNF-like chromatin remodeler known as esBAF, which we had found to be an essential component of the core pluripotency transcriptional network. Increased STAT3 levels partially rescued Brg deficiency, indicating functional interaction between Brg and STAT3. Surprisingly, Brg deletion increased Polycomb Repressive Complex 2 (PRC2) binding and H3K27-trimethylation at Brg-dependent STAT3 target genes, resulting in their silencing. Depletion of PRC2 in Brg-knockout ESCs rescued STAT3 binding and transcription of STAT3 targets, indicating that esBAF restrains PRC2-mediated repression of LIF signaling. STAT3 action is therefore tightly regulated by the opposing actions of esBAF and Polycomb to faciliate pluripotency. These results have led us to conclude that esBAF conditions the pluripotent genome for LIF/STAT3 signaling by Opposing Polycomb. (III) CHARACTERIZATION OF CTCF'S ROLE IN GLOBAL ORGANIZATION OF CHROMATIN ARCHITECTURE. Genome-wide characterization of CTCF's insulator function, and CTCF's role in the global organization of chromatin architecture. Insulators are DNA elements that regulate gene expression by preventing the spread of heterochromatin and/or blocking inappropriate interactions between transcriptional enhancers and unrelated promoters (enhancer-blocker). CCCTC-binding factor (CTCF), a highly conserved zinc finger protein, is the only known insulator-binding protein in vertebrates. CTCF has been implicated to play diverse regulatory roles in gene regulation, including transcriptional activation/repression, enhancer blocking and/or barrier insulation, genomic imprinting, X chromosome inactivation, and long-range chromatin interactions. Recent genome-wide studies by us and others have provided evidence for CTCF-mediated intra- and inter-chromosomal contacts at several developmentally regulated genomic loci. These functions support a primary role for CTCF in the global organization of chromatin architecture, and suggest that CTCF may be a heritable component of an epigenetic system regulating the interplay between DNA methylation, higher-order chromatin structure, and lineage-specific gene expression. We have been conducting experiments and bioinformatics analysis to show CTCF's role in establishment and maintenance of chromatin strucuture during cellular development and differentiation. (IV) INFERRING PROTEIN DOMAIN INTERACTIONS FROM INCOMPLETE PROTEIN-PROTEIN INTERACTION NETWORKS. Protein-protein interactions, though extremely valuable towards a better understanding of protein functions and cellular processes, do not provide any direct information about the regions/domains within the proteins that mediate the interaction. Most often, it is only a fraction of a protein that directly interacts with its biological partners. We have cleverly combined the use of genetic and functional data to infer as precisely as possible the interactions between functional domains of proteins in, and on the surface of, living cells. The significance of his work can hardly be overstated, as it lies at the heart of making further advances in our identification and understanding of metabolic and signaling pathways that respond to environmental factors. We have been maintaining a publicly available online database of protein domain interactions, a criticially important resource for experimental biologists who seek to test for new protein and domain interactions.