PROJECT SUMMARY Chromatin plays an essential role in transcriptional regulation. Chromatin-related genes are frequently mutated in cancers. Dissecting the functions of chromatin in gene regulation is important for understanding the molecular mechanisms of oncogenesis and tumor progression. As an experienced computational biologist with expertise on ChIP-seq bioinformatics and epigenetics, my research has focused on developing computational methodologies for high-throughput genomic data analysis and computational modeling on chromatin regulation of gene expression. With more independent research training in cancer biology, I will develop my research program on computational cancer epigenetics and develop an independent academic career. Recent studies have demonstrated the feasibility of targeting chromatin regulators for active open regions in the genome as novel therapeutics for cancer treatment. However, the context-specific substrates of chromatin regulators and the mechanisms underlying how chromatin regulates gene expression are largely unclear.. With the advent of next-generation sequencing based high-throughput genomic techniques including ChIP-seq, DNase-seq, and ATAC-seq, a large amount of for genomic profiling data became available, making it possible to systematically decipher the gene regulatory mechanisms with an integrative computational approach. The objective of this project is to develop novel quantitative and computational methodologies for studying epigenetic gene regulation and the functions of chromatin regulators in cancer. Specifically, we propose to develop integrative computational methods that leverage the abundant public ChIP-seq, DNase-seq, and ATAC-seq data for predicting functional regulatory elements and TFs. First (Aim 1), we will develop a method that predicts the functional enhancer elements and associated TFs given any gene set using public histone mark ChIP-seq data across multiple cell types. Second (Aim 2), we will develop a quantitative model to identify the nucleotide-resolution chromatin accessibility dynamics from paired-end DNase-seq or ATAC-seq data with correction of intrinsic biases in the data. Finally (Aim 3), we will integrate publicly available DNase- seq, ATAC-seq, and ChIP-seq data in a comprehensive database and systematically characterize the functions of chromatin regulators with a focus on EZH2 in a few cancer systems, including castration-resistant prostate cancer (CRPC) cells, and malignant peripheral nerve sheath tumors (MPNSTs). These computational methods complement existing bioinformatics methodologies and will have broad applications in the study of cancer epigenetics and gene regulation. The proposed research will fill the knowledge gap between oncogenic drivers and downstream gene expression program, and could provide mechanistic support for development of novel targeted therapeutics for cancer precision medicine.