Epigenetic patterns provide an extra layer of gene regulation beyond the genomic sequence and play a critical role in the maintenance of cell-type specific gene expression programs. Disruption of epigenetic regulation can cause severe diseases including cancer. During the past few years, a large amount of epigenomic data have been generated displaying significant epigenetic pattern changes across cell- types and in disease tissues. On the other hand, the targeting mechanism for epigenetic factors remains poorly understood. The investigators propose integrated computational and experimental approaches to tackle this challenge, summarized by the following three specific aims. In Aim 1, a wavelet analysis-based computational approach will be developed to predict genome-wide epigenetic patterns using DNA sequence information. In Aim 2, gene expression data will be further integrated to predict tissue-specific epigenetic changes. In Aim 3, controlled experiments will be carried out in mouse embryonic stem cells to validate computational predictions. If successful, our proposed research will provide mechanistic insights into epigenetic targeting and can be used to develop tools to reverse specific disease-causing, aberrant epigenetic changes as a potential novel therapeutic approach for cancer and other diseases. Our preliminary studies suggest that a wavelet-based computational approach, previously developed by the principal investigator and colleagues, is effective for the prediction of a number of epigenetic modifications. This approach will be optimized in Aim 1 for de novo detection of sequence features associated with various epigenetic modifications by incorporating more effective wavelet tools such as thresholding and wavelet packet analysis. The strength will be further enhanced by incorporating previously annotated sequence features and by using a more effective classification method. For Aim 2, gene expression data will be combined with sequence analysis to predict tissue-specific epigenetic changes. A sparse principal component regression method will be used to identify the tissue-specific, context dependent regulatory effects of modules each characterized by a combinatorial pattern of multiple regulators. For Aim 3, we will select DNA sequences and regulatory modules based on computational predictions and experimentally test their roles in epigenetic targeting by using a number of assays including introducing genetic mutations, forcing or inhibiting gene expression levels, and chromatin immunoprecipitation.