DNA methylation plays a critical role in regulating lineage specification and restriction of potency during mammalian development; aberrant patterns of DNA methylation are generally observed in cancers. Second-generation sequencing of bisulfite treated DNA is enabling DNA methylation to be examined in greater detail, and recently demonstrated array-based capture technique allow ultra-deep bisulfite sequencing in selected genomic regions. This project develops algorithmic and statistical methods required to formulate and test specific hypotheses about DNA methylation based on data from these novel experimental technologies. A family of statistical models will be designed to characterize features of DNA methylation in a cell or sample, and algorithms will be designed for associated computational tasks of model fitting, probability calculations and feature identification. The methods will be validated through application to novel, ultra-deep bisulfite sequencing data; specific hypotheses about the regulation of DNA methylation and how methylation regulates gene expression will be tested simultaneously. Specific methylation datasets related to development and cancer will be produced. Computational methods will be developed for identifying clonal features of methylation profiles in cells, for predicting developmental relationships between cells, and for resolving information about the complexity and histology of tumor samples. Efficient and robust implementations of the methods will be developed and released for public use. The proposed research will provide increased analytical capability to complement emerging experimental technology for investigating DNA methylation. This will enable researchers, particularly those studying human development and cancers, to ask and answer more precise questions about the functions of DNA methylation. Moreover, this technology will assist in identifying the most effective methylation-based markers for clinical outcomes associated with cancers.