5-Hydroxymethylcytosine (5-hmC) is a newly identified base modification in mammalian genomic DNA. Because current sequencing methods cannot differentiate 5-mC from 5-hmC, the immediate challenge is to develop robust methods to ascertain the positions of 5-hmC within the mammalian genome, a problem best addressed by adapting a new chemical labeling technology that we have developed. We show that the hydroxymethyl group of 5-hmC can be selectively labeled with chemically modified glucoses using -glucosyltransferase (-GT). This glycosylation offers a strategy of installing functional groups such as biotin onto 5-hmC. In this way, we can affinity capture DNA fragments containing the modified 5-hmC and develop sequencing methods to determine the precise locations of 5-hmC. Using this approach we have mapped the genome-wide distribution of 5-hmC in human ES cells. Building on our early successes here we propose to develop single base-resolution detection and sequencing methods to reveal the distribution of 5-hmC in human ES cells. We propose two different approaches, both utilizing the selective chemical labeling strategy we have developed. In one approach, we will label 5-hmC with bulky groups to hinder ligation and linear PCR reactions, thus achieving single base-resolution detection of 5-hmC. The linear PCR approach can be adapted into Illumina sequencing to perform high throughput determination of 5-hmC in human ES cells. In an alternative approach, we will develop an exonuclease digestion blockage method that detects modified 5-hmC at 3' end of undigested DNA fragments using Illumina sequencing; we have already shown that the chemically modified 5-hmC blocks exonuclease III digestion at 3' end of the modified 5-hmC. The new technologies proposed in this R21 application not only enable us to map 5-hmC at single-base resolution in human ES cells, but also could be applied to other samples to map 5-hmC systemically. PUBLIC HEALTH RELEVANCE: 5-Hydroxymethylcytosine (5-hmC) is a newly discovered base modification surprisingly abundant in the genomic DNAs of embryonic stem cells. The proposed work will develop efficient chemical labeling methods to perform single base-resolution detection and sequencing of 5-hmC in human ES cells. The success of the proposed work will reveal the fundamental role(s) of 5-hmC in stem cells.