5-Hydroxymethylcytosine (5-hmC) is a newly identified base modification in mammalian genomic DNA. In certain tissues or cells it can accumulate to relatively high levels. Because current sequencing methods cannot differentiate 5-mC from 5-hmC, the immediate challenge is to develop robust methods to ascertain the positions of 5-hmC within the mammalian genome, a problem best addressed by adapting a new chemical labeling technology that we have invented. We show that the hydroxymethyl group of 5-hmC can be selectively labeled with chemically modified glucoses using -glucosyltransferase (GT). This glycosylation offers a strategy of installing functional groups such as biotin onto 5-hmC. In this way, we can affinity capture DNA fragments containing the modified 5-hmC and develop sequencing methods to determine the precise locations of 5-hmC. Using this new method, we have obtained the first genome-wide distribution map of 5-hmC in the mammalian genome. Building on our early successes, we propose to develop single-base resolution detection and sequencing methods to reveal the exact locations of 5-hmC in mammalian genomes. We propose different but complementary approaches in order to ensure that effective methods will be available to the biological community in the near future. The genome-wide information obtained can be used to help probe the functional roles of 5-hmC, for instance potential proteins and/or transcriptional factors that may recognize 5-hmC in specific sequence contents. We have also shown that 5-hmC can be further oxidized to 5-fC and 5-caC by the TET family enzymes. Significantly, when paired with a normal G, the 5-fC and 5-caC modification can be excised by the human thymine DNA glycosylase (TDG) without the need for base deamination. The base excision repair (BER) process effectively converts these base modifications back to C in an active demethylation process. We plan to develop selective 5-fC and 5-caC labeling and sequencing methods to obtain a genome-wide distribution map of these intriguing base modifications, respectively. In TDG-deficient cells, we believe the genome-wide distribution information of 5-fC/5-caC may reveal active demethylation sites in the specific cell stages. The proposed research will develop urgently needed tools for the PI's group and the broad biology community to study one of the most cutting-edge frontiers of life sciences research: the potential functional roles of these newly discovered DNA base modifications in epigenetics, development, and various human diseases. PUBLIC HEALTH RELEVANCE: 5-Hydroxymethylcytosine (5-hmC), 5-carboxylcytosine (5-caC), and 5-formylcytosine (5-fC) are newly discovered base modifications in the genomic DNAs of certain mammalian tissues and cells. The proposed research will develop efficient labeling methods in order to reveal and map genome-wide distributions of these base modifications that may play critical roles in epigenetics.