We propose to develop further assays to identify the loci at which RNA:DNA hybrids form in the genome. We also plan to gather information that will give us insights into the function of these loci. The foundation for this proposal is a reasonably substantial amount of preliminary data that show an antibody detecting RNA:DNA hybrids can be used to immunoprecipitate such loci in a human cell line, allowing sequencing to map these loci back to the genome. We find that these loci have characteristics consistent with those predicted by in vitro and other prior studies, with polypurine skewing, enrichment in rDNA and telomeres and loci like the mtDNA origin of replication. We have some early insights into the function of these loci from their physical association with genes that are completely silenced, and from mass spectroscopy analysis of proteins immunoprecipitated from chromatin. Our funding proposal is based on the further development of the genome-wide mapping assay, including some orthogonal approaches using other nucleases, affinity reagents and chemical mutagens. We propose to get much more detailed mass spectroscopy data from different cell types from human and mouse, and perform rigourous validation of the candidates identified. Finally, we will perform bioinformatic studies o the sequences at which we see the RNA:DNA hybrids forming, extending our current analyses which show intriguing patterns of purine:pyrimidine skewing, and correlate their formation with functional outcomes like gene expression and DNA replication timing. A lot of the regulatory mechanisms for the genome assume underlying double stranded DNA as the default, but if RNA:DNA hybrids are formed at a certain locus it would change our assumptions about the ability of that locus to bind transcription factors, undergo DNA methylation or organize as nucleosomes. At the very least we will be identifying a variable that has the potential to confound some of these assumptions. We hope to take the insights to a higher level, identifying innate properties of these loci that will allow us to add a layer of information about how the genome functions. If this exploratory project is successful, we will have a clear idea how to expand the study to a more comprehensive project in the future. PHS 398/2590 (Rev. 11/07) Continuation Format Page PUBLIC HEALTH RELEVANCE: The genome is inherently complex and is regulated by numerous mechanisms that we are just beginning to understand. One underexplored area is the role that unusual DNA structures may play, something that has been relatively difficult to study. DNA usually exists in the classical double helix as described by Watson and Crick, two strands of DNA pairing to form double-stranded DNA. As a field, we have performed many experiments to study how genes are regulated built upon the assumption that this is how DNA organizes itself in living cells. There are, however, exceptions to this rule, one being the presence of RNA:DNA hybrids in the genome. RNA is usually associated with DNA only during the act of transcription, following which it leaves the parent DNA molecule and allows the DNA to return to its usual double-stranded conformation. An interesting prior observation is that some loci do not appear to relinquish the RNA, leaving it tightly associated with the DNA strand, forming an RNA:DNA hybrid and leaving the remaining unpaired DNA strand in a single-stranded conformation, a so-called R-loop. Most of these structures were identified not in living cells but in artificial conditions in which hey were recreated biochemically. Our interest was to see whether we could identify where they form in living human cells. We show that we have been able to develop such an assay, to begin to identify the proteins that bind to these sequences, and to understand their function by computational biological approaches. The funding proposal describes how we plan to develop these studies further, having made some significant progress in the pre-funding period. The proposed project does not include human studies, as these would be premature until we have a system well-established, but the application of the analytical system to human disease research will be in a number of areas. The first is ageing - single-stranded DNA is more prone to age-related oxidative damage, and the telomeres of chromosomes, which are very influential in ageing, are well-established to be RNA:DNA hybrid-forming loci. Cancer is another area of interest - we would like to see whether these loci have roles in translocations, as has been proposed, and act as specific genomic targets of certain chemotherapeutic drugs. We are anxious to proceed to the stage of the project that will allow us to perform these clinically-applicable studies, but recognize the need to make sure we have a carefully designed and robust system in place beforehand, prompting the current exploratory grant funding proposal.