We plan to develop a high-throughput, high-resolution in vivo footprinting method based on next generation single molecule DNA sequencing technology. Single molecule sequencing will be used in place of conventional methods to detect cleavage induced by three modifying agents: DMS (dimethyl sulfate), DNase I and UV light. With a simple modification to standard ligation-mediated PCR (LM-PCR), the precise sequence of protein binding sites can be determined by sequencing short cleavage signature tags and simply counting the number of times a tag terminates at each nucleotide position within a designated 'footprint window'. This counting method will convert the cumbersome, 'band intensity'analysis of classical footprinting methods to an absolute, frequency-based approach that can be readily analyzed by an automated analysis pipeline. The project commences with the development and demonstration of the technology and its large-scale use. First, optimal conditions will be developed and applied to generate cleavage maps of previously footprinted 'control'segments (DMS, DNase I and UV) in a multiplexed format. With optimized parameters established, the assay will be used for the de novo footprinting of putative cis regulatory elements, as indicated by DNase I hypersensitivity. Within this aim we will also determine the sensitivity and specificity of the HD (High Definition) tag footprinting assay as well as assess the biological and technical reproducibility of the system. >100 tag-generated footprints will be validated each year by classical footprinting techniques to ensure accuracy of the data. Finally, we will determine the scalability of the HD tag footprinting system and apply it on a large scale to the analysis of putative functional elements within the ENCODE regions. DNase I hypersensitive sites identified from four cell types will be footprinted and data compared to existing ENCODE data types. A high-throughput method capable of generating high resolution DNase I, DMS and UV footprints will provide a powerful tool for identifying the protein interacting sequences within novel cis-regulatory elements of any eukaryotic genome. High-resolution tag footprinting will contribute to the goals of the ENCODE project by complementing existing assays and creating a large volume of precise sequence-based data that can be used to further annotate functional elements in the genome.