Project Summary: We have worked to extend the range of analyzable samples by developing protocols suitable for sample types including formalin fixed, paraffin embedded samples, flow sorted primary cells and fine needle aspirates. We have established that useful nucleic acid preparations can be obtained from these fixed tissues and are continuing to extend the analysis of this material for a wider range of genomic technologies, especially for sequencing. Indeed, current efforts have been directed primarily at the implementation of next generation sequencing technologies. These methods primarily depend on producing an array of DNA molecules which are sequentially imaged during the sequencing reaction. We are investigating the use of these methods for gene expression profiling for large and small RNAs, for the detection of genome rearrangements, mutations, and for the measurement of chromatin modifications, DNase I hypersensitive sites, and transcription factor localization. In addition we are exploring new methods for large scale genome mapping using microfluidic technologies. A major part of this effort is the development and implementation of software tools can be used to analyze the massive amount of sequence data which is generated by this work. Although this is a challenging process, it ultimately will yield a streamlined analysis pipeline in which multiple sequence based assays will be easy to integrate and free of array platform specific artifacts. Specific goals of our computational efforts include the optimization of pipelines to process sequence data for chromatin analysis, chromosome rearrangements, gene expression, and mutation detection. We are currently engaged in pursuing new approaches to target sequencing efforts to regions of interest such as the small proportion of the genome composed of genes, or subsets of genes of particular interest in order to be able to sequence thousands of genes in individual samples or in a complementary fashion, to sequence a few key genes in hundreds of samples. We are also developing efficient methods to target intergenic and intronic regions which are often the sites of important structural rearrangements in cancer. We make use of informative models as need to test our novel approaches to genome characterization. This work has extensive potential applications in basic, translational, and clinical research.