The goal of Computational Analysis Core B is to manage, analyze, and extract maximum scientific value from the data generated by the investigators of this Program Project. The primary emphasis of Core B will be to assist Program investigators in the analysis of RNA-Seq data and to connect the results of these measurements with prior knowledge to unveil new biology. Importantly, these data will provide a rich source for discovering previously unappreciated shifts in mRNA isoform usage, novel microRNAs and biologically relevant noncoding transcripts. The Core reflects a novel collaboration between researchers spanning two institutions that collectively possess immense expertise and resources in computation and data analysis: The University of Washington (UW) Department of Computer Science & Engineering and Sage Bionetworks, a non-profit biomedical research organization created to revolutionize how researchers approach the complexity of human biological information and the treatment of disease. The vast data sets generated by next-generation sequence technologies create enormous opportunities but also significant challenges. Core B will assure optimal management and interpretation of data obtained from the RNA-Seq studies proposed by Projects 1, 3 and 4 and Core A. These studies will generate RNA-Seq data for both short (e.g., microRNA) and long (e.g., messenger RNA and long noncoding RNA) protocols. As described below. Core Director Dr. Ruzzo (UW), and co-investigators Dr. Brig Mecham and Dr. Adam Margolin (Sage) are exceptionally well qualified to carry out the proposed work, and an existing collaboration with them has already generated interesting findings as described in Project 1 (Blau). In support of the Program, Core B's activities will vary in accordance with the needs of the individual Projects. Core B will provide an essential resource to P01 investigators who have not previously had access to formalized bioinformatics support. For investigators with existing bioinformatics collaborations Core B will assist in bringing uniform best practices to quality assessment, analysis and interpretation of RNA-Seq data while minimizing duplication of effort. This includes developing procedures and simple automated workflows to integrate existing specialized tools to provide a unified framework for storage, comparison, analysis and visualization of these data sets. Interaction with Sage Bionetworks will be particularly valuable, exploiting their tools for integration and visualization of diverse biological data, and their ongoing effort to standardize and distribute all publicly available microarray and sequencing data. This work required developing automated workflows that reliably processed 15,000 distinct microarray data sets, yielding standardized information in a usable format for the community. Similar work is underway for RNASeq data. These workflows will serve as a template for the proposed analyses. Additionally, Dr. Ruzzo's experience with next-generation sequence technologies (both RNA-Seq and ChlP-Seq analyses) and expertise in prediction of conserved noncoding RNAs offers the prospect of identifying important, novel players in the unique biological systems addressed by this POI.