It is now clear that response to targeted therapies in cancer is dictated by the molecular characteristics of an individual patient's tumor. The industry is increasingly looking to genomics-based analyses to link drug response phenotypes to underlying genetic mutations and multiplex molecular signatures. Such analyses are hindered by three fundamental challenges. First, cancer comprises a diverse collection of potentially hundreds of distinct molecular diseases and most preclinical experiments are not of the appropriate scale to represent this diversity. Second, most analyses are limited in their scope of genomic associations, focusing on only one out of several possible characterizations (e.g. mutations, gene expression, DNA copy number). Finally, most analyses are missing the last crucial step, which is to map multiplex genetic biomarkers of drug response to clinical tumor populations. Compendia Bioscience seeks to address these fundamental challenges and provide an oncogenomics drug profiling solution that: assesses a broad panel of cancer cell lines for response to a given compound;defines sensitive and refractory cell line populations;performs detailed genomics correlation analysis spanning gene expression DNA copy number and mutations;develops and refines multiplex genetic biomarker(s) of response;and analyzes response biomarker(s) across nearly 30,000 highly curated clinical cancer specimens to identify patient populations and subpopulations likely to respond. The solution builds upon the functionality and resources previously established in Oncomine, a comprehensive gene expression database that collects, standardizes, and analyzes publicly available gene expression data. This will be accomplished by extending the flexibility of Oncomine to capture new metadata on existing studies: by overcoming technical obstacles associated with incorporating new data types;and by utilizing the full range of available data in logical ways to support drug development workflows. The Specific Aims of this Phase I proposal are to: 1. Catalog and annotate publicly available gene mutation data for 300+ cancer cell line panel. 2. Process and integrate DNA copy number data for 300+ cancer cell line panel. 3. Develop method to call amplifications and deletions from DNA copy number data and integrate with mutation data. Altogether the aims of this proposal significantly contribute to proving the feasibility of the overall approach: by assembling critical data elements for a large cell line panel and then directly comparing concordance between cell line data and tumor data. PUBLIC HEALTH RELEVANCE: Despite enormous investments in genomics technology aimed at improving the drug development pipeline, cancer remains a leading cause of mortality in the United States. This proposal seeks to advance preclinical drug development efforts by maximizing the value of experimentally robust, well-characterized cell lines, and using the results of those studies to inform patient selection in clinical trials of novel therapeutic compounds. This will improve public health by making it easier for drug development companies to identify patient populations likely to benefit from novel compounds, and to successfully advance those compounds through clinical trials and to the marketplace.