This application seeks funds to develop LIFE - a novel information system to seamlessly integrate diverse data set produced by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) and other screening programs. The LINCS program aims to generate an extensive reference set of cellular response signatures to a variety of small molecule and genetic perturbations. The goal is to create a sustainable and widely accessible knowledge resource to advance our understanding of the highly orchestrated interplay of molecular biological components in maintaining healthy development and how their perturbation causes disease. Data produced at LINCS span a variety of assay formats and technologies, including biochemical and single cell phenotypic responses, and genome-wide transcriptional profiling. The success of this initiative critically relies on an effective informatics solution to integrate the various (current and future) data types into coherent data sets and to make them accessible, interpretable, and actionable for scientists of different backgrounds and with different objectives. We propose to develop LIFE - a novel knowledge-based information system that will solve this challenge. Tremendous progress has been made during the last decade developing Semantic Web technologies with the goals of formalizing knowledge, linking information across different domains, and integrating heterogeneous data from diverse sources. LIFE will leverage these technologies and extend them further. LIFE will incorporate biomedical domain-level ontologies, including our recently developed BioAssay Ontology, to associate related data types and to provide a knowledge context of the LINCS assays and their outcomes. LIFE will be scalable with respect to information volume and complexity. A key novel feature of LIFE will be the potential to derive novel implicit knowledge by various inference mechanisms;similar to how humans obtain insights by (mentally) connecting different pieces of information. The overarching goal of the LIFE system is to help scientists to use data and results produced in the LINCS and other NIH screening programs in their own research and to support their translation towards the development of novel therapeutics. PUBLIC HEALTH RELEVANCE: Public and private organizations are generating huge data sets as they attempt to develop new drugs for human diseases. One reason drug development is slow lies in the difficulty of generating new knowledge from the collection of all the available drug development data. We propose to develop a novel information system that will help scientists access and use information generated by the NIH LINCS project and other public data sources