BiologicalNetworks: Integrated environment for systems biology This proposal is a collaboration between the San Diego Supercomputer Center (SDSC) of UCSD and the Keck Graduate Institute, to continue the development of BiologicalNetworks and PathSys (www.biologicalnetworks.org), a biomedical analysis environment we have developed over the past three years. Originally developed with NSF funding, BiologicalNetworks, together with its backend data management platform, PathSys, has been designed for warehousing, querying, managing and analyzing molecular interaction networks such as protein-protein, protein-DNA and genetic interactions, and was primarily applied for analyzing the regulatory mechanisms of model organisms. The design architecture, however, was deliberately made as general as possible such that extension into a variety of biomedical applications would be natural. Here we propose to extend the scope of the environment to 1) accommodate the workload of growing user base, and 2) enable a wider variety of data and analysis that our biomedical users are seeking. The proposed extension of the software will turn BiologicalNetworks into a modular, extensible software platform where a wider variety of complex biological data can be efficiently managed and integrated, new analysis modules can be added, and data sharing and interoperation are enabled. The work will be carried out by a multi-disciplinary team, together with a number of driving biomedical users who actively use this system for solving multiscale problems related to human diseases and model organisms. Our driving biological projects address several needs recognized in the NIH roadmap (for 2008): 1) Microbiome initiatives that would focus on developing a deeper understanding of communities of microbes (fungi, viruses, etc.) in order to determine how they affect human health;2) `Genetic Connectivity Map'effort to discover and demonstrate the linkages between diseases, drug candidates, and genetic manipulation;3) Proteome analysis tools;and others. The work proposed in this application will be accomplished through three specific aims: 1. Rearchitecting BiologicalNetworks. To create an extensible system where users can contribute their own analysis modules, share data and analyses, and interoperate with external data/software resources, and yet behave as a single synergistic analytical environment, we need to rearchitect the current BiologicalNetworks software. In the new architecture, every type of data will be associated with a module, and users will be able to write new analysis plug-ins that would fit into an existing module or become a workflow script that utilizes multiple modules. While the modules and plug-ins can be registered independently, they will be semantically related through ontologies. Any analytical workflow that involves multiple modules will exchange information using a semantically rich exchange protocol. 2. Extension of current PathSys database system: Here the first goal is to use our driving biological problems to extend our storage and warehousing capabilities to the variety of data mentioned above. Second, the data integration system will be able to ontologically index all stored or imported data objects, using known data-ontology mapping techniques. Third, the system should be able to model multi-scale spatial and temporal events occurring within and between cells: the biological players participating in them, and event groups that constitute recursively higher order phenomena. To enable the above, we will provide a more expressive query language of the system and develop the query processing software to support operations on and across the types of data. The front-end system will provide biological users an expressive, but intuitive, user-friendly query interface over any combination of objects, properties, paths, graphs, and state transition information. 3. Development of Analysis, Interoperation and Collaboration Modules: To make the system useful, we will develop a number of modules that will be partitioned into Analytical Modules and Service Modules. Analytical Modules are modules that are commonly used or requested by our current user-base for conducting certain kinds of analyses interoperating like Google `mash-ups'and Yahoo! pipes. For example, a graph module to find the most-likely pathways that mediate a dosage-suppression phenomenon in a model organism is an analytical module that our users have requested and can interoperate with the microarray module to see the simultaneous expression changes of genes in dosage-suppression pathways. A Service Module is an infrastructural module that enables the users to perform some tasks transparently. These include a module for data sharing across multiple instances of BiologicalNetworks, a module to facilitate interoperation with other databases, data management software, external analytical software and a module to facilitate workflow sharing among collaborating users and a `reproducible research'module. We will build a Photoshop -style visualization environment where layers will be used to visualize different types of heterogeneous biological data. This proposal is a collaboration between the San Diego Supercomputer Center (SDSC) of UCSD and the Keck Graduate Institute, to continue the development of BiologicalNetworks and PathSys (www.biologicalnetworks.org), a biomedical analysis environment we have developed over the past three years and to extend the scope of the environment to 1) accommodate the workload of growing user base, and 2) enable a wider variety of data and analysis that our biomedical users are seeking. The proposed extension of the software will turn BiologicalNetworks into a modular, extensible software platform where a wider variety of complex biological data can be efficiently managed and integrated, new analysis modules can be added, and data sharing and interoperation are enabled. The work will be carried out by a multi-disciplinary team, together with a number of driving biomedical users who actively use this system for solving multiscale problems related to human diseases and model organisms.