This proposal intends to test the use of open source software technologies to fulfill the critical need of academic and small- to medium sized private research laboratories for a robust yet cost effective solution to their ever-growing data management, distribution, integration and analysis problems. In collaboration with researchers at the University of California, San Francisco (UCSF) Comprehensive Cancer Center, the PI would analyze workflows, develop a prototype specification for an Integrated Cancer Data Management System (ICDMS), design and develop the prototype, and have collaborating users test the prototype to determine its efficacy at achieving program objectives. A successful ICDMS would address the growing need of researchers to integrate distributed cross-disciplinary data sources into coherent knowledge bases for biomedical research. Researchers' prevalent practice of manually referencing related information in federated data sources is error-prone, time consuming and tedious, and must be repeatedly updated. In addition, researchers' source data comes from multiple runs on multiple instruments performed by multiple people. The ICDMS would address these issues with a graphical web-based workspace that allows researchers to organize and manipulate data from different sources, apply visualization and other analytics, query analytical results across distributed data sources for related valuable information, and easily access and share that workspace. Researchers at UCSF will test the ICDMS in a real-life environment. The PI will prove viability of the ICDMS by focusing on genomic and cytometric data utilized in breast cancer research in a representative active research program that generates high volume data sets and requires related queries across federated databases for effective research results. Establishing the viability and subsequent use of an online system such as the ICDMS has profound implications by creating enabling technology with the following impacts: 1. Increasing the progress rate of a research study or patient analysis, by allowing researchers to more effectively focus their efforts on the most promising data sets; 2. Increasing the progress rate by allowing researchers to save time and increase accuracy by automatically relating the significant data sets with other large data sources; 3. Increasing the continuity of research efforts as personnel in laboratories change; 4. Increasing the ability of researchers at different sites or in different disciplines to collaborate.