Collaborative Drug Discovery, Inc. (CDD) proposes to create a novel web-based software platform that enables scientists to work together effectively to discover and improve new drug leads, yet with the option not to reveal chemical structures to each other. It will create the first practical system of biocomputational analysis across distributed datasets with different owners, while respecting data privacy. By lowering this key barrier to collaboration, the platform will accelerate the pre-clinical drug discovery pipeline. Research aimed at neglected diseases and orphan indications will especially benefit, because they often rely on the loosely affiliated efforts of academic investigators, non-profit foundations, government laboratories, and small biotechnology firms ("extra-pharma" entities). Such efforts typically lack not only the resources but also the integrated workflows of discovery projects conducted at large pharmaceutical companies (within which data can be shared freely across departments). The project will for the first time enable researchers focused on neglected diseases and orphan indications to effectively exploit biocomputational tools such as virtual screening and ADME/Tox predictions, which are now considered to be standard and indispensible components of early discovery workflows within large pharma. It will also make it easier for these extra-pharma researchers to collaborate with large pharma and benefit from large pharma's significant investment accumulating large high-quality datasets. In Phase 1 of the proposed SBIR, CDD will leverage ongoing collaborations to prove the feasibility and value of the approach with prospective potency predictions in advance of experimental confirmation. Key collaborators include Prof. Carl Nathan at Weill Cornell Medical College, Dr. Clifton Barry, III, at NIAID, and Allen Casey at the Infectious Disease Research Institute (IDRI). Their groups will serve as an experimental test bed for the project. They all have ongoing screening programs to discover compounds active against tuberculosis (TB). Specific aims for Phase 1 include: 1. Demonstrate the value to the collaborating screening centers of creating computational TB screening models derived from distributed, heterogeneous collections of data and exploiting the models prospectively to filter and prioritize the molecules scheduled to be screened. Validate the hypothesis that by selecting subsets enriched with active compounds, the centers can efficiently explore more of chemical space than would otherwise be possible with limited resources. 2. Develop initial standards for specifying models (including purpose, inputs, outputs, algorithms, descriptor types, domain of applicability and other parameters necessary for presentation, interpretation, and exchange) that will form the outline for more comprehensive software prototypes that CDD will iteratively develop, deploy, test and validate in Phase 2. PUBLIC HEALTH RELEVANCE: The proposed project will create novel computational tools that will help researchers to accelerate the discovery of new and improved drugs against a wide range of diseases. These tools will particularly benefit researchers working on diseases that leading pharmaceutical companies have largely ignored because they are not perceived as highly profitable opportunities, despite the fact that in many cases they afflict millions of people.