Abstract Virtual screening is the most practical method to leverage ligand and protein structures for lead discovery. Unfortunately, both ligand-based and docking techniques are inaccessible to most investigators. A key result from the first period, the ZINC database, has lowered the barrier to entry for docking through public access 3D screening libraries. The Similarity Ensemble Approach (SEA), also developed in the first period, has shown early promise for ligand-based target identification. Still, virtual screening remains difficult to use for most investigators. To lower these barriers still further we will develop databases and automated tools for use by the general community, and investigate their usefulness in proof-of-concept studies. The specific aims are: 1. To develop databases that derive from and enable virtual screening. A. We will develop a database of pre-calculated docking hits that can simply be looked up and purchased for about 1,000 protein targets. This will rely on automated tools for docking, hit evaluation, and comparisons among targets (aim 2). We will also improve databases for virtual screening developed in the first period. These include: B. Expanding ZINC, adding more commercially available compounds and improving the structures represented in it. C. Improving the robustness of DUD, a general benchmarking set for virtual screening. D. Expanding the database of high energy intermediates (HEI) developed in the first period for protein function prediction. 2. To create simple web-based tools for ligand-based and protein-based virtual screening. We will develop and refine two web-based tools to enable non-specialists to discover ligands for their targets. A. For structure-based docking, a simple-looking web-interface to docking that guides the user, selects parameters, calibrates the model, and manages the calculation on our cluster. We will develop automated tools to evaluate the reliability of docking results. B. The second virtual screening tool is ligand based, for use when the structure of the target is unknown but many ligands are available, or when one wants to explore alternate targets for a known drug or reagent. We further develop a novel cheminformatic method SEA introduced in the last period to predict target relationships and off-target effects. This approach has had precocious success in identifying interesting polypharmacology, and we will also use it ourselves to predict- and-test off-target, clinically relevant effects of 50 to 100 FDA drugs, and identify the targets of the ~10% of FDA drugs for which a target is unknown. PUBLIC HEALTH RELEVANCE: Virtual screening is widely used to discover new molecular leads for drug discovery and reagents to understand biological processes. Unfortunately, the technique remains difficult to use, and has thus been restricted to a few expert laboratories, limiting its usefulness. In this proposal, we create databases and tools to bring virtual screening to a wide biological audience, much expanding its impact and usefulness.