Project Summary The proposed work will accelerate the pace of drug discovery by developing, validating, and testing new methods, tools, and resources for structure-based drug design. Two fundamental challenges of structure-based drug design are the accurate scoring and ranking of protein-ligand structures, which identi?es active com- pounds, and the ability to ef?ciently search a large number of ligands, which ensures that active compounds are sampled. This proposal will address these challenges by developing a novel approach for protein-ligand scoring and expanding the size of the chemical space that can be ef?ciently searched during lead optimiza- tion. The methods will be validated by their prospective application toward the discovery of new anti-cancer molecules and will be made readily accessible through online resources and open-source tools. The proposal leverages recent and signi?cant advances in deep learning and image recognition to develop scoring functions that accurately recognize high-af?nity protein-ligand interactions. This is achieved by design- ing and training convolutional neural nets on three-dimensional representations of protein-ligand structures to discriminate between binders and non-binders. Convolutional neural net training will exploit large datasets of af?nity and structural data to automatically extract the relevant features necessary to accurately prioritize compounds. Additionally, the proposal develops the ?rst means of fully integrating a convolutional neural net scoring function directly into an energy minimization and docking work?ow. Interactive virtual screening enables the search of millions of compounds in a few seconds so that queries can be interactively optimized. Interactivity enables the synergistic uni?cation of human expert knowledge and ef?cient computational algorithms. The proposed work will dramatically expand the size of chemical space ac- cessible through interactive virtual screening. Algorithms for ef?ciently searching the chemical space of billions or trillions of compounds implicitly de?ned by a set of reaction schemas and fragments will be created as part of a lead optimization work?ow. Fragment-oriented search will be accelerated by a new data structure that combines pharmacophore and molecular shape information into a single sub-linear time index. The scoring and lead optimization methods developed in this proposal will be released as open-source soft- ware and made immediately available through open-access online resources. As part of the prospective valida- tion of the proposed methods, these resources will be used to identify hit compounds and optimize leads for two targets related to cancer metabolism: serine hydroxymethyltransferase and kidney glutaminase isoform C. Successful completion of the objectives of this proposal will positively impact public health by reducing the cost and time-to-market of developing new drugs, particularly with respect to novel protein targets.