This project was started as part of a joint project of the CADD Group with several groups at the Department of Defense (DoD), with the title Computational platforms for transforming small molecules into investigational new drugs. The projects lead PI on the DoD side was Dr. S. Anders Wallqvist, Tri-Service Biotechnology High-Performance Computing Software Applications Institute for Force Health Protection (BHSAI), Telemedicine and Advanced Technology Research Center (TATRC), U.S. Army Medical Research and Materiel Command (USAMRMC), 2405 Whittier Drive, Suite 200, Frederick, MD 217602. Other participating groups were at the Department of Biochemistry, Walter Reed Army Institute of Research (WRAIR), and the Department of Cell Biology and Biochemistry, U.S. Army Medical Research Institute for Infectious Diseases (USAMRIID). The aim of the overall project was to integrate three fundamental aspects of the preclinical drug development phase, i.e., structure-based drug design, analysis and prediction of pharmacological data, and the prediction of adverse and off-target effects, in particular those related to drug metabolization, from chemical structures. The most important aspect of Dr. Pugliese's work concerned metabolism and metabolites. The work having effectively started in early 2010, the former CADD Group member Dr. Pugliese worked on implementing a resource for successful prediction of metabolism and metabolites of drug-like small molecules as part of our computer-aided drug design capabilities, until his departure from NCI for a permanent position in June, 2011. While the initial tests and application of these resources were done in the context of pathogens of interest to DoD, the general capability of predicting metabolic stability, metabolization profile and specific metabolites of a small molecule is applicable to all types of drug development, and therefore is very useful in the development of anti-cancer therapeutics aiming at molecular targets of high interest to NCI, as well as in, e.g., NCI's anti-HIV drug design projects. The project has therefore been continued even after the completion of the formal collaboration with the DoD groups in summer 2011. The first phase of this project, consisting of canvassing the field for predictive computer tools as well as data sets that can be used to test these tools and develop (better) predictive models, has been successfully completed. Both commercial and free resources have been compiled or acquired. A comparison and benchmark study with appropriate publication was completed, submitted, and is in press. In this part of the project, we focused on (prediction of) metabolic stability data such as half-life values in Human Liver Microsome or Human Hepatocyte assays. This paper also includes a small benchmark study of predictions of cytochrome P450 interactions (substrates, inhibitors, and inducers). In the second, more-applied, phase of the project, we developed QSAR models for metabolic stability of compounds, based on in vitro half-life assay data measured in human liver microsomes. A variety of QSAR models were generated using different statistical methods and descriptor sets implemented in both open-source and commercial programs (KNIME, GUSAR, StarDrop). The models obtained were compared using four different external validation sets from public and commercial data sources, including two smaller sets of in vivo half-life data in humans. The most predictive models were used for predicting the metabolic stability of compounds from the Open NCI Database, the results of which have been made publicly available on the NCI/CADD Group web server (http://cactus.nci.nih.gov). Both this study and the paper mentioned above have been published in the journal Future Medicinal Chemistry. Current efforts focus on broadening our predictive capabilities to models of all types of properties in the area of absorption, distribution, metabolism, excretion, and toxicities (ADME/Tox) of small molecules. Recently, Dr. Alexey Zakharov has made available a suite of predictive models for physicochemical properties, toxicities, as well as some biological activities in the form of the Chemical Activity Predictor (CAP) web service on the NCI/CADD Group web server. In the context of this topic, CADD Group members have also developed and improved general QSAR-related approaches and algorithms, as well as analyzed (Q)SAR models' dependency on mix-and-match'ability of assay data coming both from specific projects and large public databases such as PubChem and ChEMBL. Further analyses of Q(SAR) data and approaches have been performed with our Russian colleagues. Also, large ADME-Tox computations are being performed for molecules from the SAVI project (Project 6).