ChemBench: the Integrated Web Portal to Accelerate Cheminformatics and Chemical Genomics Research. Project Summary/Abstract As a result of growing proliferation of high-throughput technologies in modern chemical and biological research, the experimental scientists generating large volumes of data are no longer equipped with adequate tools and approaches to manage, let alone analyze, their own data. Modern chemical and biological research requires sophisticated public-domain tools for computational and statistical analysis of large experimental datasets to create data models that can aid and prioritize further experiments. Cheminformatics has emerged in the last decade as a burgeoning research discipline combining computational, statistical, and informational methodologies with some of the key concepts in chemistry and biology. Modern cheminformatics is defined broadly as a chemocentric scientific discipline encompassing the creation, retrieval, management, visualization, modeling, discovery, and dissemination of chemical knowledge. Further progress of cheminformatics research is hampered by the lack of publicly available computational tools and software to analyze chemical genomics data. This project endeavors to fill this gap through an advanced cheminformatics web portal called Chembench. The current system has been built in our group in the last four years with limited initial support from the NIH RoadMap planning grant. This project will advance Chembench to become an easily extensible and maintainable system addressing the needs of both computational and experimental chemical genomics scientists by providing a freely available model building tool for cheminformaticians and freely available compound activity predicting tools for biologists and medicinal chemists. The computational workflow to incorporate these key procedures and afford their transparent use must be based on rigorous software design approaches and procedures that constitute the focus of Specific Aim 1 of this project. Building upon this rigorous computational infrastructure, the functional key elements of the workflow include modules for (i) chemical data preparation including curation, descriptor calculation, and chemical space visualization (Aim 2); (ii) QSAR model development and statistical validation (Aim 3); (iii) prediction of computational hits in external compound libraries (Aim 4); and (iv) analysis and exploration of computational hits (Aim 5). The portal is designed as an aggregate of modules that can be used as part of an automated integrated workflow or individually, depending on the research interests and specific expertise of users. Similar to the role that bioinformatics has played in transforming modern biomedical research; cheminformatics is poised to revolutionize all areas of research in chemistry and chemical biology. Chembench provides the experimental and computational probe/drug discovery researchers with knowledge discovery tools and infrastructure that enables them to explore and exploit chemical genomics databases, build rigorous data models, and reduce the experimental effort required to identify novel biologically active compounds. The publicly available Chembench portal will help translate large scale chemical genomics research into the discovery of new medicines to improve public health. PUBLIC HEALTH RELEVANCE: This project intends to develop a publicly available cheminformatics ChemBench portal to provide the experimental and computational probe/drug discovery researchers with knowledge discovery tools and infrastructure to explore and exploit chemical genomics databases, build rigorous data models, and reduce the amount of experimental effort required to identify novel biologically active compounds such as chemical probes and drug candidates.