This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. ABSTRACT The BRITE Center has been established by North Carolina as the premier Center of Excellence in biotechnology and drug discovery within NC Central University, an HBCU (Historically Black College/University) to advance the education and research in this promising field. As a component of this Center, we are engaged in the research and application of cheminformatics and computational drug discovery tools. Specifically, we aim to conduct the following work using the computing resources of DAC-TeraGrid if we are granted the award. 1. Virtual Computational Drug Screening Background: Current pharmacophore based computational virtual screening methods often ignore the intricate details of the binding site shapes and focus only on the key pharmacophore elements. Thus, they often miss critically important information during the virtual screening process, resulting in many false positives. For example, large molecules with multiple side chains attached to a central scaffold may be selected as false positives, simply because their core structures have the required pharmacophore elements. This situation may be alleviated if binding site excluded volumes are considered. However, the efficiency of such a process is dramatically reduced due to the frequent checks for clashes with the excluded volumes. Thus, there is a great need to increase the efficiency of such method for database searching while taking into account the volume restraints that help reduce the false positive rate. The Computational Method: To address the pitfalls and shortcomings of current pharmacophore methods, we have engaged in the study of a novel structure-based shape pharmacophore method for virtual screening. It takes advantage of a computational geometry algorithm (Delauney tessellation / alpha-shape analysis) to detect the binding site atoms and generate a negative image of the binding site, which complements the binding site shape. This negative image is represented by a set of spheres of different sizes. There are multiple ways to represent the overall shape of this set of spheres. Currently, we use the OEChem Shape library functions to represent it due to the well-known efficiency of the shape matching algorithm. Other computer vision method is also being explored to help improve the accuracy and efficiency of the shape matching process. The innovative aspect of our method comes from the fact that a rigorous computational geometry algorithm has been used to detect the binding site atoms, and a deterministic process to generate the matching image, as well as the representation of this image with OEChem Shape functions. Additional development will include adding more advanced computer vision techniques for shape matching and recognition. The Computing Plan: The above method will be applied to several selected targets: PDE (phosphodiesterases), HIV reverse transcriptase, nuclear hormone receptors and a few other targets. Validation of the method will be performed based on the information on known inhibitor/ligands in the WOMBAT database, which contains over 50,000 molecular structures. The computing intensive *conformational analysis* of these 50,000 molecules as well as *shape matching* with each of the above targets will be conducted in this proposal. The retrieval rate of known active compounds will be compared with ligand-based shape matching (ROCS) as well as FRED docking program (both are computationally intensive as well). 2. Biologically Relevant Molecular Diversity Measure Background: In diversity analysis of compound libraries, most methods look at only the self-dissimilarity among the compound structures, neglecting known information about the biological space revealed by structural genomics projects. This may lead to a hugely diverse set of compounds which may not have any biological effect on most targets. We use the structure-based shape pharmacophore method (see section 1) to evaluate the relevance of a given compounds by comparing its shape with a PANEL of shapes derived from a selected set of protein structures. The RATIONALE behind this is that the shapes of functional pockets on protein surface or binding sites are often the determinant for a molecules biological functions. By using a PANEL of shapes derived from biologically relevant protein pockets, we can evaluate whether a molecule might be biologically active. Such a method is extremely useful especially in the context of the NIH Roadmap initiative where finding chemical probes for biological pathways is the main task. The Computing Plan: a HitMap where the fitness of each molecule in a collection with each of the PANEL shapes will be evaluated. Such a HitMap across a collection of protein structures (>100) will essentially build tentative links between protein structures and small molecule collections. For example, the HitMap for the PubChem molecules will be a useful resource for Chemical Genomics investigations where researchers will have a holistic view of what a molecule might do to other proteins in addition to the target of interest. This grant would greatly enhance our computational effort to obtain such a chemical genomics tree (CGTree) that links chemical structures to their potential biological targets, and ultimately help advance the goal of NIH Roadmap on Chemical Genomics Research.