The main objective of this project is to develop, validate, and deliver efficient computational tools for rapid and reliable prediction of biological activity and/or related pharmaceutical properties of drug-like molecules. We plan to develop statistically significant and robust Quantitative Structure-Activity Relationships (QSAR) methodologies, which incorporate rigorous validation procedures and lead to models with a high predictive power and practical utility. The methodologies are built upon the similarity principle, i.e., similarity or diversity of chemical structures determines similarity or diversity of their biological action. We argue that chemical similarity should be evaluated in the context of the target property and employ objective similarity and diversity functions to achieve biologically meaningful clustering of compounds in the descriptor space. Consequently, our methodologies employ variable selection procedures aimed at identifying descriptors most relevant with respect to the target property. Paramount to our approach is rigorous model validation with external datasets which ensures the highest hit rates when predictive QSAR models are ultimately applied to screening chemical databases or virtual libraries for biologically active compounds. Four major areas of concentration in this proposal corresponding to its Specific Aims include: . development of novel, mainly non-linear QSAR methods, such as k nearest neighbor (kNN) and Support Vector Machines (SVM) approaches. The emphasis will be on the development of novel descriptors of chemical structure, and the efficiency, automation and statistical robustness of the underlying methodologies to afford their application to large commercial datasets. . development of efficient and objective QSAR model validation methodologies, which define the domain of model applicability and maximize the predictive ability of the models. This studies should lead to establishing widely accepted "good practices" of reliable and extensively validated QSAR models. . application of validated QSAR modeling methods to various datasets of pharmacological or pharmaceutical significance in collaboration with experimental investigators; this part also includes the development of new approaches for effective data analysis and model interpretation based on advanced algorithms for data compression and mapping from high- to low dimensional descriptor space. . implementation of all modeling methods developed and validated in the course of this work in the publicly accessible UNC QSAR web server. Successful implementation of this proposal is expected to afford highly automated, predictive and accessible QSAR modeling tools, which will benefit a broad research community working in the area of drug design and discovery.