Combinatorial chemistry, a means for rapidly synthesizing large libraries of molecules, is an active area of research with the promise of dramatically accelerating and improving the process of drug discovery. The screening of diverse libraries is expected to help medicinal chemists find drug candidates quickly, but exactly what "diversity" should mean for this purpose remains controversial and somewhat vague. The principal objective of this project is to place diversity on a mathematically rigorous footing by constructing a space of chemical descriptors such that biological activity on pharmacologically relevant assays correlates with regions of that space. Overall goals include: 1. the development of a metric for ranking libraries of molecules, 2. the precise assessment of the relative information provided by binary and multi-valued biological measurements, and 3. the development of an improved statistical inference procedure for structure-activity hypothesis generation. Specific goals for Phase I of the project are the following: 1) The construction of a metric space of pharmacologically relevant descriptors 2) The assessment of the descriptors using a novel, information-theoretic characterization of the statistical significance of pharmacophores generated using these descriptors 3) A statistical characterization of the relationship between regions in descriptor space and pharmacological activity PROPOSED COMMERCIAL APPLICATION: Drug discovery is extremely expensive and typically takes 3-7 years. Many of the world's pharmaceutical and biotechnology companies are pursuing the development and use of combinatorial libraries as a more efficient means of finding pharmaceutical leads. The results of the research proposed here would help these companies to design their libraries, to screen them effectively for new information, and to use that information to discover novel therapeutic agents both inside and outside their combinatorial libraries.