An efficient drug research and development process requires the in-depth understanding of the molecular, cellular and toxicological properties and mechanisms of a drug candidate. Relevant pieces of information are scattered among thousands of scientific publications, and bringing them together is becoming a bottleneck in the modern biomedical research and drug discovery process. Efficiency of data search, access, and analysis can be greatly facilitated by the development of domain-specific knowledge bases. A knowledge base is a comprehensive database of information, structured in such a manner to optimize data access and make it amenable to sophisticated logical queries, computational analyses, and data mining. However, existing data resources in the field of pharmacology and toxicology only partially satisfy these criteria. Being dependent on human experts, these data resources usually store a very limited amount of information about certain aspects of the mechanisms and toxicity of well-studied drugs. The bulk of biological, pharmacological, and clinical information remains available only in free-text form, and its volume continues to grow exponentially. The latest advances in automated information techniques offer a unique opportunity to bridge the gap between free-text and structured knowledge in the drug development field. We have developed MedScan, a flexible protein information extraction technology. We now propose to extend its scope into the metabolic, toxicological, and clinical areas. We propose to build a prototype of a knowledge base supporting drug discovery and development research. It will bring together molecular, cellular, and clinical aspects of drug action mechanisms, metabolism, and toxicity. The knowledge base will be populated (and automatically updated) with information extracted from publicly available literature resources (such as Pubmed, ToxLine, and full-text journals) using extended MedScan technology. The value of the proposed system is two-fold. First, the developed knowledge base can be used as a highly integrated cross-domain knowledge resource providing fast and efficient access to the latest information about all aspects of drug mechanisms, metabolism, and toxicity. Second, this knowledge can be used as a training set for building toxicological computational prediction models. We believe that such a knowledge base can facilitate research in the pharmaceutical and drug discovery fields and will be highly useful to many researchers in these areas. An efficient drug research and development process requires the in-depth understanding of the molecular, cellular and toxicological properties and mechanisms of a drug candidate. Collecting this information is a daunting task, since it is scattered among thousands of scientific publications and reports. In this proposal we outline how to apply the most recent advances in the information extraction techniques to create and automatically maintain a knowledge base that brings together molecular, cellular, and clinical aspects of drug action mechanisms, metabolism, and toxicity. The developed knowledge base can be used as a highly integrated cross-domain knowledge resource providing fast and efficient access to the latest information in this field. Collected knowledge can also be used as a training set for building toxicological computational prediction models. We believe that such a knowledge base can facilitate research in the pharmaceutical and drug discovery fields and will be highly useful to many researchers in these areas. [unreadable] [unreadable] [unreadable]