We propose a program of research with two interlocking, foundational goals: (1) to develop and evaluate software for information extraction from clinical text corpora using existing Open Biomedical Ontologies (OBO) and (2) to develop and evaluate software for enrichment of existing biomedical ontologies from clinical text corpora. As a result of our work we will deliver the Ontology Development and Information Extraction Toolkit (ODIE) - a set of software components integrated with GATE, Prot[unreadable]g[unreadable] and LexGrid, that will assist researchers and ontology developers in performing these tasks. As a testbed for our work, we will focus mainly on the National Cancer Institute Thesaurus - an existing OBO ontology, but will develop many of our components to be generalizable to other OBO ontologies. We have chosen the domain of hematopathology as a test case because of the rich and varied source of clinical documents, and the potential for our software to advance translational biomedical research in this area. However the majority of the components that we develop will be domain-neutral and will generalize to other areas within and outside of Oncology. The work we propose is significant for three contributions. First, we will develop novel methods or modify existing methods for accomplishing information extraction and ontology enrichment and we will evaluate the performance of these alternatives. Second, we will develop and disseminate generic software resources for performing these tasks, which leverage the National Center for Biomedical Ontology supported tools. Third, we will contribute to the development of existing OBO ontologies. The results of this work will use OBO ontologies in fundamental ways to advance biomedicine. This grant propose to develop a set of computer tools to assist researchers in (1) extracting meaning and codifying medical documents, and (2) building formal representations of knowledge from those documents. This work would benefit the general public by increasing the speed and efficiency of determining what information is in a particular medical document and allowing automated processing of large numbers of documents. Additionally, the project would contribute to the software for developing other applications by helping researchers build more comprehensive ontologies. The results of this work may benefit both medical research and patient care.