DESCRIPTION (Taken from application abstract): The long-term aim of this project is to use natural language methods in order to enhance the functionality of the electronic medical record, which is a source of abundant clinical data. However, the data is mostly in textual form and therefore unusable for automated clinical applications, such as decision support, research, quality assurance, and outcomes assessment. By using a natural language processor to map the clinical information in the reports into structured codified clinical data, the data will be made readily accessible so that it could be utilized by subsequent automated clinical applications. We have already shown that it is possible to build an effective text processor that accurately codifies textual reports within the specialized domain of radiology. In this project we intend to build upon our successful experience and will extend the processor to another limited domain that is different from radiology and to a broad domain in order to study the feasibility of transferring the processor to all of medicine. More specifically, we will broaden the processor so that it codifies clinical information in the physical examination section of the discharge summary and then to all of the discharge summary, where we will focus on coding diagnoses. The emphasis of our work will not only be concerned with extending the language processor but will also focus on scalability, evaluation of the performance, the effort, and the portability aspects. In addition, because discharge summaries are so complex and comprehensive, we will have to extend the formal representational model of the clinical information and also develop new natural language processing techniques and new vocabulary development tools. This work will continue to be performed within an operational clinical setting.