Electronic medical records and exchanges offer new opportunities for the analysis of population health data;however, new methods in natural language processing (NLP) must first be developed to structure and codify these records, since most medical data is in the form of free text which cannot be stored and manipulated by computers. Once this is accomplished, population health data can be analyzed which will lead to better treatment guidelines, targeted drug therapy, and more cost effective care. Logical Semantics, Inc. (LSI) proposes to develop new statistical NLP methods for analyzing large scale medical domains. These methods will leverage LSI's semantic annotation technology, which has created the largest semantically annotated clinical corpus in the world. LSI's goal is to semantically index large medical record repositories accurately against propositions arranged in knowledge ontologies and make these indices available for text mining applications. The phase one research is focused on three specific aims that will lead to breakthroughs in the science of NLP: (1) Develop new statistical NLP algorithms employing a large semantically annotated medical corpus, (2) Semi-automate knowledge ontology generation, and (3) Develop and combine rule based with statistical NLP algorithms to create a superior hybrid NLP system. The achievement of these aims will result in computer systems that can extract the meaning from free text medical records so researchers, policy makers, and clinicians can use health analytics to improve healthcare. PUBLIC HEALTH RELEVANCE: Natural language processing (NLP) has been successful in extracting specific findings and diagnoses from free text medical records. However, for NLP to be useful in health analytics, methods must be devised to capture most of the findings in a medical record. Logical Semantics, Inc. (LSI) proposes to build new statistical algorithms that can scale against the numerous complex findings in medical reports. LSI will leverage its advanced semantic annotation technology which employs corpus linguistics and sentential logic to build these new algorithms. The goal is to abstract over 80% of a free text records into computer readable form so that researchers can develop new treatment guidelines, improve decision support, and deliver more cost effective care.