Early detection of disease outbreaks can decrease patient morbidity and mortality and minimize the spread of diseases. Early detection requires accurate classification of patient symptoms early in the course of their illness. One approach is biosurveillance, in which electronic symptom data are captured early in the course of illness, and analyzed for signals that might indicate an outbreak requiring investigation and response by the public health system. Emergency department (ED) patient records are particularly useful for biosurveillance, given their timely, electronic availability. ED data elements used in surveillance systems include the chief complaint (a brief description of the patient's primary symptom(s)), and triage nurses'note (also known as history of present illness).The chief complaint is the most widely used ED data element, because it is recorded electronically by the majority of EDs. One study showed that adding triage notes increased the sensitivity of biosurveillance case detection. The increased sensitivity is because the triage note increases the amount of data available: instead of one symptom in a chief complaint (e.g., fever), triage notes may contain multiple symptoms (e.g., "fever, cough &shortness of breath for 12 hours"). Surveillance efforts are hampered, however, by the wide variability of free text data in ED chief complaints and triage notes. They often include misspellings, abbreviations, acronyms and other lexical and semantic variants that are difficult to group into symptom clusters (e.g., fever, temp 104, fvr, febrile). Tools are needed to address the lexical and semantic variation in symptom terms in ED data in order to improve biosurveillance. Natural language processing tools have been shown to facilitate concept extraction from more structured clinical data such as radiology reports, but there has been limited application of these techniques to free text ED triage notes. The project team developed the Emergency Medical Text Processor (EMT-P) to preprocess the chief complaint. EMT-P cleans and normalizes brief chief complaint entries and then extracts standardized concepts, but it is not sufficient in its current state to preprocess longer, more complex text passages such as triage notes. This proposed project will further strengthen biosurveillance by adapting EMT-P and other statistical and classical natural language processing tools to develop a system that extracts concepts from triage notes for biosurveillance.