Abstract Allergies affect one in five Americans and are the 5th leading chronic disease in the U.S. Each year, allergies account for more than 17 million outpatient office visits. Although documenting and exchanging allergy information in electronic health records (EHRs) is becoming increasingly important, we still face multiple challenges. These include: lack of well-adopted standard terminologies for representing allergies, frequent entry of allergy information as free-text, and no existing process for reconciling allergy information. In this study, we will provide solutions to these challenges by addressing the following specific aims: 1) conduct analyses on standard terminologies and a large allergy repository to build a comprehensive knowledge base for representing allergy information; 2) design, develop and evaluate a natural language processing (NLP) module for extracting and encoding free-text allergy information and integrate it with an existing NLP system; 3) measure the feasibility and efficiency of the proposed NLP system for the new process of allergy reconciliation; and 4) distribute our methods and tool, so they are widely available to other researchers and healthcare institutions for non-commercial use.