This application addresses broad Challenge Area (10) Information Technology for Processing Health Care Data and specific Challenge Topic 10-LM-101: Informatics for post-marketing surveillance. The overall goal of this study is to develop a generalizable framework for studying medication side effects recorded in narrative medical documents. We will implement and test this system on the example of epidemiologic characterization of side effects of HMG-CoA reductase inhibitors (a.k.a. statins). Statins are the most commonly used class of medications for treatment of hypercholesterolemia in the U.S. In randomized clinical trials statins are associated only with a slight increase in adverse reactions and no increase in discontinuation of treatment compared to placebo. However, in clinical practice the rates of side effects and discontinuation appear significantly higher and represent a major barrier to a critical, potentially lifesaving therapy. For example, myalgias are reported to be relatively rare in clinical trials but are thought to be more common in clinical practice. Additionally, a number of other statin-associated complaints reported anecdotally but not well elucidated in clinical trials include depression, irritability, and memory loss among others. Most of these have been poorly epidemiologically characterized and their prevalence and risk factors remain unknown. Structured electronic medical record (EMR) and administrative data have been used to study medication side effects. However, structured data have important limitations. They may not contain temporal or causative information necessary to link particular problems to medications and may not be sufficiently granular to identify specific adverse reactions. Narrative EMR data, such as provider notes, can provide documentation of causative links between medication and adverse events at high levels of granularity. Natural language processing (NLP) is an emerging technology that enables computational abstraction of information from narrative medical documents. In prior work we have successfully applied natural language processing to abstract medication information from narrative provider notes, including medication intensification, medication non-adherence and medication discontinuation. We will leverage these tools and the extensive EMR infrastructure at Partners HealthCare to develop and test a natural language processing system to study medication side effects. We will validate this system on the example of studying epidemiology of adverse reactions to statins. The findings of this project will lay the foundation for an open-source system that can be used for post-marketing surveillance of medication side effects using narrative EMR data. PUBLIC HEALTH RELEVANCE (provided by applicant): Frequency and risk factors for side effects of statins (medications used to treat high cholesterol) in everyday medical practice (as opposed to research studies) are not known. In this project we will design a system for analyzing the information about statin side effects in the electronic medical records. If successful, this approach can be subsequently generalized to study side effects of many other medications.