Adverse drug reactions (ADRs) are a major burden for patients and healthcare, causing preventable hospitalizations and deaths, and incurring a huge cost. The long-term objective of this proposal is to advance patient safety and reduce costs by discovering novel serious ADRs through use of automated methods that combine information from large and varied patient populations as well as from the literature. There have been considerable advances in pharmacovigilance, but more work is needed. For example, Vioxx, a commonly used drug, was recently found to cause at least 88,000 occurrences of myocardial infarction, highlighting the insufficiency of current methods. To date, methods have mainly depended on the use of single sources of data, primarily from the Federal Food and Drug Administration Adverse Event Reporting System (FAERS) and from electronic health records (EHRS). Although important, each of the sources has different limitations and advantages, and therefore, combining the data across them should lead to more effective drug safety surveillance by increasing the statistical power, and also by allowing each data source to complement the other sources. We already have developed methods associated with each of the single sources, and therefore, this is an excellent opportunity to build upon our research accomplishments to advance the state of the art in pharmacovigilance. More specifically, we will a) acquire and combine comprehensive clinical data from the electronic health records (EHRs) of two different health care sites serving diverse populations by utilizing natural language processing (NLP) to obtain vast quantities of fine-grained data, and then by developing data mining methodologies on the clinical data to detect novel ADR signals, b) analyze differences in therapy-related risk factors between the two EHR populations, such as racial and ethnic differences, c) detect ADR signals in the FAERS database using an established methodology, d) develop improved methods to acquire ADR signals based on information in the literature, and e) develop methods that utilize the results from the above sources to maximize effectiveness. We will focus on eight serious ADRs, and collect a high-quality reference standard for those ADRs so that we will be able to evaluate and compare performance of the different detection methods individually as well as the methods that combine the sources. This proposal is well positioned to overcome problems associated with existing automated methods, which are primarily based on use of individual sources of data. We are confident the methods will be effective because a strong infrastructure is in place for us to build upon. Most importantly, the methodology developed in this proposal presents an excellent chance to leverage heterogeneous data sources to dramatically improve patient safety and reduce costs.