To successfully use large linked clinical databases for comparative effectiveness research (CER) requires addressing some key informatics challenges associated with distributed, heterogeneous clinical data. Electronic networks of researchers are part of the solution because they can bridge the physical and organizational divides created by distinct health systems'individual electronic medical records (EMRs). In addition, informatics research has demonstrated the feasibility of automatically coding clinical text, enhancing the capacity to integrate both unstructured and non-standardized clinical data from EMRs. With this study, we propose to develop CER infrastructure, make broadly available the proven MediClass technology for automated classification of EMRs containing both coded data and text clinical notes, and demonstrate the potential of this infrastructure for addressing CER questions within the asthma and tobacco-using patient populations of 6 diverse health systems. Asthma and smoking each impose huge and modifiable burdens on the healthcare system, and multiple morbidities related to asthma and smoking have been targeted by the IOM and AHRQ as priority areas in efforts to improve the healthcare system through comparative effectiveness research. We propose to develop, deploy, operate and evaluate the CER HUB, an Internet-based platform for conducting CER, and to demonstrate its utility in studying clinical interventions in asthma and smoking. Researchers who register to use the HUB, beginning with the research team from the 6 participating study sites, will be able to use a secure website to configure and download MediClass applications addressing CER questions within their respective healthcare organizations, to contribute these IRB-approved, processed datasets back to a centralized data coordinating center to be pooled with data similarly processed from other healthcare organizations, and to use the pooled database to answer diverse comparative effectiveness questions of large, real-world populations. A central function of the CER HUB will be facilitating (through online, interactive tools) development of a shared library of MediClass knowledge modules that afford uniform, standardized coding of EMR data. This shared library of knowledge modules could permit researchers to assess effectiveness in multiple areas of healthcare and gain access to data otherwise locked away in text clinical notes. A goal of the CER HUB is to accelerate creation of standardized knowledge used to normalize heterogeneous EMR data as representations of clinical events for CER. During the project period we will conduct 2 studies using this infrastructure to address the effectiveness of interventions for asthmatics and tobacco users across the 6 participating health systems. As an ongoing resource, the HUB will provide a collaborative development platform for enhancing comparative effectiveness research in potentially any health care domain. CER researchers can build software applications that will process their EMRs, creating standardized datasets permitting CER using a secure website to configure and download MediClass applications addressing CER questions within their respective healthcare organizations, to contribute these IRB-approved, processed datasets back to a centralized data coordinating center to be pooled with data similarly processed from other healthcare organizations, and to use the pooled database to answer diverse comparative effectiveness questions of large, real-world populations PUBLIC HEALTH RELEVANCE: Comparative effectiveness research (CER) requires that clinical data be in standard forms allowing multiple, large databases to be efficiently combined, and requires that all of the data be coded so that automated summarization of the data is possible. However, much of the clinical data necessary for CER is in the text clinical notes written by clinicians when caring for patients. We will build a centralized website where CER researchers can build software applications that will process their electronic medical records, including both the text and coded data, creating standardized datasets permitting comparative effectiveness research. We will demonstrate the utility of this infrastructure by conducting CER studies investigating the effectiveness of interventions in asthma and smoking, across the 6 participating health systems.