Project Summary/Abstract The COVID-19 pandemic is a significant public health problem that will require novel approaches for management and intervention. Knowledge of the disease?s transmission, symptomatology, clinical course, treatment and outcomes is rapidly evolving based on many sources. An important source for advancing this knowledge are data from electronic health records (EHR) and health information exchanges (HIE) because they can provide a real-time, unvarnished view of the disease. However, the initially ?invisible? nature of the disease makes clear that clinicians and public health personnel were at a significant disadvantage in discovering and quantifying the pandemic. There is an urgent need to learn rapidly from EHR and other data to improve discovery and monitoring of patients infected by the coronavirus. The evolving dynamic and understanding of the incidence and course of COVID-19 requires that we develop new methods for discovery from data. The long-term goal of our research is to develop collaborative filtering algorithms to facilitate access to and analysis of clinical data. The goal of this application is to characterize COVID-19 patients through data in a community HIE, specifically the Indiana Network for Patient Care (INPC) within Indiana?s HIE (IHIE), and understand how that characterization differs from that within the EHRs of individual health systems. Understanding how COVID-19 patients are represented in HIEs and EHRs will build an important foundation for downstream computational activities, such as real-time discovery, public health surveillance, intervention management and contact tracing. The two specific aims of this project are to (1) extract a cohort of patients suffering from COVID-19 and similar diseases from IHIE and (2) characterize patients according to several dimensions, such as demographics, signs and symptoms, and disease course using both the INPC as well as separate EHR data sets. The data, going back to 1/1/2015, will be extracted from the INPC, and the clinical data warehouses at IU Health and Eskenazi Health, two of our major health system partners. As of this writing, 230,749 individuals in Indiana (3.4 percent of the population of 6.73m) have been tested for the coronavirus, of whom 32,078 (13.9 percent) have tested positive. We will apply computational phenotyping approaches using both HIE and individual EHR data in order to help us evaluate to what degree data from individual EHRs can help approximate characterizations based on HIE data. This proposal is significant because it will help us understand how HIE and EHR data can be used to characterize both COVID-19 and non-COVID-19 patients. It is innovative because it leverages multiple computational phenotyping methods on both individual organizations? EHR, as well as HIE, data to generate a comprehensive characterization of COVID-19 and non-COVID-19 patients.