Project Summary (Abstract) With little change in incidence for over 50 years, pneumonia remains the top cause for morbid hospitalization1 in the USA, and is associated with healthcare costs exceeding $10 billion annually2. This study proposes to mine big data captured in an intergrated medical/dental record (iEHR) and enterprise data warehouse (EDW) of a large midwestern medical-dental integrated healthcare system and will test the hypothesis that poor oral health is an independent risk for subtypes of community-acquired and hospital-acquired pneumonia. Proposed specific aims include: 1) electronic identification and characterization of pneumonia types and 2) evaluation of the association of oral health status with risk of pneumonia. Tasks to achieve study aims are to: a) develop electronic, phenotype-based algorithm(s) to classify and characterize pneumonia by subtype and relative frequency of events; b) characterize impact of immediate and longitudinal oral health status on emergent pneumonia stratified by subtype; and c) evaluate relative risk contributed by medical and dental factors. Innovative application of natural language processing (NLP) to support evaluation of unstructured data and machine learning (ML) to identify as-yet unknown potential risk factors is proposed. These aims will be accomplished by established investigators including dentists and researchers with extensive research track records in oral and systemic health including pneumonia, clinical pulmonologist/intensivist to provide clinical expertise to inform data mining, biomedical informaticians with expertise in data mining, ML and NLP and data modeling, and experienced biostatisticians who will apply appropriate statistical approaches and traditional data modeling to big data. This team will collaboratively create and deliver a unique, well-defined, pneumonia- specific, oral health data registry resource and validated phenotype-based algorithm to classify pneumonia, stratified by subtypes, which will support future interrogation for additional permutations of medical and dental factors. Study outcomes are expected to leverage immediate translational value within the health system with high potential for relevance and portability to other settings. The project is expected to define risk factors which may represent actionable targets for reduction of pneumonia risk across various settings.