PROJECT SUMMARY Individuals with substance use disorders are disproportionately experiencing homelessness, poverty, and chronic medical conditions (diabetes and hypertension), which are emerging risk factors for contracting SARS-CoV-2 (official name for the virus that causes COVID-19). Different types of substance use have been associated with development of respiratory infections and progression to severe respiratory failure, also known as Acute Respiratory Distress Syndrome (ARDS). However, complex syndromes like ARDS and behavioral conditions like substance misuse are difficult to identify from the electronic health record. Clinical notes and radiology reports provide a rich source of information that may be used to identify cases of substance misuse and ARDS. This information is routinely recorded during hospital care, and automated, data-driven solutions with natural language processing (NLP) can extract semantics and important risk factors from the unstructured data of clinical notes. The computational methods of NLP derive meaning from clinical notes, from which machine learning can predict risk factors for patients leaving AMA or progressing to respiratory failure. Our team developed tools with >80% sensitivity/specificity to identify individual types of substance misuse using NLP with machine learning (ML). Our single-center models delineated risk factors embedded in the notes (e.g., mental health conditions, socioeconomic indicators). Further, we have developed and externally validated a machine learning tool to identify cases of ARDS with high accuracy for early treatment. We aim to expand this work by pooling data across health systems and build a generalizable and comprehensive classifier that captures multiple types of substance misuse for use in risk stratification and prognostication during the COVID pandemic. We hypothesize that a single-model NLP substance misuse classifier will provide a standardized, interoperable, and accurate approach for universal analysis of hospitalized patients, and that such information can be used to identify those at risk for disrupted care and those at risk for respiratory failure. We aim to train and test our substance misuse classifiers at Rush in a retrospective dataset of over 60,000 hospitalizations that have been manually screened with the universal screen, AUDIT, and DAST. This Administrative Supplement will allow us to examine the correlations between substances of misuse and risk for COVID-19 as well as development of Acute Respiratory Distress Syndrome (ARDS) in the context of these phenomena.