Acute respiratory distress syndrome (ARDS) represent a major global public health problem, with an ICU incidence of 10.4% and an associated mortality rate of 35-45%. Early detection of ARDS is essential for appropriate application of Lung Protective Ventilation, the only therapeutic intervention demonstrated to improve mortality in mechanically ventilated patients with established ARDS. In response to this need, tools that electronically monitor ABG results and interpret radiology reports of CXR findings for evidence of ARDS criteria in mechanically ventilated patients have been developed and deployed in critical care settings. Although these tools have been sufficiently accurate to detect ARDS, they fail to detect early stages of disease progression, prior to full-blown ARDS development, in at risk patients. Recent initiatives are focused on tools to identify patients in very early stages of ARDS before wide-spread alveolar damage has occurred, enabling strategies focused on prevention of ARDS development. Among efforts to use EMR data to reliably identify patients at high risk of developing ARDS, the most notable are the Lung Injury Prediction Score (LIPS) and the ?early acute lung injury? score (EALI). Validation studies show these promising tools have discriminating performance with an area under ROC curve (AUC) in the .7-.8 range, and demonstrate risk detection early in the course of illness, well in advance of frank respiratory failure. Our vision is to build on these efforts, and derive a highly automated early identification algorithm based on routinely captured medical record data that reliably and accurately identifies trajectories associated with post-exposure to specific classes of risk factors that predicts patients at risk for development of ARDS with an AUC discrimination in excess of .95. We believe the predictive performance of existing ARDS screening tools can be significantly improved by: 1) combined use of highly specific/sensitive cognitive/explanatory models to help guide machine learning by reducing ?label noise? and identifying specific risk-induced trajectories associated with progression to ARDS; 2) exploring a full constellation of predictive risk factors not yet addressed by earlier efforts; 3) use of advanced natural language processing technology to extract ARDS risk information from free text clinical reports, and 4) use of enhanced machine learning algorithms that are label noise tolerant and combine the predictive power of nonparametric algorithms with the feature interpretability associated with parametric techniques.