Medical errors have been shown to be the third leading cause of death in the United States. The Institute of Medicine and several state legislatures have recommended the use of patient safety event reporting systems (PSRS) to better understand and improve safety hazards. Numerous healthcare providers have adopted these systems, which provide a framework for healthcare provlder staff to report patient safety events. Public databases like MAUDE and VAERS have also been created to collect and trend safety events across healthcare systems. A patient safety event (PSE) report generally consists of both structured and unstructured data elements. Structured data are pre-defined, fixed fields that solicit specific information about the event. The unstructured data fields generally include a free text field where the reporter can enter a text description of the event. The text descriptions are often a rich data source in that the reporter ls not constrained to limited categories or selection options and is able to freely descrlbe the details of the event. The goal of this project is to develop novel statistical methods to analyze unstructured text like patient safety event reports arising in healthcare, which can lead to significant improvements to patient safety and enable timely intervention strategies. We address three problems: (a) Building realistic and meaningful baseline models for near misses, and detecting systematic deterioration of adverse outcomes relative to such baselines; (b) Understanding critical factors that lead to near misses & quantifying severity of outcomes; and (c) ldentifylng document groups of interest. We will use novel statistical approaches that combine Natural Language Processing with Statistical Process Monitoring, Statistical Networks Analysis, and Spatio-temporal Modeling to build a generalizable toolbox that can address these issues in healthcare. An important advantage of our research team is the involvement of healthcare domain experts and access to frontline staff, and we will leverage this strength to develop our algorithms. A key feature of our work is the generalizability of our methods, which will be applicable to biomedical documents arising across a remarkable variety of areas, such as patient safety and equipment malfunction reports, electronic health records, adverse drug or vaccine reports, etc. We will also release open source software via R packages & GitHub, which will enable healthcare staff and researchers to execute our methods on their datasets.