Errors in medical documents represent a critical issue that can adversely affect healthcare quality and safety. Physician use of speech recognition (SR) technology has risen in recent years due to its ease of use and efficiency at the point of care. However, high error rates, upwards of 10-23%, have been observed in SR-generated medical documents. Error correction and content editing can be time consuming for clinicians. A solution to this problem is to improve accuracy through automated error detection using natural language processing (NLP). In this study, we will provide solutions to these challenges by addressing the following specific aims: 1) build a large corpus of clinical documents dictated via SR across different healthcare institutions and clinical settings; 2) conduct error analysis to estimate the prevalence and severity of SR errors; 3) develop innovative methods based on NLP for automated error detection and correction and create a comprehensive knowledge base that contains confusion sets, error frequencies and other error patterns; 4) evaluate the performance of the proposed methods and tool; and 5) distribute our methods and findings to make them available to other researchers. We believe this application aligns with AHRQ's HIT and Patient Safety portfolios as well as AHRQ's Special Emphasis Notice to support projects to generate new evidence on health IT system safety (NOT- HS-15-005).