Project Summary With newly available electronic health data and a massive increase in processing power, data-driven personalized medicine is just now becoming possible.1 However, advances to improve health care are inherently limited by data quality. One of the most used sources of data, the patient problem list, is also the greatest source of data inaccuracy. According to recent studies, the patient problem list is often less than 50% accurate in documenting the most critical conditions.2 3 4 5 These errors exacerbate inefficiencies throughout the American health care system from care delivery to quality improvement. Primary care physicians rely on problem lists to develop transitional treatment plans for the 68 million Americans who change providers every year. Errors related to care transitions harm more than 1.5 million people each year in the United States, costing the nation an estimated $3.5 billion annually.6 Population health efforts, a cornerstone of value-based healthcare, rely on problem lists to determine risk levels and deployment of resources. These efforts cannot succeed if the source data produce faulty results. This application seeks to enable better individual patient care, enhanced population health management, and effective downstream analytics by building an automated problem list builder, which provides an accurate and granular account of the patient?s medical conditions. If the program is successful, one of the greatest technical risks in value-based healthcare will be addressed. Phase I exceeded success criteria in proving feasibility of core modules in natural language processing (NLP) and artificial intelligence. Based on Phase I success, implementation pathways are demonstrated through pilots with one of the largest US healthcare systems and one of the largest global biotechnology firms. The team is comprised of commercial and academic leaders in the field of NLP-based products applied to value-based healthcare.