Substantial evidence gathered over the last 50 years shows that non-adherence to treatment poses a crucial barrier to effective care and survival for cancer and other chronic diseases. At least one in five cancer patients do not adhere to treatment regimen, with much higher disease-specific rates. This non-adherence, or deviation from the recommended and expected clinical path, can dramatically increase costs of care, hospitalizations, adverse outcomes and the chance of preventable death. What causes non-adherence to treatment regimens is currently not rigorously understood. Current adherence research methods largely rely on survey instruments that have limited scale and scope, provide lagging information that inhibits timely intervention, and offer little actionable information to help patients to adhere to their care regimens. With continuous changes in cancer treatment, newer proactive approaches and methods for surveillance of patient adherence and targeted interventions are needed. In this Phase 2 SBIR project we will examine the validity of a novel approach (based on the completed Phase-1 project) that uses a novel computational algorithm to glean fine- grained attributes of cancer patients from standard electronic medical records. Our preliminary work has shown that most electronic medical records contain free-form text describing patient health progress, sentiment, vitals, medical condition, side effects, and social history written by physicians, nurses, medical assistants, and other staff during every visit encounter. With the steady adoption of electronic medical records by clinicians across the US (currently 29% and rising at 12% per year), clinical notes found in electronic records offer a tantalizing source of insight into patient adherence and behavior. In this Phase-2 SBIR project we aim to commercialize a novel, scalable prototype that can glean a rich set of risk factors for patient non-adherence from 1.5 million patient encounter records, corresponding to 30,050 patients that span a 10 year time-horizon from a community cancer clinic. Our objectives are to estimate the risk of a patient's ability to adhere to a prescribed regimen and enable targeted and timely interventions by using computational analysis of unstructured and structured fields in standard clinical documentation. We also aim to monitor and measure important patient treatment and adherence metrics (e.g. as defined by the American Society of Clinical Oncology) that can play a significant role in tracking high-risk patients for improved patient treatment outcomes, adherence, quality and safety. Our approach represents a significant, actionable advance over the lagging indicators offered by survey-based methods prevalent in adherence research. Our proposed approach has deep implications for improved quality of care, proactive management of chronic diseases, retention of patients in clinical practice and clinical trials, patient safety, improved patient follow-up and risk assessment, drug and disease surveillance, enablement of new care models, targeted intervention, and improved outcomes by helping patients to better adhere to their regimens.