During routine medical care, enormous amounts of data are collected in the form of blood counts, blood chemistries, and other biomarkers. Despite this huge investment, remarkably little effort is applied to the interpretation of this data. Outside of medicine, a revolution in the analysis of large datasets has been driven by machine learning techniques in diverse applications ranging from identifying credit card fraud to making recommendations for book purchases. Despite the prominence of bioinformatics in the NIH Roadmap Initiative, these remarkable advances have had little impact on medical care. The broad, long-term objective of this proposal is to optimize, implement, test, and nationally distribute machine learning algorithms which will utilize patterns in large datasets to improve diagnostic and prognostic accuracy in medicine. We propose to use the monitoring of immune suppression during thiopurine therapy for inflammatory bowel disease as a demonstration case. A low therapeutic index makes it important to optimize thiopurine dosage for inflammatory bowel disease, and assays for serum metabolites are of limited benefit. Our preliminary data show that machine learning algorithms can be used to substantially improve prognostic accuracy and reduce costs in monitoring thiopurine use. The central hypothesis of this proposal is that there are patterns in the blood counts and blood chemistries associated with effective immune suppression by thiopurine medications which can be used to guide medication dosing. The rationale for this hypothesis is based on two observations. First, our preliminary data demonstrates that machine learning can identify significant changes in immune system activation through analysis of laboratory data. Second, published data suggests that the less accurate thiopurine metabolite tests are reasonably effective in guiding dose adjustment of thiopurines. This application proposes the optimization, implementation and testing of a improved set of thiopurine monitoring algorithms, and the nationwide delivery of the optimized algorithms through the National Cancer Institute-supported LIDDEx (Laboratory Information Digital Data Exchange) architecture. The specific aims of this proposal are to: (1) use longitudinal clinical data and novel mathematical methods to improve the existing algorithm for clinical response to thiopurine therapy, using objective evidence of bowel inflammation as the gold standard; (2) prospectively test whether the thiopurine monitoring algorithms can accurately classify IBD patients who are immunosuppressed and patients who are non-adherent to thiopurine medications, and whether these algorithms can prospectively guide dosing of thiopurines in patients; and (3) implement these revised algorithms on a web server using the LIDDEx grid architecture to enable nationwide clinical use, and field test this implementation in the Ann Arbor VA IBD clinic. The proposed studies will directly impact patient care throughout the United States, and by demonstrating the effectiveness of this informatics architecture, spur further innovation and application of bioinformatics to clinical care. PUBLIC HEALTH RELEVANCE: This project addresses the NIH Roadmap Initiative goal of the application of innovations in bioinformatics to bedside clinical practice. The proposed studies will directly impact the care of patients with inflammatory bowel disease (IBD) throughout the United States. We expect that the successful demonstration of the effectiveness of the LIDDEx clinical informatics architecture will have a broad impact on clinical care beyond IBD by spurring further innovation and application of bioinformatics to clinical care in a range of medical fields.