Project Lead: Murphy, Susan Principal Investigator: Kumar, Santosh TR&D2: Dynamic Optimization of Continuously Adapting mHealth Interventions via Prudent, Statistically Efficient, and Coherent Reinforcement Learning Lead: Dr. Susan Murphy, Harvard University; 10% effort (1.2CM) Abstract: The mHealth Center for Discovery, Optimization & Translation of Temporally-Precise Interventions (the mDOT Center) will enable a new paradigm of temporally-precise medicine to maintain health and manage the growing burden of chronic diseases. The mDOT Center will develop and disseminate the methods, tools, and infrastructure necessary for researchers to pursue the discovery, optimization and translation of temporally- precise mHealth interventions. Such interventions, when dynamically personalized to the moment-to-moment biopsychosocial-environmental context of each individual, will precipitate a much-needed transformation in healthcare by enabling patients to initiate and sustain the healthy lifestyle choices necessary for directly managing, treating, and in some cases even preventing the development of medical conditions. Organized around three Technology Research & Development (TR&D) projects, mDOT represents a unique national resource that will develop multiple methodological and technological innovations and support their translation into research and practice by the mHealth community in the form of easily deployable wearables, apps for wearables and smartphones, and a companion mHealth cloud system, all open-source. Technology Research and Development project 2 (TR&D2) will address three key limitations of current online reinforcement learning (RL) when applied to personalize mobile interventions to individuals. Two of these limitations are related to the need to increase efficacy and reduce negative delayed intervention burden effects leading to disengagement. The third looks to future needs involving the personalization of multiple intervention components each operating at a different time scale. In particular, we will accommodate the ever-present mobile health challenge of user disengagement by developing a continuum of approaches between RL algorithms that ignore delayed intervention effects and RL algorithms that attempt to capture noisy delayed intervention effects over a more distant future. Second, we will increase the rate at which personalization occurs via optimally leveraging data across time and across users to more quickly personalize the interventions to each user. Third, we will develop the first RL approaches to coherently personalize multiple intervention components holistically. In addition, to enhance impact and dissemination, the methods will be developed in close collaboration with three collaborative projects with an emphasis on model interpretability. We will provide the two service projects and the broader research community with open-source software tools and systems consisting of smartphone and cloud computing components for online personalization. TR&D2 will synergistically work in partnership with the other TR&D projects, the Training and Dissemination Core, and the Administration Core to maximize the societal impact of TR&D2 technologies. 0