Project Summary/Abstract Midbrain dopamine neurons are thought to drive associative learning by signaling reward prediction error (RPE), or actual minus expected reward. Based on dopamine RPE signaling, computational and empirical studies have produced detailed models of how reinforcement learning could be implemented in the brain. In particular, the temporal difference (TD) learning model has been a cornerstone in understanding how dopamine RPEs could drive associative learning. Classically, TD learning imparts value to features that serially track the passage of elapsed time relative to observable stimuli. In the real world, however, sensory stimuli provide ambiguous information about the hidden state of the environment, leading to the proposal that TD learning might instead operate over an inferred distribution of hidden states (a ?belief state?). Although this hypothesis has gained traction in theories of reinforcement learning, the empirical evidence is lacking. To test this hypothesis in Aim 1, dopamine neurons will be recorded while mice perform either of two novel classical conditioning tasks. In both tasks, the timing of reward delivery relative to conditioned stimulus is varied across trials. In the first task, reward is always given. In the second task, reward is occasionally omitted. Preliminary data displays a striking difference in dopamine signaling between these two tasks, which is well-explained by a model that incorporates the animal?s intra-trial inference that reward may be omitted in the second task. These preliminary results provide evidence in favor of an associative learning rule that combines cached values with hidden state inference. Aim 2 then seeks to understand which cortical regions shape hidden state inference in the dopamine system. This Aim will consist of cortical electrophysiology (Aim 2a) and chemogenetic cortical inactivation (Aim 2b) as mice perform the classical conditioning tasks described above. The results of this proposal will provide critical experimental data towards understanding how reinforcement learning is actually implemented in the brain. This has broad relevance to both basic and translational science. In the healthy brain, robust reinforcement learning ensures that animals can maximize rewards within their environments. In the diseased brain, reinforcement learning may also play an important role. For instance, addiction has been cast as an example of maladaptive and destructive reinforcement learning. Aberrant dopamine signaling in schizophrenia is thought to underlie the reinforcement of ?positive? symptoms such as auditory hallucination. Therefore, examining the regulation of dopamine signaling and constructing a more accurate model of reinforcement learning is of great importance in understanding both the healthy and diseased brain.