How do humans and animals utilize past experience to guide future decisions? Computational models from reinforcement learning have provided a useful framework for answering this question. Whereas a prominent theory has used model-free reinforcement learning to describe how midbrain dopamine neurons incrementally, over many experiences, compute the average of rewards received for performing an action, this theory cannot explain the ability of animals and humans to flexibly change their preference for an action after learning that the hedonic value of its outcome has changed. This ability, referred to as goal-directed action, requires anticipating the consequences of an action and evaluating how those consequences line up with one's goals. Determining the neural mechanisms underlying goal-directed action is of critical importance to public health as a wide range of psychiatric and neurological disorders including Obsessive Compulsive Disorder, Parkinson's disease and Schizophrenia have all been associated with deficits in this ability and the neural circuitry believed to carry it out. Model-based reinforcement learning, a framework for choosing actions by forecasting and evaluating their long-term consequences, offers a promising theoretical account for understanding goal- directed action. However, the computations prescribed by model-based reinforcement learning are so laborious that any physical system with limited computational resources must take steps to simply them. This proposal suggests a biologically plausible mechanism by which the brain simplifies computations required by model-based reinforcement learning in order to perform goal-directed action. Our core hypothesis is that the brain simplifies model-based computations by storing and reusing aggregate multi-step predictions about action consequences. We have carefully designed a multi-step reinforcement-learning task for which this proposed mechanism generates recognizable behavior and we present preliminary data suggesting that humans display this behavior. In Aim 1, we propose using multivariate analysis of functional Magnetic Resonance Imaging (fMRI) data to evaluate whether neural activity at the time of decision-making supports this proposed mechanism. In Aim 2, we will analyze the behavior of temporal lobe lesion patients in order to causally link the proposed mechanism to a neural substrate. Through the proposed work, the Primary Investigator, who is already quite well experienced in computational modeling and behavioral analysis, will acquire expertise in fMRI as well as neuropsychological methods. In addition, this work will form the basis of the remainder of the Primary Investigator's graduate dissertation work and will be carried out over the next 2 years. The co-sponsors will provide guidance throughout the process and particularly with analysis of neuroimaging data. The work will be presented at scientific meetings and will be published and made available to the public when completed.