Humans and other animals can choose their actions using multiple learning algorithms and decision making strategies. For example, habitual behaviors adapted to a stable environment can be selected using so-called model-free reinforcement learning algorithms, in which the value of each action is incrementally updated according to the amount of unexpected reward. The underlying neural mechanisms for this type of reinforcement learning have been intensively studied. By contrast, how the brain utilizes the animal's knowledge of its environment to plan sequential actions using a model-based reinforcement learning algorithm remains unexplored. In this application, PIs with complementary expertise will investigate how different subdivisions of the primate prefrontal cortex contribute to the evaluation and arbitration of different learning algorithms during strategic planning in primates, using a sequential game referred to as 4-in-a row. Previous studies have revealed that with training, humans improve their competence in this game by gradually switching away from a model-free reinforcement learning towards a model-based reinforcement learning in the form of a tree search. In the first set of experiments, we will train non-human primates to play the 4-in-a-row game against a computer opponent. We predict that the complexity of the strategic planning and the opponent's move violating the animal's expectation will be reflected in the speed of animal's action and pupil diameters. Next, we will test how the medial and lateral aspects of prefrontal cortex contribute to the evaluation and selection of different learning algorithms during strategic interaction between the animal and computer opponent. We hypothesize that the lateral prefrontal cortex is involved in computing the integrated values of alternative actions originating from multiple sources and guiding the animal's choice, whereas the medial prefrontal cortex might be more involved in monitoring and resolving the discrepancies of actions favored by different learning algorithms. The results from these experiments will expand our knowledge of the neural mechanisms for complex strategic planning and unify various approaches to study naturalistic behaviors. By taking advantage of recent advances in machine learning and decision neuroscience, proposed studies will elucidate how multiple learning algorithms are simultaneously implemented and coordinated via specific patterns of activity in the prefrontal cortex. The results from these studies will transform the behavioral and analytical paradigms used to study high-order planning and their neural underpinnings in humans and animals.