The Actor/Critic model has been suggested to be the computational solution to optimizing long term gains. In this model, decisions made by the Actor are updated by the Critic when outcomes deviate from what is expected. Past research shows that midbrain dopamine (DA) neurons signal errors in reward prediction; however, it is unknown how these signals are generated or how they impact decision policies in downstream brains areas. According to the Actor/Critic model, midbrain DA neurons compute prediction errors by comparing the predicted value of reward, signaled by ventral striatum (VS), to the actual value of reward received, but this has not been directly tested. Subsequently, prediction errors are thought to modify behavior by updating the action policies of the Actor, dorsal striatum (DS). Neural correlates in DS include policies related to stimuli, responses and outcomes, but how these correlates are modulated by the DA system during learning remains unknown. Here, these issues will be addressed by recording from single neurons in DS and midbrain DA neurons after DA and VS inactivation, respectively. The importance of these interactions will be verified by inactivation techniques. Importantly, this circuit has been shown to be abnormal in addiction, which makes sense, considering that addicts cannot optimize choice behavior in the face of changing consequences. A final experiment will examine neural correlates of reward predictions, prediction errors and decision policies in rats that have chronically self-administered cocaine; the results will help determine how these neural representations are disrupted after long-term drug exposure.