Splitting the Difference

How Does the Brain Code Reward Episodes?

Authors


Address for correspondence: Brian Knutson, Department of Psychology, Bldg. 420, Jordan Hall, Stanford, CA 94305. Voice: +650-724-2965; fax: +650-725-5699.
 knutson@psych.stanford.edu

Abstract

So, nat'ralists observe, a flea
Hath smaller fleas that on him prey;
And these have smaller still to bite 'em;
And so proceed ad infinitum.
—Jonathan Swift, On Poetry: A Rhapsody, 1733

Abstract: Animal research and human brain imaging findings suggest that reward processing involves distinct anticipation and outcome phases. Error terms in popular models of reward learning (such as the temporal difference [TD] model) do not distinguish between the updating of expectations in response to reward cues and outcomes. Thus, correlating a single error term with neural activation assumes recruitment of similar neural substrates at each update. Here, we split the error term to separately model reward prediction and prediction errors, and compare the fit of single versus split error terms to functional magnetic resonance imaging (FMRI) data acquired during a monetary incentive delay task. We speculate and find that while the nucleus accumbens computes gain prediction in response to cues, the mesial prefrontal cortex (MPFC) computes gain prediction errors in response to outcomes. In addition to offering a more comprehensive and anatomically situated view of reward processing, split error terms generate novel predictions about psychiatric symptoms and lesion-induced deficits.

Ancillary