Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action

Bernard W Balleine, John P O'Doherty, Bernard W Balleine, John P O'Doherty

Abstract

Recent behavioral studies in both humans and rodents have found evidence that performance in decision-making tasks depends on two different learning processes; one encoding the relationship between actions and their consequences and a second involving the formation of stimulus-response associations. These learning processes are thought to govern goal-directed and habitual actions, respectively, and have been found to depend on homologous corticostriatal networks in these species. Thus, recent research using comparable behavioral tasks in both humans and rats has implicated homologous regions of cortex (medial prefrontal cortex/medial orbital cortex in humans and prelimbic cortex in rats) and of dorsal striatum (anterior caudate in humans and dorsomedial striatum in rats) in goal-directed action and in the control of habitual actions (posterior lateral putamen in humans and dorsolateral striatum in rats). These learning processes have been argued to be antagonistic or competing because their control over performance appears to be all or none. Nevertheless, evidence has started to accumulate suggesting that they may at times compete and at others cooperate in the selection and subsequent evaluation of actions necessary for normal choice performance. It appears likely that cooperation or competition between these sources of action control depends not only on local interactions in dorsal striatum but also on the cortico-basal ganglia network within which the striatum is embedded and that mediates the integration of learning with basic motivational and emotional processes. The neural basis of the integration of learning and motivation in choice and decision-making is still controversial and we review some recent hypotheses relating to this issue.

Figures

Figure 1
Figure 1
(a) Photomicrograph of an NMDA-induced cell body lesion of prelimbic prefrontal cortex (right hemisphere) and approximate region of lesion-induced damage (orange oval; left hemisphere) found to abolish the acquisition of goal-directed action in rats (cf. Balleine and Dickinson, 1998a, 1998b; Corbit and Balleine, 2003a, 2003b; Ostlund and Balleine, 2005). (b) Region of human vmPFC (here medial OFC) exhibiting a response profile consistent with the goal-directed system. Activity in this region during action selection for a liquid food reward was sensitive to the current incentive value of the outcome, decreasing in activity during the selection of an action leading to a food reward devalued through selective satiation compared to an action leading to a non-devalued food reward. From Valentin et al (2007). (c) Regions of human vmPFC (medial prefrontal cortex and medial OFC) exhibiting sensitivity to instrumental contingency and thereby exhibiting response properties consistent with the goal-directed system. Activation plots show areas with increased activity during sessions with a high contingency between responses and rewards compared with sessions with low contingency. From Tanaka et al (2008). (d) Photo-micrographs of NMDA-induced cell-body lesions of dorsomedial and dorsolateral striatum (right hemisphere) with the approximate region of lesion-induced damage illustrated in using red and purple circles, respectively (left hemisphere). This lesion of dorsomedial striatum has been found to abolish acquisition and retention of goal-directed learning (cf. Yin et al, 2005), whereas this lesion of dorsolateral striatum was found to abolish the acquisition of habit learning (Yin et al, 2004). (e) Region of human anterior dorsomedial striatum exhibiting sensitivity to instrumental contingency from the same study described in panel c. (f) Region of posterior lateral striatum (posterior putamen) exhibiting a response profile consistent with the behavioral development of habits in humans. From Tricomi et al, 2009.
Figure 2
Figure 2
(a) Evidence reviewed in text suggests that distinct neural networks mediate the acquisition of goal-directed actions and habits and the role of goal values and of Pavlovian values in the motivation of performance. On this view, habits are encoded in a network involving sensory-motor (SM) cortical inputs to dorsolateral striatum (DL), with feedback to cortex through substantial nigra reticulata/internal segment of the globus pallidus (SNr/GPi) and posterior thalamus (PO) and are motivated by midbrain dopaminergic inputs from substantia nigra pars compacta (SNc). A parallel circuit linking medial prefrontal cortex (MPC), dorsomedial striatum (DM), SNr, and mediodorsal thalamus (MD) mediates goal-directed actions that may speculatively involve a dopamine-mediated reward process. Finally, choice between actions can be facilitated both by the value of the goal or outcome associated with an action, likely involving amygdala inputs to ventral striatum, MPC and DM, and by Pavlovian values mediated by a parallel ventral circuit mediated by orbitofrontal cortex (OFC) and ventral striatal (VS) inputs into the habit and goal-directed loops. (b) Various theories have been advanced, based on rat and primate data, regarding how limbic, cortical, and midbrain structures interact with the striatum to control performance (see text). Here, dopaminergic (DA) feedforward and feedback processes are illustrated involving VTA-accumbens shell and core and SNc-dorsal striatal networks. The involvement of the BLA in reward processes is illustrated, as is the hypothesized involvement of the inframbic cortex (IL) and central nucleus of the amygdala in the reinforcement signal derived from SNc afferents on dorsolateral stiatum.
Figure 3
Figure 3
Associations that current evidence suggests are formed between various stimuli (S), actions (R), and outcomes (O) or goals (G) and goal values (V) during the course of acquisition or performance of goal-directed action. (i) Current research provides evidence for both R–OG and OS–R associations in the control of performance. Whereas the OS–R association does not directly engage or influence changes in outcome value, the R–OG association directly activates an evaluative process (V). Performance (RE) relies on both R–O and O–R processes. The necessity of the evaluative pathway in response initiation can be overcome by increasing the contribution of the selection process either by presenting a stimulus separately trained with the outcome (ie, SCS–OS as in Pavlovian-instrumental transfer experiments) or by strengthening the OS–R association itself by selective reinforcement through overtraining.

Source: PubMed

3
Předplatit