An Integrative Perspective on the Role of Dopamine in Schizophrenia

Tiago V Maia, Michael J Frank, Tiago V Maia, Michael J Frank

Abstract

We propose that schizophrenia involves a combination of decreased phasic dopamine responses for relevant stimuli and increased spontaneous phasic dopamine release. Using insights from computational reinforcement-learning models and basic-science studies of the dopamine system, we show that each of these two disturbances contributes to a specific symptom domain and explains a large set of experimental findings associated with that domain. Reduced phasic responses for relevant stimuli help to explain negative symptoms and provide a unified explanation for the following experimental findings in schizophrenia, most of which have been shown to correlate with negative symptoms: reduced learning from rewards; blunted activation of the ventral striatum, midbrain, and other limbic regions for rewards and positive prediction errors; blunted activation of the ventral striatum during reward anticipation; blunted autonomic responding for relevant stimuli; blunted neural activation for aversive outcomes and aversive prediction errors; reduced willingness to expend effort for rewards; and psychomotor slowing. Increased spontaneous phasic dopamine release helps to explain positive symptoms and provides a unified explanation for the following experimental findings in schizophrenia, most of which have been shown to correlate with positive symptoms: aberrant learning for neutral cues (assessed with behavioral and autonomic responses), and aberrant, increased activation of the ventral striatum, midbrain, and other limbic regions for neutral cues, neutral outcomes, and neutral prediction errors. Taken together, then, these two disturbances explain many findings in schizophrenia. We review evidence supporting their co-occurrence and consider their differential implications for the treatment of positive and negative symptoms.

Keywords: Computational psychiatry; Dopamine; Negative symptoms; Prediction error; Psychosis; Reinforcement learning; Schizophrenia.

Conflict of interest statement

DISCLOSURES

TVM has no biomedical financial interests or potential conflicts of interest. MJF is a consultant for F. Hoffmann-La Roche Pharmaceuticals.

Copyright © 2016 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.

Figures

Figure 1
Figure 1
Amphetamine, at high doses, increases spontaneous dopamine (DA) transients while simultaneously blunting adaptive transients for relevant stimuli, as measured by fast-scan cyclic voltammetry in the striatum. (A) A high dose of amphetamine (right) markedly increases the number of spontaneous transients (red asterisks) relative to the unmedicated state (left). (B) A high dose of amphetamine markedly increases the frequency, amplitude, and duration of spontaneous transients. Values indicated are as percent increases over the predrug state. (C) A reward-predicting cue (presented at time 0) elicits a cue-locked transient in the unmedicated state and under saline (left two panels). A moderate dose of amphetamine increases this transient (third panel), but a large dose of amphetamine virtually abolishes it (right panel). (D) Even though a high dose of amphetamine virtually abolishes the adaptive transient for the reward-predicting cue, it markedly increases spontaneous transients in the same task (measured in the 10 seconds before cue presentation). Adapted, with permission, from Daberkow et al. (13).
Figure 2
Figure 2
Effects of dopamine in the striatum, and mechanisms of action selection in the basal ganglia. (A) Effects of dopamine (DA) on plasticity and excitability (gain) of striatal medium spiny neurons (MSNs) of the direct (Go) and indirect (NoGo) basal ganglia pathways. The current state or stimulus, s, is represented in cortex. Corticostriatal synapses onto D1-containining MSNs represent the positive value of learned associations between states or stimuli s and actions a [G(s,a); Box 1]; corticostriatal synapses onto D2-containining MSNs represent the negative value of learned associations between states or stimuli and actions [N(s,a); Box 1]. Phasic dopamine bursts following an action strengthen corticostriatal synapses to Go MSNs through D1-mediated long-term potentiation and weaken corticostriatal projections to NoGo MSNs through D2-mediated long-term depression (indicated by the circles with a plus and a minus sign, respectively) (Equations 3–4 in Box 1; Figure 3A, B). Phasic dopamine dips following an action may have the opposite effects (Figure 3C). Dopamine during choice amplifies the gain of Go MSNs (βG) by increasing their excitability through D1 receptors and reduces the gain of NoGo MSNs (βN) by decreasing their excitability through D2 receptors (indicated respectively by the circle with a multiplication sign and the tandem circles with a minus sign and a multiplication sign) (Equations 6–9 in Box 1; Figure 3D). The output of Go MSNs reflects learned Go values [G(s,a)], modulated by the gain of the Go pathway (βG), which can be represented mathematically as βG × G(s,a). Similarly, the output of NoGo MSNs reflects learned NoGo values [N(s,a)], modulated by the gain of the NoGo pathway (βN), which can be represented mathematically as βN × N(s,a). (B) Action-selection mechanisms in the basal ganglia. Go and NoGo values [G(s,a) and N(s,a), respectively] are specific for each state-action [(s,a)] pair. Illustrated are three possible actions (labeled 1, 2, and 3) for a given state s. Each action has its own G(s,a) and N(s,a) values, which are determined by the strength of the corticostriatal synapses from the cortical representation of state s to Go and NoGo MSNs, respectively, for that state-action pair [(s,a)]. The output of Go and NoGo MSNs is determined by these learned values [G(s,a) and N(s,a), respectively] modulated by the gain of the respective pathway (βG and βN, respectively), yielding the same products as in panel (A) [βG × G(s,a) and βN × N(s,a), respectively]. The projections from all basal ganglia nuclei—striatum, globus pallidus external segment (GPe), globus pallidus internal segment (GPi), and substantia nigra pars reticulata (SNr)—are inhibitory. In simplified terms, if the projection neurons in an area receive afferent inhibitory projections, that area can be seen as flipping the sign of the information in those afferent projections. This process is represented in the graph by circles with a minus sign inside. Under this simplified conceptualization, the GPe can be seen as flipping the sign of βN × N(s,a), yielding −βN × N(s,a). The GPi then combines (sums) its two incoming inputs [βG × G(s,a) and −βN × N(s,a)], but since its incoming projections are inhibitory, it flips the sign of those inputs, yielding −βG × G(s,a) + βN × N(s,a). Finally, given that the projections from the GPi to the thalamus are also inhibitory, the thalamus flips the sign again, yielding βG × G(s,a) − βN × N(s,a). The cortex therefore receives information about the difference βG × G(s,a) − βN × N(s,a) for each action a available in the current state s. (Note that these differences are the values of the exponents in Equation 5 in Box 1.) Lateral inhibition in cortex then implements a competitive dynamics that performs action selection using these differences (approximated in Equation 5 in Box 1 using a softmax). In short, the best action in a given state s is determined on the basis of the differences βG × G(s,a) − βN × N(s,a) for all actions a available in s (Equations 5 and 9 in Box 1). This account is, of course, greatly simplified—for example, it does not take into account the full complexity of the basal-ganglia anatomy, it assumes that competition via lateral inhibition occurs only in cortex, and it assumes that all processing other than the competition approximated by the softmax is linear. It has the advantage, however, of clearly linking each structure and processing step in the basal ganglia to a simple, well-defined mathematical operation, and of showing how all of those operations work together to implement a sensible action-selection algorithm (Box 1).
Figure 3
Figure 3
Effects of dopamine (DA) on plasticity and excitability (gain) of striatal direct (Go) and indirect (NoGo) medium spiny neurons (MSNs). (A) Reference scenario against which the figures in the remaining panels should be compared. In this scenario, we assume that the Go and NoGo corticostriatal synapses [G(s,a) and N(s,a), respectively] for the state-action pair under scrutiny have the same initial weights. (B) If the person (or animal) executes action a in state s, and that is followed by a phasic dopamine burst (corresponding to a positive prediction error; Box 1), the Go weight for that state-action pair [G(s,a)] is increased, and the NoGo weight for that state-action pair [N(s,a)] is decreased [compare the thickness of the arrows depicting the corticostriatal synapses with each other and with those in panel (A)] (Equations 3 and 4 in Box 1; Figure 2A). Thus, the next time the person (or animal) is in state s, it will have a greater tendency to choose that action [compare the size of the Go and NoGo MSNs, which are intended to depict activation levels, with each other and with those in panel (A), or compare the size of the arrows departing from Go and NoGo MSNs, which convey the same information]. (C) If the person (or animal) executes action a in state s, and that is followed by a phasic dopamine dip (corresponding to a negative prediction error), the Go weight for that state-action pair [G(s,a)] is decreased, and the NoGo weight for that state-action pair [N(s,a)] is increased [compare the thickness of the arrows depicting the corticostriatal synapses with each other and with those in panel (A)] (Equations 3 and 4 in Box 1; Figure 2A). Thus, the next time the person (or animal) is in state s, it will have less tendency to choose that action [compare the size of the Go and NoGo MSNs (or of the arrows that depart from them) with each other and with those in panel (A)]. (D) If dopamine during choice is increased, either because tonic dopamine is increased or because the cues presented themselves elicit a dopamine burst (positive prediction error), the activity of Go MSNs is increased, and the activity of NoGo MSNs is decreased [compare the size of Go and NoGo MSNs (or of the arrows that depart from them) with each other and with those in panel (A)], resulting in greater weighting of positive relative to negative values and therefore a greater tendency to select the action (Equations 6–9 in Box 1; Figure 2A). This effect is due to gain modulation of corticostriatal synapses rather than to changes in their strength [note that the arrows depicting the weights of corticostriatal synapses are unchanged relative to panel (A)]. Thus, this effect during choice is separate from the effects on learning. However, the two effects interact because the gain modulation acts on the learned synaptic weights (Equations 5 and 9 in Box 1; Figure 2A).
Figure 4
Figure 4
Increased spontaneous dopamine transients in the striatum explain several neural and behavioral laboratory findings in schizophrenia that correlate with positive symptoms and help to explain positive symptoms themselves. Increased spontaneous dopamine transients (green) have specific effects on computational variables (orange-brown) that, in turn, cause specific neural and behavioral disturbances that have been found in the laboratory in schizophrenia (blue, with numbers in parenthesis referring to relevant citations). In real life, the same alterations in the computational variables may cause specific neurocognitive disturbances (blue-red gradient) that, in turn, cause positive symptoms (red). The same computational alterations can also explain dyskinesia associated with schizophrenia (dotted red). In more detail, increased spontaneous dopamine transients that follow neutral stimuli function as positive prediction errors (PEs) for those stimuli, causing increased midbrain activity for “neutral PEs,” as has been observed in schizophrenia (53). According to Equation 1 (Box 1), these inappropriate positive PEs cause increased, inappropriate value learning for neutral stimuli, which in turn causes increased activation of value regions, such as the ventral striatum (VS), for neutral stimuli, as has been observed in schizophrenia (51,52). This activation, particularly for the midbrain (52), may also reflect the increased PEs that occur when the stimulus is presented. The inappropriate value learning for neutral stimuli may also cause increased autonomic activation for those stimuli, as has also been observed in schizophrenia (51). In real life, the inappropriate value learning may lead to aberrant valuation of stimuli, thoughts, percepts, etc., possibly contributing to positive symptoms. In addition, according to Equation 3 (Box 1), the inappropriate positive PEs cause inappropriate direct-pathway (Go) learning for neutral stimuli-action pairs, leading to inappropriate behavioral responding to neutral stimuli, as has also been observed in schizophrenia (55,56). When applied to the cognitive domain, this inappropriate Go learning may lead to learned gating of aberrant thoughts and percepts, possibly contributing to positive symptoms. When applied to the motor domain, this inappropriate Go learning may lead to dyskinesia, which is associated with schizophrenia even in antipsychotic-naive patients (148). The fact that increased spontaneous dopamine transients may be the common cause of all of the depicted laboratory-based deficits (blue boxes) and also contribute to positive symptoms (red box) explains the correlations between these laboratory deficits and positive symptoms (,–57). Gray boxes identify relations between concepts. ↑ means increased; ↓ means decreased. BOLD, blood oxygen level–dependent.
Figure 5
Figure 5
Blunted adaptive dopamine transients in the striatum explain several neural and behavioral laboratory findings in schizophrenia that correlate with negative symptoms and help to explain negative symptoms themselves. Blunted adaptive dopamine transients (green) have specific effects on computational variables (orange-brown) that, in turn, cause specific neural and behavioral disturbances that have been found in the laboratory in schizophrenia (blue, with numbers in parenthesis referring to relevant citations). In real life, the same alterations in the computational variables likely cause decreased valuation of stimuli and events (blue-red gradient), which, in turn, causes at least some forms of primary negative symptoms (red). The same disturbances can also explain Parkinsonism associated with schizophrenia (dotted red). In more detail, blunted adaptive dopamine transients (i.e., blunted transients for relevant stimuli and outcomes) cause blunted prediction error (PE) signaling, which has been observed in schizophrenia in many studies (52,53,60,68,69,82). According to Equation 1 (Box 1), reduced PE signaling causes reduced value learning, which, given that the ventral striatum (VS) represents value (32), in turn causes reduced VS activation during reward anticipation, as has also been observed in schizophrenia in many studies (–92). Some of the findings of reduced VS activation during reward anticipation could also be due to the blunted PE signaling (dashed arrow) because, with learning, PEs move from outcomes to the cues that predict them (12), and the blood oxygen level–dependent (BOLD) response to the cue could extend into the reward-anticipation period. In real life, reduced value learning could lead to reduced valuation of stimuli, events, and situations, possibly contributing to negative symptoms. According to Equation 3 (Box 1), reduced PE signaling also causes reduced direct-pathway (Go) learning, thereby leading to reduced learning from rewards, as has also been observed in multiple studies in schizophrenia (,–67,82). In real life, the impaired Go learning may lead to reduced learning to perform actions that lead to positive outcomes, which, especially in the face of preserved indirect-pathway (NoGo) learning, may contribute to negative symptoms. Decreased Go learning may also lead to Parkinsonism, which, despite being commonly associated with antipsychotics, is associated with schizophrenia even in antipsychotic-naive patients (148). According to Equations 6–8 (Box 1), adaptive transients that occur when reward-predicting cues are presented amplify Go signals (i.e., increase βG) and reduce NoGo signals (i.e., reduce βN). As a result, positive values are given more weight than negative values, facilitating (a) choice of rewarding options, (b) effortful responses for reward, and (c) fast, invigorated responding (30). Blunted adaptive transients cause a reduction of these effects, leading to (a) difficulties choosing rewarding options, which may contribute to the observed deficits in choice after Go learning, (b) decreased tendency to make effortful responses for reward, and (c) longer reaction times, all of which have been found in schizophrenia (,–67,82,100,107,109). Indeed, in animals, inhibiting dopamine-neuron firing during choice decreases choice of rewarding actions (149), and dopamine-neuron firing for a reward-predicting cue correlates negatively with reaction times (150). Increased reaction times may also be caused by reduced Go learning. The decrease in adaptive transients for reward-predicting cues may be further compounded by the decreased value learning, which will make those cues have lower value and therefore elicit smaller PEs (whose signaling will then itself be reduced even further because of the blunted PE signaling). The fact that blunted adaptive dopamine transients may be the common cause of all of the depicted laboratory-based deficits (blue boxes) and at least some forms of primary negative symptoms (red box) explains the widely replicated correlations between these laboratory deficits and negative symptoms (,,,,,–88,100). Gray boxes identify relations between concepts. ↑ means increased; ↓ means decreased.

Source: PubMed

3
Abonneren