Learning from sensory and reward prediction errors during motor adaptation
Jun Izawa, Reza Shadmehr, Jun Izawa, Reza Shadmehr
Abstract
Voluntary motor commands produce two kinds of consequences. Initially, a sensory consequence is observed in terms of activity in our primary sensory organs (e.g., vision, proprioception). Subsequently, the brain evaluates the sensory feedback and produces a subjective measure of utility or usefulness of the motor commands (e.g., reward). As a result, comparisons between predicted and observed consequences of motor commands produce two forms of prediction error. How do these errors contribute to changes in motor commands? Here, we considered a reach adaptation protocol and found that when high quality sensory feedback was available, adaptation of motor commands was driven almost exclusively by sensory prediction errors. This form of learning had a distinct signature: as motor commands adapted, the subjects altered their predictions regarding sensory consequences of motor commands, and generalized this learning broadly to neighboring motor commands. In contrast, as the quality of the sensory feedback degraded, adaptation of motor commands became more dependent on reward prediction errors. Reward prediction errors produced comparable changes in the motor commands, but produced no change in the predicted sensory consequences of motor commands, and generalized only locally. Because we found that there was a within subject correlation between generalization patterns and sensory remapping, it is plausible that during adaptation an individual's relative reliance on sensory vs. reward prediction errors could be inferred. We suggest that while motor commands change because of sensory and reward prediction errors, only sensory prediction errors produce a change in the neural system that predicts sensory consequences of motor commands.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
References
- Synofzik M, Thier P, Lindner A. Internalizing agency of self-action: perception of one's own hand movements depends on an adaptable prediction about the sensory action outcome. J Neurophysiol. 2006;96:1592–1601.
- Synofzik M, Lindner A, Thier P. The cerebellum updates predictions about the visual consequences of one's behavior. Curr Biol. 2008;18:814–818.
- Baddeley RJ, Ingram HA, Miall RC. System identification applied to a visuomotor task: near-optimal human performance in a noisy changing task. J Neurosci. 2003;23:3066–3075.
- Berniker M, Kording K. Estimating the sources of motor errors for adaptation and generalization. Nat Neurosci. 2008;11:1454–1461.
- Kording KP, Tenenbaum JB, Shadmehr R. The dynamics of memory as a consequence of optimal adaptation to a changing body. Nat Neurosci. 2007;10:779–786.
- Sing GC, Joiner WM, Nanayakkara T, Brayanov JB, Smith MA. Primitives for motor adaptation reflect correlated neural tuning to position and velocity. Neuron. 2009;64:575–589.
- van Beers RJ. Motor learning is optimally tuned to the properties of motor noise. Neuron. 2009;63:406–417.
- Frank MJ, Doll BB, Oas-Terpstra J, Moreno F. Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation. Nat Neurosci. 2009;12:1062–1068.
- Schonberg T, Daw ND, Joel D, O'Doherty JP. Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making. J Neurosci. 2007;27:12860–12867.
- Trommershauser J, Maloney LT, Landy MS. Decision making, movement planning and statistical decision theory. Trends Cogn Sci. 2008;12:291–297.
- Kawato M, Gomi H. A computational model of four regions of the cerebellum based on feedback-error learning. Biol Cybern. 1992;68:95–103.
- Kawato M. Internal models for motor control and trajectory planning. Curr Opin Neurobiol. 1999;9:718–727.
- Thoroughman KA, Shadmehr R. Learning of action through adaptive combination of motor primitives. Nature. 2000;407:742–747.
- Smith MA, Ghazizadeh A, Shadmehr R. Interacting adaptive processes with different timescales underlie short-term motor learning. PLoS Biol. 2006;4:e179.
- Pearson TS, Krakauer JW, Mazzoni P. Learning not to generalize: modular adaptation of visuomotor gain. J Neurophysiol. 2010;103:2938–2952.
- Shadmehr R. Generalization as a behavioral window to the neural mechanisms of learning internal models. Hum Mov Sci. 2004;23:543–568.
- Haswell CC, Izawa J, Dowell LR, Mostofsky SH, Shadmehr R. Representation of internal models of action in the autistic brain. Nat Neurosci. 2009;12:970–972.
- Bedford FL. Keeping perception accurate. Trends Cogn Sci. 1999;3:4–11.
- Wolpert DM, Ghahramani Z. Computational principles of movement neuroscience. Nat Neurosci. 2000;3(Suppl):1212–1217.
- Barnes TD, Kubota Y, Hu D, Jin DZ, Graybiel AM. Activity of striatal neurons reflects dynamic encoding and recoding of procedural memories. Nature. 2005;437:1158–1161.
- Trommershauser J, Gepshtein S, Maloney LT, Landy MS, Banks MS. Optimal compensation for changes in task-relevant movement variability. J Neurosci. 2005;25:7169–7178.
- Takikawa Y, Kawagoe R, Itoh H, Nakahara H, Hikosaka O. Modulation of saccadic eye movements by predicted reward outcome. Exp Brain Res. 2002;142:284–291.
- Doya K. Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr Opin Neurobiol. 2000;10:732–739.
- Izawa J, Rane T, Donchin O, Shadmehr R. Motor adaptation as a process of reoptimization. J Neurosci. 2008;28:2883–2891.
- Izawa J, Shadmehr R. On-line processing of uncertain information in visuomotor control. J Neurosci. 2008;28:11360–11368.
- Todorov E, Jordan MI. Optimal feedback control as a theory of motor coordination. Nat Neurosci. 2002;5:1226–1235.
- Shadmehr R, Krakauer JW. A computational neuroanatomy for motor control. Exp Brain Res. 2008;185:359–381.
- Izawa J, Kondo T, Ito K. Biological arm motion through reinforcement learning. Biol Cybern. 2004;91:10–22.
- Poggio T, Fahle M, Edelman S. Fast perceptual learning in visual hyperacuity. Science. 1992;256:1018–1021.
- Hwang EJ, Smith MA, Shadmehr R. Adaptation and generalization in acceleration-dependent force fields. Exp Brain Res. 2006;169:496–506.
- Tanaka H, Sejnowski TJ, Krakauer JW. Adaptation to visuomotor rotation through interaction between posterior parietal and motor cortical areas. J Neurophysiol. 2009;102:2921–2932.
- Tseng YW, Diedrichsen J, Krakauer JW, Shadmehr R, Bastian AJ. Sensory prediction errors drive cerebellum-dependent adaptation of reaching. J Neurophysiol. 2007;98:54–62.
- Smith MA, Shadmehr R. Intact ability to learn internal models of arm dynamics in Huntington's disease but not cerebellar degeneration. J Neurophysiol. 2005;93:2809–2821.
- Yin HH, Knowlton BJ. The role of the basal ganglia in habit formation. Nat Rev Neurosci. 2006;7:464–476.
- Nakahara H, Doya K, Hikosaka O. Parallel cortico-basal ganglia mechanisms for acquisition and execution of visuomotor sequences - a computational approach. J Cogn Neurosci. 2001;13:626–647.
- Tanaka SC, Doya K, Okada G, Ueda K, Okamoto Y, et al. Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nat Neurosci. 2004;7:887–893.
- Samejima K, Ueda Y, Doya K, Kimura M. Representation of action-specific reward values in the striatum. Science. 2005;310:1337–1340.
- Packard MG, Knowlton BJ. Learning and memory functions of the Basal Ganglia. Annu Rev Neurosci. 2002;25:563–593.
- Wickens JR, Reynolds JN, Hyland BI. Neural mechanisms of reward-related motor learning. Curr Opin Neurobiol. 2003;13:685–690.
- Dickinson A, Balleine B. Motivational control of goal-directed action. Anima Learn Behave. 1994;22:1–8.
- Gabrieli JD, Stebbins GT, Singh J, Willingham DB, Goetz CG. Intact mirror-tracing and impaired rotary-pursuit skill learning in patients with Huntington's disease: evidence for dissociable memory systems in skill learning. Neuropsychology. 1997;11:272–281.
- Agostino R, Sanes JN, Hallett M. Motor skill learning in Parkinson's disease. J Neurol Sci. 1996;139:218–226.
- Marinelli L, Crupi D, Di Rocco A, Bove M, Eidelberg D, et al. Learning and consolidation of visuo-motor adaptation in Parkinson's disease. Parkinsonism Relat Disord. 2009;15:6–11.
- Criscimagna-Hemminger SE, Bastian AJ, Shadmehr R. Size of error affects cerebellar contributions to motor learning. J Neurophysiol. 2010;103:2275–2284.
- Harris CM, Wolpert DM. Signal-dependent noise determines motor planning. Nature. 1998;394:780–784.
- Jones KE, Hamilton AF, Wolpert DM. Sources of signal-dependent noise during isometric force production. J Neurophysiol. 2002;88:1533–1544.
- Faisal AA, Selen LP, Wolpert DM. Noise in the nervous system. Nat Rev Neurosci. 2008;9:292–303.
- Churchland MM, Afshar A, Shenoy KV. A central source of movement variability. Neuron. 2006;52:1085–1096.
- Burge J, Ernst MO, Banks MS. The statistical determinants of adaptation rate in human reaching. J Vis 8: 20. 2008;21-19
- Kording KP, Ku SP, Wolpert DM. Bayesian integration in force estimation. J Neurophysiol. 2004;92:3161–3165.
- Sutton R, Barto A. MIT Press; 1998. Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning).
Source: PubMed