Learning Predictive Statistics: Strategies and Brain Mechanisms

Rui Wang, Yuan Shen, Peter Tino, Andrew E Welchman, Zoe Kourtzi, Rui Wang, Yuan Shen, Peter Tino, Andrew E Welchman, Zoe Kourtzi

Abstract

When immersed in a new environment, we are challenged to decipher initially incomprehensible streams of sensory information. However, quite rapidly, the brain finds structure and meaning in these incoming signals, helping us to predict and prepare ourselves for future actions. This skill relies on extracting the statistics of event streams in the environment that contain regularities of variable complexity from simple repetitive patterns to complex probabilistic combinations. Here, we test the brain mechanisms that mediate our ability to adapt to the environment's statistics and predict upcoming events. By combining behavioral training and multisession fMRI in human participants (male and female), we track the corticostriatal mechanisms that mediate learning of temporal sequences as they change in structure complexity. We show that learning of predictive structures relates to individual decision strategy; that is, selecting the most probable outcome in a given context (maximizing) versus matching the exact sequence statistics. These strategies engage distinct human brain regions: maximizing engages dorsolateral prefrontal, cingulate, sensory-motor regions, and basal ganglia (dorsal caudate, putamen), whereas matching engages occipitotemporal regions (including the hippocampus) and basal ganglia (ventral caudate). Our findings provide evidence for distinct corticostriatal mechanisms that facilitate our ability to extract behaviorally relevant statistics to make predictions.SIGNIFICANCE STATEMENT Making predictions about future events relies on interpreting streams of information that may initially appear incomprehensible. Past work has studied how humans identify repetitive patterns and associative pairings. However, the natural environment contains regularities that vary in complexity from simple repetition to complex probabilistic combinations. Here, we combine behavior and multisession fMRI to track the brain mechanisms that mediate our ability to adapt to changes in the environment's statistics. We provide evidence for an alternate route for learning complex temporal statistics: extracting the most probable outcome in a given context is implemented by interactions between executive and motor corticostriatal mechanisms compared with visual corticostriatal circuits (including hippocampal cortex) that support learning of the exact temporal statistics.

Keywords: fMRI; learning; prediction; vision.

Copyright © 2017 Wang et al.

Figures

Figure 1.
Figure 1.
Trial and sequence design. a, Trial design. 8–14 stimuli were presented sequentially, followed by a cue and the test display. b, Sequence design. Markov models comprised two levels of complexity. For the zero-order model (Level 0), different states (AD) are assigned to four symbols with different frequencies. For the first-order model (Level 1), a diagram indicates states (circles) and transitional probabilities (black arrow: high probability, e.g., 80%; gray arrow: low probability, e.g., 20%). Transitional probabilities are shown in a four-by-four conditional probability matrix, with rows indicating temporal context and columns indicating the corresponding target. c, Experimental protocol. Observers underwent multiple days of behavioral training first with zero-order sequences and then with first-order sequences. For each level, observers completed three to five training sessions (an average of four sessions is shown for illustration purposes). Three fMRI scanning sessions were conducted before (Pre) and immediately after training per level (Post0, Post1).
Figure 2.
Figure 2.
Behavioral performance. a, Mean PI across participants for test (open symbols) and training (solid symbols) blocks for Level 0 and Level 1. Data are fitted (least-squares nonlinear fit) across training blocks. Random guess baseline is indicated by dotted lines. b, Normalized PI during scanning. Data are shown before (gray bars) and after (black bars) training for each level. Error bars indicate SEM.
Figure 3.
Figure 3.
fMRI results. a, GLM maps for the two-way interaction between scanning session (Pre, Post0, Post1) and sequence (structured vs random), at p < 0.005 (cluster threshold corrected). Only the first five volumes were included in the analysis that correspond to the presentation of sequence, the participants' prediction, and the test display presentation to avoid confounding the results by the participants' response. Similar results were observed at a more conservative threshold (p < 0.001), but small volume correction was necessary for small structures (i.e., putamen) at this threshold. b, PSC index (percentage signal change for structured sequences compared with random sequences) before and after training for Level 0 and Level 1. Data are shown for ROIs that showed a significant interaction between session (pretraining vs posttraining) and sequence (structured vs random). Error bars indicate SEM. Note that different number of runs were scanned before and after training (i.e., pretraining scan comprised three runs per level, whereas posttraining scans comprised nine runs per level). To compare equal amounts of data before and after training, we selected three of the nine runs from each posttraining scan; that is, we divided each session into two time periods and selected randomly one run per time period to match the order in which data were collected during the pretraining scan. Whole-brain voxelwise GLM analysis showed significant interactions for sequence (structured vs random) and scanning session (Pre, Post0, Post1) in the frontal, parietal, and subcortical regions, which is consistent with our main result.
Figure 4.
Figure 4.
fMRI results controlled for differences in sequence entropy across levels. a, GLM maps (p < 0.001, cluster threshold corrected) for 2-way interaction between scanning session (Pre, Post0, Post1) and sequence (structured vs random) including entropy rate as a regressor. b, PSC index before and after training for Level 0 and Level 1. Error bars indicate SEM. Data are shown for ROIs that showed a significant interaction between session (pre vs posttraining) and sequence (structured vs random).
Figure 5.
Figure 5.
fMRI results for sequence presentation and participants' prediction. a, PSC index for sequence presentation (volumes 1–2) and participant prediction (volumes 4–5) before and after training for Level 0 and Level 1. Data are shown for the representative ROIs from Figure 4b. Error bars indicate SEM. b, GLM maps for the 2-way interaction between scanning session and sequence at p < 0.005 (cluster threshold corrected) using only the volumes that correspond to sequence presentation.
Figure 6.
Figure 6.
Strategy choice. Strategy choice is shown at the beginning (first two runs) and end (last two runs) of training for Level 0 (squares) and Level 1 (circles). Open symbols indicate individual participant data; closed symbols indicate mean date per level. Strategy choice was measured by comparing participant responses to two possible strategies: matching (i.e., predicting the presented target distribution) versus maximization (i.e., predicting the high probability targets per context). Negative values indicate a strategy closer to matching, whereas positive values indicate a strategy closer to maximization. Error bars indicate SEM.
Figure 7.
Figure 7.
Brain activations correlating with matching. Covariance analysis shows significant (p < 0.05, cluster threshold corrected) negative correlations (R correlation coefficient) between individual strategy index and learning-dependent fMRI change (i.e., after vs before training) for Level 0 (a) and Level 1 (b). Whole-brain maps and plots show negative correlations between strategy index and PSC index change (posttraining vs pretraining) for representative ROIs, as derived from the covariance analysis (note that these correlation plots are only presented for demonstration purposes; no additional statistical analysis was performed in these ROIs after the covariance analysis to avoid circularity). Cd, Caudate: b, body; t, tail; Th, thalamus; PHG, parahippocampal gyrus; Hipp, hippocampus.
Figure 8.
Figure 8.
Brain activations correlating with maximization. Covariance analysis shows significant (p < 0.05, cluster threshold corrected) positive correlations (R correlation coefficient) between individual strategy index and learning-dependent fMRI change (i.e., posttraining vs pretraining) for Level 0 (a) and Level 1 (b). Whole-brain maps and plots show positive correlations between strategy index and PSC index change (posttraining vs pretraining) for representative ROIs, as derived from the covariance analysis (note that these correlation plots are only presented for demonstration purposes; no additional statistical analysis was performed in these ROIs after the covariance analysis to avoid circularity).

References

    1. Acerbi L, Vijayakumar S, Wolpert DM (2014) On the origins of suboptimality in human probabilistic inference. PLoS Comput Biol 10:e1003661. 10.1371/journal.pcbi.1003661
    1. Aizenstein HJ, Stenger VA, Cochran J, Clark K, Johnson M, Nebes RD, Carter CS (2004) Regional brain activation during concurrent implicit and explicit sequence learning. Cereb Cortex 14:199–208. 10.1093/cercor/bhg119
    1. Alexander GE, DeLong MR, Strick PL (1986) Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annual review of neuroscience 9:357–381. 10.1146/annurev.ne.09.030186.002041
    1. Alink A, Schwiedrzik CM, Kohler A, Singer W, Muckli L (2010) Stimulus predictability reduces responses in primary visual cortex. J Neurosci 30:2960–2966. 10.1523/JNEUROSCI.3730-10.2010
    1. Antoniou M, Ettlinger M, Wong PC (2016) Complexity, training paradigm design, and the contribution of memory subsystems to grammar learning. PLoS One 11:e0158812. 10.1371/journal.pone.0158812
    1. Ashby FG, Maddox WT (2005) Human category learning. Annu Rev Psychol 56:149–178. 10.1146/annurev.psych.56.091103.070217
    1. Aslin RN, Newport EL (2012) Statistical learning from acquiring specific items to forming general rules. Curr Dir Psychol Sci 21:170–176. 10.1177/0963721412436806
    1. Bahlmann J, Schubotz RI, Mueller JL, Koester D, Friederici AD (2009) Neural circuits of hierarchical visuo-spatial sequence processing. Brain Res 1298:161–170. 10.1016/j.brainres.2009.08.017
    1. Balleine BW, O'Doherty JP (2010) Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology 35:48–69. 10.1038/npp.2009.131
    1. Bar M. (2009) The proactive brain: memory for predictions. Philos Trans R Soc Lond B Biol Sci 364:1235–1243. 10.1098/rstb.2008.0310
    1. Bornstein AM, Daw ND (2012) Dissociating hippocampal and striatal contributions to sequential prediction learning. Eur J Neurosci 35:1011–1023. 10.1111/j.1460-9568.2011.07920.x
    1. Brainard DH. (1997) The psychophysics toolbox. Spat Vis 10:433–436. 10.1163/156856897X00357
    1. Chun MM. (2000) Contextual cueing of visual attention. Trends Cogn Sci 4:170–178. 10.1016/S1364-6613(00)01476-5
    1. Chun MM, Jiang Y (1998) Contextual cueing: Implicit learning and memory of visual context guides spatial attention. Cogn Psychol 36:28–71. 10.1006/cogp.1998.0681
    1. Cools R, Clark L, Owen AM, Robbins TW (2002) Defining the neural mechanisms of probabilistic reversal learning using event-related functional magnetic resonance imaging. J Neurosci 22:4563–4567.
    1. Cools R, Clark L, Robbins TW (2004) Differential responses in human striatum and prefrontal cortex to changes in object and rule relevance. J Neurosci 24:1129–1135. 10.1523/JNEUROSCI.4312-03.2004
    1. Dale R, Duran ND, Morehead JR (2012) Prediction during statistical learning, and implications for the implicit/explicit divide. Adv Cogn Psychol 8:196–209. 10.5709/acp-0115-z
    1. Daw ND, Niv Y, Dayan P (2005) Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat neuroscience 8:1704–1711. 10.1038/nn1560
    1. den Ouden HE, Friston KJ, Daw ND, McIntosh AR, Stephan KE (2009) A dual role for prediction error in associative learning. Cereb Cortex 19:1175–1185. 10.1093/cercor/bhn161
    1. Eckstein MP, Mack SC, Liston DB, Bogush L, Menzel R, Krauzlis RJ (2013) Rethinking human visual attention: Spatial cueing effects and optimality of decisions by honeybees, monkeys and humans. Vision Res 85:5–19. 10.1016/j.visres.2012.12.011
    1. Eklund A, Nichols TE, Knutsson H (2016) Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates. Proc Natl Acad Sci U S A 113:7900–7905. 10.1073/pnas.1602413113
    1. Fiser J, Aslin RN (2002) Statistical learning of higher-order temporal structure from visual shape sequences. J Exp Psychol Learn Mem Cogn 28:458–467. 10.1037/0278-7393.28.3.458
    1. Fiser J, Aslin RN (2005) Encoding multielement scenes: statistical learning of visual feature hierarchies. J Exp Psychol Gen 134:521–537. 10.1037/0096-3445.134.4.521
    1. Forman SD, Cohen JD, Fitzgerald M, Eddy WF, Mintun MA, Noll DC (1995) Improved assessment of significant activation in functional magnetic resonance imaging (fMRI): use of a cluster-size threshold. Magn Reson Med 33:636–647. 10.1002/mrm.1910330508
    1. Fulvio JM, Green CS, Schrater PR (2014) Task-specific response strategy selection on the basis of recent training experience. PLoS Comput Biol 10:e1003425. 10.1371/journal.pcbi.1003425
    1. Gheysen F, Van Opstal F, Roggeman C, Van Waelvelde H, Fias W (2011) The neural basis of implicit perceptual sequence learning. Front Hum Neurosci 5:137. 10.3389/fnhum.2011.00137
    1. Goebel R, Esposito F, Formisano E (2006) Analysis of functional image analysis contest (FIAC) data with brainvoyager QX from single-subject to cortically aligned group general linear model analysis and self-organizing group independent component analysis. Hum Brain Mapp 27:392–401. 10.1002/hbm.20249
    1. Grassberger P. (1986) Toward a quantitative theory of self-generated complexity. International Journal of Theoretical Physics 25:907–938. 10.1007/BF00668821
    1. Heekeren HR, Marrett S, Ungerleider LG (2008) The neural systems that mediate human perceptual decision making. Nat Rev Neurosci 9:467–479. 10.1038/nrn2374
    1. Hindy NC, Ng FY, Turk-Browne NB (2016) Linking pattern completion in the hippocampus to predictive coding in visual cortex. Nat Neurosci 19:665–667. 10.1038/nn.4284
    1. Hsieh LT, Gruber MJ, Jenkins LJ, Ranganath C (2014) Hippocampal activity patterns carry information about objects in temporal context. Neuron 81:1165–1178. 10.1016/j.neuron.2014.01.015
    1. Kahnt T, Grueschow M, Speck O, Haynes JD (2011) Perceptual learning and decision-making in human medial frontal cortex. Neuron 70:549–559. 10.1016/j.neuron.2011.02.054
    1. Karni A, Sagi D (1993) The time course of learning a visual skill. Nature 365:250–252. 10.1038/365250a0
    1. Karuza EA, Newport EL, Aslin RN, Starling SJ, Tivarus ME, Bavelier D (2013) The neural correlates of statistical learning in a word segmentation task: an fMRI study. Brain Lang 127:46–54. 10.1016/j.bandl.2012.11.007
    1. Knowlton BJ, Squire LR, Gluck MA (1994) Probabilistic classification learning in amnesia. Learn Mem 1:106–120.
    1. Koelsch S, Rohrmeier M, Torrecuso R, Jentschke S (2013) Processing of hierarchical syntactic structure in music. Proc Natl Acad Sci U S A 110:15443–15448. 10.1073/pnas.1300272110
    1. Kok P, Jehee JF, de Lange FP (2012) Less is more: expectation sharpens representations in the primary visual cortex. Neuron 75:265–270. 10.1016/j.neuron.2012.04.034
    1. Lawrence AD, Sahakian BJ, Robbins TW (1998) Cognitive functions and corticostriatal circuits: insights from Huntington's disease. Trends Cogn Sci 2:379–388. 10.1016/S1364-6613(98)01231-5
    1. Leaver AM, Van Lare J, Zielinski B, Halpern AR, Rauschecker JP (2009) Brain activation during anticipation of sound sequences. J Neurosci 29:2477–2485. 10.1523/JNEUROSCI.4921-08.2009
    1. Li W. (1991) On the relationship between complexity and entropy for Markov chains and regular languages. Complex Systems 5:381–399.
    1. Meyer T, Olson CR (2011) Statistical learning of visual transitions in monkey inferotemporal cortex. Proc Natl Acad Sci U S A 108:19401–19406. 10.1073/pnas.1112895108
    1. Monchi O, Petrides M, Petre V, Worsley K, Dagher A (2001) Wisconsin Card Sorting revisited: distinct neural circuits participating in different stages of the task identified by event-related functional magnetic resonance imaging. J Neurosci 21:7733–7741.
    1. Murray RF, Patel K, Yee A (2015) Posterior probability matching and human perceptual decision making. PLoS Comput Biol 11:e1004342. 10.1371/journal.pcbi.1004342
    1. Nastase S, Iacovella V, Hasson U (2014) Uncertainty in visual and auditory series is coded by modality-general and modality-specific neural systems. Hum Brain Mapp 35:1111–1128. 10.1002/hbm.22238
    1. Nissen MJ, Bullemer P (1987) Attentional requirements of learning: evidence from performance measures. Cognitive Psychology 19:1–32. 10.1016/0010-0285(87)90002-8
    1. Nomura EM, Maddox WT, Filoteo JV, Ing AD, Gitelman DR, Parrish TB, Mesulam MM, Reber PJ (2007) Neural correlates of rule-based and information-integration visual category learning. Cereb Cortex 17:37–43.
    1. Pasupathy A, Miller EK (2005) Different time courses of learning-related activity in the prefrontal cortex and striatum. Nature 433:873–876. 10.1038/nature03287
    1. Pelli DG. (1997) The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spat Vis 10:437–442. 10.1163/156856897X00366
    1. Perruchet P, Pacton S (2006) Implicit learning and statistical learning: one phenomenon, two approaches. Trends Cogn Sci 10:233–238. 10.1016/j.tics.2006.03.006
    1. Poldrack RA, Packard MG (2003) Competition among multiple memory systems: converging evidence from animal and human brain studies. Neuropsychologia 41:245–251. 10.1016/S0028-3932(02)00157-4
    1. Raichle ME, Fiez JA, Videen TO, MacLeod AM, Pardo JV, Fox PT, Petersen SE (1994) Practice-related changes in human brain functional anatomy during nonmotor learning. Cereb Cortex 4:8–26. 10.1093/cercor/4.1.8
    1. Rauch SL, Whalen PJ, Savage CR, Curran T, Kendrick A, Brown HD, Bush G, Breiter HC, Rosen BR (1997) Striatal recruitment during an implicit sequence learning task as measured by functional magnetic resonance imaging. Hum Brain Mapp 5:124–132.
    1. Reber AS. (1967) Implicit learning of artificial grammars. Journal of Verbal Learning and Verbal Behavior 6:855–863. 10.1016/S0022-5371(67)80149-X
    1. Rieskamp J, Otto PE (2006) SSL: a theory of how people learn to select strategies. J Exp Psychol Gen 135:207–236. 10.1037/0096-3445.135.2.207
    1. Robbins TW. (2007) Shifting and stopping: fronto-striatal substrates, neurochemical modulation and clinical implications. Philos Trans R Soc Lond B Biol Sci 362:917–932. 10.1098/rstb.2007.2097
    1. Rose M, Haider H, Salari N, Büchel C (2011) Functional dissociation of hippocampal mechanism during implicit learning based on the domain of associations. J Neurosci 31:13739–13745. 10.1523/JNEUROSCI.3020-11.2011
    1. Rosenthal CR, Andrews SK, Antoniades CA, Kennard C, Soto D (2016) Learning and recognition of a non-conscious sequence of events in human primary visual cortex. Curr Biol 26:834–841. 10.1016/j.cub.2016.01.040
    1. Rushworth MF, Behrens TE (2008) Choice, uncertainty and value in prefrontal and cingulate cortex. Nat Neurosci 11:389–397. 10.1038/nn2066
    1. Saffran JR, Johnson EK, Aslin RN, Newport EL (1999) Statistical learning of tone sequences by human infants and adults. Cognition 70:27–52. 10.1016/S0010-0277(98)00075-4
    1. Schapiro AC, Kustner LV, Turk-Browne NB (2012) Shaping of object representations in the human medial temporal lobe based on temporal regularities. Curr Biol 22:1622–1627. 10.1016/j.cub.2012.06.056
    1. Schendan HE, Searl MM, Melrose RJ, Stern CE (2003) An FMRI study of the role of the medial temporal lobe in implicit and explicit sequence learning. Neuron 37:1013–1025. 10.1016/S0896-6273(03)00123-5
    1. Schwarb H, Schumacher EH (2012) Generalized lessons about sequence learning from the study of the serial reaction time task. Adv Cogn Psychol 8:165–178. 10.5709/acp-0113-1
    1. Seger CA. (1994) Implicit learning. Psychol Bull 115:163–196. 10.1037/0033-2909.115.2.163
    1. Seger CA. (2013) The visual corticostriatal loop through the tail of the caudate: circuitry and function. Front Syst Neurosci 7.
    1. Seger CA, Cincotta CM (2006) Dynamics of frontal, striatal, and hippocampal systems during rule learning. Cereb Cortex 16:1546–1555.
    1. Shanks DR, Tunney RJ, McCarthy JD (2002) A re-examination of probability matching and rational choice. Journal of Behavioral Decision Making 15:233–250. 10.1002/bdm.413
    1. Shaw R. (1984) The dripping faucet as a model chaotic system. Santa Cruz, CA: Aerial.
    1. Summerfield C, Egner T (2009) Expectation (and attention) in visual cognition. Trends Cogn Sci 13:403–409. 10.1016/j.tics.2009.06.003
    1. Turk-Browne NB, Jungé J, Scholl BJ (2005) The automaticity of visual statistical learning. J Exp Psychol Gen 134:552–564. 10.1037/0096-3445.134.4.552
    1. Turk-Browne NB, Scholl BJ, Chun MM, Johnson MK (2009) Neural evidence of statistical learning: efficient detection of visual regularities without awareness. J Cogn Neurosci 21:1934–1945. 10.1162/jocn.2009.21131
    1. Turk-Browne NB, Scholl BJ, Johnson MK, Chun MM (2010) Implicit perceptual anticipation triggered by statistical learning. J Neurosci 30:11177–11187. 10.1523/JNEUROSCI.0858-10.2010
    1. van den Bos E, Poletiek FH (2008) Effects of grammar complexity on artificial grammar learning. Mem Cognit 36:1122–1131. 10.3758/MC.36.6.1122
    1. Woo CW, Krishnan A, Wager TD (2014) Cluster-extent based thresholding in fMRI analyses: pitfalls and recommendations. Neuroimage 91:412–419. 10.1016/j.neuroimage.2013.12.058

Source: PubMed

3
订阅