Visual object recognition is facilitated by temporal community structure

Ehsan Kakaei, Stepan Aleshin, Jochen Braun, Ehsan Kakaei, Stepan Aleshin, Jochen Braun

Abstract

Humans and others primates are highly attuned to temporal consistencies and regularities in their sensory environment and learn to predict such statistical structure. Moreover, in several instances, the presence of temporal structure has been found to facilitate procedural learning and to improve task performance. Here we extend these findings to visual object recognition and to presentation sequences in which mutually predictive objects form distinct clusters or "communities." Our results show that temporal community structure accelerates recognition learning and affects the order in which objects are learned ("onset of familiarity").

© 2021 Kakaei et al.; Published by Cold Spring Harbor Laboratory Press.

Figures

Figure 1.
Figure 1.
Presentation sequence and trial structure. (A) Presentation sequences were generated as (nearly) random walks on three types of graphs, with nodes representing a distinct object and edges representing possible transitions (in both directions). A sparsely connected, modular graph generated “strongly structured” sequences with distinct community structures (left), a sparsely connected, nonmodular graph generated “weakly structured” sequences (middle), and a full connected graph generated “unstructured” or “random” sequences (right). (B) Presentation sequences consisted of 180 complex, three-dimensional objects (shown rotating for 2 sec about a randomly oriented axis in the frontal plane). Of these, 170 ± 0.04 (mean ± SEM) objects were recurring, and 9.2 ± 0.04 objects were nonrecurring. Observers categorized each object as “familiar” or “unfamiliar.” Over the four sessions of 1 wk, observers performed 24 runs and viewed 4320 presentations, with every recurring object appearing at least 250 times.
Figure 2.
Figure 2.
Time course of recognition learning. (A) Average hit rate (recurring categorized as familiar, per window) increases with the number of presentations of a given object. (B) Average false alarm rate (nonrecurring not categorized as unfamiliar, per session) decreases with the number of presentations. (C) Average corrected performance ρ increases nearly monotonically with presentation number. It was consistently larger for strongly structured sequences (with temporal community structure) than for unstructured sequences. (D) Average criterion bias b, as a function of presentation number. Green regions indicate the transition between sessions (20%–80% of objects in previous session).
Figure 3.
Figure 3.
Analysis of the onset of familiarity with individual objects. (A) Successive onsets of familiarity (Δn = 1) are far more likely ([**] P < 0.005) for objects within the same cluster than would be expected by chance (dashed line). For nearly successive onsets (Δn = 2) this effect was not observed. (B) Comparison of frequency of successive onsets, compared with chance level, for objects pairs either in the same cluster (outlined blue and cyan) or in different clusters (green and red), which are either adjacent (blue and green) or nonadjacent on the graph (cyan and red). Frequency is significantly elevated ([*] P < 0.05 FDR corrected) for adjacent objects in the same cluster (blue) and suppressed for nonadjacent objects in different clusters (red).

References

    1. Barakat BK, Seitz AR, Shams L. 2013. The effect of statistical learning on internal stimulus representations: predictable items are enhanced even when not predicted. Cognition 129: 205–211. 10.1016/j.cognition.2013.07.003
    1. Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B (Methodol) 57: 289–300. 10.1111/j.2517-6161.1995.tb02031.x
    1. Bülthoff HH, Edelman S. 1992. Psychophysical support for a two-dimensional view interpolation theory of object recognition. Proc Natl Acad Sci 89: 60–64. 10.1073/pnas.89.1.60
    1. Chun MM, Jiang Y. 1998. Contextual cueing: implicit learning and memory of visual context guides spatial attention. Cogn Psychol 36: 28–71. 10.1006/cogp.1998.0681
    1. Fiser J, Aslin RN. 2001. Unsupervised statistical learning of higher-order spatial structures from visual scenes. Psychol Sci 12: 499–504. 10.1111/1467-9280.00392
    1. Fiser J, Aslin RN. 2002. Statistical learning of higher-order temporal structure from visual shape sequences. J Exp Psychol Learn Mem Cogn 28: 458–467. 10.1037/0278-7393.28.3.458
    1. Gershman SJ. 2017. Predicting the past, remembering the future. Curr Opin Behav Sci 17: 7–13. 10.1016/j.cobeha.2017.05.025
    1. Hamid OH, Wendemuth A, Braun J. 2010. Temporal context and conditional associative learning. BMC Neurosci 11: 45. 10.1186/1471-2202-11-45
    1. Jiang Y, Wagner LC. 2004. What is learned in spatial contextual cuing: configuration or individual locations? Percept Psychophys 66: 454–463. 10.3758/BF03194893
    1. Kahn AE, Karuza EA, Vettel JM, Bassett DS. 2018. Network constraints on learnability of probabilistic motor sequences. Nat Hum Behav 2: 936–947. 10.1038/s41562-018-0463-8
    1. Karuza EA, Kahn AE, Thompson-Schill SL, Bassett DS. 2017. Process reveals structure: how a network is traversed mediates expectations about its architecture. Sci Rep 7: 1–9. 10.1038/s41598-017-12876-5
    1. Karuza EA, Kahn AE, Bassett DS. 2019. Human sensitivity to community structure is robust to topological variation. Complexity 2019: 8379321. 10.1155/2019/8379321
    1. Kemp C, Tenenbaum JB. 2008. The discovery of structural form. Proc Natl Acad Sci 105: 10687–10692. 10.1073/pnas.0802631105
    1. Macmillan NA, Creelman CD. 2004. Detection theory: a user's guide, 2nd ed. Lawrence Erlbaum Associates, Mahwah, NJ.
    1. Maslov S, Sneppen K. 2002. Specificity and stability in topology of protein networks. Science 296: 910–913. 10.1126/science.1065103
    1. Miyashita Y. 1988. Neuronal correlate of visual associative long-term memory in the primate temporal cortex. Nature 335: 817–820. 10.1038/335817a0
    1. Otsuka S, Saiki J. 2016. Gift from statistical learning: visual statistical learning enhances memory for sequence elements and impairs memory for items that disrupt regularities. Cognition 147: 113–126. 10.1016/j.cognition.2015.11.004
    1. Rigotti M, ben Dayan Rubin D, Morrison SE, Salzman CD, Fusi S. 2010. Attractor concretion as a mechanism for the formation of context representations. Neuroimage 52: 833–847. 10.1016/j.neuroimage.2010.01.047
    1. Rubinov M, Sporns O. 2010. Complex network measures of brain connectivity: uses and interpretations. Neuroimage 52: 1059–1069. 10.1016/j.neuroimage.2009.10.003
    1. Russek EM, Momennejad I, Botvinick MM, Gershman SJ, Daw ND. 2017. Predictive representations can link model-based reinforcement learning to model-free mechanisms. PLoS Comput Biol 13: e1005768. 10.1371/journal.pcbi.1005768
    1. Saffran JR, Aslin RN, Newport EL. 1996. Statistical learning by 8-month-old infants. Science 274: 1926–1928. 10.1126/science.274.5294.1926
    1. Schapiro AC, Turk-Browne NB. 2015. Statistical learning. In Brain mapping: an encyclopedic reference (ed. Toga AW), pp. 501–506. Academic Press, New York.
    1. Schapiro AC, Rogers TT, Cordova NI, Turk-Browne NB, Botvinick MM. 2013. Neural representations of events arise from temporal community structure. Nat Neurosci 16: 486–492. 10.1038/nn.3331
    1. Siegelman N, Bogaerts L, Kronenfeld O. 2018. Redefining ‘learning’ in statistical learning: what does an online measure reveal about the assimilation of visual regularities? Cogn Sci 42: 692–727. 10.1111/cogs.12556
    1. Sigala N, Logothetis NK. 2002. Visual categorization shapes feature selectivity in the primate temporal cortex. Nature 415: 318–320. 10.1038/415318a
    1. Sigala N, Gabbiani F, Logothetis NK. 2002. Visual categorization and object representation in monkeys and humans. J Cogn Neurosci 14: 187–198. 10.1162/089892902317236830
    1. Sisk CA, Remington RW, Jiang YV. 2019. Mechanisms of contextual cueing: a tutorial review. Atten Percept Psychophys 81: 2571–2589. 10.3758/s13414-019-01832-2
    1. Stanislaw H, Todorov N. 1999. Calculation of signal detection theory measures. Behav Res Methods Instrum Comput 31: 137–149. 10.3758/BF03207704
    1. Swallow KM, Zacks JM, Abrams RA. 2009. Event boundaries in perception affect memory encoding and updating. J Exp Psychol Gen 138: 236–257. 10.1037/a0015631
    1. Thalmann M, Souza AS, Oberauer K. 2019. How does chunking help working memory? J Exp Psychol Learn Mem Cogn 45: 37–55. 10.1037/xlm0000578
    1. Wallis G. 1998. Temporal order in human object recognition learning. J Biol Syst 6: 299–313. 10.1142/S0218339098000200
    1. Wallis G, Bülthoff H. 1999. Learning to recognize objects. Trends Cogn Sci 3: 22–31. 10.1016/s1364-6613(98)01261-3

Source: PubMed

3
購読する