Transferring structural knowledge across cognitive maps in humans and models

Shirley Mark, Rani Moran, Thomas Parr, Steve W Kennerley, Timothy E J Behrens, Shirley Mark, Rani Moran, Thomas Parr, Steve W Kennerley, Timothy E J Behrens

Abstract

Relations between task elements often follow hidden underlying structural forms such as periodicities or hierarchies, whose inferences fosters performance. However, transferring structural knowledge to novel environments requires flexible representations that are generalizable over particularities of the current environment, such as its stimuli and size. We suggest that humans represent structural forms as abstract basis sets and that in novel tasks, the structural form is inferred and the relevant basis set is transferred. Using a computational model, we show that such representation allows inference of the underlying structural form, important task states, effective behavioural policies and the existence of unobserved state-trajectories. In two experiments, participants learned three abstract graphs during two successive days. We tested how structural knowledge acquired on Day-1 affected Day-2 performance. In line with our model, participants who had a correct structural prior were able to infer the existence of unobserved state-trajectories and appropriate behavioural policies.

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1. Transfer of structural knowledge: graph…
Fig. 1. Transfer of structural knowledge: graph structures and experimental design.
a Experimental design. Agents and participants learned graphs with underlying Hexagonal (left) or Community (right) structure. Each grey dot is a node on a graph and corresponds to a picture that was viewed by the participant (for example, a picture of a bee). The lines are edges between nodes. Pictures of nodes that are connected by an edge can appear one after the other. The degree of all nodes in both graphs is six (a connecting node connects to one fewer node within a community to keep the degree equal to six). Participants learned the graphs during two successive days. In both experiments, participants were segregated into two groups. Participants in one group learned graphs with the same underlying structure during both days while the other groups learned graphs with different underlying structures during the different days. Two graphs were learnt during day 1 and additional graph on day 2. b One block of the task. Participants never observed the underlying graph structure but had to learn (or infer) it by performing a task. In each block, participants learned the associations between pairs of pictures, where each pair of pictures are emitted by neighbouring states on the graph. Following the learning phase, participants had to answer different types of questions: (1) report which of the two picture sequences could be extended with a target picture, (2) indicate whether the picture in the middle (sun) can appear between the two other pictures in a sequence (left and right, respectively, under ‘Testing associations knowledge’) and (3) they navigated on the graph: starting from a certain picture, for example, the building, participants had to choose (or skip) between two pictures that are connected to the current picture (the building) on the graph, for example, basket and boots (empty squares above the arrows indicate minimum steps to the target). The chosen picture then replaces the ‘starting picture’. Participants repeated these steps until they reached the target picture (for instance, the running man). (4) Which picture is closer to the target picture (the bag), the ice-cream or the basket? (Distance estimation). Question type three was excluded from day 2 on experiment 1.
Fig. 2. Associative and abstract representation of…
Fig. 2. Associative and abstract representation of transition structure.
Learning underlying graph structure from observations of pictures. Graph can be represented by learning the associations between the stimuli (associative representation). Such representation is conjunctive, as the relations between the stimuli are encoded using the associations between the representation of the stimuli themselves. This type of representation does not allow generalization and knowledge transfer. Representing the graph using two separate matrices, the transition and emission matrices allows generalization over graph structure.
Fig. 3. Inferring graph structure rather than…
Fig. 3. Inferring graph structure rather than learning it using a basis sets representation for structural knowledge.
a We present a generative model for graphs. Each graph belongs to a structural form (Sf). Given a structural form, graph size (θ) is sampled from prior distribution (p(θ|Sf)) and the transition matrix is approximated. Given a transition matrix (Asfθ, that is determined by the form and the dimensions) an emission matrix (B) is sampled. From these two matrices, the sequence of observations (O) can be generated. b Basis sets for Hexagonal grid, few examples. c Basis sets for a community structure. Basis sets can allow direct inference of important graph states without the need of further computation. In a graph with underlying community structure, the connecting nodes (blue circles) are important; knowing them allows fast transitions between communities. A basis set that contains explicit connecting nodes assignment vectors allows the direct inference of their identity by learning the emission matrix. d The transition matrices can be approximated using Basis sets for structural knowledge. Upper panels: correct and approximated transition structure for Hexagonal grids with 36 nodes. Lower panels: real and approximated transition matrices for a graph with underlying community structure (35 nodes).
Fig. 4. Inference of unobserved links (Hexagonal…
Fig. 4. Inference of unobserved links (Hexagonal graph).
a Inferring the existence of unobserved edges (links). Left—the task: The agents had to indicate which of two nodes (pictures) has smaller number of links to the target. With only observed links, the number of links to the target was identical. Right—red edges indicates missing links on the graph. For example, the two nodes that are marked with light blue have the same number of observed links to the target node (marked with dark blue circle), while the number of links that connect these two nodes to the target is different on the complete graph. b When learning from pairs that were sampled randomly (not in succession) while some of the links (pairs) were never observed, simple associative models as learning transition matrix (DA) or simple SR (SR-online: learning using TD-SR, SR-A: calculating SR from the learnt transition matrix) could not infer the existence of the unobserved links and solve the task (it in fact solves it worse than chance). Agents that use a filtered SR representation (SRreg) could answer these questions better than chance. Shadows are the standard errors of the mean (SEM), the centre is the mean. c When learning from pairs that were sampled randomly (not in succession) while some of the links (pairs) were never observed), the basis set agent, that transfers abstract structural knowledge, was able to infer the structural form (Supplementary Fig. 2) and graph size correctly. d Further, the agent was able to infer the existence of links that were never observed and determined correctly, which of two pictures is closer to a target picture, according to the complete graph (green). The agent could do so even though the number of observed links between the two pictures and the target was identical (p(cor) corresponds to the average fraction of correct answers out of 40 questions in each block). When the agent was forced to infer a community structure (red), it answered these questions worse than chance. Shadows are the standard errors of the mean, the centre is the mean.
Fig. 5. Transfer of structural knowledge allows…
Fig. 5. Transfer of structural knowledge allows inference of unobserved links (Hexagonal graph).
Participants had to indicate which of two pictures is closer to a target picture. Participants that reached the second day of our task with the correct prior expectation over the structural forms performed significantly better in such task compared to participants with the wrong structural prior (left panel) (30 participants in each group). They were able to answer these questions significantly above chance even when there were links that were never observed, and they had to choose between two pictures with an identical number of observed links to the target (right panel). One-tailed t test. **p < 0.01, *p < 0.05. Error bar: SEM. Colorcode: Log10(p-value).
Fig. 6. Policy transfer: learning graphs with…
Fig. 6. Policy transfer: learning graphs with underlying community structure.
a Our agent was able to infer the correct number of communities (middle panel, averaged inferred number of communities over 20 simulations). It was also able to infer the identities of the connecting nodes (lower panel, inferred number of nodes divided to the number of connecting nodes according to the inferred graph size, see ‘Methods’). Shadows are the SEM, the centre is the mean. b Participants with correct structural prior spend less time on learning the associations between the pictures (RT = response time for changing to the next pair, upper panel—left). The number of steps to the target (nsteps) is significantly lower for participants with the correct structural prior (upper panels, Dt=0 is the initial number of links between the current picture and the target). During navigation, participants with the correct prior over the structural forms choose connecting nodes more frequently (lower panel—left), they do so even if this choice takes them far away from the target (lower panel—right). Error bars are the SEM, the centre is the mean. *p < 0.05, **p < 0.01 (20 participants in each group).

References

    1. Tolman EC. Maps in your mind. Psychol. Rev. 1948;55:189–208. doi: 10.1037/h0061626.
    1. Stachenfeld KL, Botvinick MM, Gershman SJ. The hippocampus as a predictive map. Nat. Neurosci. 2017;20:1643–1653. doi: 10.1038/nn.4650.
    1. Dayan P. Improving generalization for temporal difference learning: the successor representation. Neural Comput. 1993;5:613–624. doi: 10.1162/neco.1993.5.4.613.
    1. Kemp C, Goodman ND, Tenenbaum JB. Learning to learn causal models. Cogn. Sci. 2010;34:1185–1243. doi: 10.1111/j.1551-6709.2010.01128.x.
    1. Halford GS, Bain JD, Maybery MT, Andrews G. Induction of relational schemas: common processes in reasoning and complex learning. Cogn. Psychol. 1998;35:201–245. doi: 10.1006/cogp.1998.0679.
    1. Reeves L, Weisberg RW. The role of content and abstract information in analogical transfer. Psychol. Bull. 1994;115:381–400. doi: 10.1037/0033-2909.115.3.381.
    1. Strogatz SH. Exploring complex networks. Nature. 2001;410:268–276. doi: 10.1038/35065725.
    1. Gershman SJ, Niv Y. Learning latent structure: carving nature at its joints. Curr. Opin. Neurobiol. 2010;20:251–256. doi: 10.1016/j.conb.2010.02.008.
    1. Kemp, C. & Tenenbaum, J. B. The discovery of structural form. Proc. Natl Acad. Sci. USA105, 10687–10692 (2008).
    1. Girvan M, Newman MEJ. Community structure in social and biological networks. Proc. Natl Acad. Sci. USA. 2002;99:7821–7826. doi: 10.1073/pnas.122653799.
    1. Collins AM, Quillian MR. Retrieval time from semantic memory. J. Verbal Learning Verbal Behav. 1969;8:240–247. doi: 10.1016/S0022-5371(69)80069-1.
    1. Zambaldi, V. et al. Relational deep reinforcement learning. arXiv: 1806.01830v2 [cs.LG] (2018).
    1. Ferguson, K. & Mahadevan, S. Proto-transfer learning in markov decision processes using spectral methods. Proc. ICML Work. Struct. Knowl. Transf. Mach. Learn. 151 (2006).
    1. Taylor ME, Stone P. Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res. 2009;10:1633–1685.
    1. Konidaris G, Scheidwasser I, Barto AG. Transfer in reinforcement learning via shared features. J. Mach. Learn. Res. 2012;13:1333–1371.
    1. Tolman C,E. Introduction and removal of reward, and maze performance in rats. Univ. Calif. Publ. Psychol. 1930;4:257–275.
    1. Tolman EC, Ritchie BF, Kalish D. Studies in spatial learning: orientation and the short-cut. J. Exp. Psychol. 1946;36:13–24. doi: 10.1037/h0053944.
    1. Tenenbaum JB, Kemp C, Griffiths TL, Goodman ND. How to grow a mind: statistics, structure, and abstraction. Science. 2011;331:1279–1285. doi: 10.1126/science.1192788.
    1. Saxe, A. M., Mcclelland, J. L. & Ganguli, S. A mathematical theory of semantic development in deep neural networks. 10.1073/pnas.1820226116 (2019).
    1. Mahadevan S, Maggioni M. Proto-value functions: a laplacian framework for learning representation and control in markov decision processes. J. Mach. Learn. Res. 2007;8:2169–2231.
    1. Roweis S, Ghahramani Z. A unifying review of linear gaussian models. Neural Comput. 1999;11:305–345. doi: 10.1162/089976699300016674.
    1. Rasmussen, C. E. & Ghahramani, Z. Occam’s Razor. Advances in Neural Information Systems 13 (2001).
    1. Schapiro, A. C., Rogers, T. T., Cordova, N. I., Turk-Browne, N. B. & Botvinick, M. M. Neural representations of events arise from temporal community structure. Nat.Neurosci.16, 486–492 (2013).
    1. Garvert MM, Dolan RJ, Behrens TEJ. A map of abstract relational knowledge in the human hippocampal–entorhinal cortex. Elife. 2017;6:1–20. doi: 10.7554/eLife.17086.
    1. Gelman, A., Carlin, J. B., Stern, H. S., Rubin, D. B. & Raton London New York Washington, B. Bayesian Data Analysis 2nd edn (2013).
    1. Dordek, Y., Soudry, D., Meir, R. & Derdikman, D. Extracting grid cell characteristics from place cell inputs using non-negative principal component analysis. 10.7554/eLife.10094.001 (2016).
    1. Whittington, J. C. et al. The Tolman-Eichenbaum Machine: Unifying space and relational memory through generalisation in the hippocampal formation. bioRxiv10.1101/770495 (2019).
    1. Wang, J. X. et al. Prefrontal cortex as a meta-reinforcement learning system. Nat. Neurosci. 10.1038/s41593-018-0147-8 (2018).
    1. Harlow HF. The formation of learning sets. Psychol. Rev. 1949;56:51–65. doi: 10.1037/h0062474.
    1. Spectral Graph Theory - Fan R. K. Chung, Fan Chung Graham - Google Books. (2019).
    1. Hafting T, Fyhn M, Molden S, Moser MB, Moser EI. Microstructure of a spatial map in the entorhinal cortex. Nature. 2005;436:801–806. doi: 10.1038/nature03721.
    1. Behrens TEJ, et al. What is a cognitive map? Organizing knowledge for flexible behavior. Neuron. 2018;100:490–509. doi: 10.1016/j.neuron.2018.10.002.
    1. Baram, A. B., Muller, T. H., Whittington, J. C. R. & Behrens, T. E. J. Intuitive planning: global navigation through cognitive maps based on grid-like codes. bioRxiv10.1101/421461 (2018).
    1. Aronov D, Nevers R, Tank DW. Mapping of a non-spatial dimension by the hippocampal-entorhinal circuit. Nature. 2017;543:719–722. doi: 10.1038/nature21692.
    1. Bao X, et al. Grid-like neural representations support olfactory navigation of a two-dimensional odor space. Neuron. 2019;102:1066–1075. doi: 10.1016/j.neuron.2019.03.034.
    1. Constantinescu AO, O’Reilly JX, Behrens TEJ. Organizing conceptual knowledge in humans with a gridlike code. Science. 2016;352:1464–1468. doi: 10.1126/science.aaf0941.
    1. Yoon, K. et al. Specific evidence of low-dimensional continuous attractor dynamics in grid cells. 10.1038/nn.3450 (2013).
    1. Burak, Y. & Fiete, I. R. Accurate path integration in continuous attractor network models of grid cells. PLoS Comput. Biol. 5, e1000291 (2009).
    1. Gardner, R. J., Lu, L., Wernle, T., Moser, M.-B. & Moser, E. I. Correlation structure of grid cells is preserved during sleep. 10.1038/s41593-019-0360-0 (2019).
    1. Trettel, S. G., Trimper, J. B., Hwaun, E., Fiete, I. R. & Colgin, L. L. Grid cell co-activity patterns during sleep reflect spatial overlap of grid fields during active behaviors. 10.1038/s41593-019-0359-6 (2019).
    1. Fyhn M, Hafting T, Treves A, Moser MB, Moser EI. Hippocampal remapping and grid realignment in entorhinal cortex. Nature. 2007;446:190–194. doi: 10.1038/nature05601.
    1. Solstad T, Boccara CN, Kropff E, Moser MB, Moser EI. Representation of geometric borders in the entorhinal cortex. Science. 2008;322:1865–1868. doi: 10.1126/science.1166466.
    1. Baum L. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process. Inequalities. 1972;3:1–8.
    1. Gershman, S. J., Moore, C. D., Todd, M. T., Norman, K. A. & Sederberg, P. B. The successor representation and temporal context. Neural Comput.24, 1553–1568 (2012).

Source: PubMed

3
Subscribe