Revealing the multidimensional mental representations of natural objects underlying human similarity judgements

Martin N Hebart, Charles Y Zheng, Francisco Pereira, Chris I Baker, Martin N Hebart, Charles Y Zheng, Francisco Pereira, Chris I Baker

Abstract

Objects can be characterized according to a vast number of possible criteria (such as animacy, shape, colour and function), but some dimensions are more useful than others for making sense of the objects around us. To identify these core dimensions of object representations, we developed a data-driven computational model of similarity judgements for real-world images of 1,854 objects. The model captured most explainable variance in similarity judgements and produced 49 highly reproducible and meaningful object dimensions that reflect various conceptual and perceptual properties of those objects. These dimensions predicted external categorization behaviour and reflected typicality judgements of those categories. Furthermore, humans can accurately rate objects along these dimensions, highlighting their interpretability and opening up a way to generate similarity estimates from object dimensions alone. Collectively, these results demonstrate that human similarity judgements can be captured by a fairly low-dimensional, interpretable embedding that generalizes to external behaviour.

Trial registration: ClinicalTrials.gov NCT00001360.

Conflict of interest statement

Competing interests

The authors declare no competing interests.

Figures

Extended Data Fig. 1. Reproducibility of dimensions
Extended Data Fig. 1. Reproducibility of dimensions
Reproducibility of dimensions in the chosen 49-dimensional embedding across 20 random initializations (see Extended Data Figure 2 for a list of all dimension labels). Shaded areas reflect 95% confidence intervals.
Extended Data Fig. 2. Labels and word…
Extended Data Fig. 2. Labels and word clouds for all 49 model dimensions
Labels for all 49 dimensions, with respective word clouds reflecting the naming frequency across 20 participants. The dimensions appear to reflect both perceptual and conceptual properties of objects. A visual comparison between labels and word clouds indicates a generally good agreement between participant naming and the labels we provided for the dimensions.
Extended Data Fig. 3. Category-typicality correlations
Extended Data Fig. 3. Category-typicality correlations
Detailed results of inferential statistical analyses correlating category-related dimensions with typicality of their category. One-sided p-values were generated using randomization tests and were controlled for false discovery rate (FDR) across multiple tests. 90% confidence intervals were used to complement one-sided tests.
Extended Data Fig. 4. Model performance and…
Extended Data Fig. 4. Model performance and dimensionality as a function of training data size
Model performance and dimensionality varied as a function of the amount of data used for training the model. Models were trained in steps of 100,000 trials. Six models with random initialization and random subsets of data were trained per step and all models applied to the same test data as in the main text, making it a total of 78 trained models. For each step, computation of up to two models did not complete on the compute server for technical reasons, making the total between 4 and 6 models per step. Results for each individual model and the average for each step are shown in the Figure. a. Model performance was already high for 100,000 trials as training data but increased with more data, saturating around the final model performance. b. Dimensionality increased steadily with the amount of training data.
Fig. 1 |
Fig. 1 |
Task and modeling procedure for large-scale identification of mental object representations. For this figure, all images were replaced by images with similar appearance from the public domain. a We applied a triplet odd-one-out similarity task to images of the 1,854 objects in the THINGS database and collected a large number of ratings (1.46 million) using online crowdsourcing. The triplet odd-one-out task measures object similarity as the probability of choosing two objects together. This task evokes different minimal contexts as a basis for grouping objects together, which in turn emphasizes the relevant dimensions. b The goal of the modeling procedure was to learn an interpretable representational embedding that captures choice behavior in the odd-one-out task and predicts object similarity across all pairs of objects. Since only a subset of all possible triplets had been sampled (0.14 % of 1.06 billion possible combinations), this model additionally served to complete the sparsely sampled similarity matrix. c The model reflects the assumed cognitive process underlying the odd-one-out task. The embedding was initialized with random weights and would carry out predictions for which object pair was the most similar, based on the dot product. The prediction of the most similar pair is equivalent to predicting the remaining object as the odd-one-out. Model predictions were initially at chance (see example for a prediction that deviates from the choice) but learned gradually to predict behavioral choices. To allow for error backpropagation to the weights, the model was implemented as a shallow neural network.
Fig. 2 |
Fig. 2 |
Predictiveness of the computational model for single trial behavioral judgments and similarity. a Model performance was evaluated by predicting choice behavior at the individual trial level. The noise ceiling denotes the maximal performance any model could achieve given the noise in the data and is determined by the consistency in participants’ responses to the same triplet. The performance of the model in predicting independent test data approached noise ceiling, demonstrating excellent predictive performance. Error bars and shaded areas denote 95% confidence intervals. b To estimate how well the model predicted behavioral similarity, a model-generated similarity matrix was compared to a fully-sampled behavioral similarity matrix for 48 diverse objects. Results reveal a close fit (Pearson r = 0.90, p < 0.001, randomization test, 95% CI: 0.88–0.91), demonstrating that most explainable variance was captured by the model.
Fig. 3 |
Fig. 3 |
Example object dimensions illustrating their interpretability. The images reflect the objects with the highest weights along those dimensions. Word clouds illustrate the labels provided by 20 participants to visual exposure of those dimensions, weighted by their naming frequency. While responses tended to focus more on extreme examples, they generally exhibited a close correspondence to the dimension labels we generated, which are shown above each set of images (see Extended Data Figure 2 for labels and word clouds of all dimensions). For this figure, all images were replaced by images with similar appearance from the public domain.
Fig. 4 |
Fig. 4 |
Illustration of example objects with their respective dimensions, using circular bar plots (“rose plots”). The length of each petal reflects the degree to which an object dimension is expressed for the image of a given object. For display purposes, dimensions with small weights are not labeled. For this figure, all images were replaced by images with similar appearance from the public domain.
Fig. 5 |
Fig. 5 |
Two-dimensional visualization of the similarity embedding, combining dimensionality reduction (MDS-initialized t-SNE, dual perplexity: 5 and 30, 1,000 iterations) with rose plots for each object (see Fig. 4). At the global structure level, the results confirm the well-known distinction between “animate - inanimate” or “man-made - natural” objects, with some exceptions (see main text). In addition, the different clusters seem to reflect broader object categories which emerge naturally from object similarity judgments. However, dimensions are not restricted to those clusters but are expressed to different degrees throughout this representational space. For this figure, all images were replaced by images with similar appearance from the public domain.
Fig. 6 |
Fig. 6 |
How many dimensions are required to capture behavioral judgments and object similarity? By iteratively setting the dimensions with the smallest numeric value to 0, we estimated the effect of eliminating those dimensions from judgments. A drop in model performance indicates behavioral relevance of those dimensions. For explaining 95 to 99% of the predictive performance in behavior, between 6 and 11 dimensions are required, while for explaining 95 to 99% of the variance in similarity, between 9 and 15 dimensions are required.
Fig. 7 |
Fig. 7 |
The relationship between seemingly categorical dimensions and typicality ratings of those categories. Many of these dimensions exhibited a positive relationship between the numeric value of objects along that dimension and the typicality of category membership. This demonstrates that even seemingly categorical dimensions reflect the graded nature of the underlying dimensions and that typicality may be an emergent property of those dimensions. All results were min-max scaled for better comparability. Significant relationships between both variables are displayed in bold typeface (p < 0.05 one-sided, FDR-corrected for multiple comparisons). See Extended Data Figure 3 for individual inferential statistical results.
Fig. 8 |
Fig. 8 |
Task and results of direct ratings of dimensions. a 20 participants were asked to indicate with a mouse click where they believed objects would fall along all 49 model dimensions. Rather than providing participants with dimension labels, the rating scale was spanned by example images along the currently rated dimension (in this example, dimension 1, “artificial/hard”). b Results for the 20 tested objects revealed a good reconstruction of object similarity by dimension ratings when comparing it to the similarity predicted from the embedding that served as a reference (Pearson r = 0.85, p < 0.001, randomization test, 95% CI: 0.80–0.89). These results further support the idea that dimensions are interpretable and that they can be used to directly generate object similarities. For this figure, all images were replaced by images with similar appearance from the public domain.

References

    1. Biederman I Recognition-by-components: A theory of human image understanding. Psychol. Rev 94, 115–147 (1987).
    1. Edelman S Representation is representation of similarities. Behav. Brain Sci 21, 449–467 (1998).
    1. Nosofsky RM Attention, similarity, and the identification–categorization relationship. J. Exp. Psychol. Gen 115, 39–57 (1986).
    1. Goldstone RL The role of similarity in categorization: Providing a groundwork. Cognition 52, 125–157 (1994).
    1. Rosch E, Mervis CB, Gray WD, Johnson DM & Boyes-Braem P Basic objects in natural categories. Cognit. Psychol 8, 382–439 (1976).
    1. Hahn U & Chater N Concepts and similarity in Knowledge, concepts and categories (eds. Lamberts Koen & Shanks David) 43–92 (Psychology Press, 1997).
    1. Rips LJ, Smith EE & Medin DL Concepts and categories: Memory, meaning, and metaphysics in The Oxford Handbook of Thinking and Reasoning (eds. Holyoak KJ & Morrison RG) 177–209 (Oxford University Press, 2012).
    1. Rogers TT & McClelland JL Semantic cognition: A parallel distributed processing approach. (MIT press, 2004).
    1. Goldstone RL & Son JY Similarity in The Oxford Handbook of Thinking and Reasoning (eds. Holyoak KJ & Morrison RG) 155–176 (Oxford University Press, 2012).
    1. Kriegeskorte N & Kievit RA Representational geometry: integrating cognition, computation, and the brain. Trends Cogn. Sci 17, 401–412 (2013).
    1. Caramazza A & Shelton JR Domain-specific knowledge systems in the brain: The animate-inanimate distinction. J. Cogn. Neurosci 10, 1–34 (1998).
    1. Chao LL, Haxby JV & Martin A Attribute-based neural substrates in temporal cortex for perceiving and knowing about objects. Nat. Neurosci 2, 913–919 (1999).
    1. Konkle T & Oliva A Canonical visual size for real-world objects. J. Exp. Psychol. Hum. Percept. Perform 37, 23–37 (2011).
    1. Murphy G The big book of concepts. (MIT press, 2004).
    1. McRae K, Cree GS, Seidenberg MS & McNorgan C Semantic feature production norms for a large set of living and nonliving things. Behav. Res. Methods 37, 547–559 (2005).
    1. Devereux BJ, Tyler LK, Geertzen J & Randall B The Centre for Speech, Language and the Brain (CSLB) concept property norms. Behav. Res. Methods 46, 1119–1127 (2014).
    1. Hebart MN et al. THINGS: A database of 1,854 object concepts and more than 26,000 naturalistic object images. PLoS ONE 14, e0223792 (2019).
    1. Tversky A Features of similarity. Psychol. Rev 84, 327–352 (1977).
    1. Barsalou LW Context-independent and context-dependent information in concepts. Mem. Cognit 10, 82–93 (1982).
    1. Maddox WT & Ashby FG Comparing decision bound and exemplar models of categorization. Percept. Psychophys 53, 49–70 (1993).
    1. Hoyer PO Modeling receptive fields with non-negative sparse coding. Neurocomputing 52, 547–552 (2003).
    1. Murphy B, Talukdar P & Mitchell T Learning effective and interpretable semantic models using non-negative sparse embedding. in Proceedings of COLING 2012 1933–1950 (2012).
    1. Shepard RN Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space. Psychometrika 22, 325–345 (1957).
    1. Kobak D & Berens P The art of using t-SNE for single-cell transcriptomics. Nat. Commun 10, 1–14 (2019).
    1. Shelton JR, Fouch E & Caramazza A The selective sparing of body part knowledge: A case study. Neurocase 4, 339–351 (1998).
    1. Pedersen T, Patwardhan S & Michelizzi J WordNet::Similarity - Measuring the relatedness of concepts in HLT-NAACL 2004: Demonstration papers (eds. Dumais S, Marcu D & Roukos S) 38–41 (ACL Press, 2004).
    1. Warrington EK & Shallice T Category specific semantic impairments. Brain 107, 829–853 (1984).
    1. Rips LJ Similarity, typicality, and categorization in Similarity and analogical reasoning (eds. Vosniadou Stella & Ortony Andrew) 21–59 (Cambridge University Press, 1989).
    1. Smith EE & Sloman SA Similarity- versus rule-based categorization. Mem. Cognit 22, 377–386 (1994).
    1. Pilehvar MT & Collier N De-conflated semantic representations. in 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP) 1680–1690 (2016).
    1. Nosofsky RM, Sanders CA, Meagher BJ & Douglas BJ Toward the development of a feature-space representation for a complex natural category domain. Behav. Res. Methods 50, 530–556 (2018).
    1. Nosofsky RM, Sanders CA, Meagher BJ & Douglas BJ Search for the missing dimensions: Building a feature-space representation for a natural-science category domain. Comput. Brain Behav 3, 13–33 (2020).
    1. Zheng CY, Pereira F, Baker CI & Hebart MN Revealing interpretable object representations from human behavior. Preprint at arXiv (2019).
    1. Keil FC Constraints on knowledge and cognitive development. Psychol. Rev 88, 187–227 (1981).
    1. Shepard RN The analysis of proximities: Multidimensional scaling with an unknown distance function .I. Psychometrika 27, 125–140 (1962).
    1. Torgerson WS Multidimensional scaling: I. Theory and method. Psychometrika 17, 401–419 (1952).
    1. Thurstone LL Multiple factor analysis. Psychol. Rev 38, 406–427 (1931).
    1. Tranel D, Logan CG, Frank RJ & Damasio AR Explaining category-related effects in the retrieval of conceptual and lexical knowledge for concrete entities: Operationalization and analysis of factors. Neuropsychologia 35, 1329–1339 (1997).
    1. Shepard RN & Arabie P Additive clustering: Representation of similarities as combinations of discrete overlapping properties. Psychol. Rev 86, 87–123 (1979).
    1. Navarro DJ & Lee MD Common and distinctive features in stimulus similarity: A modified version of the contrast model. Psychon. Bull. Rev 11, 961–974 (2004).
    1. Carlson TA, Ritchie JB, Kriegeskorte N, Durvasula S & Ma J Reaction time for object categorization is predicted by representational distance. J. Cogn. Neurosci 26, 132–142 (2013).
    1. Yee E & Thompson-Schill SL Putting concepts into context. Psychon. Bull. Rev 23, 1015–1027 (2016).
    1. Charest I, Kievit RA, Schmitz TW, Deca D & Kriegeskorte N Unique semantic space in the brain of each beholder predicts perceived similarity. Proc. Natl. Acad. Sci 111, 14565–14570 (2014).
    1. De Haas B, Iakovidis AL, Schwarzkopf DS & Gegenfurtner KR Individual differences in visual salience vary along semantic dimensions. Proc. Natl. Acad. Sci 116, 11687–11692 (2019).
    1. Peterson JC, Abbott JT & Griffiths TL Evaluating (and improving) the correspondence between deep neural networks and human representations. Cogn. Sci 42, 2648–2669 (2018).
    1. Rajalingham R et al. Large-scale, high-resolution comparison of the core visual object recognition behavior of humans, monkeys, and state-of-the-art deep artificial neural networks. J. Neurosci 38, 7255–7269 (2018).
    1. Jozwik KM, Kriegeskorte N, Storrs KR & Mur M Deep convolutional neural networks outperform feature-based but not categorical models in explaining object similarity judgments. Front. Psychol 8, 1726 (2017).
    1. Iordan MC, Giallanza T, Ellis CT, Beckage N & Cohen JD Context Matters: Recovering Human Semantic Structure from Machine Learning Analysis of Large-Scale Text Corpora. Preprint at arXiv (2019).
    1. Bauer AJ & Just MA Neural representations of concept knowledge in The Oxford Handbook of Neurolinguistics (eds. de Zubicaray GI & Schiller NO) 518–547 (Oxford University Press, 2019).
    1. Binder JR et al. Toward a brain-based componential semantic representation. Cogn. Neuropsychol 33, 130–174 (2016).
    1. Huth AG, Nishimoto S, Vu AT & Gallant JL A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron 76, 1210–1224 (2012).
    1. Cichy RM, Kriegeskorte N, Jozwik KM, van den Bosch JJ & Charest I The spatiotemporal neural dynamics underlying perceived similarity for real-world objects. NeuroImage 194, 12–24 (2019).
    1. Bankson BB, Hebart MN, Groen IIA & Baker CI The temporal evolution of conceptual object representations revealed through models of behavior, semantics and deep neural networks. NeuroImage 178, 172–182 (2018).
    1. Abadi M et al. Tensorflow: A system for large-scale machine learning. in 12th Symposium on Operating Systems Design and Implementation 265–283 (2016).
    1. Kingma DP & Ba J Adam: A method for stochastic optimization. Preprint at arXiv (2015).

Source: PubMed

3
Sottoscrivi