Bottom-up processing of curvilinear visual features is sufficient for animate/inanimate object categorization

Valentinos Zachariou, Amanda C Del Giacco, Leslie G Ungerleider, Xiaomin Yue, Valentinos Zachariou, Amanda C Del Giacco, Leslie G Ungerleider, Xiaomin Yue

Abstract

Animate and inanimate objects differ in their intermediate visual features. For instance, animate objects tend to be more curvilinear compared to inanimate objects (e.g., Levin, Takarae, Miner, & Keil, 2001). Recently, it has been demonstrated that these differences in the intermediate visual features of animate and inanimate objects are sufficient for categorization: Human participants viewing synthesized images of animate and inanimate objects that differ largely in the amount of these visual features classify objects as animate/inanimate significantly above chance (Long, Stormer, & Alvarez, 2017). A remaining question, however, is whether the observed categorization is a consequence of top-down cognitive strategies (e.g., rectangular shapes are less likely to be animals) or a consequence of bottom-up processing of their intermediate visual features, per se, in the absence of top-down cognitive strategies. To address this issue, we repeated the classification experiment of Long et al. (2017) but, unlike Long et al. (2017), matched the synthesized images, on average, in the amount of image-based and perceived curvilinear and rectilinear information. Additionally, in our synthesized images, global shape information was not preserved, and the images appeared as texture patterns. These changes prevented participants from using top-down cognitive strategies to perform the task. During the experiment, participants were presented with these synthesized, texture-like animate and inanimate images and, on each trial, were required to classify them as either animate or inanimate with no feedback given. Participants were told that these synthesized images depicted abstract art patterns. We found that participants still classified the synthesized stimuli significantly above chance even though they were unaware of their classification performance. For both object categories, participants depended more on the curvilinear and less on the rectilinear, image-based information present in the stimuli for classification. Surprisingly, the stimuli most consistently classified as animate were the most dangerous animals in our sample of images. We conclude that bottom-up processing of intermediate features present in the visual input is sufficient for animate/inanimate object categorization and that these features may convey information associated with the affective content of the visual stimuli.

Trial registration: ClinicalTrials.gov NCT00001360.

Figures

Figure 1
Figure 1
Three example inanimate objects used in Long et al. (2017) together with (a) their corresponding texform images, created using the Freeman and Simoncelli (2011) algorithm, (b) their corresponding synthesized images created using the Portilla and Simoncelli (2000) algorithm. The images under the “original,” “controlled,” and “texform” columns were directly extracted from figure 1 of Long et al. (2017). The images under the “texform” column were created using the algorithm described in Freeman and Simoncelli (2011) with some slight modifications outlined in Long et al. (2016). The images under the “synthesized” column were created by using the algorithm described in Portilla and Simoncelli (2000), which we used in this study. Both the “texform” and “synthesized” object algorithms used the images under the “controlled” column as inputs.
Figure 2
Figure 2
Example stimuli and sample trial displays from the rating and classification sessions of the experiment. (A) Examples of animate images with their corresponding synthesized stimuli. (B) Examples of inanimate/man-made images with their corresponding synthesized stimuli. (C) A sample trial display from the rating session of the experiment. This example depicts a trial in which a participant rated the synthesized stimuli on degree of “boxiness” or how rectilinear the images appeared to him or her. The black bar corresponds to the amount of “boxiness” this participant attributed to the stimulus image. (D) Sample trial display from the classification session of the experiment. This example depicts a trial in which a participant classified the synthesized stimulus as “animate.” The black bar represents the confidence of the participant on his or her classification choice.
Figure 3
Figure 3
The figure depicts how the classification accuracy of the animate (X) and inanimate (O) images varied as a function of the amount of calculated curvilinear information present in the images. The best-fit line for the animate category is represented by the solid black line, and the best-fit line for the inanimate category is represented by the dotted gray line. The average classification accuracy was 54.62%, which is significantly above chance.
Figure 4
Figure 4
Rank-ordered classification accuracy for the synthesized animate images in descending order. A sample of the original images, corresponding to the rank-ordered image IDs of the synthesized images, are presented on top of the bars. The horizontal dashed line represents chance performance (50% accuracy). The error bars denote ±1 SEM.

References

    1. Bar M, Neta M. Humans prefer curved visual objects. Psychological Science. (2006);17(8):645–648.
    1. Bell A. H, Hadj-Bouziane F, Frihauf J. B, Tootell R. B, Ungerleider L. G. Object representations in the temporal cortex of monkeys and humans as revealed by functional magnetic resonance imaging. Journal of Neurophysiology. (2009);101(2):688–700.
    1. Bi Y, Wang X, Caramazza A. Object domain and modality in the ventral visual pathway. Trends in Cognitive Sciences. (2016);20(4):282–290.
    1. Cauchoix M, Crouzet S. M, Fize D, Serre T. Fast ventral stream neural activity enables rapid visual categorization. NeuroImage. (2016);125:280–290.
    1. Cheung O. S, Gauthier I. Visual appearance interacts with conceptual knowledge in object recognition. Frontiers in Psychology. (2014);5:793.
    1. Freeman J, Simoncelli E. P. Metamers of the ventral stream. Nature Neuroscience. (2011);14(9):1195–1201.
    1. Gallant J. L, Braun J, Van Essen D. C. Selectivity for polar, hyperbolic, and Cartesian gratings in macaque visual cortex. Science. (1993 Jan 1);259:100–103.
    1. Grill-Spector K, Kanwisher N. Visual recognition: As soon as you know it is there, you know what it is. Psychological Science. (2005);16(2):152–160.
    1. Haxby J. V, Gobbini M. I, Furey M. L, Ishai A, Schouten J. L, Pietrini P. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science. (2001 Sep 28);293(5539):2425–2430.
    1. Hubel D. H, Wiesel T. N. Receptive fields of single neurones in the cat's striate cortex. The Journal of Physiology. (1959);148(3):574–591.
    1. Hung C. P, Kreiman G, Poggio T, DiCarlo J. J. Fast readout of object identity from macaque inferior temporal cortex. Science. (2005 Nov 4);310(5749):863–866.
    1. Kanwisher N. Functional specificity in the human brain: A window into the functional architecture of the mind. Proceedings of the National Academy of Sciences, USA. (2010);107(25):11163–11170.
    1. Kriegeskorte N, Mur M, Ruff D. A, Kiani R, Bodurka J, Esteky H, Bandettini P. A. Matching categorical object representations in inferior temporal cortex of man and monkey. Neuron. (2008);60(6):1126–1141.
    1. Lang P. J, Bradley M. M, Cuthbert B. N. International affective picture system (IAPS): Affective ratings of pictures and instruction manual. Technical Report A-8. Gainesville, FL: University of Florida; (2008).
    1. Levin D. T, Takarae Y, Miner A. G, Keil F. Efficient visual search by category: Specifying the features that mark the difference between artifacts and animals in preattentive vision. Attention, Perception, & Psychophysics. (2001);63(4):676–697.
    1. Long B, Konkle T, Cohen M. A, Alvarez G. A. Mid-level perceptual features distinguish objects of different real-world sizes. Journal of Experimental Psychology: General. (2016);145(1):95–109.
    1. Long B, Störmer V. S, Alvarez G. A. Mid-level perceptual features contain early cues to animacy. Journal of Vision. (2017);17(6):1–20.:20. doi: 10.1167/17.6.20. PubMed ] [ PubMed ] [ Article.
    1. Perrett D. I, Hietanen J. K, Oram M. W, Benson P. J, Rolls E. T. Organization and functions of cells responsive to faces in the temporal cortex. Philosophical Transactions of the Royal Society of London B. (1992);355(1273):23–30.
    1. Perrinet L. U, Bednar J. A. Edge co-occurrences can account for rapid categorization of natural versus animal images. Scientific Reports. (2015);5:11400.
    1. Portilla J, Simoncelli E. P. A parametric texture model based on joint statistics of complex wavelet coefficients. International Journal of Computer Vision. (2000);40(1):49–70.
    1. Riesenhuber M, Poggio T. Hierarchical models of object recognition in cortex. Nature Neuroscience. (1999);2(11):1019–1025.
    1. Rosch E, Mervis C. B, Gray W. D, Johnson D. M, Boyes-Braem P. Basic objects in natural categories. Cognitive Psychology. (1976);8(3):382–439.
    1. Schmidt F, Hegele M, Fleming R. W. Perceiving animacy from shape. Journal of Vision. (2017);17(11):1–15.:10. doi: 10.1167/17.11.10. PubMed ] [ PubMed ] [ Article.
    1. Serre T, Oliva A, Poggio T. A feedforward architecture account for rapid categorization. Proceedings of the National Academy of Sciences, USA. (2007);104(15):6424–6429.
    1. Tanaka K. Inferotemporal cortex and object vision. Annual Review of Neuroscience. (1996);19:109–139.
    1. Tavakol M, Dennick R. Making sense of Cronbach's alpha. International Journal of Medical Education. (2011);2:53–55.
    1. Thorpe S, Fize D, Marlot C. Speed of processing in the human visual system. Nature. (1996 Jun 6);381(6582):520–522.
    1. Tsao D. Y, Freiwald W. A, Tootell R. B, Livingston M. S. A cortical region consisting entirely of face-selective cells. Science. (2006 Feb 3);311(5761):670–674.
    1. Yue X, Pourladian I. S, Tootell R. B, Ungerleider L. G. Curvature-processing network in macaque visual cortex. Proceedings of the National Academy of Sciences, USA. (2014);111(33):E3467–E3475.

Source: PubMed

3
Abonnieren