Hearing lips and seeing voices: how cortical areas supporting speech production mediate audiovisual speech perception

Jeremy I Skipper, Virginie van Wassenhove, Howard C Nusbaum, Steven L Small, Jeremy I Skipper, Virginie van Wassenhove, Howard C Nusbaum, Steven L Small

Abstract

Observing a speaker's mouth profoundly influences speech perception. For example, listeners perceive an "illusory" "ta" when the video of a face producing /ka/ is dubbed onto an audio /pa/. Here, we show how cortical areas supporting speech production mediate this illusory percept and audiovisual (AV) speech perception more generally. Specifically, cortical activity during AV speech perception occurs in many of the same areas that are active during speech production. We find that different perceptions of the same syllable and the perception of different syllables are associated with different distributions of activity in frontal motor areas involved in speech production. Activity patterns in these frontal motor areas resulting from the illusory "ta" percept are more similar to the activity patterns evoked by AV(/ta/) than they are to patterns evoked by AV(/pa/) or AV(/ka/). In contrast to the activity in frontal motor areas, stimulus-evoked activity for the illusory "ta" in auditory and somatosensory areas and visual areas initially resembles activity evoked by AV(/pa/) and AV(/ka/), respectively. Ultimately, though, activity in these regions comes to resemble activity evoked by AV(/ta/). Together, these results suggest that AV speech elicits in the listener a motor plan for the production of the phoneme that the speaker might have been attempting to produce, and that feedback in the form of efference copy from the motor system ultimately influences the phonetic interpretation.