Action planning and predictive coding when speaking

Jun Wang, Daniel H Mathalon, Brian J Roach, James Reilly, Sarah K Keedy, John A Sweeney, Judith M Ford, Jun Wang, Daniel H Mathalon, Brian J Roach, James Reilly, Sarah K Keedy, John A Sweeney, Judith M Ford

Abstract

Across the animal kingdom, sensations resulting from an animal's own actions are processed differently from sensations resulting from external sources, with self-generated sensations being suppressed. A forward model has been proposed to explain this process across sensorimotor domains. During vocalization, reduced processing of one's own speech is believed to result from a comparison of speech sounds to corollary discharges of intended speech production generated from efference copies of commands to speak. Until now, anatomical and functional evidence validating this model in humans has been indirect. Using EEG with anatomical MRI to facilitate source localization, we demonstrate that inferior frontal gyrus activity during the 300ms before speaking was associated with suppressed processing of speech sounds in auditory cortex around 100ms after speech onset (N1). These findings indicate that an efference copy from speech areas in prefrontal cortex is transmitted to auditory cortex, where it is used to suppress processing of anticipated speech sounds. About 100ms after N1, a subsequent auditory cortical component (P2) was not suppressed during talking. The combined N1 and P2 effects suggest that although sensory processing is suppressed as reflected in N1, perceptual gaps may be filled as reflected in the lack of P2 suppression, explaining the discrepancy between sensory suppression and preserved sensory experiences. These findings, coupled with the coherence between relevant brain regions before and during speech, provide new mechanistic understanding of the complex interactions between action planning and sensory processing that provide for differentiated tagging and monitoring of one's own speech, processes disrupted in neuropsychiatric disorders.

Keywords: Corollary discharge; Efference copy; IFG; STG.

Published by Elsevier Inc.

Figures

Figure 1
Figure 1
Illustration of the behavioral tasks. Left shows a cartoon profile of a healthy subject talking (saying “ah”), and right shows listening to a playback of “ah” through headphones. The audio system records the speech sounds during Talking and plays them back during Listening. The intention to say “ah” is indicated as an orange “thought bubble” over the left hemisphere IFG area. The orange curved arrow pointing from the IFG area to auditory cortex indicates the transmission of the efference copy of the motor plan, which produces a corollary discharge (blue burst) of the expected sensation in auditory cortex. When the expected sensation (corollary discharge) matches the actual sensation in auditory cortex (green burst), perception is suppressed.
Figure 2
Figure 2
Mean voltage scalp maps and ERPs in the Talking and Listening tasks. The left voltage scalp plots show spatial distributions of the pre-speech, N1 and P2 ERP responses. Sensors showing maximum activities are marked by red circle surrounding FCz. Note, stronger pre-speech responses were observed in the Talking task while stronger N1 and P2 responses were observed in the Listening task. On the right are ERPs recorded from FCz linked to the onset of the speech sound (dotted vertical line) during both the Listening (red lines) and Talking tasks (blue lines). During Talking, N1 to the speech sound is suppressed relative to N1 to the same sound during Listening. In addition, there is a slow pre-speech negative activity spanned from -300 to 0 milliseconds. Amplitude (microvolts) is on the y-axis and time (milliseconds) is on the x-axis. Vertex negativity is plotted down.
Figure 3
Figure 3
(a) Source localization maps parsing the Task x Period interactions, showing the Task effect for each time period separately. Red represents greater activity during Talking than Listening, and green represents greater activity during Listening than Talking. Note greater activity in IFG (white circles) and mouth sensorimotor area (yellow circles) during Talking than Listening and greater activity in STG during Listening than Talking. (b) Source localization maps for P2 compared to baseline, averaging across Talking and Listening tasks.
Figure 4
Figure 4
Frontal-Temporal source coherence for Talking and Listening tasks. The bar graph shows the magnitudes of source coherence during Talking (black bar) and Listening (gray bar) between IFG areas and auditory cortex. Error bars indicate standard error, *significance at p

Figure 5

Bivariate scatter plots depict the…

Figure 5

Bivariate scatter plots depict the relationship between (a) N1 suppression and source activity…

Figure 5
Bivariate scatter plots depict the relationship between (a) N1 suppression and source activity differences between Talking and Listening in IFG areas, and, (b) N1 suppression and source coherence between IFG area and primary auditory cortex. N1 suppression = N1 (Talking) – N1 (Listening). Source coherence difference=Source coherence (Talking) – Source Coherence (Listening).
Figure 5
Figure 5
Bivariate scatter plots depict the relationship between (a) N1 suppression and source activity differences between Talking and Listening in IFG areas, and, (b) N1 suppression and source coherence between IFG area and primary auditory cortex. N1 suppression = N1 (Talking) – N1 (Listening). Source coherence difference=Source coherence (Talking) – Source Coherence (Listening).

Source: PubMed

3
Abonneren