Phonatory and articulatory representations of speech production in cortical and subcortical fMRI responses

Joao M Correia, César Caballero-Gaudes, Sara Guediche, Manuel Carreiras, Joao M Correia, César Caballero-Gaudes, Sara Guediche, Manuel Carreiras

Abstract

Speaking involves coordination of multiple neuromotor systems, including respiration, phonation and articulation. Developing non-invasive imaging methods to study how the brain controls these systems is critical for understanding the neurobiology of speech production. Recent models and animal research suggest that regions beyond the primary motor cortex (M1) help orchestrate the neuromotor control needed for speaking, including cortical and sub-cortical regions. Using contrasts between speech conditions with controlled respiratory behavior, this fMRI study investigates articulatory gestures involving the tongue, lips and velum (i.e., alveolars versus bilabials, and nasals versus orals), and phonatory gestures (i.e., voiced versus whispered speech). Multivariate pattern analysis (MVPA) was used to decode articulatory gestures in M1, cerebellum and basal ganglia. Furthermore, apart from confirming the role of a mid-M1 region for phonation, we found that a dorsal M1 region, linked to respiratory control, showed significant differences for voiced compared to whispered speech despite matched lung volume observations. This region was also functionally connected to tongue and lip M1 seed regions, underlying its importance in the coordination of speech. Our study confirms and extends current knowledge regarding the neural mechanisms underlying neuromotor speech control, which hold promise to study neural dysfunctions involved in motor-speech disorders non-invasively.

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Description of the task. (A) Overview of the task: MRI session composed of 4 functional runs divided in trials separated by an inter-trial-interval of 16 s. In each trial, participants produced a given item 3 times. Items are disyllabic non-words (e.g.., bәbә). (B) Stimuli and laryngeal control: stimuli was balanced for place of articulation (bilabial and alveolar) and manner of articulation (orals and nasals), and the controlled vowel schwa (ә); for the voiced condition, the IA (interarytenoid) and LCA (lateral cricoarytenoid) laryngeal muscles are recruited, whereas the PCA (posterior cricoarytenoid) is not, and the reversed for the whispered condition. (C) Detail of task for a given trial: 0.9 s of silent gap were introduced between consecutive TRs for speech production without MRI noise; top: sound recording in black and low-pass-filtered signal envelope in red; below-left: spectrogram image of an utterance example; below-right: scatter plot of F1 and F2 formants in a given participant (each dot represents an utterance), red for voiced and blue for whispered speech.
Figure 2
Figure 2
Behavioral results. (A) Upper: Loudness per task across all participants. Red is voiced and blue is whispered speech. Bottom: Loudness per item repetition (3 items are produced per trial) across all participants for the voiced speech task. (B) Respiratory impulse response function (resp-IRF) using the average BOLD fluctuation within all cortical voxels. (C) Group results of the respiratory and articulatory recordings, red is voiced and blue is whispered speech: upper: voiced and whispered respiratory fluctuations (standard errors from the mean is shaded); gray horizontal bars refer to t-test differences (p < 0.05); ‘in’ and ‘ex’ depict inhale and exhale periods, respectively; middle: voiced and whispered articulatory fluctuations; bottom: combined respiratory and articulatory fluctuations.
Figure 3
Figure 3
Univariate fMRI results (uncorrected statistics). (A) Speech versus baseline. The central sulcus (CS), inferior parietal sulcus (IPS), supramarginal gyrus (SMG), Heschl’s gyrus (HG) and inferior frontal gyrus (IFG) are outlined to provide landmark references. Bottom: flat cerebellum map; black lines represent borders between cerebellar lobules. (B) Voiced versus whispered speech. Top arrows indicate the trunk motor area (TMA) and the bottom arrows indicate the dorsal laryngeal motor area (dLMA) found by this contrast. (C) Bilabial versus Alveolar conditions. Top arrows indicate the lip and the bottom arrows the tongue motor regions. Bottom cerebellum representation includes labels of the parcelated cerebellar lobules according to the SUIT atlas.
Figure 4
Figure 4
Multivariate fMRI group results. (A) MVPA ROI + RFE results for the three main contrasts: voiced versus whispered speech; bilabial versus alveolar; oral versus nasal. Classification is depicted by red bars and permutation chance-level by green bars. Black-colored asterisks (*) represent two-sided paired t-test of classification results against permutation chance-level (p < 0.05); red-colored asterisks represent FDR corrected statistics (q < 0.05) for multiple ROI tests. (B) Classification importance of the voxels within each ROI (using the RFE algorithm) projected onto the cortical and cerebellar maps. Multiple ROIs are projected simultaneously onto the maps for simplicity; the boundaries of the ROIs are indicated as colored lines and labelled in the top-left map. The sign of the voxel’s importance (positive or negative) represent their preference towards the first condition (positive values, warm colors) or second condition (negative values, cold colors). Bottom: maps of voxel’s importance in the cerebellum.
Figure 5
Figure 5
Summary of MVPA ROI + RFE results. Matrix reporting the accuracy difference between classification and permutation chance-level. Asterisks (*) indicate significant FDR q 

Figure 6

Cortical searchlight results. ( A–C…

Figure 6

Cortical searchlight results. ( A–C ) the main contrasts (voiced versus whispered, bilabial…

Figure 6
Cortical searchlight results. (A–C) the main contrasts (voiced versus whispered, bilabial versus alveolar, oral versus nasal). (D–F) task-based contrasts (voiced versus whispered speech) done separately by stimuli type (schwa, bilabial and alveolar). Top arrows indicate the consistency of TMA location; Bottom arrows indicate the consistency of LMA location. Arrows are placed equally on every map. (G–J) contrasts of each place of articulatory condition (bilabial and alveolar) versus the schwa condition for each task separately (voiced and whispered speech).

Figure 7

Functional connectivity results using beta-time-series…

Figure 7

Functional connectivity results using beta-time-series correlations based on individually localized lip and tongue…

Figure 7
Functional connectivity results using beta-time-series correlations based on individually localized lip and tongue motor regions. Circles indicate an approximate localization of the seed region in the group-inflated surface (seed regions were identified at the individual subject level using a p A) Seed is the left lip motor region. (B) Seed is the left tongue motor region.

Figure 8

Summary diagram of the cooperation…

Figure 8

Summary diagram of the cooperation required for voiced speech and possible experimental conditions…

Figure 8
Summary diagram of the cooperation required for voiced speech and possible experimental conditions used to isolate the sub-components of speech production (Respiration, Phonation and Articulation).
All figures (8)
Figure 6
Figure 6
Cortical searchlight results. (A–C) the main contrasts (voiced versus whispered, bilabial versus alveolar, oral versus nasal). (D–F) task-based contrasts (voiced versus whispered speech) done separately by stimuli type (schwa, bilabial and alveolar). Top arrows indicate the consistency of TMA location; Bottom arrows indicate the consistency of LMA location. Arrows are placed equally on every map. (G–J) contrasts of each place of articulatory condition (bilabial and alveolar) versus the schwa condition for each task separately (voiced and whispered speech).
Figure 7
Figure 7
Functional connectivity results using beta-time-series correlations based on individually localized lip and tongue motor regions. Circles indicate an approximate localization of the seed region in the group-inflated surface (seed regions were identified at the individual subject level using a p A) Seed is the left lip motor region. (B) Seed is the left tongue motor region.
Figure 8
Figure 8
Summary diagram of the cooperation required for voiced speech and possible experimental conditions used to isolate the sub-components of speech production (Respiration, Phonation and Articulation).

References

    1. Story BH. An overview of the physiology, physics and modeling of the sound source for vowels. Acoust. Sci. Technol. 2002;23:195–206. doi: 10.1250/ast.23.195.
    1. Rathelot J-A, Strick PL. Subdivisions of primary motor cortex based on cortico-motoneuronal cells. Proc. Natl. Acad. Sci. 2009;106:918–923. doi: 10.1073/pnas.0808362106.
    1. Tremblay Pascale, Deschamps Isabelle, Gracco Vincent L. Neurobiology of Language. 2016. Neurobiology of Speech Production; pp. 741–750.
    1. Foerster, O. The cerebral cortex in man. Lancet August, 309–312 (1931).
    1. Bouchard KE, Mesgarani N, Johnson K, Chang EF. Functional organization of human sensorimotor cortex for speech articulation. Nature. 2013;495:327–332. doi: 10.1038/nature11911.
    1. Mugler EM, et al. Direct classification of all American English phonemes using signals from functional speech motor cortex. J. Neural Eng. 2014;11:035015. doi: 10.1088/1741-2560/11/3/035015.
    1. Brown S, et al. The somatotopy of speech: Phonation and articulation in the human motor cortex. Brain Cogn. 2009;70:31–41. doi: 10.1016/j.bandc.2008.12.006.
    1. Simonyan K. The laryngeal motor cortex: Its organization and connectivity. Curr. Opin. Neurobiol. 2014;28:15–21. doi: 10.1016/j.conb.2014.05.006.
    1. Carey D, Krishnan S, Callaghan MF, Sereno MI, Dick F. Functional and Quantitative MRI Mapping of Somatomotor Representations of Human Supralaryngeal Vocal Tract. Cereb. Cortex. 2017;27:265–278. doi: 10.1093/cercor/bhx056.
    1. Catani M. A little man of some importance. Brain. 2017;140:3055–3061. doi: 10.1093/brain/awx270.
    1. Rampinini, A. C. et al. Functional and spatial segregation within the inferior frontal and superior temporal cortices during listening, articulation imagery, and production of vowels. Sci. Rep. 7, 1–13 (2017).
    1. Brown S, Ngan E, Liotti M. A larynx area in the human motor cortex. Cereb. Cortex. 2008;18:837–845. doi: 10.1093/cercor/bhm131.
    1. Shuster LI, Lemieux SK. An fMRI investigation of covertly and overtly produced mono- and multisyllabic words. Brain Lang. 2005;93:20–31. doi: 10.1016/j.bandl.2004.07.007.
    1. Loucks TMJ, Poletto CJ, Simonyan K, Reynolds CL, Ludlow CL. Human brain activation during phonation and exhalation: Common volitional control for two upper airway functions. Neuroimage. 2007;36:131–143. doi: 10.1016/j.neuroimage.2007.01.049.
    1. Moeller S, et al. Multiband multislice GE-EPI at 7 tesla, with 16-fold acceleration using partial parallel imaging with application to high spatial and temporal whole-brain fMRI. Magn. Reson. Med. 2010;63:1144–53. doi: 10.1002/mrm.22361.
    1. Feinberg David A., Moeller Steen, Smith Stephen M., Auerbach Edward, Ramanna Sudhir, Glasser Matt F., Miller Karla L., Ugurbil Kamil, Yacoub Essa. Multiplexed Echo Planar Imaging for Sub-Second Whole Brain FMRI and Fast Diffusion Imaging. PLoS ONE. 2010;5(12):e15710. doi: 10.1371/journal.pone.0015710.
    1. Caballero-Gaudes C, Reynolds RC. Methods for cleaning the BOLD fMRI signal. Neuroimage. 2017;154:128–149. doi: 10.1016/j.neuroimage.2016.12.018.
    1. Gracco VL, Tremblay P, Pike B. Imaging speech production using fMRI. Neuroimage. 2005;26:294–301. doi: 10.1016/j.neuroimage.2005.01.033.
    1. Chang C, Glover GH. Relationship between respiration, end-tidal CO2, and BOLD signals in resting-state fMRI. Neuroimage. 2009;47:1381–1393. doi: 10.1016/j.neuroimage.2009.04.048.
    1. Power JD. A simple but useful way to assess fMRI scan qualities. Neuroimage. 2017;154:150–158. doi: 10.1016/j.neuroimage.2016.08.009.
    1. Wise RG, Ide K, Poulin MJ, Tracey I. Resting fluctuations in arterial carbon dioxide induce significant low frequency variations in BOLD signal. Neuroimage. 2004;21:1652–1664. doi: 10.1016/j.neuroimage.2003.11.025.
    1. De Martino F, et al. Combining multivariate voxel selection and support vector machines for mapping and classification of fMRI spatial patterns. Neuroimage. 2008;43:44–58. doi: 10.1016/j.neuroimage.2008.06.037.
    1. Formisano E, De Martino F, Bonte M, Goebel R. ‘Who’ is saying ‘what’? Brain-based decoding of human voice and speech. Science (80-.). 2008;322:970–973. doi: 10.1126/science.1164318.
    1. Bonte M, Correia JM, Keetels M, Vroomen J, Formisano E. Reading-induced shifts of perceptual speech representations in auditory cortex. Sci. Rep. 2017;7:1–11. doi: 10.1038/s41598-017-05356-3.
    1. Kriegeskorte N, Goebel R, Bandettini P. Information-based functional brain mapping. Proc. Natl. Acad. Sci. 2006;103:3863–3868. doi: 10.1073/pnas.0600244103.
    1. Nambu A. Somatotopic organization of the primate basal ganglia. Front. Neuroanat. 2011;5:1–9. doi: 10.3389/fnana.2011.00026.
    1. Zeharia N, Hertz U, Flash T, Amedi A. Negative blood oxygenation level dependent homunculus and somatotopic information in primary motor cortex and supplementary motor area. Proc. Natl. Acad. Sci. 2012;109:18565–18570. doi: 10.1073/pnas.1119125109.
    1. Zeharia N, Hertz U, Flash T, Amedi A. New Whole-Body Sensory-Motor Gradients Revealed Using Phase-Locked Analysis and Verified Using Multivoxel Pattern Analysis and Functional Connectivity. J. Neurosci. 2015;35:2845–2859. doi: 10.1523/JNEUROSCI.4246-14.2015.
    1. Hickok G, Houde J, Rong F. Sensorimotor Integration in Speech Processing: Computational Basis and Neural Organization. Neuron. 2011;69:407–422. doi: 10.1016/j.neuron.2011.01.019.
    1. Yairi E, Ambrose N. Epidemiology of stuttering: 21st century advances. J. Fluency Disord. 2013;38:66–87. doi: 10.1016/j.jfludis.2012.11.002.
    1. Corey DM, Cuddapah VA. Delayed auditory feedback effects during reading and conversation tasks: Gender differences in fluent adults. J. Fluency Disord. 2008;33:291–305. doi: 10.1016/j.jfludis.2008.12.001.
    1. Tsunoda K, Niimi S, Hirose H. The Roles of the Posterior Cricoarytenoid and Thyropharyngeus Muscles in Whispered Speech. Folia Phoniatr. Logop. 1994;46:139–151. doi: 10.1159/000266306.
    1. Therrien, A. S., Lyons, J. & Balasubramaniam, R. Sensory Attenuation of Self-Produced Feedback: The Lombard Effect Revisited. PLoS One 7, (2012).
    1. Liu TT. Efficiency, power, and entropy in event-related fMRI with multiple trial types. Part II: Design of experiments. Neuroimage. 2004;21:401–413. doi: 10.1016/j.neuroimage.2003.09.031.
    1. Liu TT, Frank LR. Efficiency, power, and entropy in event-related fMRI with multiple trial types. Part I: Theory. Neuroimage. 2004;21:387–400. doi: 10.1016/j.neuroimage.2003.09.030.
    1. Cox RW. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput. Biomed. Res. 1996;29:162–73. doi: 10.1006/cbmr.1996.0014.
    1. Andersson JLR, Skare S, Ashburner J. How to correct susceptibility distortions in spin-echo echo-planar images: Application to diffusion tensor imaging. Neuroimage. 2003;20:870–888. doi: 10.1016/S1053-8119(03)00336-7.
    1. Diedrichsen J, Zotow E. Surface-based display of volume-averaged cerebellar imaging data. PLoS One. 2015;10:1–18. doi: 10.1371/journal.pone.0133402.
    1. Ackermann H, Hage SR, Ziegler W. Brain mechanisms of acoustic communication in humans and nonhuman primates: An evolutionary perspective. Behav. Brain Sci. 2014;37:529–546. doi: 10.1017/S0140525X13003099.
    1. Cox DD, Savoy RL. Functional magnetic resonance imaging (fMRI) ‘brain reading’: Detecting and classifying distributed patterns of fMRI activity in human visual cortex. Neuroimage. 2003;19:261–270. doi: 10.1016/S1053-8119(03)00049-1.
    1. Ontivero-Ortega M, Lage-Castellanos A, Valente G, Goebel R, Valdes-Sosa M. Fast Gaussian Naïve Bayes for searchlight classification analysis. Neuroimage. 2017;163:471–479. doi: 10.1016/j.neuroimage.2017.09.001.
    1. Chen Y, et al. Cortical surface-based searchlight decoding. Neuroimage. 2011;56:582–592. doi: 10.1016/j.neuroimage.2010.07.035.
    1. Rissman J, Gazzaley A, D’Esposito M. Measuring functional connectivity during distinct stages of a cognitive task. Neuroimage. 2004;23:752–763. doi: 10.1016/j.neuroimage.2004.06.035.
    1. Ackermann H, Wildgruber D, Daum I, Grodd W. Does the cerebellum contribute to cognitive aspects of speech production? A functional magnetic resonance imaging (fMRI) study in humans. Neurosci. Lett. 1998;247:187–190. doi: 10.1016/S0304-3940(98)00328-0.
    1. Niziolek CA, Nagarajan SS, Houde JF. What does motor efference copy represent? evidence from speech production. J. Neurosci. 2013;33:16110–16116. doi: 10.1523/JNEUROSCI.2137-13.2013.
    1. Belyk M, Brown S. The origins of the vocal brain in humans. Neurosci. Biobehav. Rev. 2017;77:177–193. doi: 10.1016/j.neubiorev.2017.03.014.
    1. Schulz GM, Varga M, Jeffires K, Ludlow CL, Braun AR. Functional neuroanatomy of human vocalization: An H215O PET study. Cereb. Cortex. 2005;15:1835–1847. doi: 10.1093/cercor/bhi061.
    1. Hickok G. Computational neuroanatomy of speech production. Nat. Rev. Neurosci. 2012;13:135–145. doi: 10.1038/nrn3158.
    1. Ackermann H, Mathiak K, Riecker A. The contribution of the cerebellum to speech production and speech perception: Clinical and functional imaging data. Cerebellum. 2007;6:202–213. doi: 10.1080/14734220701266742.
    1. Chartier J, Anumanchipalli GK, Johnson K, Chang EF. Encoding of Articulatory Kinematic Trajectories in Human Speech Sensorimotor Cortex. Neuron. 2018;98:1042–1054.e4. doi: 10.1016/j.neuron.2018.04.031.
    1. Mugler Emily M., Tate Matthew C., Livescu Karen, Templer Jessica W., Goldrick Matthew A., Slutzky Marc W. Differential Representation of Articulatory Gestures and Phonemes in Precentral and Inferior Frontal Gyri. The Journal of Neuroscience. 2018;38(46):9803–9813. doi: 10.1523/JNEUROSCI.1206-18.2018.
    1. Lenglet Christophe, Abosch Aviva, Yacoub Essa, De Martino Federico, Sapiro Guillermo, Harel Noam. Comprehensive in vivo Mapping of the Human Basal Ganglia and Thalamic Connectome in Individuals Using 7T MRI. PLoS ONE. 2012;7(1):e29153. doi: 10.1371/journal.pone.0029153.
    1. Guenther FH, Ghosh SS, Tourville JA. Neural modeling and imaging of the cortical interactions underlying syllable production. Brain Lang. 2006;96:280–301. doi: 10.1016/j.bandl.2005.06.001.
    1. Schwartze M, Kotz SA. Contributions of cerebellar event-based temporal processing and preparatory function to speech perception. Brain Lang. 2016;161:28–32. doi: 10.1016/j.bandl.2015.08.005.
    1. Buckner RL, Krienen FM, Castellanos A, Diaz JC, Yeo BTT. The organization of the human cerebellum estimated by intrinsic functional connectivity. J. Neurophysiol. 2011;106:2322–2345. doi: 10.1152/jn.00339.2011.
    1. Freitas J, Teixeira A, Silva S, Oliveira C, Dias MS. Detecting nasal vowels in speech interfaces based on surface electromyography. PLoS One. 2015;10:1–26.
    1. Lingala SG, Sutton BP, Miquel ME, Nayak KS. Recommendations for real-time speech MRI. J. Magn. Reson. Imaging. 2016;43:28–44. doi: 10.1002/jmri.24997.
    1. Paine TL, Conway CA, Malandraki GA, Sutton BP. Simultaneous dynamic and functional MRI scanning (SimulScan) of natural swallows. Magn. Reson. Med. 2011;65:1247–1252. doi: 10.1002/mrm.22824.
    1. Doya K. Complementary roles of basal ganglia and cerebellum in learning and motor control. Curr. Opin. Neurobiol. 2000;10:732–739. doi: 10.1016/S0959-4388(00)00153-7.

Source: PubMed

3
Subscribe