Perception of Rhythmic Speech Is Modulated by Focal Bilateral Transcranial Alternating Current Stimulation

Benedikt Zoefel, Isobella Allard, Megha Anil, Matthew H Davis, Benedikt Zoefel, Isobella Allard, Megha Anil, Matthew H Davis

Abstract

Several recent studies have used transcranial alternating current stimulation (tACS) to demonstrate a causal role of neural oscillatory activity in speech processing. In particular, it has been shown that the ability to understand speech in a multi-speaker scenario or background noise depends on the timing of speech presentation relative to simultaneously applied tACS. However, it is possible that tACS did not change actual speech perception but rather auditory stream segregation. In this study, we tested whether the phase relation between tACS and the rhythm of degraded words, presented in silence, modulates word report accuracy. We found strong evidence for a tACS-induced modulation of speech perception, but only if the stimulation was applied bilaterally using ring electrodes (not for unilateral left hemisphere stimulation with square electrodes). These results were only obtained when data were analyzed using a statistical approach that was identified as optimal in a previous simulation study. The effect was driven by a phasic disruption of word report scores. Our results suggest a causal role of neural entrainment for speech perception and emphasize the importance of optimizing stimulation protocols and statistical approaches for brain stimulation research.

Conflict of interest statement

Conflict of Interest: The authors declare no competing financial interests.

Figures

Figure 1. Experimental paradigm.
Figure 1. Experimental paradigm.
A. Rhythmic noise-vocoded speech sounds, consisting of five one-syllable words, were presented so that the p-center of all syllables (dashed lines) fell at one of eight different phases of simultaneously applied tACS. After each trial, participants selected images that correspond to the second, third and fourth word they thought to have heard, with eight possible options for each word. B. Electrode configuration in Experiment 1 (unilateral) and 2 (bilateral). For both configurations, connector positions are shown in black. For the bilateral configurations, parts of the outer electrodes covered by isolating tape are colored black. C. Blocked design used in both experiments. Participants completed 4 x 5 runs, with two possible run orders as shown here and described in detail in Section Experimental Design. D. Proportion of 16-channel vocoded speech used to construct vocoded stimuli (for which 16- and 1-channel speech were mixed; see Section Stimuli), separately for each participant and the four adaptive runs in the two experiments.
Figure 2. Stimulus construction.
Figure 2. Stimulus construction.
A. For each of the clear speech sentences (spectrogram for one example sentence is color-coded and shown in both panels), amplitude envelopes (blue lines) were extracted for 16 frequency bands (top) as well as the broadband signal (bottom). B. Each of the 16 narrowband envelopes was mixed with the broadband envelope in proportion p (0.5 for the example shown in B-D). C. Each of the resulting envelopes (shown in B) was used to modulate noise in the respective frequency band. D. The resulting signals were re-combined to yield a 16-/1-channel vocoded speech mix. For this form of vocoded speech, high values of the mixing proportion (p = 1) results in 16-channel vocoded speech which is highly intelligible, low mixing proportions (p = 0) results in 1-channel vocoded speech which is entirely unintelligible. Intermediate proportions result in intermediate intelligibility.
Figure 3. Statistical Protocols.
Figure 3. Statistical Protocols.
Both analyses (A: pre-registered analysis; B: optimal analysis, identified in Zoefel et al., 2019) are illustrated based on two (simulated) example subjects, and the average across 20 (simulated) subjects. Open circles show phase bins used for alignment. In all panels, the pi/-pi bin is plotted twice for visualization purposes.
Figure 4. Main results.
Figure 4. Main results.
A-C. Average word report accuracy (as Log Odd’s Ratio, LOR) as a function of the phase relation between tACS and speech rhythm. As required by the two applied statistical analyses (see Section Data Analyses and Fig. 3), individual data was aligned in different ways before being averaged across participants, shown in rows A-C. Bins used for alignment, by definition maximal or minimal, are shown as open circles. Shaded areas show standard error of mean (SEM) after between-participant variation has been removed as appropriate for repeated-measures comparisons of different phase bins (Cousineau, 2005). The pi/-pi bin is plotted twice for visualization purposes. D-I. Distribution of values (relevant phase bins are color-coded in A-C) which are compared to 0 to test for a phasic modulation (D,E), enhancement (F,G), or disruption (H,I) of speech perception, respectively. Dots show data from single participants, mean and confidence interval (1.96*SEM) are shown by red lines and red-colored areas, respectively. In all panels, right y-axes show LOR converted into changes in word report accuracy, given, on average, 50% correctly identified target words. Note that these changes are expressed relative to the Sham condition (A-C, F-I) or relate two phase bins in the Stim condition (D,E). In panel E, LOR difference values (d1-d2) and corresponding changes in accuracy were divided by two, to take into account the fact that this difference involves two comparisons of phase bins (d1, cf. panel B, and d2, cf. panel C). This was not necessary for the corresponding statistical test (see Section Data Analyses) which is unaffected by such scaling factors. Figure panels shaded grey correspond to those relevant for the pre-registered analysis.

Source: PubMed

3
Abonneren