Influence of musical training on understanding voiced and whispered speech in noise

Dorea R Ruggles, Richard L Freyman, Andrew J Oxenham, Dorea R Ruggles, Richard L Freyman, Andrew J Oxenham

Abstract

This study tested the hypothesis that the previously reported advantage of musicians over non-musicians in understanding speech in noise arises from more efficient or robust coding of periodic voiced speech, particularly in fluctuating backgrounds. Speech intelligibility was measured in listeners with extensive musical training, and in those with very little musical training or experience, using normal (voiced) or whispered (unvoiced) grammatically correct nonsense sentences in noise that was spectrally shaped to match the long-term spectrum of the speech, and was either continuous or gated with a 16-Hz square wave. Performance was also measured in clinical speech-in-noise tests and in pitch discrimination. Musicians exhibited enhanced pitch discrimination, as expected. However, no systematic or statistically significant advantage for musicians over non-musicians was found in understanding either voiced or whispered sentences in either continuous or gated noise. Musicians also showed no statistically significant advantage in the clinical speech-in-noise tests. Overall, the results provide no evidence for a significant difference between young adult musicians and non-musicians in their ability to understand speech in noise.

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1. Mean proportion of correctly identified…
Figure 1. Mean proportion of correctly identified words as a function of signal-to-noise ratio (SNR).
Musician (M; circles) and non-musician (NM; squares) data from continuous noise (solid lines) and gated noise (dashed lines) trials are included for each of the three speech types, shown in the three panels. No significant differences between the groups were observed. Error bars indicate +/−1 standard error (s.e.) of the mean.
Figure 2. Average results for QuickSIN and…
Figure 2. Average results for QuickSIN and Adaptive HINT measures of speech perception in noise.
Black bars denote the average performance of musicians, and grey bars denote the average performance of the non-musician data. The QuickSIN measure (left group) indicates dB of SNR loss, relative to an ideal level of speech understanding in 4-speaker background babble. The Adaptive HINT measure (right group) indicates the threshold SNR for speech understanding in a continuous speech-shaped noise. In both cases, lower scores denote better performance. Error bars represent +/−1 s.e. of the mean.
Figure 3. Scatter plots of individual musician…
Figure 3. Scatter plots of individual musician speech-in-noise data as a function of years of musical training and age of onset of musical training.
Individual non-musician data are shown in the far left column of each row. Distributions of musician and non-musician scores overlap, despite the significant within-group correlations with duration of musical training (but not age of onset).
Figure 4. Fundamental-frequency difference limens.
Figure 4. Fundamental-frequency difference limens.
Musicians (open circles) demonstrate significantly better F0 discrimination limens than non-musicians (grey filled squares) for lowpass-filtered harmonic complexes with F0s centered around 110 Hz and 210 Hz. Error bars represent +/−1 s.e. of the mean.

References

    1. Spiegel MF, Watson CS (1984) Performance on frequency-discrimination tasks by musicians and nonmusicians. J Acoust Soc Am 76: 1690–1695.
    1. Kishon-Rabin L, Amir O, Vexler Y, Zaltz Y (2001) Pitch discrimination: Are professional musicians better than non-musicians? J Basic Clin Physiol Pharmacol 12: 125–144.
    1. Micheyl C, Delhommeau K, Perrot X, Oxenham AJ (2006) Influence of musical and psychoacoustical training on pitch discrimination. Hear Res 219: 36–47.
    1. Schneider P, Scherg M, Dosch HG, Specht HJ, Gutschalk A, et al. (2002) Morphology of Heschl’s gyrus reflects enhanced activation in the auditory cortex of musicians. Nat Neurosci 5: 688–694.
    1. Bidelman GM, Krishnan A, Gandour JT (2011) Enhanced brainstem encoding predicts musicians’ perceptual advantages with pitch. Eur J Neurosci 33: 530–538.
    1. Musacchia G, Sams M, Skoe E, Kraus N (2007) Musicians have enhanced subcortical auditory and audiovisual processing of speech and music. Proc Natl Acad Sci U S A 104: 15894–15898.
    1. Carcagno S, Plack CJ (2011) Subcortical plasticity following perceptual learning in a pitch discrimination task. J Assoc Res Otolaryngol 12: 89–100.
    1. Krishnan A, Xu Y, Gandour J, Cariani P (2005) Encoding of pitch in the human brainstem is sensitive to language experience. Brain Res Cogn Brain Res 25: 161–168.
    1. Freyman RL, Griffin AM, Oxenham AJ (2012) Intelligibility of whispered speech in stationary and modulated noise maskers. J Acoust Soc Am 132: 2514–2523.
    1. Binns C, Culling JF (2007) The role of fundamental frequency contours in the perception of speech against interfering speech. J Acoust Soc Am 122: 1765–1776.
    1. Miller SE, Schlauch RS, Watson PJ (2010) The effects of fundamental frequency contour manipulations on speech intelligibility in background noise. J Acoust Soc Am 128: 435–443.
    1. Skoe E, Kraus N (2012) A little goes a long way: how the adult brain is shaped by musical training in childhood. J Neurosci 32: 11507–11510.
    1. Wong PCM, Skoe E, Russo NM, Dees T, Kraus N (2007) Musical experience shapes human brainstem encoding of linguistic pitch patterns. Nat Neurosci 10: 420–422.
    1. Parbery-Clark A, Skoe E, Lam C, Kraus N (2009) Musician enhancement for speech-in-noise. Ear Hear 30: 653–661.
    1. Parbery-Clark A, Strait DL, Kraus N (2011) Context-dependent encoding in the auditory brainstem subserves enhanced speech-in-noise perception in musicians. Neuropsychologia 49: 3338–3345.
    1. Parbery-Clark A, Skoe E, Kraus N (2009) Musical experience limits the degradative effects of background noise on the neural processing of sound. J Neurosci 29: 14100–14107.
    1. Qin MK, Oxenham AJ (2003) Effects of simulated cochlear-implant processing on speech reception in fluctuating maskers. J Acoust Soc Am 114: 446–454.
    1. Lorenzi C, Gilbert G, Carn H, Garnier S, Moore BCJ (2006) Speech perception problems of the hearing impaired reflect inability to use temporal fine structure. Proc Natl Acad Sci U S A 103: 18866–18869.
    1. Hopkins K, Moore BCJ (2010) The importance of temporal fine structure information in speech at different spectral regions for normal-hearing and hearing-impaired subjects. J Acoust Soc Am 127: 1595–1608.
    1. Stickney GS, Zeng F–G, Litovsky R, Assmann P (2004) Cochlear implant speech recognition with speech maskers. J Acoust Soc Am 116: 1081–1091.
    1. Oxenham AJ, Simonson AM (2009) Masking release for low- and high-pass-filtered speech in the presence of noise and single-talker interference. J Acoust Soc Am 125: 457–468.
    1. Strelcyk O, Dau T (2009) Relations between frequency selectivity, temporal fine-structure processing, and speech reception in impaired hearing. J Acoust Soc Am 125: 3328–3345.
    1. Bernstein JGW, Grant KW (2009) Auditory and auditory-visual intelligibility of speech in fluctuating maskers for normal-hearing and hearing-impaired listeners. J Acoust Soc Am 125: 3358–3372.
    1. Bernstein JGW, Brungart DS (2011) Effects of spectral smearing and temporal fine-structure distortion on the fluctuating-masker benefit for speech at a fixed signal-to-noise ratio. J Acoust Soc Am 130: 473–488.
    1. Festen JM, Plomp R (1990) Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing. J Acoust Soc Am 88: 1725–1736.
    1. Peters RW, Moore BCJ, Baer T (1998) Speech reception thresholds in noise with and without spectral and temporal dips for hearing-impaired and normally hearing people. J Acoust Soc Am 103: 577–587.
    1. Nelson PB, Jin S-H (2004) Factors affecting speech understanding in gated interference: Cochlear implant users and normal-hearing listeners. J Acoust Soc Am 115: 2286–2294.
    1. Wechsler D (2011) Wechsler Abbreviated Scale of Intelligence - Second Edition. Bloomington, MN: Pearson.
    1. Helfer KS (1997) Auditory and auditory-visual perception of clear and conversational speech. J Speech, Lang Hear Res 40: 432–443.
    1. Studebaker GA (1985) A “rationalized” arcsine transform. J Speech Hear Res 28: 455–462.
    1. Killion MC, Niquette PA, Gudmundsen GI, Revit LJ, Banerjee S (2004) Development of a quick speech-in-noise test for measuring signal-to-noise ratio loss in normal-hearing and hearing-impaired listeners. J Acoust Soc Am 116: 2395–2405.
    1. Nilsson M, Soli SD, Sullivan JA (1994) Development of the Hearing in Noise Test for the measurement of speech reception thresholds in quiet and in noise. J Acoust Soc Am 95: 1085–1099.
    1. Soli SD, Wong LLN (2008) Assessment of speech intelligibility in noise with the Hearing in Noise Test. Int J Audiol 47: 356–361.
    1. Zendel BR, Alain C (2012) Musicians experience less age-related decline in central auditory processing. Psychol Aging 27: 410–417.
    1. Strait DL, Parbery-Clark A, Hittner E, Kraus N (2012) Musical training during early childhood enhances the neural encoding of speech in noise. Brain Lang 123: 191–201.
    1. Corrigall KA, Schellenberg EG, Misura NM (2013) Music training, cognition, and personality. Front Psychol 4: 222.
    1. Levitin DJ (2012) What does it mean to be musical? Neuron 73: 633–637.
    1. Oxenham AJ, Fligor BJ, Mason CR, Kidd G (2003) Informational masking and musical training. J Acoust Soc Am 114: 1543–1549.

Source: PubMed

3
Sottoscrivi