Automated vocal analysis of naturalistic recordings from children with autism, language delay, and typical development

D K Oller, P Niyogi, S Gray, J A Richards, J Gilkerson, D Xu, U Yapanel, S F Warren, D K Oller, P Niyogi, S Gray, J A Richards, J Gilkerson, D Xu, U Yapanel, S F Warren

Abstract

For generations the study of vocal development and its role in language has been conducted laboriously, with human transcribers and analysts coding and taking measurements from small recorded samples. Our research illustrates a method to obtain measures of early speech development through automated analysis of massive quantities of day-long audio recordings collected naturalistically in children's homes. A primary goal is to provide insights into the development of infant control over infrastructural characteristics of speech through large-scale statistical analysis of strategically selected acoustic parameters. In pursuit of this goal we have discovered that the first automated approach we implemented is not only able to track children's development on acoustic parameters known to play key roles in speech, but also is able to differentiate vocalizations from typically developing children and children with autism or language delay. The method is totally automated, with no human intervention, allowing efficient sampling and analysis at unprecedented scales. The work shows the potential to fundamentally enhance research in vocal development and to add a fully objective measure to the battery used to detect speech-related disorders in early childhood. Thus, automated analysis should soon be able to contribute to screening and diagnosis procedures for early disorders, and more generally, the findings suggest fundamental methods for the study of language in natural environments.

Conflict of interest statement

Conflict of interest statement: The recordings and hardware/software development were funded by Terrance and Judi Paul, owners of the previous for-profit company Infoture. Dissolution of the company was announced February 10, 2009, and it was reconstituted as the not-for-profit LENA Foundation. All assets of Infoture were given to the LENA Foundation. Before dissolution of the company, D.K.O., P.N., and S.F.W. had received consultation fees for their roles on the Scientific Advisory Board of Infoture. J.A.R., J.G., and D.X. are current employees of the LENA Foundation. S.G. and U.Y. are affiliates and previous employees of Infoture/LENA Foundation. None of the authors has or has had any ownership in Infoture or the LENA Foundation.

Figures

Fig. 1.
Fig. 1.
Demographics. (A) Characteristics of the child groups and recordings. (B) Demographic parameters indicating that groups differed significantly in terms of gender, mother's education level (a strong indicator of socioeconomic status), and general development age determined by the Child Development Inventory (CDI) (40), an extensive parent questionnaire assessment obtained for 86% of the children, including all those in the matched samples. This CDI age equivalent score is based on 70 items distributed across all of the subscales of the CDI, including language, motor development, and social skills. The groups also differed in age and in scores on the LENA Developmental Snapshot, a 52-item parent questionnaire measure of communication and language, obtained for all participants. (C) Demographic parameters for subsamples matched on gender, mother's education level, and developmental age.
Fig. 2.
Fig. 2.
Results of correlational analysis and MLR analysis. (A) Correlations of acoustic parameter SVI/SCU ratio scores with age across the 1,486 recordings. In 10 of 12 cases, both typically developing and language-delayed children showed higher absolute values of correlations with age than children with autism (SI Appendix, Table S9 AF). All 12 correlations for the typically developing sample and seven of 12 for the language-delayed sample were statistically significant after Bonferroni correction (P < 0.004). The autistic sample, conversely, showed little evidence of development on the parameters; all correlations of acoustic parameters with age were lower than ±0.2. (B) MLR for typically developing and autism samples. Blue dots represent real and predicted age (i.e., “predicted vocal development age”) for 802 recordings of typically developing children based on SVI/SCU ratios for the 12 acoustic parameters (r = 0.762, R2 = 0.581). Red dots represent 351 recordings of children with autism, for which predicted vocal development ages (r = 0.175, R2 = 0.031) were determined based on the typically developing MLR model. Each red diamond represents the mean predicted vocal development level across recordings for one of the 77 children with autism. (C) MLR for the typically developing (blue) and language-delayed samples (333 recordings; gold squares for 49 individual child averages; r = 0.594, R2 = 0.353).
Fig. 3.
Fig. 3.
LDA with LOOCV indicating differentiation of child groups. Estimated classification probabilities were based on the 12 acoustic parameters. Results are displayed in bubble plots, with each bubble sized in proportion to the number of children at each x axis location (the number of children represented by the largest bubble in each line is labeled). The plots indicate classification probabilities (Left) and proportion-correct classification (Right) for LDA in (A and B) children with autism (red) versus typically developing children (blue) and (C and D) a combined group of children with autism (red) or language delay (gold) versus typically developing children (blue).
Fig. 4.
Fig. 4.
LDA showing group discrimination for various configurations and subsamples. (A) LDA data on the six binary configurations of the three groups for the entire sample (N = 232; LOOCV); subsamples matched on gender, mother's education level, and developmental level (n = 113; LOOCV); a different holdout method in which, instead of LOOCV, training was conducted on phase I (2006–2008) data (n = 138) and testing on phase II (2009) data (n = 94); and a testing sample based for each child on the first recording only (N = 232; LOOCV). (B) Bar graph comparisons for subsamples illustrating robustness of group differentiation in the autism versus typical development configuration only (based on LOOCV modeling). Means were calculated over logit-transformed posterior probabilities (PPs) of autism classification, then converted back to PPs. All comparisons showed robust group differentiation, including (from left to right) the entire sample (N = 232), boys (typical, n = 48; autism, n = 64), girls (typical, n = 58; autism, n = 13), children of higher socioeconomic status (SES) as indicated by mother's education level ≥6 on an eight-point scale (typical, n = 42; autism, n = 49) and lower SES (typical, n = 64; autism, n = 28). To assess the possibility that “language level” may have played a critical role in automated group differentiation, we compared 35 child pairs matched for developmental age (typical development group, mean age of 22.6 mo; autism group, mean age of 22.7 mo) on the Snapshot, a language/communication measure (SI Appendix, “Participant Groups and Recording Procedures”), and 46 child pairs matched on the raw score from a single subscale of the CDI (40), namely, the expressive language subscale (typical development group, mean score of 21.6; autism group, mean score of 21.5), and we found robust group differentiation on PPs. Similar results were obtained for 48 children in the autism sample whose parents reported they were using spoken words meaningfully and for typical and autism samples split at medians into subgroups of high or low language (High Lang, Low Lang) level on the Snapshot developmental quotient. A subsample of 29 children with autism for whom diagnosis had been based on the Autism Diagnostic Observation Schedule (ADOS) (41) and another of 24 children with autism diagnosed with the Childhood Autism Rating Scale (CARS) (42) also showed robust group discrimination for PPs. Finally, in phase II (typical, n = 30; autism, n = 77) administration of both the Child Behavior CheckList (CBCL) (43) and the Modified Checklist for Autism in Toddlers (MCHAT) (44) had supported group assignment based on diagnoses, and group differentiation of PPs for these children using the automated system was unambiguous.

Source: PubMed

3
구독하다