Automated assessment of psychiatric disorders using speech: A systematic review

Daniel M Low, Kate H Bentley, Satrajit S Ghosh, Daniel M Low, Kate H Bentley, Satrajit S Ghosh

Abstract

Objective: There are many barriers to accessing mental health assessments including cost and stigma. Even when individuals receive professional care, assessments are intermittent and may be limited partly due to the episodic nature of psychiatric symptoms. Therefore, machine-learning technology using speech samples obtained in the clinic or remotely could one day be a biomarker to improve diagnosis and treatment. To date, reviews have only focused on using acoustic features from speech to detect depression and schizophrenia. Here, we present the first systematic review of studies using speech for automated assessments across a broader range of psychiatric disorders.

Methods: We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines. We included studies from the last 10 years using speech to identify the presence or severity of disorders within the Diagnostic and Statistical Manual of Mental Disorders (DSM-5). For each study, we describe sample size, clinical evaluation method, speech-eliciting tasks, machine learning methodology, performance, and other relevant findings.

Results: 1395 studies were screened of which 127 studies met the inclusion criteria. The majority of studies were on depression, schizophrenia, and bipolar disorder, and the remaining on post-traumatic stress disorder, anxiety disorders, and eating disorders. 63% of studies built machine learning predictive models, and the remaining 37% performed null-hypothesis testing only. We provide an online database with our search results and synthesize how acoustic features appear in each disorder.

Conclusion: Speech processing technology could aid mental health assessments, but there are many obstacles to overcome, especially the need for comprehensive transdiagnostic and longitudinal studies. Given the diverse types of data sets, feature extraction, computational methodologies, and evaluation criteria, we provide guidelines for both acquiring data and building machine learning models with a focus on testing hypotheses, open science, reproducibility, and generalizability.

Level of evidence: 3a.

Keywords: machine learning; mental health; psychiatry; speech; voice.

Conflict of interest statement

The authors declare no potential conflict of interest.

© 2020 The Authors. Laryngoscope Investigative Otolaryngology published by Wiley Periodicals, Inc. on behalf of The Triological Society.

Figures

Figure 1
Figure 1
How machine learning works
Figure 2
Figure 2
PRISMA flow diagram of study inclusion and exclusion criteria for the systematic review
Figure 3
Figure 3
Synthesis of null‐hypothesis testing studies across psychiatric disorders. Acoustic features are color‐coded on the y‐axis into source features from the vocal folds (blue), filter features from the vocal tract (red), spectral features (purple), and prosodic or melodic features (black).56 Features that are significantly higher in a psychiatric population than healthy controls or that correlate positively with the severity of a disorder receive a score of 1 (red), features that are lower or correlate negatively receive a score of −1 (blue), and nonsignificant or contradicting findings receive a score of 0 (gray). The mean is computed for features with multiple results. The cell size is weighed by the amount of studies. Features not studied in a disorder are blank. Anxiety, social or general anxiety disorder; OCD, obsessive‐compulsive disorder; PTSD, post‐traumatic stress disorder
Figure 4
Figure 4
Glossary of acoustic features. Classification based on References 29 and 56. For further discussion, see the Geneva Minimalistic Acoustic Parameter Set (GeMAPS)57 and Section 4.3.3
Figure 5
Figure 5
Nested bootstrapping for more robust performance estimation on small datasets and hyperparameter tuning. Example uses RMSE as performance metric on 60 bootstrapping samples and 5‐fold cross‐validation. K‐fold cross‐validation assumes large sample sizes and on small datasets may return a biased estimate of the underlying performance distribution. RMSE, root mean squared error

References

    1. Merikangas KR, He J‐P, Burstein M, et al. Lifetime prevalence of mental disorders in U.S. adolescents: results from the National Comorbidity Survey Replication‐Adolescent Supplement (NCS‐A). J Am Acad Child Adolesc Psychiatry. 2010;49(10):980‐989.
    1. Substance Abuse and Mental Health Services Administration . Key Substance Use and Mental Health Indicators in the United States: Results from the 2017 National Survey on Drug Use and Health (HHS Publication No. SMA 18‐5068, NSDUH Series H‐53). Rockville, MD: Center for Behavioral Health Statistics and Quality, Substance Abuse and Mental Health Services Administration; 2018.
    1. Trautmann S, Rehm J, Wittchen H. The economic costs of mental disorders. EMBO Rep. 2016;17(9):1245‐1249.
    1. Substance Abuse and Mental Health Services Administration . Results from the 2014 National Survey on Drug Use and Health: Mental Health Findings, NSDUH Series H‐50, HHS Publication No.(SMA) 15‐4927. Rockville, MD: Substance Abuse and Mental Health Services Administration; 2015.
    1. Goessl VC, Curtiss JE, Hofmann SG. The effect of heart rate variability biofeedback training on stress and anxiety: a meta‐analysis. Psychol Med. 2017;47(15):2578‐2586.
    1. Miranda D, Calderón M, Favela J. Anxiety detection using wearable monitoring. In Proceedings of the 5th Mexican Conference on Human‐Computer Interaction. Oaxaca, Mexico: 2014.
    1. Williamson JR, Godoy E, Cha M, et al. Detecting depression using vocal, facial and semantic communication cues. Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge (AVEC '16), New York, NY: ACM; 2016:11‐18.
    1. Ringeval F, Schuller B, Valstar M, et al. AVEC 2019 workshop and challenge: state‐of‐mind, depression with AI, and cross‐cultural affect recognition. Proceedings of the 2019 on Audio/Visual Emotion Challenge and Workshop. ACM; Nice, France: 2019.
    1. Yang L, Li Y, Chen H, Jiang D, Oveneke MC, Sahli H. Bipolar disorder recognition with histogram features of arousal and body gestures. Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop (AVEC '18). NYC, USA: ACM; 2018:15‐21.
    1. Syed ZS, Sidorov K, Marshall D. Automated screening for bipolar disorder from audio/visual modalities. Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop (AVEC '18). NYC, USA: ACM; 2018:39‐45.
    1. Scherer S, Morency LP, Rizzo A. Multisense and SimSensei—a multimodal research platform for real‐time assessment of distress indicators. In: 2012 Conference, Arlington, VA, October 19.
    1. Gravenhorst F, Muaremi A, Bardram J, et al. Mobile phones as medical devices in mental disorder treatment: an overview. Pers Ubiquit Comput. 2015;19(2):335‐353.
    1. Maxhuni A, Muñoz‐Meléndez A, Osmani V, Perez H, Mayora O, Morales EF. Classification of bipolar disorder episodes based on analysis of voice and motor activity of patients. Pervasive Mob Comput. 2016;31:50‐66.
    1. Likforman‐Sulem L, Esposito A, Faundez‐Zanuy M, Clémençon S, Cordasco G. EMOTHAW: a novel database for emotional state recognition from handwriting and drawing. IEEE Trans Hum Machine Syst. 2017;47(2):273‐284.
    1. Ghosh SS, Baker JT. Will neuroimaging produce a clinical tool for psychiatry? Psychiatr Ann. 2019;49(5):209‐214.
    1. Patel MJ, Khalaf A, Aizenstein HJ. Studying depression using imaging and machine learning methods. Neuroimage Clin. 2016;10:115‐123.
    1. Librenza‐Garcia D, Kotzian BJ, Yang J, et al. The impact of machine learning techniques in the study of bipolar disorder: a systematic review. Neurosci Biobehav Rev. 2017;80:538‐554.
    1. Guntuku SC, Yaden DB, Kern ML, Ungar LH, Eichstaedt JC. Detecting depression and mental illness on social media: an integrative review. Curr Opin Behav Sci. 2017;18:43‐49.
    1. Calvo RA, Milne DN, Hussain MS, Christensen H. Natural language processing in mental health applications using non‐clinical texts. Nat Lang Eng. 2017;23(5):649‐685.
    1. Abbe A, Grouin C, Zweigenbaum P, Falissard B. Text mining applications in psychiatry: a systematic literature review. Int J Methods Psychiatr Res. 2016;25(2):86‐100.
    1. Mohr DC, Ho J, Duffecy J, et al. Perceived barriers to psychological treatments and their relationship to depression. J Clin Psychol. 2010; 66(4):394‐409.
    1. Turner RJ, Jay Turner R, Lloyd DA, Taylor J. Physical disability and mental health: an epidemiology of psychiatric and substance disorders. Rehabil Psychol. 2006;51(3):214‐223.
    1. Shalev A, Liberzon I, Marmar C. Post‐traumatic stress disorder. N Engl J Med. 2017;376(25):2459‐2469.
    1. Rathbone AL, Clarry L, Prescott J. Assessing the efficacy of mobile health apps using the basic principles of cognitive behavioral therapy: systematic review. J Med Internet Res. 2017;19(11):e399.
    1. Chekroud AM, Zotti RJ, Shehzad Z, et al. Cross‐trial prediction of treatment outcome in depression: a machine learning approach. Lancet Psychiatry. 2016;3(3):243‐250.
    1. Freedman R, Lewis DA, Michels R, et al. The initial field trials of DSM‐5: new blooms and old thorns. Am J Psychiatry. 2013;170(1):1‐5.
    1. Regier DA, Narrow WE, Clarke DE, et al. DSM‐5 field trials in the United States and Canada, part II: test‐retest reliability of selected categorical diagnoses. Am J Psychiatry. 2013;170(1):59‐70.
    1. Gideon J, Schatten HT, Mc Innis MG, Provost EM. Emotion recognition from natural phone conversations in individuals with and without recent suicidal ideation. In: The 20th Annual Conference of the International Speech Communication Association INTERSPEECH; Sep. 15‐19, Graz, Austria: 2019.
    1. Cummins N, Scherer S, Krajewski J, Schnieder S, Epps J, Quatieri TF. A review of depression and suicide risk assessment using speech analysis. Speech Commun. 2015;71:10‐49.
    1. Insel TR. The NIMH research domain criteria (RDoC) project: precision medicine for psychiatry. Am J Psychiatry. 2014;171(4):395‐397.
    1. Huang K, Wu C, Su M, Kuo Y. Detecting unipolar and bipolar depressive disorders from elicited speech responses using latent affective structure model. IEEE Trans Affect Comput. 2018;9:563‐577.
    1. Bedi G, Carrillo F, Cecchi GA, et al. Automated analysis of free speech predicts psychosis onset in high‐risk youths. NPJ Schizophr. 2015;1:15030.
    1. Faurholt‐Jepsen M, Busk J, Frost M, et al. Voice analysis as an objective state marker in bipolar disorder. Transl Psychiatry. 2016;6:e856.
    1. Bzdok D, Meyer‐Lindenberg A. Machine learning for precision psychiatry: opportunities and challenges. Biol Psychiatry Cogn Neurosci Neuroimaging. 2018;3(3):223‐230.
    1. Nahum‐Shani I, Smith SN, Spring BJ, et al. Just‐in‐time adaptive interventions (JITAIs) in mobile health: key components and design principles for ongoing health behavior support. Ann Behav Med. 2018;52(6):446‐462.
    1. Koh PW, Liang P. Understanding black‐box predictions via influence functions. Proceedings of the 34th International Conference on Machine Learning – ICML'17. Vol 70. Sydney, Australia: ; 2017:1885‐1894.
    1. Kleinberg J, Mullainathan S. Simplicity creates inequity: implications for fairness, stereotypes, and interpretability. 2019.
    1. Foster KR, Koprowski R, Skufca JD. Machine learning, medical diagnosis, and biomedical engineering research ‐ commentary. Biomed Eng Online. 2014;13:94.
    1. Char DS, Shah NH, Magnus D. Implementing machine learning in health care ‐ addressing ethical challenges. N Engl J Med. 2018;378(11):981‐983.
    1. Akhtar N, Mian A. Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access. 2018;6:14410‐14430.
    1. Regulation P. Regulation (EU) 2016/679 of the European Parliament and of the council. Regulation. 2016;679:2016.
    1. Goodman B, Flaxman S. European Union regulations on algorithmic decision‐making and a “right to explanation”. AI Mag. 2017;38(3):50‐57.
    1. Gunning D. Explainable Artificial Intelligence (xai). Defense Advanced Research Projects Agency (DARPA), nd Web 2017;2. . Accessed December 25, 2019.
    1. Denes PB, Pinson EN. The Speech Chain: The Physics and Biology of Spoken Language. Murray Hill, NJ: Bell Telephone Laboratories; 1963.
    1. Kraepelin E. Manic depressive insanity and paranoia. J Nerv Ment Dis. 1921;53(4):350.
    1. Snowden LR. Bias in mental health assessment and intervention: theory and evidence. Am J Public Health. 2003;93(2):239‐243.
    1. Ely JW, Graber ML, Croskerry P. Checklists to reduce diagnostic errors. Acad Med. 2011;86(3):307‐313.
    1. Cohen AS, Elvevåg B. Automated computerized analysis of speech in psychiatric disorders. Curr Opin Psychiatry. 2014;27(3):203‐209.
    1. Bzdok D, Ioannidis JPA. Exploration, inference, and prediction in neuroscience and biomedicine. Trends Neurosci. 2019;42(4):251‐262.
    1. Morales M, Scherer S, Levitan R. A cross‐modal review of indicators for depression detection systems. Proceedings of the Fourth Workshop on Computational Linguistics and Clinical Psychology—From Linguistic Signal to Clinical Reality; 2017:1‐12.
    1. Tokuno S. Pathophysiological voice analysis for diagnosis and monitoring of depression In: Kim Y‐K, ed. Understanding Depression. Clinical Manifestations, Diagnosis and Treatment. Vol 2 Singapore: Springer; 2018:83‐95.
    1. Cohen AS, Mitchell KR, Elvevåg B. What do we really know about blunted vocal affect and alogia? A meta‐analysis of objective assessments. Schizophr Res. 2014;159(2‐3):533‐538.
    1. Parola A, Simonsen A, Bliksted V, Fusaroli R. Voice patterns in schizophrenia: a systematic review and Bayesian meta‐analysis. Schizophr Res. In press.
    1. Moher D, Liberati A, Tetzlaff J, Altman DG, for the PRISMA Group . Preferred reporting items for systematic reviews and meta‐analyses: the PRISMA statement. BMJ. 2009;6(7):e1000097.
    1. American Psychiatric Association . Diagnostic and Statistical Manual of Mental Disorders (DSM‐5®). Washington, DC: American Psychiatric Publishing; 2013.
    1. Horwitz R, Quatieri TF, Helfer BS, Yu B, Williamson JR, Mundt J. On the relative importance of vocal source, system, and prosody in human depression. In: 2013 IEEE International Conference on Body Sensor Networks; 2013:1‐6.
    1. Eyben F, Scherer KR, Schuller BW, et al. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for voice research and affective computing. IEEE Trans Affect Comput. 2016;7(2):190‐202.
    1. Gilboa‐Schechtman E, Galili L, Sahar Y, Amir O. Being “in” or “out” of the game: subjective and acoustic reactions to exclusion and popularity in social anxiety. Front Hum Neurosci. 2014;8:147.
    1. Kiss G, Vicsi K. Mono‐ and multi‐lingual depression prediction based on speech processing. Int J Speech Technol. 2017;20(4):919‐935.
    1. Valstar M, Schuller B, Smith K, et al. Proceedings of the 3rd ACM International Workshop on Audio/Visual Emotion Challenge ‐ AVEC '13. Barcelona, Spain: 2013.
    1. Valstar M, Schuller B, Smith K, et al. Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge ‐ AVEC '14. Orlando, USA: 2014.
    1. Valstar M, Pantic M, Gratch J, et al. Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge ‐ AVEC '16. Amsterdam, USA: 2016.
    1. Ringeval F, Schuller B, Valstar M, et al. Real‐life depression, and affect recognition workshop and challenge. Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge (AVEC '17). Mountain View, USA: ACM; 2017:3‐9.
    1. Ringeval F, Schuller B, Valstar M, et al. AVEC 2018 workshop and challenge: bipolar disorder and cross‐cultural affect recognition. Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop (AVEC'18). Beijing, China: 2018:3‐13.
    1. Gratch J, Artstein R, Lucas GM, et al. The distress analysis interview corpus of human and computer interviews. In: LREC. Citeseer; 2014:3123‐3128.
    1. Gideon J, Provost EM, McInnis M. Mood state prediction from speech of varying acoustic quality for individuals with bipolar disorder. Proc IEEE Int Conf Acoust Speech Signal Process. Shanghai, China: March 20‐25, 2016:2359‐2363.
    1. Xing X, Cai B, Zhao Y, Li S, He Z, Fan W. Multi‐modality hierarchical recall based on GBDTs for bipolar disorder classification. Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop (AVEC'18). Beijing, China: 2018:31‐37.
    1. Afshan A, Guo J, Park SJ, Ravi V, Flint J, Alwan A. Effectiveness of voice quality features in detecting depression. Interspeech. Hyderabad, India: 2018;1676‐1680.
    1. Kächele M, Schels M, Schwenker F. Inferring depression and affect from application dependent meta knowledge. Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge (AVEC '14). Orlando, NYC: 2014:41‐48.
    1. Williamson JR, Quatieri TF, Helfer BS. Vocal and facial biomarkers of depression based on motor incoordination and timing. Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge; Orlando, NYC: 2014.
    1. Ho YC, Pepyne DL. Simple explanation of the no‐free‐lunch theorem and its implications. J Optimiz Theory Appl. 2002;115(3):549‐570.
    1. Quatieri TF, Malyska N. Vocal‐source biomarkers for depression: a link to psychomotor activity. In: Thirteenth Annual Conference of the International Speech Communication Association; Portland, USA, Sept. 9‐13: 2012.
    1. Marmar CR, Brown AD, Qian M, et al. Speech‐based markers for posttraumatic stress disorder in US veterans. Depress Anxiety. 2019;36(7):607‐616.
    1. Scherer S, Lucas GM, Gratch J, Skip Rizzo A, Morency L. Self‐reported symptoms of depression and PTSD are associated with reduced vowel space in screening interviews. IEEE Trans Affect Comput. 2016;7(1):59‐73.
    1. Xu R, Mei G, Zhang G, et al. A voice‐based automated system for PTSD screening and monitoring. Stud Health Technol Inform. 2012;173:552‐558.
    1. Kliper R, Vaizman Y, Weinshall D, Portuguese S. Evidence for depression and schizophrenia in speech prosody. In: Third ISCA Workshop on Experimental Linguistics; Saint‐Malo, France: June 19‐23, 2010.
    1. Kliper R, Portuguese S, Weinshall D. Prosodic analysis of speech and the underlying mental state In Serino S, Matic A, Giakoumis D, Lopez G, Cipresso, P (Eds.), Pervasive Computing Paradigms for Mental Health. NY, NY: Springer International Publishing; 2016:52‐62.
    1. Perlini C, Marini A, Garzitto M, et al. Linguistic production and syntactic comprehension in schizophrenia and bipolar disorder. Acta Psychiatr Scand. 2012;126(5):363‐376.
    1. Tahir Y, Yang Z, Chakraborty D, et al. Non‐verbal speech cues as objective measures for negative symptoms in patients with schizophrenia. PLoS One. 2019;14(4):e0214314.
    1. Rapcan V, D'Arcy S, Yeap S, Afzal N, Thakore J, Reilly RB. Acoustic and temporal analysis of speech: a potential biomarker for schizophrenia. Med Eng Phys. 2010;32(9):1074‐1079.
    1. Guidi A, Schoentgen J, Bertschy G, Gentili C, Scilingo EP, Vanello N. Features of vocal frequency contour and speech rhythm in bipolar disorder. Biomed Signal Process Control. 2017;37:23‐31.
    1. Guidi A, Scilingo EP, Gentili C, Bertschy G, Landini L, Vanello N. Analysis of running speech for the characterization of mood state in bipolar patients. 2015 AEIT International Annual Conference (AEIT); Naples, Italy: 2015.
    1. Zhang J, Pan Z, Gui C, et al. Analysis on speech signal features of manic patients. J Psychiatr Res. 2018;98:59‐63.
    1. Hamilton M. Development of a rating scale for primary depressive illness. Br J Soc Clin Psychol. 1967;6(4):278‐296.
    1. Young RC, Biggs JT, Ziegler VE, Meyer DA. A rating scale for mania: reliability, validity and sensitivity. Br J Psychiatry. 1978;133:429‐435.
    1. Weeks JW, Lee C‐Y, Reilly AR, et al. “The sound of fear”: assessing vocal fundamental frequency as a physiological indicator of social anxiety disorder. J Anxiety Disord. 2012;26(8):811‐822.
    1. Galili L, Amir O, Gilboa‐Schechtman E. Acoustic properties of dominance and request utterances in social anxiety. J Soc Clin Psychol. 2013;32(6):651‐673.
    1. Weeks JW, Srivastav A, Howell AN, Menatti AR. “Speaking more than words”: classifying men with social anxiety disorder via vocal acoustic analyses of diagnostic interviews. J Psychopathol Behav Assess. 2016;38(1):30‐41.
    1. Özseven T, Düğenci M, Doruk A, Kahraman Hİ. Voice traces of anxiety: acoustic parameters affected by anxiety disorder. Arch Acoust. 2018;43(4):625‐636.
    1. Silber‐Varod V, Kreiner H, Lovett R, Levi‐Belz Y, Amir N. Do social anxiety individuals hesitate more? The prosodic profile of hesitation disfluencies in social anxiety disorder individuals. Proceedings of Speech Prosody. Boston, USA: 2016:1211‐1215.
    1. Cassol M, Reppold CT, Ferrão Y, Gurgel LG, Almada CP. Análise de características vocais e de aspectos psicológicos em indivíduos com transtorno obsessivo‐compulsivo [Analysis of vocal characteristics and psychological aspects in individuals with obsessive‐compulsive disorder]. Rev Soc Bras Fonoaudiol. 2010;15(4):491‐496.
    1. Cielo CA, Didoné DD, Torres EMO, de Moraes JP. Laryngopharyngeal reflux and bulimia nervosa: laryngeal and voice disorders. Rev CEFAC. 2011;13(2):352‐361.
    1. Rajiah K, Mathew EM, Veettil SK, Kumar S. Bulimia nervosa and its relation to voice changes in young adults: a simple review of epidemiology, complications, diagnostic criteria and management. J Res Med Sci. 2012;17(7):689‐693.
    1. Balata P, Colares V, Petribu K, Leal M de C. Bulimia nervosa as a risk factor for voice disorders–literature review. Braz J Otorhinolaryngol. 2008;74(3):447‐451.
    1. Rothstein SG, Rothstein JM. Bulimia: the otolaryngology head and neck perspective. Ear Nose Throat J. 1992;71(2):78‐80.
    1. Rothstein SG. Reflux and vocal disorders in singers with bulimia. J Voice. 1998;12(1):89‐90.
    1. Maciejewska B, Rajewska‐Rager A, Maciejewska‐Szaniec Z, Michalak M, Rajewski A, Wiskirska‐Woźnica B. The assessment of the impact of anorexia nervosa on the vocal apparatus in adolescent girls—a preliminary report. Int J Pediatr Otorhinolaryngol. 2016;85:141‐147.
    1. Garcia‐Santana C, Capilla P, Blanco A. Alterations in tone of voice in patients with restrictive anorexia nervosa: a pilot study. Clin Salud. 2016;27(2):71‐87.
    1. Kächele M, Schels M, Schwenker F. The influence of annotation, corpus design, and evaluation on the outcome of automatic classification of human emotions. Front ICT. 2016;3:17.
    1. Yang T‐H, Wu C‐H, Huang K‐Y, Su M‐H. Coupled HMM‐based multimodal fusion for mood disorder detection through elicited audio–visual signals. J Amb Intel Hum Comput. 2017;8(6):895‐906.
    1. Wang J, Sui X, Zhu T, Flint J. Identifying comorbidities from depressed people via voice analysis. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); Kansas City, USA: Nov. 13‐16, 2017:986‐991.
    1. Bernardini F, Lunden A, Covington M, et al. Associations of acoustically measured tongue/jaw movements and portion of time speaking with negative symptom severity in patients with schizophrenia in Italy and the United States. Psychiatry Res. 2016;239:253‐258.
    1. Arseniev‐Koehler A, Mozgai S, Scherer S. What type of happiness are you looking for? A closer look at detecting mental health from language. Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic; 2018:1‐12.
    1. Crawford AA, Lewis S, Nutt D, et al. Adverse effects from antidepressant treatment: randomised controlled trial of 601 depressed individuals. Psychopharmacology. 2014;231(15):2921‐2931.
    1. De Hert M, Detraux J, van Winkel R, Yu W, Correll CU. Metabolic and cardiovascular adverse effects associated with antipsychotic drugs. Nat Rev Endocrinol. 2011;8(2):114‐126.
    1. Pan W, Flint J, Shenhav L, et al. Re‐examining the robustness of voice features in predicting depression: compared with baseline of confounders. PLoS One. 2019;14(6):e0218172.
    1. Abu‐Mostafa YS, Magdon‐Ismail M, Lin HT. Learning from Data. Vol 4 New York, NY: AML Book; 2012.
    1. Ferreira CP, Gama ACC, Santos MAR, Maia MO. Laryngeal and vocal analysis in bulimic patients. Braz J Otorhinolaryngol. 2010;76(4):469‐477.
    1. Ziegler W. Task‐related factors in oral motor control: speech and oral diadochokinesis in dysarthria and apraxia of speech. Brain Lang. 2002;80(3):556‐575.
    1. Kiss G, Vicsi K. Comparison of read and spontaneous speech in case of automatic detection of depression. 2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom); Debrecen, Hungary: 2017:213‐218.
    1. Hashim NW, Wilkes M, Salomon R, Meggs J, France DJ. Evaluation of voice acoustics as predictors of clinical depression scores. J Voice. 2017;31(2):256.e1‐256.e6.
    1. Mundt JC, Snyder PJ, Cannizzaro MS, Chappie K, Geralts DS. Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology. J Neurolinguist. 2007;20(1):50‐64.
    1. Karam ZN, Baveja SS, Mcinnis M, Provost EM. Mood monitoring of bipolar disorder using speech analysis. US Patent June 2017. . Accessed July 30, 2019.
    1. Kane J, Aylett M, Yanushevskaya I, Gobl C. Phonetic feature extraction for context‐sensitive glottal source processing. Speech Commu. 2014;59:10‐21.
    1. van den Broek EL, van der Sluis F, Dijkstra T. Telling the story and re‐living the past: how speech analysis can reveal emotions in post‐traumatic stress disorder (PTSD) patients In: Westerink J, Krans M, Ouwerkerk M, eds. Sensing Emotions: The Impact of Context on Experience Measurements. Dordrecht, The Netherlands: Springer; 2011:153‐180.
    1. Scherer S, Stratou G, Gratch J, Morency L‐P. Investigating voice quality as a speaker‐independent indicator of depression and PTSD. Interspeech. Lyon, France: Aug. 25‐29, 2013;847‐851.
    1. Bylsma LM, Morris BH, Rottenberg J. A meta‐analysis of emotional reactivity in major depressive disorder. Clin Psychol Rev. 2008;28(4):676‐691.
    1. Alghowinem S, Goecke R, Wagner M, Epps J, Breakspear M, Parker G. Detecting depression: a comparison between spontaneous and read speech. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing; Vancouver, Canada: 2013:7547‐7551.
    1. DeVault D, Artstein R, Benn G, et al. SimSensei kiosk: a virtual human interviewer for healthcare decision support. Proceedings of the 2014 International Conference on Autonomous Agents and Multi‐Agent Systems (AAMAS '14). Paris, France: 2014:1061‐1068.
    1. Hartholt A, Traum D, Marsella SC, et al. All together now Intelligent Virtual Agents. Berlin, Germany: Springer; 2013:368‐381.
    1. Burton C, Tatar AS, McKinstry B, et al. Pilot randomised controlled trial of Help4Mood, an embodied virtual agent‐based system to support treatment of depression. J Telemed Telecare. 2016;22(6):348‐355.
    1. Cummins N, Epps J, Sethu V, Krajewski J. Variability compensation in small data: oversampled extraction of i‐vectors for the classification of depressed speech. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); Florence, Italy: 2014:970‐974.
    1. Eyben F, Wöllmer M, Schuller B. Opensmile: the Munich versatile and fast open‐source audio feature extractor. Proceedings of the 18th ACM International Conference on Multimedia (MM '10). Indianapolis, USA: 2010:1459‐1462.
    1. Nemes V, Nikolic D, Barney A, Garrard P. A feasibility study of speech recording using a contact microphone in patients with possible or probable Alzheimer's disease to detect and quantify repetitions in a natural setting. Alzheimers Dement. 2012;8(4):P490‐P491.
    1. McClure P, Zheng CY, Kaczmarzyk J. Distributed weight consolidation: a brain segmentation case study. Adv Neural Inf Process Syst 2018. Montreal, Canada: 2018.
    1. Smilkov D, Thorat N, Assogba Y, et al. TensorFlow.js: machine learning for the web and beyond. arXiv [csLG]. January 2019.
    1. Yarkoni T, Westfall J. Choosing prediction over explanation in psychology: lessons from machine learning. Perspect Psychol Sci. 2017;12(6):1100‐1122.
    1. Nosek BA, Ebersole CR, DeHaven AC, Mellor DT. The preregistration revolution. Proc Natl Acad Sci U S A. 2018;115(11):2600‐2606.
    1. Schuller B, Steidl S, Batliner A, et al. The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism. Proceedings INTERSPEECH 2013, 14th Annual Conference of the International Speech Communication Association, Lyon, France; 2013.
    1. Tausczik YR, Pennebaker JW. The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol. 2010;29(1):24‐54.
    1. Tacchetti M. User Guide for ELAN Linguistic Annotator; 2017. . Accessed on December 25, 2019.
    1. Saito T, Rehmsmeier M. The precision‐recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One. 2015;10(3):e0118432.
    1. Lipton ZC. The mythos of model interpretability. arXiv [csLG]. June 2016.
    1. Doshi‐Velez F, Kim B. Towards a rigorous science of interpretable machine learning. arXiv [statML]. February 2017.
    1. Molnar C. Interpretable machine learning. ; 2019. . Accessed December 25, 2019.
    1. Nori H, Jenkins S, Koch P, Caruana R. InterpretML: a unified framework for machine learning interpretability. arXiv [csLG]. September 2019.
    1. Lundberg SM, Lee S‐I. A unified approach to interpreting model predictions In: Guyon I, Luxburg UV, Bengio S, et al., eds. Advances in Neural Information Processing Systems. Vol.30 Red Hook, NY: Curran Associates; 2017:4765‐4774.
    1. Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell. 2019;1(5):206‐215.
    1. Boettiger C. An introduction to docker for reproducible research. Oper Syst Rev. 2015;49(1):71‐79.
    1. Kurtzer GM, Sochat V, Bauer MW. Singularity: scientific containers for mobility of compute. PLoS One. 2017;12(5):e0177459.
    1. Oakden‐Rayner L. AI Competitions Don't Produce Useful Models. . Published September 19, 2019. Accessed December 25, 2019.
    1. Mount J. A Deeper Theory of Testing. Win‐Vector Blog. . Published September 26, 2015. Accessed December 25, 2019.
    1. Blum A, Hardt M. The ladder: a reliable leaderboard for machine learning competitions. arXiv [csLG]. February 2015. .
    1. Alghowinem S, Goecke R, Epps R, Wagner M, Cohn J. Cross‐cultural depression recognition from vocal biomarkers. Interspeech. San Francisco, USA: Sept. 8‐12, 2016;1943‐1947.
    1. Mitra V, Shriberg E, Vergyri D, Knoth B, Salomon RM. Cross‐corpus depression prediction from speech. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); Brisbane, USA: 2015:4769‐4773.
    1. Cummins N, Sethu V, Epps J, Schnieder S, Krajewski J. Analysis of acoustic space variability in speech affected by depression. Speech Commun. 2015;75:27‐49.
    1. Stasak B, Epps J. Differential performance of automatic speech‐based depression classification across smartphones. In: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW); San Antonio, USA: 2017:171–175.
    1. Mitra V, Tsiartas A, Shriberg E. Noise and reverberation effects on depression detection from speech. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); Shanghai, China: 2016:5795‐5799.
    1. Karam ZN, Provost EM, Singh S, et al. Ecologically valid long‐term mood monitoring of individuals with bipolar disorder using speech. Proc IEEE Int Conf Acoust Speech Signal Process. Florence, Italy: 2014;2014:4858‐4862.
    1. Muaremi A, Gravenhorst F, Grünerbl A, Arnrich B, Tröster G. Assessing bipolar episodes using speech cues derived from phone calls In Serino S, Matic A, Giakoumis D, Lopez G, Cipresso P (eds.). Pervasive Computing Paradigms for Mental Health. NY, NY: Springer International Publishing; 2014:103‐114.
    1. Yang Y, Fairbairn C, Cohn JF. Detecting depression severity from vocal prosody. IEEE Trans Affect Comput. 2013;4(2):142‐150.
    1. He L, Jiang D, Sahli H. Multimodal depression recognition with dynamic visual and audio cues. In: 2015 International Conference on Affective Computing and Intelligent Interaction (ACII); Xian, China: 2015:260‐266.
    1. Acharya UR, Rajendra Acharya U, Oh SL, et al. Automated EEG‐based screening of depression using deep convolutional neural network. Comput Methods Programs Biomed. 2018;161:103‐113.
    1. Friston KJ, Redish AD, Gordon JA. Computational nosology and precision psychiatry. Comput Psychiatr. 2017;1:2‐23.
    1. Allsopp K, Read J, Corcoran R, Kinderman P. Heterogeneity in psychiatric diagnostic classification. Psychiatry Res. 2019;279:15‐22.
    1. Custers B. Click here to consent forever: expiry dates for informed consent. Big Data Soc. 2016;3(1):2053951715624935.
    1. Amodei D, Olah C, Steinhardt J, Christiano P, Schulman J, Mané D. Concrete problems in AI safety. arXiv [csAI]. June 2016.
    1. Baker MG, Kale R, Menken M. The wall between neurology and psychiatry. BMJ. 2002;324(7352):1468‐1469.
    1. Ibáñez A, García AM, Esteves S, et al. Social neuroscience: undoing the schism between neurology and psychiatry. Soc Neurosci. 2018;13(1):1‐39.
    1. Iverach L, O'Brian S, Jones M, et al. Prevalence of anxiety disorders among adults seeking speech therapy for stuttering. J Anxiety Disord. 2009;23(7):928‐934.
    1. Paolini AG. Trait anxiety affects the development of tinnitus following acoustic trauma. Neuropsychopharmacology. 2012;37(2):350‐363.
    1. Gomaa MAM, Elmagd MHA, Elbadry MM, Kader RMA. Depression, anxiety and stress scale in patients with tinnitus and hearing loss. Eur Arch Otorhinolaryngol. 2014;271(8):2177‐2184.
    1. Marmor S, Horvath KJ, Lim KO, Misono S. Voice problems and depression among adults in the United States. Laryngoscope. 2016;126(8):1859‐1864.
    1. Martinez CC, Cassol M. Measurement of voice quality, anxiety and depression symptoms after speech therapy. J Voice. 2015;29(4):446‐449.
    1. Lannin DG, Guyll M, Vogel DL, Madon S. Reducing the stigma associated with seeking psychotherapy through self‐affirmation. J Couns Psychol. 2013;60(4):508‐519.

Source: PubMed

3
Subscribe