COVID-19 Artificial Intelligence Diagnosis Using Only Cough Recordings

Jordi Laguarta, Ferran Hueto, Brian Subirana, Jordi Laguarta, Ferran Hueto, Brian Subirana

Abstract

Goal: We hypothesized that COVID-19 subjects, especially including asymptomatics, could be accurately discriminated only from a forced-cough cell phone recording using Artificial Intelligence. To train our MIT Open Voice model we built a data collection pipeline of COVID-19 cough recordings through our website (opensigma.mit.edu) between April and May 2020 and created the largest audio COVID-19 cough balanced dataset reported to date with 5,320 subjects. Methods: We developed an AI speech processing framework that leverages acoustic biomarker feature extractors to pre-screen for COVID-19 from cough recordings, and provide a personalized patient saliency map to longitudinally monitor patients in real-time, non-invasively, and at essentially zero variable cost. Cough recordings are transformed with Mel Frequency Cepstral Coefficient and inputted into a Convolutional Neural Network (CNN) based architecture made up of one Poisson biomarker layer and 3 pre-trained ResNet50's in parallel, outputting a binary pre-screening diagnostic. Our CNN-based models have been trained on 4256 subjects and tested on the remaining 1064 subjects of our dataset. Transfer learning was used to learn biomarker features on larger datasets, previously successfully tested in our Lab on Alzheimer's, which significantly improves the COVID-19 discrimination accuracy of our architecture. Results: When validated with subjects diagnosed using an official test, the model achieves COVID-19 sensitivity of 98.5% with a specificity of 94.2% (AUC: 0.97). For asymptomatic subjects it achieves sensitivity of 100% with a specificity of 83.2%. Conclusions: AI techniques can produce a free, non-invasive, real-time, any-time, instantly distributable, large-scale COVID-19 asymptomatic screening tool to augment current approaches in containing the spread of COVID-19. Practical use cases could be for daily screening of students, workers, and public as schools, jobs, and transport reopen, or for pool testing to quickly alert of outbreaks in groups. General speech biomarkers may exist that cover several disease categories, as we demonstrated using the same ones for COVID-19 and Alzheimer's.

Keywords: AI diagnostics; COVID-19 screening; convolutional neural networks; deep learning; speech recognition.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.

Figures

Fig. 1.
Fig. 1.
Overview architecture of the COVID-19 discriminator with cough recordings as input, and COVID-19 diagnosis and longitudinal saliency map as output. A similar architecture was used for Alzheimer's .
Fig. 2.
Fig. 2.
The top orange line with a square shows the ROC curve for the set of subjects diagnosed with an official test with AUC (0.97), while the bottom blue curve with a circle shows the ROC curve for all subjects in the validation set. The square shows the chosen threshold with 98.5% sensitivity and 94.2% specificity on officially tested subjects, and the black circle shows the chosen threshold for high sensitivity (94.0%) on the whole validation set, although any point on the curve could be chosen depending on the use case.
Fig. 3.
Fig. 3.
A. The numbers on the x-axis describe the number of layers in the biomarker models fine-tuned to COVID-19. The fewer required to beat the baseline (which is the same architecture trained on COVID-19 discrimination without the pre-trained biomarker models) shows the relevance of each biomarker for COVID-19. “Complete: shows the final COVID-19 discriminator with all the biomarkers integrated. B. The white dotted part of the bar shows the performance gained when the Cough biomarker model is incorporated, while pre-trained denotes individually training the biomarker models for COVID-19 before integrating them into the multi-modal architecture on Fig. 1. C. shows the explainable saliency map derived from biomarker model predictions to longitudinally track patient progression and is analogous to the saliency map derived for Alzheimer's . OVBM denotes the final model diagnostic. The BrainOS section shows the model aggregated prediction for 1-4 coughs of a subject. The COVID-19 progress factor calculates based on the 1-4 cough predictions, a possible degree of severity from the quantity of acoustic information required for a confident diagnostic. The voting confidence and salient factor indicate, based on the composite predictions of individual biomarker models, the aggregate confidence and salient discrimination for each subject.
Fig. 4.
Fig. 4.
In cases where there are very few infected individuals, a group pre-screening tool can be derived from the COVID-19 OVBM model to accurately alert infected groups while avoiding false-positives as illustrated in the graph. With the current accuracy, shown in blue, a threshold of 3 positives in a group of 25 are required so that only 1% of groups of 25 with no cases are falsely labelled and therefore unnecessarily tested via expensive biological tests. In other words, in a campus with 2500 yet uninfected students, only 25 will have to be tested with biological methods until 3 people in a class of 25 catch the virus, in which case the screening will alert of the outbreak. The x-axis shows how the required number of positives in a group, 3 in this example, drops if the COVID-19 model accuracy improves. Each line shows percent of groups of 25 people falsely tagged with COVID-19 with a minimum number of COVID-19 positives in it. As a second example, assume a country like New Zealand, with very few COVID-19 cases, wanted to screen for new early outbreaks and to do so tested 50M inhabitants using a PCR or serology test with 99% specificity. The country would purchase 50M tests and obtain 500 000 false-positives. Meanwhile, assume a group test yielding a 99.9% test accuracy was used, i.e. requiring 5 positives instead of 3 in the example above. Of the, 2M groups of 25, only 2000 groups would be falsely tagged or 50 000 people. Hence, 0.1% of the cost and 0.1% of the false positives otherwise. The value of this group testing tool is that it enables organizations and countries to pre-screen its whole population daily, and rapidly locate incipiently infected groups, without the necessity of using an expensive PCR or serology test on each inhabitant.

References

    1. Barro R. J., Ursua J. F., and Weng J., “The coronavirus and the great influenza pandemic: Lessons from the “spanish flu” for the coronavirus's potential effects on mortality and economic activity,” Nat. Bur. Econ. Res., Cambridge, MA, USA, Tech. Rep. w26866, 2020.
    1. “Why your coronavirus test could cost 23 – or 2,315,” Jun. 2020. [Online]. Available:
    1. Tromberg B. J. et al., “Rapid scaling up of covid-19 diagnostic testing in the United States—The NIH radx initiative,” New Engl. J. Med., vol. 383, no. 11, pp. 1071–1077, 2020.
    1. La Marca A., Capuzzo M., Paglia T., Roli L., Trenti T., and Nelson S. M., “Testing for SARS-CoV-2 (COVID-19): A systematic review and clinical guide to molecular and serological in-vitro diagnostic assays,” Reprod. BioMed. Online, vol. 41, no. 4, pp. 483–499, Sep. 2020.
    1. Salathe M. et al., “Covid-19 epidemic in Switzerland: On the importance of testing, contact tracing and isolation,” Swiss Med. Weekly, vol. 150, no. 11-12, 2020, Paper w20225.
    1. Hunter D. J., “COVID-19 and the stiff upper lip—The pandemic response in the united kingdom,” New England J. Med., vol. 382, no. 16, 2020, Paper e31.
    1. Li L. et al., “Using artificial intelligence to detect COVID-19 and community acquired pneumonia based on pulmonary CT: Evaluation of the diagnostic accuracy,” Radiology, vol. 296, no. 2, pp. E65–E71, Aug. 2020.
    1. Mei X. et al., “Artificial intelligence–enabled rapid diagnosis of patients with COVID-19,” Nature Med., vol. 26, pp. 1224–1228, 2020.
    1. Imran A. et al., “AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app,” 2020, ArXiv: 2004.01275.
    1. Subirana B. et al., “Hi sigma, do I have the coronavirus?: Call for a new artificial intelligence approach to support health care professionals dealing with the COVID-19 pandemic,” 2020, arXiv:2004.06510.
    1. Laguarta J., Hueto F., Rajasekaran P., Sarma S., and Subirana B., “Longitudinal speech biomarkers for automated alzheimer's detection,” preprint, 2020, doi: 10.21203/-56078/v1.
    1. Fotuhi M., Mian A., Meysami S., and Raji C. A., “Neurobiology of COVID-19,” J. Alzheimer's Disease, vol. 76, no. 1, pp. 3–19, 2020, doi: 10.3233/JAD-200581.
    1. Moein S. T., Hashemian S. M., Mansourafshar B., KhorramTousi A., Tabarsi P., and Doty R. L., “Smell dysfunction: A biomarker for COVID-19,” in International Forum of Allergy and Rhinology. Hoboken, NJ, USA: Wiley Online Library, 2020.
    1. Mao L. et al., “Neurological manifestations of hospitalized patients with covid-19 in Wuhan, China: A retrospective case series study,” medRxiv, 2020, doi: 10.1101/2020.02.22.20026500.
    1. Modules, The Center for Brains, Minds, and Machines, 2020. [Online]. Available:
    1. Lyons J. et al., “James lyons/python speech features: Release v0.6.1,” Jan. 2020. [Online]. Available:
    1. Heckman P. R., Blokland A., and Prickaerts J., “From age related cognitive decline to alzheimer's disease: A translational overview of the potential role for phosphodiesterases,” in Phosphodiesterases: CNS Functions and Diseases . Berlin, Germany: Springer, 2017, pp. 135–168.
    1. Wirths O. and Bayer T., “Motor impairment in alzheimer's disease and transgenic alzheimer's disease mouse models,” Genes, Brain Behavior, vol. 7, no. 1, pp. 1–5, Feb. 2008.
    1. Galvin J., Tariot P., Parker M. W., and Jicha G., “Screen and intervene: The importance of early detection and treatment of Alzheimer's disease,” Med. Roundtable General Med. Ed., vol. 1, no. 1, pp. 50–58, 2020.
    1. Dodd J. W., “Lung disease as a determinant of cognitive decline and dementia,” Alzheimer's Res. Therapy, vol. 7, no. 1, p. 32, 2015.
    1. Chertkow H. and Bub D., “Semantic memory loss in dementia of alzheimer's type: What do various measures measure?,” Brain, vol. 113, no. 2, pp. 397–417, 1990.
    1. Subirana B., Bagiati A., and Sarma S., “On the forgetting of college academice: At “ebbinghaus speed”?,” Center Brains, Minds Mach., Cambridge, MA, USA, Tech. Rep. 068, 2017.
    1. Cano-Cordoba F., Sarma S., and Subirana B., “Theory of intelligence with forgetting: Mathematical theorems explaining human universal forgetting using “forgetting neural networks,” Center Brains, Minds Mach., Cambridge, MA, USA, Tech. Rep. 071, 2017.
    1. Reed W. J. and Hughes B. D., “From gene families and genera to incomes and internet file sizes: Why power laws are so common in nature,” Phys. Rev. E, vol. 66, no. 6, 2002, Art. no. 067103.
    1. Higenbottam T. and Payne J., “Glottis narrowing in lung disease,” Amer. Rev. Respiratory Disease, vol. 125, no. 6, pp. 746–750, 1982.
    1. Chang A. and Karnell M. P., “Perceived phonatory effort and phonation threshold pressure across a prolonged voice loading task: A study of vocal fatigue,” J. Voice, vol. 18, no. 4, pp. 454–466, 2004.
    1. Subirana B., “Call for a wake standard for artificial intelligence,” Commun. ACM, vol. 7, no. 63, pp. 32–35, Jul. 2020.
    1. He K., Zhang X., Ren S., and Sun J., “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.
    1. Panayotov V., Chen G., Povey D., and Khudanpur S., “Librispeech: An asr corpus based on public domain audio books,” in Proc. IEEE Int. Conf. Acoustics, Speech Signal Process. , 2015, pp. 5206–5210.
    1. Costa A. et al., “The need for harmonisation and innovation of neuropsychological assessment in neurodegenerative dementias in europe: Consensus document of the joint program for neurodegenerative diseases working group,” Alzheimer's Res. Therapy, vol. 9, no. 1, p. 27, 2017.
    1. Baldwin S. and Farias S. T., “Unit 10.3: Assessment of cognitive impairments in the diagnosis of alzheimer's disease,” in Current Protocols Neuroscience/Editorial Board, Crawley J. N. et al., pp. Unit10–Unit3, 2009.
    1. Livingstone S. R. and Russo F. A., “The Ryerson audio-visual database of emotional speech and song (ravdess): A dynamic, multimodal set of facial and vocal expressions in North American english,” PloS One, vol. 13, no. 5, 2018.
    1. Abeyratne U. R., Swarnkar V., Setyati A., and Triasih R., “Cough sound analysis can rapidly diagnose childhood pneumonia,” Ann. Biomed. Eng., vol. 41, no. 11, pp. 2448–2462, 2013.
    1. Pramono R. X. A., Imtiaz S. A., and Rodriguez-Villegas E., “A cough-based algorithm for automatic diagnosis of pertussis,” PloS One, vol. 11, no. 9, 2016.
    1. Pramono R. X. A., Imtiaz S. A., and Rodriguez-Villegas E., “A cough-based algorithm for automatic diagnosis of pertussis,” PloS One, vol. 11, no. 9, 2016, Paper e0162128.
    1. Porter P. et al., “A prospective multicentre study testing the diagnostic accuracy of an automated cough sound centred analytic system for the identification of common respiratory disorders in children,” Respiratory Res., vol. 20, no. 1, 2019, Art. no. 81.
    1. Botha G. et al., “Detection of tuberculosis by automatic cough sound analysis,” Physiological Meas., vol. 39, no. 4, 2018, Art. no. 045005.
    1. Windmon A. et al., “Tussiswatch: A smartphone system to identify cough episodes as early symptoms of chronic obstructive pulmonary disease and congestive heart failure,” IEEE J. Biomed. Health Informat., vol. 23, no. 4, pp. 1566–1573, Jul. 2019.
    1. Pham C., “Mobicough: Real-time cough detection and monitoring using low-cost mobile devices,” in Proc. Asian Conf. Intell. Inf. Database Syst., 2016, pp. 300–309.
    1. Quatieri T. T. T. F. and Palmer J. S., “A framework for biomarkers of COVID-19 based on coordination of speech-production subsystems,” IEEE Open J. Eng. Med. Biol., vol. 1, pp. 203–206, 2020.
    1. Lechien J. R. et al., “Clinical and epidemiological characteristics of 1,420 european patients with mild-to-moderate coronavirus disease 2019,” J. Internal Med., vol. 288, no. 3, pp. 335–344, 2020.
    1. Seifried E E. and Ciesek S., “Pool testing of SARSCOV-02 samples increases worldwide test capacities many times over,” 2020. [Online]. Available:
    1. Kavanagh M. M. et al., “Access to lifesaving medical resources for African countries: Covid-19 testing and response, ethics, and politics,” Lancet, vol. 395, no. 10238, pp. 1735–1738, 2020.
    1. Subirana B., Bivings R., and Sarma S., “Wake neutrality of artificial intelligence devices,” in Algorithms and Law, Ebers M. and Navas S., Eds. Cambridge, U.K.: Cambridge Univ. Press, 2020, pp. 235–268. ch. 9.

Source: PubMed

3
Abonneren