Cough event classification by pretrained deep neural network

Jia-Ming Liu, Mingyu You, Zheng Wang, Guo-Zheng Li, Xianghuai Xu, Zhongmin Qiu, Jia-Ming Liu, Mingyu You, Zheng Wang, Guo-Zheng Li, Xianghuai Xu, Zhongmin Qiu

Abstract

Background: Cough is an essential symptom in respiratory diseases. In the measurement of cough severity, an accurate and objective cough monitor is expected by respiratory disease society. This paper aims to introduce a better performed algorithm, pretrained deep neural network (DNN), to the cough classification problem, which is a key step in the cough monitor.

Method: The deep neural network models are built from two steps, pretrain and fine-tuning, followed by a Hidden Markov Model (HMM) decoder to capture tamporal information of the audio signals. By unsupervised pretraining a deep belief network, a good initialization for a deep neural network is learned. Then the fine-tuning step is a back propogation tuning the neural network so that it can predict the observation probability associated with each HMM states, where the HMM states are originally achieved by force-alignment with a Gaussian Mixture Model Hidden Markov Model (GMM-HMM) on the training samples. Three cough HMMs and one noncough HMM are employed to model coughs and noncoughs respectively. The final decision is made based on viterbi decoding algorihtm that generates the most likely HMM sequence for each sample. A sample is labeled as cough if a cough HMM is found in the sequence.

Results: The experiments were conducted on a dataset that was collected from 22 patients with respiratory diseases. Patient dependent (PD) and patient independent (PI) experimental settings were used to evaluate the models. Five criteria, sensitivity, specificity, F1, macro average and micro average are shown to depict different aspects of the models. From overall evaluation criteria, the DNN based methods are superior to traditional GMM-HMM based method on F1 and micro average with maximal 14% and 11% error reduction in PD and 7% and 10% in PI, meanwhile keep similar performances on macro average. They also surpass GMM-HMM model on specificity with maximal 14% error reduction on both PD and PI.

Conclusions: In this paper, we tried pretrained deep neural network in cough classification problem. Our results showed that comparing with the conventional GMM-HMM framework, the HMM-DNN could get better overall performance on cough classification task.

Figures

Figure 1
Figure 1
A simple example of RBM with 3 visible units and 4 hidden units.
Figure 2
Figure 2
The training process of combination of DNN and HMM.
Figure 3
Figure 3
Performances on patient dependent test set. The results are shown as a function of the number of layers and number of hidden units in each layer. The performance of baseline was generated from a conventional GMM-HMM model.
Figure 4
Figure 4
Performances on patient independent test set. The setting here is as same as Figure 3, except that these results are generated from a patient independent test set.

References

    1. Ma W, Yu L, Wang Y, Li X, Hanjing L, Qiu Z. Changes in health-related quality of life and clinical implications in chinese patients with chronic cough. Cough. 2009;5(1):7.
    1. Xu X-H, Yang Z-M, Chen Q, Yu L, Liang S-W, Lv H-J, Qiu Z-M. Therapeutic efficacy of baclofen in refractory gastroesophageal reflux-induced chronic cough. World J Gastroenterol. 2013;19(27):4386–4392.
    1. Xu X, Chen Q, Liang S, Hanjing L, Qiu Z. Successful resolution of refractory chronic cough induced bygastroesophageal reflux with treatment of baclofen. Cough. 2012;8(1):8.
    1. Hsu JY, Stone RA, Logan-Sinclair RB, Worsdell M, Busst CM, Chung KF. Coughing frequency in patients with persistent cough: assessment using a 24 hour ambulatory recorder. European Respiratory Journal. 1994;7(7):1246–1253.
    1. Barry SJ, Dane AD, Morice AH, Walmsley AD. The automatic recognition and counting of cough. Cough. 2006;2(1):8.
    1. Matos S, Birring SS, Pavord ID, Evans H. Detection of cough signals in continuous audio recordings using hidden markov models. Biomedical Engineering, IEEE Transactions on. 2006;53(6):1078–1083.
    1. Matos S, Birring SS, Pavord ID, Evans DH. An automated system for 24-h monitoring of cough frequency: the leicester cough monitor. Biomedical Engineering, IEEE Transactions on. 2007;54(8):1472–1479.
    1. Larson EC, Lee T, Liu S, Rosenfeld M, Patel SN. Proceedings of the 13th International Conference on Ubiquitous Computing. UbiComp '11. ACM, New York, NY, USA; 2011. Accurate and privacy preserving cough sensing using a low-cost microphone; pp. 375–384.
    1. Larson S, Comina G, Gilman RH, Tracey BH, Bravard M, López JW. Validation of an automated cough detection algorithm for tracking recovery of pulmonary tuberculosis patients. PloS one. 2012;7(10):46229.
    1. Mohamed A, Dahl GE, Hinton G. Acoustic modeling using deep belief networks. Audio, Speech, and Language Processing, IEEE Transactions on. 2012;20(1):14–22.
    1. Dahl GE, Yu D, Deng L, Acero A. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing. 2012;20(1):30–42.
    1. Hinton GE. Training products of experts by minimizing contrastive divergence. Neural computation. 2002;14(8):1771–1800.
    1. Erhan D, Bengio Y, Courville A, Manzagol P-A, Vincent P, Bengio S. Why does unsupervised pre-training help deep learning? The Journal of Machine Learning Research. 2010;11:625–660.
    1. Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. International Conference on Artificial Intelligence and Statistics. 2010. pp. 249–256.
    1. Boersma P. Praat, a system for doing phonetics by computer. Glot international. 2002;5(9/10):341–345.
    1. Morice AH, Fontana GA, Belvisi MG, Birring SS, Chung KF, Dicpinigaitis PV, Kastelik JA, McGarvey LP, Smith JA, Tatar M. et al.ERS guidelines on the assessment of cough. European Respiratory Journal. 2007;29(6):1256–1276.
    1. Young S, Evermann G, Gales M, Hain T, Kershaw D, Liu X, Moore G, Odell J, Ollason D, Povey D, The HTK book(for HTK version 3.4) 2006.
    1. Povey D, Ghoshal A, Boulianne G, Burget L, Glembek O, Goel N, Hannemann M, Motlicek P, Qian Y, Schwarz P, The kaldi speech recognition toolkit. Proc. ASRU. 2011. pp. 1–4.

Source: PubMed

3
Tilaa