Automated Cough Assessment on a Mobile Platform

Mark Sterling, Hyekyun Rhee, Mark Bocko, Mark Sterling, Hyekyun Rhee, Mark Bocko

Abstract

The development of an Automated System for Asthma Monitoring (ADAM) is described. This consists of a consumer electronics mobile platform running a custom application. The application acquires an audio signal from an external user-worn microphone connected to the device analog-to-digital converter (microphone input). This signal is processed to determine the presence or absence of cough sounds. Symptom tallies and raw audio waveforms are recorded and made easily accessible for later review by a healthcare provider. The symptom detection algorithm is based upon standard speech recognition and machine learning paradigms and consists of an audio feature extraction step followed by a Hidden Markov Model based Viterbi decoder that has been trained on a large database of audio examples from a variety of subjects. Multiple Hidden Markov Model topologies and orders are studied. Performance of the recognizer is presented in terms of the sensitivity and the rate of false alarm as determined in a cross-validation test.

Figures

**Figure 1**
Representative audio waveforms from database. The top trace shows a long segment of silence followed by a cough. The bottom trace shows a segment of background noise. These are 16-bit PCM.wav files at a sampling rate of 11025 Hz.

**Figure 2**
32-point Mel-filterbank used in the feature extraction step.

**Figure 3**
Illustrative example of the audio features (MFCCs and log-energy) seen by the classifier.

**Figure 4**
Image of the transition map of the composite HMM. The matrix has a block structure with each block corresponding to one of the sound classes, *background*, *cough*, *silence*. The intensity of each pixel in this image corresponds to the log-probability pij for a token to pass from the row to the column. The off-diagonal blocks correspond to transitions between the various sound classes.

**Figure 5**
The ergodic/connected HMM topology (a) and the left-to-right HMM topology (b). The connected topology allows a transition from one state to any other state. The left-to-right topology only allows a transition to the same state or to the next state up.

**Figure 6**
Grammar network used for constructing a composite HMM from silence, *cough*, and background HMMs. This parallel structure is the most simple and freest way to combine the individual HMMs. The gray boxes represent dummy states that route the outputs to the inputs (and, importantly, do not cost a time unit).

**Figure 7**
Example Viterbi decoding pass for audio containing coughs. In this case the background model had 6 states with 60 mixtures per state while the cough model had 5 states with 60 mixtures per state with a left-to-right topology. The horizontal axis represents time (at the frame rate) while the vertical axis is an integer of the state index. Gray marks indicate states associated with the silence and background models. Black marks indicate states associated with the cough model.

**Figure 8**
Screenshots of the ADAM application showing live capture and processing of audio data. The top traces are time-domain audio waveforms and the bottom traces indicate the presence of coughs. The left screenshot shows a sequence of coughs being recognized while the right screenshot shows some example speech being discounted. This is a demonstration mode used for debugging that is normally inaccessible to patients; hence the interface is somewhat rough. The total length of audio represented is 6 seconds in each screenshot. The middle number at the upper-left is the actual cough count which is determined by finding the instances where the last cough state is followed by the first state of any model in the grammar.

References

1. Smith J. Ambulatory methods for recording cough. Pulmonary Pharmacology and Therapeutics. 2007;20(4):313–318. doi: 10.1016/j.pupt.2006.10.016.
1. Shin S.-H., Hashimoto T., Hatano S. Automatic detection system for cough sounds as a symptom of abnormal health condition. IEEE Transactions on Information Technology in Biomedicine. 2009;13(4):486–493. doi: 10.1109/TITB.2008.923771.
1. Matos S., Birring S. S., Pavord I. D., Evans D. H. Detection of cough signals in continuous audio recordings using hidden Markov models. IEEE Transactions on Biomedical Engineering. 2006;53(6):1078–1083. doi: 10.1109/TBME.2006.873548.
1. Murata A., Ohota N., Shibuya A., Ono H., Kudoh S. New non-invasive automatic cough counting program based on 6 types of classified cough sounds. Internal Medicine. 2006;45(6):391–397. doi: 10.2169/internalmedicine.45.1449.
1. Krajnik M., Damps-Konstanska I., Gorska L., Jassem E. A portable automatic cough analyser in the ambulatory assessment of cough. BioMedical Engineering Online. 2010;9, article 17 doi: 10.1186/1475-925X-9-17.
1. Goldshtein E., Tarasiuk A., Zigel Y. Automatic detection of obstructive sleep apnea using speech signals. IEEE Transactions on Biomedical Engineering. 2011;58(5):1373–1382. doi: 10.1109/TBME.2010.2100096.
1. Arias-Londoño J. D., Godino-Llorente J. I., Sáenz-Lechón N., Osma-Ruiz V., Castellanos-Domínguez G. Automatic detection of pathological voices using complexity measures, noise parameters, and mel-cepstral coefficients. IEEE Transactions on Biomedical Engineering. 2011;58(2):370–379. doi: 10.1109/TBME.2010.2089052.
1. Fonseca E., Pereira J. Normal versus pathological voice signals. IEEE Engineering in Medicine and Biology Magazine. 2009;28(5):44–48. doi: 10.1109/MEMB.2009.934248.
1. Ozdas A., Shiavi R. G., Silverman S. E., Silverman M. K., Wilkes D. M. Investigation of vocal jitter and glottal flow spectrum as possible cues for depression and near-term suicidal risk. IEEE Transactions on Biomedical Engineering. 2004;51(9):1530–1540. doi: 10.1109/TBME.2004.827544.
1. Price J. H., Khubchandani J., McKinney M., Braun R. Racial/Ethnic disparities in chronic diseases of youths and access to health care in the united states. BioMed Research International. 2013;2013:12. doi: 10.1155/2013/787616.787616
1. Rhee H., Belyea M. J., Elward K. S. Patterns of asthma control perception in adolescents: associations with psychosocial functioning. Journal of Asthma. 2008;45(7):600–606. doi: 10.1080/02770900802126974.
1. Rhee H., Wenzel J., Steeves R. H. Adolescents psychosocial experiences living with asthma: a focus group study. Journal of Pediatric Health Care. 2007;21(2):99–107. doi: 10.1016/j.pedhc.2006.04.005.
1. Wildhaber J., Carroll W. D., Brand P. L. P. Global impact of asthma on children and adolescents' daily lives: the room to breathe survey. Pediatric Pulmonology. 2012;47(4):346–357. doi: 10.1002/ppul.21557.
1. Davis K. J., DiSantostefano R., Peden D. B. Is Johnny wheezing? Parent-child agreement in the Childhood Asthma in America survey. Pediatric Allergy and Immunology. 2011;22(1):31–35. doi: 10.1111/j.1399-3038.2010.01016.x.
1. Cai L.-H., Lu L., Hanjalic A., Zhang H.-J., Cai L. A flexible framework for key audio effects detection and auditory context inference. IEEE Transactions on Audio, Speech and Language Processing. 2006;14(3):1026–1038. doi: 10.1109/TSA.2005.857575.
1. Karnjanadecha M., Zahorian S. A. Signal modeling for high-performance robust isolated word recognition. IEEE Transactions on Speech and Audio Processing. 2001;9(6):647–654. doi: 10.1109/89.943342.
1. Wilpon J. G., Rabiner L. R., Lee C., Goldman E. R. Automatic recognition of keywords in unconstrained speech using hidden Markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1990;38(11):1870–1878. doi: 10.1109/29.103088.
1. Jain A. K., Duin R. P. W., Mao J. Statistical pattern recognition: a review. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2000;22(1):4–37. doi: 10.1109/34.824819.
1. Morgan N., Bourlard H. Continuous speech recognition. IEEE Signal Processing Magazine. 1995;12(3):24–42. doi: 10.1109/79.382443.
1. Wold E., Blum T., Keislar D., Wheaton J. Content-based classification, search, and retrieval of audio. IEEE Multimedia. 1996;3(3):27–36. doi: 10.1109/93.556537.
1. Doğan E., Sert M., Yazici A. Content-based classification and segmentation of mixed-type audio by using MPEG-7 features. Proceedings of the 1st International Conference on Advances in Multimedia (MMEDIA '09); July 2009; pp. 152–157.
1. Ganapathiraju A., Hamaker J. E., Picone J. Applications of support vector machines to speech recognition. IEEE Transactions on Signal Processing. 2004;52(8):2348–2355. doi: 10.1109/TSP.2004.831018.
1. Rabiner L. R. Tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE. 1989;77(2):257–286. doi: 10.1109/5.18626.
1. Young S., Evermann G., Gales M., et al. The HTK Book (for HTK Version 3.4) Cambridge, UK: Department of Engineering, University of Cambridge; 2006. (1em plus 0.5em minus 0.4em).
1. Matos S., Birring S. S., Pavord I. D., Evans D. H. An automated system for 24-h monitoring of cough frequency: the Leicester cough monitor. IEEE Transactions on Biomedical Engineering. 2007;54(8):1472–1479. doi: 10.1109/TBME.2007.900811.
1. Rhee H., Miner S., Sterling M., Halterman S. J., Fairbanks E. The development of an automated device for asthma monitoring for adolescents: Methodologic approach and user acceptability. JMIR mHealth uHealth. 2014;2(2):p. e27. doi: 10.2196/mhealth.3118.
1. Lee C., Hyun D., Choi E., Go J., Lee C. Optimizing feature extraction for speech recognition. IEEE Transactions on Speech and Audio Processing. 2003;11(1):80–87. doi: 10.1109/TSA.2002.805644.
1. O'Shaughnessy D. Acoustic analysis for automatic speech recognition. Proceedings of the IEEE. 2013;101(5):1038–1053. doi: 10.1109/JPROC.2013.2251592.
1. Young S., Russel N. H., Thornton J. H. S. A. Cambridge University; 1989. Token passing: a simple conceptual model for connected speech recognition systems.
1. Adamson C., Avila K. Learning Core Audio: A Hands-On Guide to Audio Programming for Mac and iOS. New York, NY, USA: Addison-Wesley; 2012.

Source: PubMed

Automated Cough Assessment on a Mobile Platform

Abstract

Figures

References

Patrocinadores y Colaboradores

Condiciones médicas

Intervenciones de drogas