Automatic Posture and Movement Tracking of Infants with Wearable Movement Sensors

Manu Airaksinen, Okko Räsänen, Elina Ilén, Taru Häyrinen, Anna Kivi, Viviana Marchi, Anastasia Gallen, Sonja Blom, Anni Varhe, Nico Kaartinen, Leena Haataja, Sampsa Vanhatalo, Manu Airaksinen, Okko Räsänen, Elina Ilén, Taru Häyrinen, Anna Kivi, Viviana Marchi, Anastasia Gallen, Sonja Blom, Anni Varhe, Nico Kaartinen, Leena Haataja, Sampsa Vanhatalo

Abstract

Infants' spontaneous and voluntary movements mirror developmental integrity of brain networks since they require coordinated activation of multiple sites in the central nervous system. Accordingly, early detection of infants with atypical motor development holds promise for recognizing those infants who are at risk for a wide range of neurodevelopmental disorders (e.g., cerebral palsy, autism spectrum disorders). Previously, novel wearable technology has shown promise for offering efficient, scalable and automated methods for movement assessment in adults. Here, we describe the development of an infant wearable, a multi-sensor smart jumpsuit that allows mobile accelerometer and gyroscope data collection during movements. Using this suit, we first recorded play sessions of 22 typically developing infants of approximately 7 months of age. These data were manually annotated for infant posture and movement based on video recordings of the sessions, and using a novel annotation scheme specifically designed to assess the overall movement pattern of infants in the given age group. A machine learning algorithm, based on deep convolutional neural networks (CNNs) was then trained for automatic detection of posture and movement classes using the data and annotations. Our experiments show that the setup can be used for quantitative tracking of infant movement activities with a human equivalent accuracy, i.e., it meets the human inter-rater agreement levels in infant posture and movement classification. We also quantify the ambiguity of human observers in analyzing infant movements, and propose a method for utilizing this uncertainty for performance improvements in training of the automated classifier. Comparison of different sensor configurations also shows that four-limb recording leads to the best performance in posture and movement classification.

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Experimental design. (a) Photograph of the smart jumpsuit with four proximally placed movement sensors. (b) The annotation setup displaying annotations with synchronized video and movement data.
Figure 2
Figure 2
Design of the automatic classification pipeline. (a) The convolutional neural network (CNN) architecture used in the study as the main classifier. The role of the sensor module is to perform sensor-specific feature extraction, the sensor fusion module fuses sensor-level features into frame-level features, and the time series modeling module captures the temporal dependencies across frame-level features. (b) Block diagram for the iterative annotation refinement (IAR) procedure used to improve CNN performance through classifier-assisted resolution of inter-annotator inconsistencies on the training data.
Figure 3
Figure 3
(a) Total cumulative confusion matrices across all possible annotator pairs for the Posture (top) and Movement (bottom) track. The percentages of each column sum up to one and the absolute values denote the number of frames corresponding each cell. (b) t-SNE visualization of the entire dataset based on SVM input features. Color coding is based on the annotations of Posture (top left) and Movement categories (the rest). The visualization of the Movement track has been broken down into 3/3 (top right), 2/3 (bottom right), and 1/3 (bottom left) annotator agreement levels. The ambiguity differences in the annotation accuracies between the tracks can be clearly seen.
Figure 4
Figure 4
Performance of classifiers. Class-specific F-score box plots for individual recordings for the Posture and Movement tracks using the CNN (blue) and SVM (red) classifiers. Statistically significant recording-level differences (p 

Figure 5

The effect of sensor setup…

Figure 5

The effect of sensor setup on classification performance (UAR) using the SVM classifier.…

Figure 5
The effect of sensor setup on classification performance (UAR) using the SVM classifier. Any individual sensor configuration is inferior to the four-sensor setup. However, classification of data from a combination of one arm and one leg leads to almost comparable results.

Figure 6

Differentiation of high and low…

Figure 6

Differentiation of high and low motor performance infants with the smart jumpsuit. The…

Figure 6
Differentiation of high and low motor performance infants with the smart jumpsuit. The plots show individual category distributions (as log-probability) of the given posture and movement category, presented for the entire dataset. Results from both human annotation (x) and the classifier output (triangle) are shown for comparison, and the hairlines connect individuals to assess the individual level reliability. The highlighted recordings correspond to a sample of high performing (red; High perf.) and low performing infants (blue; Low perf.). Rest of the infant cohort is plotted with light gray lines. No statistically significant differences were found between the movement distributions from the human annotations and from classifier outputs (Mann-Whitney U-test, N = 22).

Figure 7

Total CNN classifier confusion matrices…

Figure 7

Total CNN classifier confusion matrices of the posture ( a , b )…

Figure 7
Total CNN classifier confusion matrices of the posture (a,b) and movement (c,d) tracks obtained from LOSO cross-validation of the (a,c) full annotation agreement subset and (b,d) complete data set. The percentage values inside the cells indicate the class-specific recall values, and the absolute values denote the number of frames. Average metrics are presented in Table 1.

Figure 8

The effect of IAR-based label…

Figure 8

The effect of IAR-based label refinement on the agreement ( κ ) between…

Figure 8
The effect of IAR-based label refinement on the agreement (κ) between CNN classifier outputs and the three original human annotation tracks (gray lines) as a function of IAR iterations on held-out test data. The mean pair-wise agreement between classifier and the three human annotations is shown with the blue line and Fleiss’ κ agreement across all human annotators is shown with the dotted black line. A clear improvement in classifier performance is observed due to label refinement on the training data, reaching the human-to-human agreement rate.
All figures (8)
Figure 5
Figure 5
The effect of sensor setup on classification performance (UAR) using the SVM classifier. Any individual sensor configuration is inferior to the four-sensor setup. However, classification of data from a combination of one arm and one leg leads to almost comparable results.
Figure 6
Figure 6
Differentiation of high and low motor performance infants with the smart jumpsuit. The plots show individual category distributions (as log-probability) of the given posture and movement category, presented for the entire dataset. Results from both human annotation (x) and the classifier output (triangle) are shown for comparison, and the hairlines connect individuals to assess the individual level reliability. The highlighted recordings correspond to a sample of high performing (red; High perf.) and low performing infants (blue; Low perf.). Rest of the infant cohort is plotted with light gray lines. No statistically significant differences were found between the movement distributions from the human annotations and from classifier outputs (Mann-Whitney U-test, N = 22).
Figure 7
Figure 7
Total CNN classifier confusion matrices of the posture (a,b) and movement (c,d) tracks obtained from LOSO cross-validation of the (a,c) full annotation agreement subset and (b,d) complete data set. The percentage values inside the cells indicate the class-specific recall values, and the absolute values denote the number of frames. Average metrics are presented in Table 1.
Figure 8
Figure 8
The effect of IAR-based label refinement on the agreement (κ) between CNN classifier outputs and the three original human annotation tracks (gray lines) as a function of IAR iterations on held-out test data. The mean pair-wise agreement between classifier and the three human annotations is shown with the blue line and Fleiss’ κ agreement across all human annotators is shown with the dotted black line. A clear improvement in classifier performance is observed due to label refinement on the training data, reaching the human-to-human agreement rate.

References

    1. Rosenberg SA, Zhang D, Robinson CC. Prevalence of developmental delays and participation in early intervention services for young children. Pediatrics. 2008;121:1503–1509. doi: 10.1542/peds.2007-1680.
    1. Novak I, et al. Early, Accurate Diagnosis and Early Intervention in Cerebral Palsy: Advances in Diagnosis and TreatmentEarly, Accurate Diagnosis and Early Intervention in Cerebral PalsyEarly, Accurate Diagnosis and Early Intervention in Cerebral Palsy. JAMA Pediatr. 2017;171:897–907. doi: 10.1001/jamapediatrics.2017.1689.
    1. Lobo MA, et al. Wearables for Pediatric Rehabilitation: How to Optimally Design and Use Products to Meet the Needs of Users. Phys. Ther. 2019;99:647–657. doi: 10.1093/ptj/pzz024.
    1. Nazneen N, et al. A novel system for supporting autism diagnosis using home videos: Iterative development and evaluation of system design. JMIR mHealth uHealth. 2015;3:e68. doi: 10.2196/mhealth.4393.
    1. Rihar Andraž, Mihelj Matjaž, Pašič Jure, Kolar Janko, Munih Marko. Infant trunk posture and arm movement assessment using pressure mattress, inertial and magnetic measurement units (IMUs) Journal of NeuroEngineering and Rehabilitation. 2014;11(1):133. doi: 10.1186/1743-0003-11-133.
    1. Venek, V., Kremser, W. & Schneider, C. Towards an IMU Evaluation Framework for Human Body Tracking, vol. Volume 248: Health Informatics Meets eHealth of Studies in Health Technology and Informatics, 156–163 (IOS Press, 2018).
    1. Einspieler C, Marschik P, Prechtl FR. H. Human motor behavior: Prenatal origin and early postnatal development. J. Psychol. 2008;216:147–153. doi: 10.1027/0044-3409.216.3.147.
    1. Peyton C, et al. White matter injury and general movements in high-risk preterm infants. Am. J. Neuroradiol. 2017;38:162–169. doi: 10.3174/ajnr.A4955.
    1. Harbourne R, Kamm K. Upper extremity function: What’s posture got to do with it? J. Hand Ther. 2015;28:106–113. doi: 10.1016/j.jht.2015.01.008.
    1. Guzzetta A, et al. General movements detect early signs of hemiplegia in term infants with neonatal cerebral infarction. Neuropediatrics. 2003;34:61–66. doi: 10.1055/s-2003-39597.
    1. Peake JM, Kerr G, Sullivan JP. A critical review of consumer wearables, mobile applications, and equipment for providing biofeedback, monitoring stress, and sleep in physically active populations. Front. Physiol. 2018;9:743. doi: 10.3389/fphys.2018.00743.
    1. Haataja L, et al. Optimality score for the neurologic examination of the infant at 12 and 18 months of age. The J. Pediatr. 1999;135:153–161. doi: 10.1016/S0022-3476(99)70016-8.
    1. Einspieler C, Prechtl HF, Ferrari F, Cioni G, Bos AF. The qualitative assessment of general movements in preterm, term and young infants - review of the methodology. Early Hum Dev. 1997;50:47–60. doi: 10.1016/S0378-3782(97)00092-3.
    1. Piper, M. & Darrah, J. Motor Assessment of the Developing Infant (Saunders, 1994).
    1. Tapani KT, Vanhatalo S, Stevenson NJ. Time-Varying EEG Correlations Improve Automated Neonatal Seizure Detection. Int. J. Neural Syst. 2019;29:1850030. doi: 10.1142/S0129065718500302.
    1. Cohen J. A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 1960;20:37–46. doi: 10.1177/001316446002000104.
    1. van der Maaten L, Hinton G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008;9:2579–2605.
    1. Summerfield C, Tsetsos K. Do humans make good decisions? Trends Cogn. Sci. 2015;19:27–34. doi: 10.1016/j.tics.2014.11.005.
    1. Yang J-Y, Wang J-S, Chen Y-P. Using acceleration measurements for activity recognition: An effective learning algorithm for constructing neural classifiers. Pattern Recognit. Lett. 2008;29:2213–2220. doi: 10.1016/j.patrec.2008.08.002.
    1. Anguita, D., Ghio, A., Oneto, L., Parra, X. & Reyes-Ortiz, L. J. A public domain dataset for human activity recognition using smartphones. In European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN) (2013).
    1. Zeyer A, Irie K, Schlüter R, Ney H. Improved training of end-to-end attention models for speech recognition. Proc. Interspeech. 2018;2018:7–11. doi: 10.21437/Interspeech.2018-1616.
    1. Allwein EL, Schapire RE, Singer Y. Reducing multiclass to binary: A unifying approach for margin classifiers. J. Mach. Learn. Res. 2000;1:113–141. doi: 10.1162/15324430152733133.
    1. Ha, S. & Choi, S. Convolutional neural networks for human activity recognition using multiple accelerometer and gyroscope sensors. In 2016 International Joint Conference on Neural Networks (IJCNN), 381–388, 10.1109/IJCNN.2016.7727224 (2016).
    1. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. arXiv:1512.03385 (2015).
    1. van den Oord, A. et al. Wavenet: A generative model for raw audio. arXiv:1609.03499 (2016).
    1. Danker-Hopfe H, et al. Interrater reliability for sleep scoring according to the Rechtschaffen & Kales and the new AASM standard. J. sleep research. 2009;18:74–84. doi: 10.1111/j.1365-2869.2008.00700.x.
    1. Massey SL, et al. Interrater and Intrarater Agreement in Neonatal Electroencephalogram Background Scoring. J. clinical neurophysiology: official publication Am. Electroencephalogr. Soc. 2019;36:1–8. doi: 10.1097/WNP.0000000000000534.
    1. Mann HB, Whitney DR. On a test of whether one of two random variables is stochastically larger than the other. The Annals Math. Stat. 1947;18:50–60. doi: 10.1214/aoms/1177730491.
    1. Sharma A. Developmental examination: birth to 5 years. Arch. Dis. Child. - Educ. Pract. 2011;96:162–175. doi: 10.1136/adc.2009.175901.
    1. Clark CCT, et al. A review of emerging analytical techniques for objective physical activity measurement in humans. Sports Medicine. 2017;47:439–447. doi: 10.1007/s40279-016-0585-y.
    1. Migueles JH, et al. Accelerometer data collection and processing criteria to assess physical activity and other outcomes: A systematic review and practical considerations. Sports Medicine. 2017;47:1821–1845. doi: 10.1007/s40279-017-0716-0.
    1. Clark CCT, et al. Physical activity characterization: does one site fit all? Physiol. Meas. 2018;39:09TR02. doi: 10.1088/1361-6579/aadad0.
    1. Ricardo LIC, et al. Protocol for objective measurement of infants’ physical activity using accelerometry. Medicine & Sci. Sports & Exerc. 2018;50:1084–1092. doi: 10.1249/MSS.0000000000001512.
    1. Abrishami MS, et al. Identification of developmental delay in infants using wearable sensors: Full-day leg movement statistical feature analysis. IEEE J. Transl. Eng. Heal. Medicine. 2019;7:1–7. doi: 10.1109/JTEHM.2019.2893223.
    1. Goodfellow, D. et al. Predicting infant motor development status using day long movement data from wearable sensors. 2018 KDD Work. Mach. Learn. Healthc. Medicine. London, U.K. (2018).
    1. Block VAJ, et al. Remote physical activity monitoring in neurological disease: A systematic review. PLOS ONE. 2016;11:1–41. doi: 10.1371/journal.pone.0154335.
    1. Torres EB, Smith B, Mistry S, Brincker M, Whyatt C. Neonatal diagnostics: Toward dynamic growth charts of neuromotor control. Front. Pediatr. 2016;4:121. doi: 10.3389/fped.2016.00121.
    1. Trujillo-Priego IA, et al. Development of a wearable sensor algorithm to detect the quantity and kinematic characteristics of infant arm movement bouts produced across a full day in the natural environment. Technologies. 2017;5:39. doi: 10.3390/technologies5030039.
    1. Knaier, R., Höchsmann, C., Infanger, D., Hinrichs, T. & Schmidt-Trucksäss, A. Validation of automatic wear-time detection algorithms in a free-living setting of wrist-worn and hip-worn ActiGraph GT3X+. BMC Public Heal. 19, 10.1186/s12889-019-6568-9 (2019).
    1. Majnemer A, Barr RG. Influence of supine sleep positioning on early motor milestone acquisition. Dev. medicine child neurology. 2005;47:370–6. doi: 10.1017/S0012162205000733.
    1. Herskind A, Greisen G, Nielsen JB. Early identification and intervention in cerebral palsy. Dev. Medicine & Child Neurol. 2015;57:29–36. doi: 10.1111/dmcn.12531.

Source: PubMed

3
订阅