Evaluations of Commercial Sleep Technologies for Objective Monitoring During Routine Sleeping Conditions

Jason D Stone, Lauren E Rentz, Jillian Forsey, Jad Ramadan, Rachel R Markwald, Victor S Finomore, Scott M Galster, Ali Rezai, Joshua A Hagen, Jason D Stone, Lauren E Rentz, Jillian Forsey, Jad Ramadan, Rachel R Markwald, Victor S Finomore, Scott M Galster, Ali Rezai, Joshua A Hagen

Abstract

Purpose: The commercial market is saturated with technologies that claim to collect proficient, free-living sleep measurements despite a severe lack of independent third-party evaluations. Therefore, the present study evaluated the accuracy of various commercial sleep technologies during in-home sleeping conditions.

Materials and methods: Data collection spanned 98 separate nights of ad libitum sleep from five healthy adults. Prior to bedtime, participants utilized nine popular sleep devices while concurrently wearing a previously validated electroencephalography (EEG)-based device. Data collected from the commercial devices were extracted for later comparison against EEG to determine degrees of accuracy. Sleep and wake summary outcomes as well as sleep staging metrics were evaluated, where available, for each device.

Results: Total sleep time (TST), total wake time (TWT), and sleep efficiency (SE) were measured with greater accuracy (lower percent errors) and limited bias by Fitbit Ionic [mean absolute percent error, bias (95% confidence interval); TST: 9.90%, 0.25 (-0.11, 0.61); TWT: 25.64%, -0.17 (-0.28, -0.06); SE: 3.49%, 0.65 (-0.82, 2.12)] and Oura smart ring [TST: 7.39%, 0.19 (0.04, 0.35); TWT: 36.29%, -0.18 (-0.31, -0.04); SE: 5.42%, 1.66 (0.17, 3.15)], whereas all other devices demonstrated a propensity to over or underestimate at least one if not all of the aforementioned sleep metrics. No commercial sleep technology appeared to accurately quantify sleep stages.

Conclusion: Generally speaking, commercial sleep technologies displayed lower error and bias values when quantifying sleep/wake states as compared to sleep staging durations. Still, these findings revealed that there is a remarkably high degree of variability in the accuracy of commercial sleep technologies, which further emphasizes that continuous evaluations of newly developed sleep technologies are vital. End-users may then be able to determine more accurately which sleep device is most suited for their desired application(s).

Keywords: consumer sleep technologies; sleep duration; sleep efficiency; sleep staging; wearables.

Conflict of interest statement

Financial competing interests: none. Non-financial competing interests: none. The authors report no conflicts of interest for this work.

© 2020 Stone et al.

Figures

Figure 1
Figure 1
TST boxplots: absolute percent error by device.
Figure 2
Figure 2
(AI) TST Bland–Altman plots for all devices.
Figure 3
Figure 3
TWT time boxplots: absolute percent error by device.
Figure 4
Figure 4
(AG) TWT Bland–Altman plots for all devices.
Figure 5
Figure 5
SE boxplots: absolute percent error by device.
Figure 6
Figure 6
(AG) SE Bland–Altman plots for all devices.
Figure 7
Figure 7
Light time boxplots: absolute percent error by device.
Figure 8
Figure 8
Deep time boxplots: absolute percent error by device.
Figure 9
Figure 9
REM time boxplots: absolute percent error by device.

References

    1. Watson NF, Badr MS, Belenky G, et al. Recommended amount of sleep for a healthy adult: a joint consensus statement of the American academy of sleep medicine and sleep research society. Sleep. 2015. doi:10.5665/sleep.4716
    1. Itani O, Jike M, Watanabe N, Kaneita Y. Short sleep duration and health outcomes: a systematic review, meta-analysis, and meta-regression. Sleep Med. 2017;32:246–256. doi:10.1016/j.sleep.2016.08.006
    1. Kelly J, Strecker R, Bianchi M. Recent developments in home sleep monitoring devices. ISRN Neurol. 2012;2012. doi:10.5402/2012/768794
    1. de Zambotti M, Cellini N, Goldstone A, Colrain I, Baker F. Wearable sleep technology in clinical and research settings. Med Sci Sports Exerc. 2019;51:1538–1557. doi:10.1249/MSS.0000000000001947
    1. Ibañez V, Silva J, Navarro E, Cauli O. Sleep assessment devices: types, market analysis, and a critical review on accuracy and validation. Expert Rev Med Devices. 2019;16(12):1041–1052. doi:10.1080/17434440.2019.1693890
    1. Roomkham S, Lovell D, Cheung J, Perrin D. Promises and challenges in the use of consumer-grade devices for sleep monitoring. IEEE Rev Biomed Eng. 2018;11:53–67. doi:10.1109/RBME.2018.2811735
    1. Khosla S, Deak MC, Gault D, et al. Consumer sleep technology: an American academy of sleep medicine position statement. J Clin Sleep Med. 2018;14(05):877–880. doi:10.5664/jcsm.7128
    1. Rundo J, Downey R. Polysomnography In: Levin KH, Chauvel P, editors. Handbook of Clinical Neurology. Vol. 160 Elsevier; 2019:381–392.
    1. Simons PJ, Overeem S. Polysomnography: recording, analysis and interpretation. Sleep Disorders Neurol. 2018;13–29.
    1. Ahmadi N, Shapiro GK, Chung SA, Shapiro CM. Clinical diagnosis of sleep apnea based on single night of polysomnography vs. two nights of polysomnography. Sleep Breath. 2009;13(3):221–226. doi:10.1007/s11325-008-0234-2
    1. Kaplan K, Hirshman J, Hernandez B, et al. When a gold standard isn’t so golden: lack of prediction of subjective sleep quality from sleep polysomnography. Biol Psychol. 2017;123:37–46. doi:10.1016/j.biopsycho.2016.11.010
    1. Kushida C, Chang A, Gadkary C, Guilleminault C, Carrillo O, Dement W. Comparison of actigraphic, polysomnographic, and subjective assessment of sleep parameters in sleep-disordered patients. Sleep Med. 2001;2:389–396. doi:10.1016/S1389-9457(00)00098-8
    1. Sadeh A. The role and validity of actigraphy in sleep medicine: an update. Sleep Med Rev. 2011;15(4):259–267. doi:10.1016/j.smrv.2010.10.001
    1. Martin JL, Hakim AD. Wrist actigraphy. Chest. 2011;139(6):1514–1527. doi:10.1378/chest.10-1872
    1. Marino M, Li Y, Rueschman MN, et al. Measuring sleep: accuracy, sensitivity, and specificity of wrist actigraphy compared to polysomnography. Sleep. 2013;36(11):1747–1755. doi:10.5665/sleep.3142
    1. Ancoli-Israel S, Cole R, Alessi C, Chambers M, Moorcroft W, Pollak CP. The role of actigraphy in the study of sleep and circadian rhythms. Sleep. 2003;26(3):342–392. doi:10.1093/sleep/26.3.342
    1. Depner CM, Cheng PC, Devine JK, et al. Wearable technologies for developing sleep and circadian biomarkers: a summary of workshop discussions. Sleep. 2020;43(2):1–13. doi:10.1093/sleep/zsz254
    1. Sitnick SL, Goodlin-Jones BL, Anders TF. The use of actigraphy to study sleep disorders in preschoolers: some concerns about detection of nighttime awakenings. Sleep. 2008;31(3):395–401. doi:10.1093/sleep/31.3.395
    1. Meltzer LJ, Montgomery-Downs HE, Insana SP, Walsh CM. Use of actigraphy for assessment in pediatric sleep research. Sleep Med Rev. 2012;16(5):463–475. doi:10.1016/j.smrv.2011.10.002
    1. Paquet J, Kawinska A, Carrier J. Wake detection capacity of actigraphy during sleep. Sleep. 2007;30(10):1362–1369. doi:10.1093/sleep/30.10.1362
    1. Boneva RS, Decker MJ, Maloney EM, et al. Higher heart rate and reduced heart rate variability persist during sleep in chronic fatigue syndrome: a population-based study. Auton Neurosci. 2007;137(1–2):94–101. doi:10.1016/j.autneu.2007.08.002
    1. Cincin A, Sari I, Oğuz M, et al. Effect of acute sleep deprivation on heart rate recovery in healthy young adults. Sleep Breath. 2015;19(2):631–636. doi:10.1007/s11325-014-1066-x
    1. Allen J. Photoplethysmography and its application in clinical physiological measurement. Physiol Meas. 2007;28(3):R1–R39.
    1. Butler MJ, Crowe JA, Hayes-Gill BR, Rodmell PI. Motion limitations of non-contact photoplethysmography due to the optical and topological properties of the skin. Physiol Meas. 2016;37:N27–N37. doi:10.1088/0967-3334/37/5/N27
    1. Bent B, Goldstein B, Kibbe W, Dunn J. Investigating sources of inaccuracy in wearable optical heart rate sensors. NPJ Digit Med. 2020;3. doi:10.1038/s41746-020-0226-6
    1. Sañudo B, Hoyo M, Muñoz-Lopez A, Perry J, Abt G. Pilot study assessing the influence of skin type on the heart rate measurements obtained by photoplethysmography with the apple watch. J Med Syst. 2019;43(7):195. doi:10.1007/s10916-019-1325-2
    1. Wen D, Zhang X, Liu X, Lei J. Evaluating the consistency of current mainstream wearable devices in health monitoring: a comparison under free-living conditions. J Med Internet Res. 2017;19(3):e68. doi:10.2196/jmir.6874
    1. Levendowski D, Ferini-Strambi L, Gamaldo C, Cetel M, Rosenberg R, Westbrook P. The accuracy, night-to-night variability, and stability of frontopolar sleep electroencephalography biomarkers. J Clin Sleep Med. 2017;13(6):791–803. doi:10.5664/jcsm.6618
    1. Finan P, Richards J, Gamaldo C, et al. Validation of a wireless, self-application, ambulatory electroencephalographic sleep monitoring device in healthy volunteers. J Clin Sleep Med. 2016;12(11):1443–1451.
    1. Riebe D, Ehrman J, Liguori G, Magal M, Medicine A. ACSM’s Guidelines for Exercise Testing and Prescription. 10th ed. Philadelphia Baltimore New York: Wolters Kluwer; 2018.
    1. Kaewkannate K, Kim S. A comparison of wearable fitness devices. BMC Public Health. 2016;16:1–16. doi:10.1186/s12889-016-3059-0
    1. Nelson BW, Allen NB. Accuracy of consumer wearable heart rate measurement during an ecologically valid 24-hour period: intraindividual validation study. JMIR Mhealth Uhealth. 2019;7(3):e10828. doi:10.2196/10828
    1. Nakano N, Sakura T, Ueda K, et al. Evaluation of 3D markerless motion capture accuracy using OpenPose with multiple video cameras. Front Sports Act Living. 2020;2:50. doi:10.3389/fspor.2020.00050
    1. USFDA. 510 (k) premarket notification: automatic event detection software for polysomnograph with electroencephalograph. Administration USFaD, ed. K120450. Vol 882.14002012.
    1. Schupp M, Hanning CD. Physiology of sleep. BJA CEPD Rev. 2003;3(3):69–74. doi:10.1093/bjacepd/mkg069
    1. Berryhill S, Morton C, Dean A, et al. Effect of wearables on sleep in healthy individuals: a randomized cross-over trial and validation study. J Clin Sleep Med. 2020;16(5):775–783. doi:10.5664/jcsm.8356
    1. Tedesco S, Sica M, Ancillao A, Timmons S, Barton J, O’Flynn B. Validity evaluation of the fitbit charge2 and the garmin vivosmart HR+ in free-living environments in an older adult cohort. JMIR Mhealth Uhealth. 2019;7(6):e13084. doi:10.2196/13084
    1. Kendall MG, Smith BB. The problem of m rankings. Ann Math Stat. 1939;10(3):275–287. doi:10.1214/aoms/1177732186
    1. Gearhart A, Booth DT, Sedivec K, Schauer C. Use of Kendall’s coefficient of concordance to assess agreement among observers of very high resolution imagery. Geocarto Int. 2013;28(6):517–526. doi:10.1080/10106049.2012.725775
    1. Gouhier TC, Guichard F, Gonzalez A. Synchrony and stability of food webs in metacommunities. Am Nat. 2010;175(2):E16–E34. doi:10.1086/649579
    1. Tuominen J, Peltola K, Saaresranta T, Valli K. Sleep parameter assessment accuracy of a consumer home sleep monitoring ballistocardiograph beddit sleep tracker: a validation study. J Clin Sleep Med. 2019;15(3):483–487. doi:10.5664/jcsm.7682
    1. Kinnunen HO, Rantanen A, Kenttä TV, Koskimäki H. Feasible assessment of recovery and cardiovascular health: accuracy of nocturnal HR and HRV assessed via ring PPG in comparison to medical grade ECG. Physiol Meas. 2020;41:04NT01. doi:10.1088/1361-6579/ab840a
    1. Pesonen A-K, Kuula L. The validity of a new consumer-targeted wrist device in sleep measurement: an overnight comparison against polysomnography in children and adolescents. J Clin Sleep Med. 2018;14(4):585–591. doi:10.5664/jcsm.7050
    1. Haghayegh S, Khoshnevis S, Smolensky MH, Diller KR, Castriotta RJ. Accuracy of wristband fitbit models in assessing sleep: systematic review and meta-analysis. J Med Internet Res. 2019;21(11):e16273. doi:10.2196/16273
    1. de Zambotti M, Rosas L, Colrain IM, Baker FC. The sleep of the ring: comparison of the OURA sleep tracker against polysomnography. Behav Sleep Med. 2019;17:124–136. doi:10.1080/15402002.2017.1300587
    1. Bland JM, Altman D. Statistical methods for assessing agreement between two methods of clinical measurement. The Lancet. 1986;327(8476):307–310. doi:10.1016/S0140-6736(86)90837-8
    1. Tukey JW. Exploratory Data Analysis. Vol. 2 Reading, Mass dawli; 1977.
    1. Dawson R. How significant is a boxplot outlier? Stat Educ. 2011;19(2).
    1. Hsu J. Multiple Comparisons: Theory and Methods. CRC Press; 1996.
    1. Team RC. R: a language and environment for statistical computing. 2019. Available from: .
    1. Wickham H, Averick M, Bryan J, et al. Welcome to the tidyverse. J Open Source Softw. 2019;4(43):1686. doi:10.21105/joss.01686
    1. Auguie B. gridExtra: miscellaneous functions for “grid” graphics. R Package Ver. 2017;2:602.
    1. Datta D. blandr: a Bland-Altman method comparison package for R. Zenodo. 2017.
    1. Kassambara A. rstatix: pipe-friendly framework for basic statistical tests. 2019.
    1. Cook J, Prairle M, Plante D. Utility of the Fitbit Flex to evaluate sleep in major depressive disorder: a comparison against polysomnography and wrist-worn actigraphy. J Affect Disord. 2017;217:299–305. doi:10.1016/j.jad.2017.04.030
    1. Markwald RR, Wright KP. Circadian misalignment and sleep disruption in shift work: implications for fatigue and risk of weight gain and obesity In: Sleep Loss and Obesity. Springer; 2012:101–118.
    1. Collop N. Scoring variability between polysomnography technologists in different sleep laboratories. Sleep Med. 2002;3:43–47. doi:10.1016/S1389-9457(01)00115-0
    1. Willemen T, Van Deun D, Verhaert V, et al. An evaluation of cardiorespiratory and movement features with respect to sleep stage classification. IEEE J Biomed Health. 2013;18(2):661–669. doi:10.1109/JBHI.2013.2276083
    1. Brown AC, Smolensky MH, D’Alonzo GE, Redman DP. Actigraphy: a means of assessing circadian patterns in human activity. Chronobiol Int. 1990;7(2):125–133. doi:10.3109/07420529009056964
    1. de Zambotti M, Cellini N, Menghini L, Sarlo M, Baker F. Sensors capabilities, performance, and use of consumer sleep technology. Sleep Med Clin. 2020;15(1):1–30. doi:10.1016/j.jsmc.2019.11.003
    1. Lee J, Matsummura K, Yamakoshi KI, Rolfe P, Tanaka S, Yamakoshi T. Comparison between red, green and blue light reflection photoplethysmography for heart rate monitoring during motion. 35th Annual International Conference of the IEEE Engineering in Medicine and Biology (EMBC); 2013.
    1. De Arriba-pérez F, Caeiro-Rodríguez M, Santos-Gago J. Collection and processing of data from wrist wearable devices in heterogeneous and multiple-user scenarios. Sensors. 2016;16(9):1538. doi:10.3390/s16091538
    1. Tuovinen L, Smeaton AF Unlocking the black box of wearable intelligence: ethical considerations and social impact. Paper presented at: 2019 IEEE Congress on Evolutionary Computation (CEC); January 10, 2019; Wellington, New Zealand.

Source: PubMed

3
Abonnere