Accuracy of Wristband Fitbit Models in Assessing Sleep: Systematic Review and Meta-Analysis

Shahab Haghayegh, Sepideh Khoshnevis, Michael H Smolensky, Kenneth R Diller, Richard J Castriotta, Shahab Haghayegh, Sepideh Khoshnevis, Michael H Smolensky, Kenneth R Diller, Richard J Castriotta

Abstract

Background: Wearable sleep monitors are of high interest to consumers and researchers because of their ability to provide estimation of sleep patterns in free-living conditions in a cost-efficient way.

Objective: We conducted a systematic review of publications reporting on the performance of wristband Fitbit models in assessing sleep parameters and stages.

Methods: In adherence with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement, we comprehensively searched the Cumulative Index to Nursing and Allied Health Literature (CINAHL), Cochrane, Embase, MEDLINE, PubMed, PsycINFO, and Web of Science databases using the keyword Fitbit to identify relevant publications meeting predefined inclusion and exclusion criteria.

Results: The search yielded 3085 candidate articles. After eliminating duplicates and in compliance with inclusion and exclusion criteria, 22 articles qualified for systematic review, with 8 providing quantitative data for meta-analysis. In reference to polysomnography (PSG), nonsleep-staging Fitbit models tended to overestimate total sleep time (TST; range from approximately 7 to 67 mins; effect size=-0.51, P<.001; heterogenicity: I2=8.8%, P=.36) and sleep efficiency (SE; range from approximately 2% to 15%; effect size=-0.74, P<.001; heterogenicity: I2=24.0%, P=.25), and underestimate wake after sleep onset (WASO; range from approximately 6 to 44 mins; effect size=0.60, P<.001; heterogenicity: I2=0%, P=.92) and there was no significant difference in sleep onset latency (SOL; P=.37; heterogenicity: I2=0%, P=.92). In reference to PSG, nonsleep-staging Fitbit models correctly identified sleep epochs with accuracy values between 0.81 and 0.91, sensitivity values between 0.87 and 0.99, and specificity values between 0.10 and 0.52. Recent-generation Fitbit models that collectively utilize heart rate variability and body movement to assess sleep stages performed better than early-generation nonsleep-staging ones that utilize only body movement. Sleep-staging Fitbit models, in comparison to PSG, showed no significant difference in measured values of WASO (P=.25; heterogenicity: I2=0%, P=.92), TST (P=.29; heterogenicity: I2=0%, P=.98), and SE (P=.19) but they underestimated SOL (P=.03; heterogenicity: I2=0%, P=.66). Sleep-staging Fitbit models showed higher sensitivity (0.95-0.96) and specificity (0.58-0.69) values in detecting sleep epochs than nonsleep-staging models and those reported in the literature for regular wrist actigraphy.

Conclusions: Sleep-staging Fitbit models showed promising performance, especially in differentiating wake from sleep. However, although these models are a convenient and economical means for consumers to obtain gross estimates of sleep parameters and time spent in sleep stages, they are of limited specificity and are not a substitute for PSG.

Keywords: Fitbit; accuracy; actigraphy; comparison of performance; polysomnography; sleep diary; sleep stages; sleep tracker; validation; wearable.

Conflict of interest statement

Conflicts of Interest: None declared.

©Shahab Haghayegh, Sepideh Khoshnevis, Michael H Smolensky, Kenneth R Diller, Richard J Castriotta. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 28.11.2019.

Figures

Figure 1
Figure 1
Flow diagram adapted from Moher et al [14] describing the search strategy of databases to retrieve and qualify publications of relevance for review.
Figure 2
Figure 2
Forest plot of the standardized mean difference (Hedges g) between Fitbit and polysomnography for the variable of sleep onset latency (SOL). Results are shown as effect size (ES) and 95% CI. The difference in symbol size indicates the difference in weight of the respective studies. The diamond symbol shows the 95% CI of the overall effect and the tails show the 95% prediction interval of the overall effect. PLMS: periodic limb movement in sleep.
Figure 3
Figure 3
Forest plot of the standardized mean difference (Hedges g) between Fitbit and polysomnography for the variable of wake after sleep onset (WASO). Results are shown as effect size (ES) and 95% CI. The difference in symbol size indicates the difference in the weight of the respective studies. The diamond symbol shows the 95% CI of the overall effect and the tails show the 95% prediction interval of the overall effect. PLMS: periodic limb movement in sleep.
Figure 4
Figure 4
Forest plot of the standardized mean difference (Hedges g) between Fitbit and polysomnography for the variable of total sleep time (TST). Results are shown as effect size (ES) and 95% CI. The difference in symbol size indicates the difference in weight of the respective studies. The diamond symbol shows the 95% CI of the overall effect and the tails show the 95% prediction interval of the overall effect. PLMS: periodic limb movement in sleep.
Figure 5
Figure 5
Forest plot of the standardized mean difference (Hedges g) between Fitbit and polysomnography for the variable of sleep efficiency (SE). Results are shown as effect size (ES) and 95% CI. The difference in symbol size indicates the difference in weight of the respective studies. The diamond symbol shows the 95% CI of the overall effect and the tails show the 95% prediction interval of the overall effect.

References

    1. Hirshkowitz M. SlidePlayer. [2019-06-14]. Assessing sleep wearables and in-bedroom devices: CTA standards work
    1. Haghayegh S, Khoshnevis S, Smolensky MH, Diller KR, Castriotta RJ. Performance comparison of different interpretative algorithms utilized to derive sleep parameters from wrist actigraphy data. Chronobiol Int. 2019 Dec;36(12):1752–1760. doi: 10.1080/07420528.2019.1679826.
    1. Van de Water AT, Holmes A, Hurley DA. Objective measurements of sleep for non-laboratory settings as alternatives to polysomnography: A systematic review. J Sleep Res. 2011 Mar;20(1 Pt 2):183–200. doi: 10.1111/j.1365-2869.2009.00814.x. doi: 10.1111/j.1365-2869.2009.00814.x.
    1. Fitbit. [2018-10-03]. Who we are .
    1. Fitbit. [2019-03-29]. What should I know about sleep stages? .
    1. Haghayegh S, Khoshnevis S, Smolensky MH, Diller KR. Accuracy of PurePulse photoplethysmography technology of Fitbit Charge 2 for assessment of heart rate during sleep. Chronobiol Int. 2019 Jul;36(7):927–933. doi: 10.1080/07420528.2019.1596947.
    1. Thompson WR. Worldwide Survey of Fitness Trends for 2019. ACSMs Health Fit J. 2018;22(6):10–17. doi: 10.1249/fit.0000000000000438.
    1. Smith C. DMR. 2019. Sep 06, [2019-09-05]. 60 interesting Fitbit statistics and facts (December 2018)
    1. Wright SP, Collier SR, Brown TS, Sandberg K. An analysis of how consumer physical activity monitors are used in biomedical research. FASEB J. 2017 Apr;31(1 (supplement)):1.
    1. All of Us Research Program Investigators. Denny JC, Rutter JL, Goldstein DB, Philippakis A, Smoller JW, Jenkins G, Dishman E. The "All of Us" Research Program. N Engl J Med. 2019 Aug 15;381(7):668–676. doi: 10.1056/NEJMsr1809937.
    1. National Institutes of Health, All of Us Research Program. 2019. Jan 16, [2019-09-06]. All of Us Research Program expands data collection efforts with Fitbit .
    1. Sperlich B, Holmberg H. Wearable, yes, but able…?: It is time for evidence-based marketing claims! Br J Sports Med. 2017 Aug;51(16):1240. doi: 10.1136/bjsports-2016-097295.
    1. American National Standards Institute. 2016. Sep, [2019-10-22]. CTA/NSF 2052.1-2016 (ANSI): Definitions and characteristics for wearable sleep monitors .
    1. Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Med. 2009 Jul 21;6(7):e1000097. doi: 10.1371/journal.pmed.1000097.
    1. Downs SH, Black N. The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions. J Epidemiol Community Health. 1998 Jul;52(6):377–384. doi: 10.1136/jech.52.6.377.
    1. Coe R. It's the effect size, stupid: What effect size is and why it is important. Proceedings of the Annual Conference of the British Educational Research Association; Annual Conference of the British Educational Research Association; September 12-14, 2002; University of Exeter, UK. 2002.
    1. Borenstein M, Hedges Lv, Higgins JPT, Rothstein HR. Introduction to Meta-Analysis. Chichester, UK: John Wiley & Sons; 2009.
    1. IntHout J, Ioannidis JPA, Rovers MM, Goeman JJ. Plea for routinely presenting prediction intervals in meta-analysis. BMJ Open. 2016 Jul 12;6(7):e010247. doi: 10.1136/bmjopen-2015-010247.
    1. Sullivan GM, Feinn R. Using effect size: Or why the P value is not enough. J Grad Med Educ. 2012 Oct;4(3):279–282. doi: 10.4300/JGME-D-12-00156.1.
    1. Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ. 2003 Oct 06;327(7414):557–560. doi: 10.1136/bmj.327.7414.557.
    1. Richardson M, Garner P, Donegan S. Interpretation of subgroup analyses in systematic reviews: A tutorial. Clin Epidemiol Glob Health. 2019 Jun;7(2):192–198. doi: 10.1016/j.cegh.2018.05.005.
    1. Beattie Z, Oyang Y, Statan A, Ghoreyshi A, Pantelopoulos A, Russell A, Heneghan C. Estimation of sleep stages in a healthy adult population from optical plethysmography and accelerometer signals. Physiol Meas. 2017 Oct 31;38(11):1968–1979. doi: 10.1088/1361-6579/aa9047.
    1. Brazendale K, Beets MW, Weaver RG, Perry MW, Tyler EB, Hunt ET, Decker L, Chaput J. Comparing measures of free-living sleep in school-aged children. Sleep Med. 2019 Aug;60:197–201. doi: 10.1016/j.sleep.2019.04.006.
    1. Brooke SM, An H, Kang S, Noble JM, Berg KE, Lee J. Concurrent validity of wearable activity trackers under free-living conditions. J Strength Cond Res. 2017 May;31(4):1097–1106. doi: 10.1519/JSC.0000000000001571.
    1. Cook JD, Prairie ML, Plante DT. Utility of the Fitbit Flex to evaluate sleep in major depressive disorder: A comparison against polysomnography and wrist-worn actigraphy. J Affect Disord. 2017 Aug 01;217:299–305. doi: 10.1016/j.jad.2017.04.030.
    1. Cook JD, Eftekari SC, Dallmann E, Sippy M, Plante DT. Ability of the Fitbit Alta HR to quantify and classify sleep in patients with suspected central disorders of hypersomnolence: A comparison against polysomnography. J Sleep Res. 2019 Aug;28(4):e12789. doi: 10.1111/jsr.12789.
    1. de Zambotti M, Baker FC, Willoughby AR, Godino JG, Wing D, Patrick K, Colrain IM. Measures of sleep and cardiac functioning during sleep using a multi-sensory commercially-available wristband in adolescents. Physiol Behav. 2016 May 01;158:143–149. doi: 10.1016/j.physbeh.2016.03.006.
    1. de Zambotti M, Goldstone A, Claudatos S, Colrain IM, Baker FC. A validation study of Fitbit Charge 2™ compared with polysomnography in adults. Chronobiol Int. 2018 Apr;35(4):465–476. doi: 10.1080/07420528.2017.1413578.
    1. Dickinson DL, Cazier J, Cech T. A practical validation study of a commercial accelerometer using good and poor sleepers. Health Psychol Open. 2016 Jul;3(2):2055102916679012. doi: 10.1177/2055102916679012.
    1. Hakim M, Miller R, Hakim M, Tumin D, Tobias JD, Jatana KR, Raman VT. Comparison of the Fitbit® Charge and polysomnography for measuring sleep quality in children with sleep disordered breathing. Minerva Pediatr. 2018 Dec 07;:1. doi: 10.23736/S0026-4946.18.05333-1.
    1. Kang S, Kang JM, Ko K, Park S, Mariani S, Weng J. Validity of a commercial wearable sleep tracker in adult insomnia disorder patients and good sleepers. J Psychosom Res. 2017 Jun;97:38–44. doi: 10.1016/j.jpsychores.2017.03.009.
    1. Kubala AG, Barone Gibbs B, Buysse DJ, Patel SR, Hall MH, Kline CE. Field-based measurement of sleep: Agreement between six commercial activity monitors and a validated accelerometer. Behav Sleep Med. 2019 Aug 27;:1–16. doi: 10.1080/15402002.2019.1651316.
    1. Lee H, Lee H, Moon J, Lee T, Kim M, In H, Cho C, Kim L. Comparison of wearable activity tracker with actigraphy for sleep evaluation and circadian rest: Activity rhythm measurement in healthy young adults. Psychiatry Investig. 2017 Mar;14(2):179–185. doi: 10.4306/pi.2017.14.2.179.
    1. Lee J, Byun W, Keill A, Dinkel D, Seo Y. Comparison of wearable trackers' ability to estimate sleep. Int J Environ Res Public Health. 2018 Jun 15;15(6):1265. doi: 10.3390/ijerph15061265.
    1. Liang Z, Chapa Martell MA. Validity of consumer activity wristbands and wearable EEG for measuring overall sleep parameters and sleep structure in free-living conditions. J Healthc Inform Res. 2018 Apr 20;2(1-2):152–178. doi: 10.1007/s41666-018-0013-1.
    1. Liu J, Wong WT, Zwetsloot IM, Hsu YC, Tsui KL. Preliminary agreement on tracking sleep between a wrist-worn device Fitbit Alta and consensus sleep diary. Telemed J E Health. 2019 Jan 02;:1. doi: 10.1089/tmj.2018.0202.
    1. Mantua J, Gravel N, Spencer RM. Reliability of sleep measures from four personal health monitoring devices compared to research-based actigraphy and polysomnography. Sensors (Basel) 2016 May 05;16(5):646. doi: 10.3390/s16050646.
    1. Maskevich S, Jumabhoy R, Dao PDM, Stout JC, Drummond SPA. Pilot validation of ambulatory activity monitors for sleep measurement in Huntington's disease gene carriers. J Huntingtons Dis. 2017;6(3):249–253. doi: 10.3233/JHD-170251.
    1. Meltzer LJ, Hiruma LS, Avis K, Montgomery-Downs H, Valentin J. Comparison of a commercial accelerometer with polysomnography and actigraphy in children and adolescents. Sleep. 2015 Aug 01;38(8):1323–1330. doi: 10.5665/sleep.4918.
    1. Montgomery-Downs HE, Insana SP, Bond JA. Movement toward a novel activity monitoring device. Sleep Breath. 2012 Oct;16(3):913–917. doi: 10.1007/s11325-011-0585-y.
    1. Osterbauer B, Koempel JA, Davidson Ward SL, Fisher LM, Don DM. A comparison study of the Fitbit activity monitor and PSG for assessing sleep patterns and movement in children. J Otolaryngol Adv. 2016 Mar 2;1(3):24–35. doi: 10.14302/issn.2379-8572.joa-15-891.
    1. Sargent C, Lastella M, Romyn G, Versey N, Miller DJ, Roach GD. How well does a commercially available wearable device measure sleep in young athletes? Chronobiol Int. 2018 Jun;35(6):754–758. doi: 10.1080/07420528.2018.1466800.
    1. Svensson T, Chung U, Tokuno S, Nakamura M, Svensson AK. A validation study of a consumer wearable sleep tracker compared to a portable EEG system in naturalistic conditions. J Psychosom Res. 2019 Aug 30;126:109822. doi: 10.1016/j.jpsychores.2019.109822.
    1. Morgenthaler T, Alessi C, Friedman L, Owens J, Kapur V, Boehlecke B, Brown T, Chesson A, Coleman J, Lee-Chiong T, Pancer J, Swick TJ, Standards of Practice Committee. American Academy of Sleep Medicine Practice parameters for the use of actigraphy in the assessment of sleep and sleep disorders: An update for 2007. Sleep. 2007 May;30(4):519–529. doi: 10.1093/sleep/30.4.519.
    1. Marino M, Li Y, Rueschman MN, Winkelman JW, Ellenbogen JM, Solet JM, Dulin H, Berkman LF, Buxton OM. Measuring sleep: Accuracy, sensitivity, and specificity of wrist actigraphy compared to polysomnography. Sleep. 2013 Dec 01;36(11):1747–1755. doi: 10.5665/sleep.3142.
    1. Fitbit. [2018-10-15]. How do I track my sleep with my Fitbit device? .
    1. Sadeh A. The role and validity of actigraphy in sleep medicine: An update. Sleep Med Rev. 2011 Aug;15(4):259–267. doi: 10.1016/j.smrv.2010.10.001.
    1. Wolfson AR, Carskadon MA, Acebo C, Seifer R, Fallone G, Labyak SE, Martin JL. Evidence for the validity of a sleep habits survey for adolescents. Sleep. 2003 Mar 15;26(2):213–216. doi: 10.1093/sleep/26.2.213.

Source: PubMed

3
Subscribe