A catalog of validity indices for step counting wearable technologies during treadmill walking: the CADENCE-adults study

Jose Mora-Gonzalez, Zachary R Gould, Christopher C Moore, Elroy J Aguiar, Scott W Ducharme, John M Schuna Jr, Tiago V Barreira, John Staudenmayer, Cayla R McAvoy, Mariya Boikova, Taavy A Miller, Catrine Tudor-Locke, Jose Mora-Gonzalez, Zachary R Gould, Christopher C Moore, Elroy J Aguiar, Scott W Ducharme, John M Schuna Jr, Tiago V Barreira, John Staudenmayer, Cayla R McAvoy, Mariya Boikova, Taavy A Miller, Catrine Tudor-Locke

Abstract

Background: Standardized validation indices (i.e., accuracy, bias, and precision) provide a comprehensive comparison of step counting wearable technologies.

Purpose: To expand a previously published child/youth catalog of validity indices to include adults (21-40, 41-60 and 61-85 years of age) assessed across a range of treadmill speeds (slow [0.8-3.2 km/h], normal [4.0-6.4 km/h], fast [7.2-8.0 km/h]) and device wear locations (ankle, thigh, waist, and wrist).

Methods: Two hundred fifty-eight adults (52.5 ± 18.7 years, 49.6% female) participated in this laboratory-based study and performed a series of 5-min treadmill bouts while wearing multiple devices; 21 devices in total were evaluated over the course of this multi-year cross-sectional study (2015-2019). The criterion measure was directly observed steps. Computed validity indices included accuracy (mean absolute percentage error, MAPE), bias (mean percentage error, MPE), and precision (correlation coefficient, r; standard deviation, SD; coefficient of variation, CoV).

Results: Over the range of normal speeds, 15 devices (Actical, waist-worn ActiGraph GT9X, activPAL, Apple Watch Series 1, Fitbit Ionic, Fitbit One, Fitbit Zip, Garmin vivoactive 3, Garmin vivofit 3, waist-worn GENEActiv, NL-1000, PiezoRx, Samsung Gear Fit2, Samsung Gear Fit2 Pro, and StepWatch) performed at < 5% MAPE. The wrist-worn ActiGraph GT9X displayed the worst accuracy across normal speeds (MAPE = 52%). On average, accuracy was compromised across slow walking speeds for all wearable technologies (MAPE = 40%) while all performed best across normal speeds (MAPE = 7%). When analyzing the data by wear locations, the ankle and thigh demonstrated the best accuracy (both MAPE = 1%), followed by the waist (3%) and the wrist (15%) across normal speeds. There were significant effects of speed, wear location, and age group on accuracy and bias (both p < 0.001) and precision (p ≤ 0.045).

Conclusions: Standardized validation indices cataloged by speed, wear location, and age group across the adult lifespan facilitate selecting, evaluating, or comparing performance of step counting wearable technologies. Speed, wear location, and age displayed a significant effect on accuracy, bias, and precision. Overall, reduced performance was associated with very slow walking speeds (0.8 to 3.2 km/h). Ankle- and thigh-located devices logged the highest accuracy, while those located at the wrist reported the worst accuracy.

Trial registration: Clinicaltrials.gov NCT02650258. Registered 24 December 2015.

Keywords: Accelerometer; Accuracy; Bias; Measurement; Pedometer; Physical activity.

Conflict of interest statement

The authors declare they have no conflicts of interest. The results of the present study do not constitute endorsement by the American College of Sports Medicine. The results of the study are presented clearly, honestly, and without fabrication, falsification, or inappropriate data manipulation.

© 2022. The Author(s).

Figures

Fig. 1
Fig. 1
Mean absolute percentage error (MAPE) of each wearable technology across walking speeds. Participants walked on a treadmill for 5-min bouts beginning at 0.8 km/h (0.5 mph) and increasing in 0.8 km/h (0.5 mph). MAPE (%) was computed for each person bout subtracting the directly observed steps (criterion measurement) from the wearable technology-derived steps and dividing it in absolute value by the directly observed steps. Black dots represent the averaged MAPE across all sample for a given speed. Bars represent standard deviation of MAPE. The standard deviation bars were not drawn when they were shorter than the height of the symbol. Lower MAPE values indicate higher accuracy of the wearable technology of interest. See Additional file 2 for a graphical classification of wearable technologies by age groups
Fig. 2
Fig. 2
Mean absolute percentage error (MAPE) across walking speeds presented by wear location. Participants walked on a treadmill for 5-min bouts beginning at 0.8 km/h (0.5 mph) and increasing in 0.8 km/h (0.5 mph). MAPE (%) was computed for each person bout subtracting the directly observed steps (criterion measurement) from the wearable technology-derived steps and dividing it in absolute value by the directly observed steps. Black dots represent the averaged MAPE across specific wear location for a given speed. Bars represent standard deviation of MAPE. The standard deviation bars were not drawn when they were shorter than the height of the symbol. Lower MAPE values indicate higher wear location accuracy. Ankle-worn wearable: StepWatch (N = 253). Thigh-worn wearable: activPAL (N = 249). Waist-worn wearables: Actical (N = 250), ActiGraph GT9X (N = 254), Digi-Walker SW-200 (N = 258), Fitbit One (N = 160), Fitbit Zip (N = 98), GENEActiv (N = 224), NL-1000 (N = 258), PiezoRx (N = 98). Wrist-worn wearables: ActiGraph GT9X (N = 254), Apple Watch Series 1 (N = 174), Fitbit Ionic (N = 98), Garmin vivoactive 3 (N = 96), Garmin vivoactive HR (N = 77), Garmin vivofit 2 (N = 80), Garmin vivofit 3 (N = 77), GENEActiv (N = 217), Polar M600 (N = 97), Samsung Gear Fit2 (N = 80), Samsung Gear Fit2 Pro (N = 98). See Additional file 2 for a graphical classification of wearable technologies by age groups and Additional File 8: Suppl Table 1 for a tabular description of validity indices by wear locations
Fig. 3
Fig. 3
Mean absolute percentage error (MAPE) across walking speeds presented by age group. Participants walked on a treadmill for 5-min bouts beginning at 0.8 km/h (0.5 mph) and increasing in 0.8 km/h (0.5 mph). MAPE (%) was computed for each person bout subtracting the directly observed steps (criterion measurement) from the wearable technology-derived steps and dividing it in absolute value by the directly observed steps. Black dots represent the averaged MAPE across specific age group for a given speed. Bars represent standard deviation of MAPE. The standard deviation bars were not drawn when they were shorter than the height of the symbol. Lower MAPE values indicate higher age group accuracy. All age groups (21–85 years) wore the Actical (N = 250), ActiGraph GT9X (Waist) (N = 254), ActiGraph GT9X (Wrist) (N = 254), activPAL (N = 249), Digi-Walker SW-200 (N = 258), GENEActiv (Waist) (N = 224), GENEActiv (Wrist) (N = 217), NL-1000 (N = 258), and the StepWatch (N = 253). Young Adults (21–40 years) also wore the Fitbit One (N = 80) and Garmin vivofit 2 (N = 80). Middle-Age Adults (41–60 years) also wore the Apple Watch Series 1 (N = 76), Fitbit One (N = 80), Garmin vivoactive HR (N = 77), Garmin vivofit 3 (N = 77), and the Samsung Gear Fit2 (N = 80). Older Adults (61–85 years) also wore the AppleWatch Series 1 (N = 98), Fitbit Ionic (N = 98), Fitbit Zip (N = 98), Garmin vivoactive 3 (N = 96), PiezoRx (N = 98), Polar M600 (N = 97), and the Samsung Gear Fit2 Pro (N = 98). See Additional file 2 for a graphical classification of wearable technologies by age groups. See Additional File 8: Suppl Table 2 for a tabular description of validity indices by age groups
Fig. 4
Fig. 4
Effect of speed on overall accuracy (mean absolute percentage error, MAPE) of wearable technology’s step counting ability. MAPE and corresponding 95% confidence intervals (CIs) respective to each technology are plotted across speed bouts. Slow speed bouts: 0.8, 1.6, 2.4, 3.2 km/h (0.5, 1.0, 1.5, 2.0 mph); normal speed bouts: 4.0, 4.8, 5.6, 6.4 km/h (2.5, 3.0, 3.5, 4.0 mph); fast speed bouts: 7.2, 8.0 km/h (4.5, 5.0 mph). Each black dot represents grouped averages of MAPE values, with 95% CIs estimated using mixed effect models and extending above and below that point estimate. The 95% CIs bars were not drawn when they were shorter than the height of the symbol. MAPE values closer to 0 (indicated by a dashed line) are indicative of greater accuracy. 95% CIs bars that do not overlap are interpreted as significantly different among them. Likelihood ratio test P value is reported for the effect of all speeds on MAPE for each specific device. See Additional file 2 for a graphical classification of wearable technologies’ location by age groups
Fig. 5
Fig. 5
Effect of wear location on overall accuracy (mean absolute percentage error, MAPE) of wearable technologies’ step counting ability. MAPE and corresponding 95% confidence intervals (CIs; estimated using mixed effect models) of each wear location are presented at slow, normal, and fast walking speeds. Slow speed bouts: 0.8, 1.6, 2.4, 3.2 km/h (0.5, 1.0, 1.5, 2.0 mph); normal speed bouts: 4.0, 4.8, 5.6, 6.4 km/h (2.5, 3.0, 3.5, 4.0 mph); fast speed bouts: 7.2, 8.0 km/h (4.5, 5.0 mph). MAPE values were averaged across devices respective to each wear location for slow, normal, and fast walking speeds. MAPE values closer to 0 indicate greater accuracy. The 95% CIs bars were not drawn when they were shorter than the height of the symbol. Further, where 95% CIs do not overlap, there are significant differences between locations. Likelihood ratio test P value is reported for the effect of wear location on MAPE for each specific speed level. Ankle-worn wearable: StepWatch (N = 253). Thigh-worn wearable: activPAL (N = 249). Waist-worn wearables: Actical (N = 250), ActiGraph GT9X (N = 254), Digi-Walker SW-200 (N = 258), Fitbit One (N = 160), Fitbit Zip (N = 98), GENEActiv (N = 224), NL-1000 (N = 258), PiezoRx (N = 98). Wrist-worn wearables: ActiGraph GT9X (N = 254), Apple Watch Series 1 (N = 174), Fitbit Ionic (N = 98), Garmin vivoactive 3 (N = 96), Garmin vivoactive HR (N = 77), Garmin vivofit 2 (N = 80), Garmin vivofit 3 (N = 77), GENEActiv (N = 217), Polar M600 (N = 97), Samsung Gear Fit2 (N = 80), Samsung Gear Fit2 Pro (N = 98). See Additional file 2 for a graphical classification of wearable technologies by age groups
Fig. 6
Fig. 6
Effect of age on overall accuracy (mean absolute percentage error, MAPE) of wearable technologies’ step counting ability. MAPE and corresponding 95% confidence intervals (CIs; estimated using mixed effect models) of each age group are presented at slow, normal, and fast walking speeds. MAPE values were averaged across devices respective to each age group for slow, normal, and fast walking speeds. Slow speed bouts: 0.8, 1.6, 2.4, 3.2 km/h (0.5, 1.0, 1.5, 2.0 mph); normal speed bouts: 4.0, 4.8, 5.6, 6.4 km/h (2.5, 3.0, 3.5, 4.0 mph); fast speed bouts: 7.2, 8.0 km/h (4.5, 5.0 mph). MAPE values closer to 0 represent greater accuracy. The 95% CIs bars were not drawn when they were shorter than the height of the symbol. Further, where 95% CIs do not overlap, there are significant differences between locations. Likelihood ratio test P value is reported for the effect of age on MAPE for each specific speed level. All age groups (21–85 years) wore the Actical (N = 250), ActiGraph GT9X (Waist) (N = 254), ActiGraph GT9X (Wrist) (N = 254), activPAL (N = 249), Digi-Walker SW-200 (N = 258), GENEActiv (Waist) (N = 224), GENEActiv (Wrist) (N = 217), NL-1000 (N = 258), and the StepWatch (N = 253). Young Adults (21–40 years) also wore the Fitbit One (N = 80) and Garmin vivofit 2 (N = 80). Middle-Age Adults (41–60 years) also wore the Apple Watch Series 1 (N = 76), Fitbit One (N = 80), Garmin vivoactive HR (N = 77), Garmin vivofit 3 (N = 77), and the Samsung Gear Fit2 (N = 80). Older Adults (61–85 years) also wore the AppleWatch Series 1 (N = 98), Fitbit Ionic (N = 98), Fitbit Zip (N = 98), Garmin vivoactive 3 (N = 96), PiezoRx (N = 98), Polar M600 (N = 97), and the Samsung Gear Fit2 Pro (N = 98). See Additional file 2 for a graphical classification of wearable technologies by age groups

References

    1. Ometov A, Shubina V, Klus L, et al. A survey on wearable technology: history, state-of-the-art and current challenges. Comput Netw. 2021;193:108074. doi: 10.1016/j.comnet.2021.108074.
    1. Wijndaele K, Westgate K, Stephens SK, et al. Utilization and harmonization of adult accelerometry data: review and expert consensus. Med Sci Sports Exerc. 2015;47(10):2129–2139. doi: 10.1249/MSS.0000000000000661.
    1. Peake JM, Kerr G, Sullivan JP. A critical review of consumer wearables, mobile applications, and equipment for providing biofeedback, monitoring stress, and sleep in physically active populations. Front Psychol. 2018;9:743. doi: 10.3389/fphys.2018.00743.
    1. Moore CC, McCullough AK, Aguiar EJ, Ducharme SW, Tudor-Locke C. Toward harmonized treadmill-based validation of step-counting wearable technologies: a scoping review. J Phys Act Health. 2020;17(8):1–13. doi: 10.1123/jpah.2019-0205.
    1. Consumer Techonology Association Health and Fitness Technology Subcommittee. Physical activity monitoring for fitness wearables: step counting. Arlington: Consumer Technology Association; 2016.
    1. Tudor-Locke C, Mora-Gonzalez J, Ducharme SW, et al. Walking cadence (steps/min) and intensity in 61–85-year-old adults: the CADENCE-adults study. Int J Behav Nutr. 2021;18(1):129. doi: 10.1186/s12966-021-01199-4.
    1. Tudor-Locke C, Ducharme SW, Aguiar EJ, et al. Walking cadence (steps/min) and intensity in 41 to 60-year-old adults: the CADENCE-adults study. Int J Behav Nutr. 2020;17(1):137. doi: 10.1186/s12966-020-01045-z.
    1. Tudor-Locke C, Aguiar EJ, Han H, et al. Walking cadence (steps/min) and intensity in 21–40 year olds: CADENCE-adults. Int J Behav Nutr. 2019;16(1):8. doi: 10.1186/s12966-019-0769-6.
    1. Gould ZR, Mora-Gonzalez J, Aguiar EJ, et al. A catalog of validity indices for step counting wearable technologies during treadmill walking: the CADENCE-kids study. Int J Behav Nutr. 2021;18(1):97. doi: 10.1186/s12966-021-01167-y.
    1. Chen MJ, Fan X, Moe ST. Criterion-related validity of the Borg ratings of perceived exertion scale in healthy individuals: a meta-analysis. J Sports Sci. 2002;20(11):873–899. doi: 10.1080/026404102320761787.
    1. Ducharme SW, Lim J, Busa MA, et al. A transparent method for step detection using an acceleration threshold. J Meas Phys Behav. 2021;4(4):311–320. doi: 10.1123/jmpb.2021-0011.
    1. Tudor-Locke C, Schuna JM, Jr, Han H, et al. Cadence (steps/min) and intensity during ambulation in 6–20 year olds: the CADENCE-kids study. Int J Behav Nutr. 2018;15(1):20. doi: 10.1186/s12966-018-0651-y.
    1. Physical Activity Guidelines Advisory Committee . 2018 physical activity guidelines advisory committee scientific report. Washington, DC: U.S. Department of Health and Human Services; 2018.
    1. Dueker D, Gauderman WJ, McConnell R. Accuracy of a new time-resolved step counter in children. Pediatr Exerc Sci. 2012;24(4):622–633. doi: 10.1123/pes.24.4.622.
    1. Walther BA, Moore JL. The concepts of bias, precision and accuracy, and their use in testing the performance of species richness estimators, with a literature review of estimator performance. Ecography. 2005;28(6):815–829. doi: 10.1111/j.2005.0906-7590.04112.x.
    1. Greenland S, Senn SJ, Rothman KJ, et al. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol. 2016;31(4):337–350. doi: 10.1007/s10654-016-0149-3.
    1. Feito Y, Bassett DR, Thompson DL. Evaluation of activity monitors in controlled and free-living environments. Med Sci Sports Exerc. 2012;44(4):733–741. doi: 10.1249/MSS.0b013e3182351913.
    1. Feito Y, Garner HR, Bassett DR. Evaluation of ActiGraph's low-frequency filter in laboratory and free-living environments. Med Sci Sports Exerc. 2015;47(1):211–217. doi: 10.1249/MSS.0000000000000395.
    1. Hatano Y. Use of the pedometer for promoting daily walking exercise. Int Counc Health Phys Educ Recreat (ICHPER) J. 1993;29(4):4–8.
    1. Johnston W, Judice PB, Molina García P, et al. Recommendations for determining the validity of consumer wearable and smartphone step count: expert statement and checklist of the INTERLIVE network. Br J Sports Med. 2021;55(14):780–793. doi: 10.1136/bjsports-2020-103147.

Source: PubMed

3
Subscribe