Factors Affecting the Quality of Person-Generated Wearable Device Data and Associated Challenges: Rapid Systematic Review

Sylvia Cho, Ipek Ensari, Chunhua Weng, Michael G Kahn, Karthik Natarajan, Sylvia Cho, Ipek Ensari, Chunhua Weng, Michael G Kahn, Karthik Natarajan

Abstract

Background: There is increasing interest in reusing person-generated wearable device data for research purposes, which raises concerns about data quality. However, the amount of literature on data quality challenges, specifically those for person-generated wearable device data, is sparse.

Objective: This study aims to systematically review the literature on factors affecting the quality of person-generated wearable device data and their associated intrinsic data quality challenges for research.

Methods: The literature was searched in the PubMed, Association for Computing Machinery, Institute of Electrical and Electronics Engineers, and Google Scholar databases by using search terms related to wearable devices and data quality. By using PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, studies were reviewed to identify factors affecting the quality of wearable device data. Studies were eligible if they included content on the data quality of wearable devices, such as fitness trackers and sleep monitors. Both research-grade and consumer-grade wearable devices were included in the review. Relevant content was annotated and iteratively categorized into semantically similar factors until a consensus was reached. If any data quality challenges were mentioned in the study, those contents were extracted and categorized as well.

Results: A total of 19 papers were included in this review. We identified three high-level factors that affect data quality-device- and technical-related factors, user-related factors, and data governance-related factors. Device- and technical-related factors include problems with hardware, software, and the connectivity of the device; user-related factors include device nonwear and user error; and data governance-related factors include a lack of standardization. The identified factors can potentially lead to intrinsic data quality challenges, such as incomplete, incorrect, and heterogeneous data. Although missing and incorrect data are widely known data quality challenges for wearable devices, the heterogeneity of data is another aspect of data quality that should be considered for wearable devices. Heterogeneity in wearable device data exists at three levels: heterogeneity in data generated by a single person using a single device (within-person heterogeneity); heterogeneity in data generated by multiple people who use the same brand, model, and version of a device (between-person heterogeneity); and heterogeneity in data generated from multiple people using different devices (between-person heterogeneity), which would apply especially to data collected under a bring-your-own-device policy.

Conclusions: Our study identifies potential intrinsic data quality challenges that could occur when analyzing wearable device data for research and three major contributing factors for these challenges. As poor data quality can compromise the reliability and accuracy of research results, further investigation is needed on how to address the data quality challenges of wearable devices.

Keywords: data accuracy; data quality; fitness trackers; mobile phone; patient generated health data; wearable device.

Conflict of interest statement

Conflicts of Interest: None declared.

©Sylvia Cho, Ipek Ensari, Chunhua Weng, Michael G Kahn, Karthik Natarajan. Originally published in JMIR mHealth and uHealth (http://mhealth.jmir.org), 19.03.2021.

Figures

Figure 1
Figure 1
Flow diagram of the literature selection process. ACM: Association for Computing Machinery; IEEE: Institute of Electrical and Electronics Engineers.
Figure 2
Figure 2
Data heterogeneity on three levels.

References

    1. Nittas V, Mütsch M, Ehrler F, Puhan MA. Electronic patient-generated health data to facilitate prevention and health promotion: a scoping review protocol. BMJ Open. 2018 Aug 10;8(8):e021245. doi: 10.1136/bmjopen-2017-021245.
    1. McCarthy J. One in Five U.S. Adults Use Health Apps, Wearable Trackers. Gallup. 2019. Dec 11, [2020-09-20]. .
    1. Wood WA, Bennett AV, Basch E. Emerging uses of patient generated health data in clinical research. Mol Oncol. 2015 May;9(5):1018–24. doi: 10.1016/j.molonc.2014.08.006.
    1. Hickey AM, Freedson PS. Utility of Consumer Physical Activity Trackers as an Intervention Tool in Cardiovascular Disease Prevention and Treatment. Prog Cardiovasc Dis. 2016;58(6):613–9. doi: 10.1016/j.pcad.2016.02.006.
    1. Haghi M, Thurow K, Stoll R. Wearable Devices in Medical Internet of Things: Scientific Research and Commercially Available Devices. Healthc Inform Res. 2017 Jan;23(1):4–15. doi: 10.4258/hir.2017.23.1.4.
    1. Izmailova ES, Wagner JA, Perakslis ED. Wearable Devices in Clinical Trials: Hype and Hypothesis. Clin Pharmacol Ther. 2018 Jul;104(1):42–52. doi: 10.1002/cpt.966.
    1. Codella J, Partovian C, Chang H-Y, Chen C-H. Data quality challenges for person-generated health and wellness data. IBM J. Res. & Dev. 2018 Jan 1;62(1):3:1–3:8. doi: 10.1147/jrd.2017.2762218.
    1. Lim WK, Davila S, Teo JX, Yang C, Pua CJ, Blöcker C, Lim JQ, Ching J, Yap JJL, Tan SY, Sahlén A, Chin CW, Teh BT, Rozen SG, Cook SA, Yeo KK, Tan P. Beyond fitness tracking: The use of consumer-grade wearable data from normal volunteers in cardiovascular and lipidomics research. PLoS Biol. 2018 Dec;16(2):e2004285. doi: 10.1371/journal.pbio.2004285.
    1. McDonald L, Mehmud F, Ramagopalan SV. Sleep and BMI: Do (Fitbit) bands aid? F1000Res. 2018;7:511. doi: 10.12688/f1000research.14774.2.
    1. Cheung YK, Hsueh PS, Ensari I, Willey JZ, Diaz KM. Quantile Coarsening Analysis of High-Volume Wearable Activity Data in a Longitudinal Observational Study. Sensors (Basel) 2018 Sep 12;18(9) doi: 10.3390/s18093056.
    1. Burg MM, Schwartz JE, Kronish IM, Diaz KM, Alcantara C, Duer-Hefele J, Davidson KW. Does Stress Result in You Exercising Less? Or Does Exercising Result in You Being Less Stressed? Or Is It Both? Testing the Bi-directional Stress-Exercise Association at the Group and Person (N of 1) Level. Ann Behav Med. 2017 Dec;51(6):799–809. doi: 10.1007/s12160-017-9902-4.
    1. Henriksen A, Haugen MM, Woldaregay AZ, Muzny M, Hartvigsen G, Hopstock LA, Grimsgaard S. Using Fitness Trackers and Smartwatches to Measure Physical Activity in Research: Analysis of Consumer Wrist-Worn Wearables. J Med Internet Res. 2018 Mar 22;20(3):e110. doi: 10.2196/jmir.9157.
    1. Safran C. Update on Data Reuse in Health Care. Yearb Med Inform. 2017 Aug;26(1):24–27. doi: 10.15265/IY-2017-013.
    1. National Institutes of Health (NIH) Data Sources. All of Us. [2021-02-27]. .
    1. Menai M, Brouard B, Vegreville M, Chieh A, Schmidt N, Oppert J, Lelong H, Loprinzi PD. Cross-Sectional and longitudinal associations of objectively-measured physical activity on blood pressure: evaluation in 37 countries. Health Promot Perspect. 2017;7(4):190–196. doi: 10.15171/hpp.2017.34.
    1. Kim K, Nikzad N, Quer G, Wineinger NE, Vegreville M, Normand A, Schmidt N, Topol EJ, Steinhubl S. Real World Home Blood Pressure Variability in Over 56,000 Individuals With Nearly 17 Million Measurements. Am J Hypertens. 2018 Apr 13;31(5):566–573. doi: 10.1093/ajh/hpx221.
    1. Weiner MG, Embi PJ. Toward reuse of clinical data for research and quality improvement: the end of the beginning? Ann Intern Med. 2009 Sep 01;151(5):359–60. doi: 10.7326/0003-4819-151-5-200909010-00141.
    1. Zozus M, Kahn M, Weiskopf N. Clinical Research Informatics. New York, NY: Springer International Publishing; 2019. Data Quality in Clinical Research; pp. 213–248.
    1. Hicks JL, Althoff T, Sosic R, Kuhar P, Bostjancic B, King AC, Leskovec J, Delp SL. Best practices for analyzing large-scale health data from wearables and smartphone apps. NPJ Digit Med. 2019;2:45. doi: 10.1038/s41746-019-0121-1. doi: 10.1038/s41746-019-0121-1.
    1. Abdolkhani R, Borda A, Gray K. Quality Management of Patient Generated Health Data in Remote Patient Monitoring Using Medical Wearables - A Systematic Review. Stud Health Technol Inform. 2018;252:1–7.
    1. Kahn MG, Callahan TJ, Barnard J, Bauck AE, Brown J, Davidson BN, Estiri H, Goerg C, Holve E, Johnson SG, Liaw S, Hamilton-Lopez M, Meeker D, Ong TC, Ryan P, Shang N, Weiskopf NG, Weng C, Zozus MN, Schilling L. A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data. EGEMS (Wash DC) 2016;4(1):1244. doi: 10.13063/2327-9214.1244.
    1. Fiorini N, Canese K, Starchenko G, Kireev E, Kim W, Miller V, Osipov M, Kholodov M, Ismagilov R, Mohan S, Ostell J, Lu Z. Best Match: New relevance search for PubMed. PLoS Biol. 2018 Aug;16(8):e2005343. doi: 10.1371/journal.pbio.2005343.
    1. Piasecki J, Waligora M, Dranseika V. Google Search as an Additional Source in Systematic Reviews. Sci Eng Ethics. 2018 Apr;24(2):809–810. doi: 10.1007/s11948-017-0010-4.
    1. Wright SP, Hall BTS, Collier SR, Sandberg K. How consumer physical activity monitors could transform human physiology research. Am J Physiol Regul Integr Comp Physiol. 2017 Mar 01;312(3):R358–R367. doi: 10.1152/ajpregu.00349.2016.
    1. Bietz MJ, Bloss CS, Calvert S, Godino JG, Gregory J, Claffey MP, Sheehan J, Patrick K. Opportunities and challenges in the use of personal health data for health research. J Am Med Inform Assoc. 2016 Apr;23(e1):e42–8. doi: 10.1093/jamia/ocv118.
    1. Fawcett T. Mining the Quantified Self: Personal Knowledge Discovery as a Challenge for Data Science. Big Data. 2015 Dec;3(4):249–66. doi: 10.1089/big.2015.0049.
    1. Düking P, Fuss FK, Holmberg H, Sperlich B. Recommendations for Assessment of the Reliability, Sensitivity, and Validity of Data Provided by Wearable Sensors Designed for Monitoring Physical Activity. JMIR Mhealth Uhealth. 2018 Apr 30;6(4):e102. doi: 10.2196/mhealth.9341.
    1. Guo A, Ma J. Context-Aware Scheduling in Personal Data Collection From Multiple Wearable Devices. IEEE Access. 2017;5:2602–2614. doi: 10.1109/access.2017.2666419.
    1. Constantinou V, Felber AE, Chan JL. Applicability of consumer activity monitor data in marathon events: an exploratory study. J Med Eng Technol. 2017 Oct;41(7):534–540. doi: 10.1080/03091902.2017.1366560.
    1. Hardy J, Veinot TC, Yan X, Berrocal VJ, Clarke P, Goodspeed R, Gomez-Lopez IN, Romero D, Vydiswaran VGV. User acceptance of location-tracking technologies in health research: Implications for study design and data quality. J Biomed Inform. 2018 Mar;79:7–19. doi: 10.1016/j.jbi.2018.01.003.
    1. Banerjee T, Sheth A. IoT Quality Control for Data and Application Needs. IEEE Intell. Syst. 2017 Mar;32(2):68–73. doi: 10.1109/mis.2017.35.
    1. Karkouch A, Mousannif H, Al Moatassime H, Noel T. Data quality in internet of things: A state-of-the-art survey. Journal of Network and Computer Applications. 2016 Sep;73:57–81. doi: 10.1016/j.jnca.2016.08.002.
    1. Liang Z, Ploderer B, Chapa-Martell M. Is Fitbit Fit for Sleep-tracking? Sources of Measurement Errors and Proposed Countermeasures. Proc 11th EAI Int Conf Pervasive Comput Technol Healthc; May 23-26, 2017; New York, NY. USA: ACM; 2017. pp. 476–479.
    1. Reinerman-Jones L, Harris J, Watson A. Human Interface and the Management of Information: Information, Knowledge and Interaction Design. HIMI 2017. New York, NY: Springer; 2017. Considerations for using fitness trackers in psychophysiology research; pp. 598–606.
    1. Jülicher T, Delisle M. Big Data Context. New York, NY: Springer; 2018. Step into "The Circle" - A Close Look at Wearables and Quantified Self; pp. 81–91.
    1. Beukenhorst AL, Sergeant JC, Little MA, McBeth J, Dixon WG. Consumer Smartwatches for Collecting Self-Report and Sensor Data: App Design and Engagement. Stud Health Technol Inform. 2018;247:291–295.
    1. Cleland I, Donnelly MP, Nugent CD, Hallberg J, Espinilla M, Garcia-Constantino M. Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition. 2018 IEEE International Conference on Pervasive Computing and Communications Workshops; March 19-23, 2018; Athens, Greece. 2018. pp. 555–560.
    1. Mezghani E, Exposito E, Drira K, Da Silveira M, Pruski C. A Semantic Big Data Platform for Integrating Heterogeneous Wearable Data in Healthcare. J Med Syst. 2015 Dec;39(12):185. doi: 10.1007/s10916-015-0344-x.
    1. Lai AM, Hsueh PS, Choi YK, Austin RR. Present and Future Trends in Consumer Health Informatics and Patient-Generated Health Data. Yearb Med Inform. 2017 Aug;26(1):152–159. doi: 10.15265/IY-2017-016.
    1. Oh J, Lee U. Exploring UX issues in Quantified Self technologies. 2015 Eighth International Conference on Mobile Computing and Ubiquitous Networking (ICMU); January 20-22, 2015; Hakodate, Japan. 2015. pp. 53–59.
    1. Wickel EE. Reporting the reliability of accelerometer data with and without missing values. PLoS One. 2014;9(12):e114402. doi: 10.1371/journal.pone.0114402.
    1. Stephens S, Beyene J, Tremblay MS, Faulkner G, Pullnayegum E, Feldman BM. Strategies for Dealing with Missing Accelerometer Data. Rheum Dis Clin North Am. 2018 May;44(2):317–326. doi: 10.1016/j.rdc.2018.01.012.
    1. Sun Helen. Enterprise Information Management: Best Practices in Data Governance. Oracle Corporation; 2011. May, .
    1. West P, Van KM, Giordano R, Weal M, Shadbolt N. Information Quality Challenges of Patient-Generated Data in Clinical Practice. Front Public Health. 2017;5:284. doi: 10.3389/fpubh.2017.00284. doi: 10.3389/fpubh.2017.00284.
    1. Khare R, Ruth B, Miller M, Tucker J, Utidjian L, Razzaghi H. Predicting Causes of Data Quality Issues in a Clinical Data Research Network. AMIA Joint Summits Translational Science; March 12-15, 2018; San Francisco. 2018. May 18, pp. 113–121.
    1. Ong T, Pradhananga R, Holve E, Kahn MG. A Framework for Classification of Electronic Health Data Extraction-Transformation-Loading Challenges in Data Network Participation. EGEMS (Wash DC) 2017 Jun 13;5(1):10. doi: 10.5334/egems.222.
    1. Mailey EL, Gothe NP, Wójcicki TR, Szabo AN, Olson EA, Mullen SP, Fanning JT, Motl RW, McAuley E. Influence of allowable interruption period on estimates of accelerometer wear time and sedentary time in older adults. J Aging Phys Act. 2014 Apr;22(2):255–60. doi: 10.1123/japa.2013-0021.
    1. Evenson KR, Terry JW. Assessment of differing definitions of accelerometer nonwear time. Res Q Exerc Sport. 2009 Jun;80(2):355–62. doi: 10.1080/02701367.2009.10599570.
    1. Tang LM, Meyer J, Epstein DA, Bragg K, Engelen L, Bauman A, Kay J. Defining Adherence: Making Sense of Physical Activity Tracker Data. ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. 2018 Mar;2(1):1–22. doi: 10.1145/3191769.
    1. Low CA, Dey AK, Ferreira D, Kamarck T, Sun W, Bae S, Doryab A. Estimation of Symptom Severity During Chemotherapy From Passively Sensed Data: Exploratory Study. J Med Internet Res. 2017;19(12):e420. doi: 10.2196/jmir.9046.
    1. Collins T, Woolley SI, Oniani S, Pires IM, Garcia NM, Ledger SJ, Pandyan A. Version Reporting and Assessment Approaches for New and Updated Activity and Heart Rate Monitors. Sensors (Basel) 2019 Apr 10;19(7) doi: 10.3390/s19071705.
    1. Evenson KR, Goto MM, Furberg RD. Systematic review of the validity and reliability of consumer-wearable activity trackers. Int J Behav Nutr Phys Act. 2015;12(1):159. doi: 10.1186/s12966-015-0314-1.
    1. Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JPA, Clarke M, Devereaux PJ, Kleijnen J, Moher D. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ. 2009;339:b2700.
    1. Ancker JS, Shih S, Singh MP, Snyder A, Edwards A, Kaushal R, HITEC investigators Root causes underlying challenges to secondary use of data. AMIA Annu Symp Proc. 2011;2011:57–62.

Source: PubMed

3
Se inscrever