Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence

Gary S Collins, Paula Dhiman, Constanza L Andaur Navarro, Jie Ma, Lotty Hooft, Johannes B Reitsma, Patricia Logullo, Andrew L Beam, Lily Peng, Ben Van Calster, Maarten van Smeden, Richard D Riley, Karel Gm Moons, Gary S Collins, Paula Dhiman, Constanza L Andaur Navarro, Jie Ma, Lotty Hooft, Johannes B Reitsma, Patricia Logullo, Andrew L Beam, Lily Peng, Ben Van Calster, Maarten van Smeden, Richard D Riley, Karel Gm Moons

Abstract

Introduction: The Transparent Reporting of a multivariable prediction model of Individual Prognosis Or Diagnosis (TRIPOD) statement and the Prediction model Risk Of Bias ASsessment Tool (PROBAST) were both published to improve the reporting and critical appraisal of prediction model studies for diagnosis and prognosis. This paper describes the processes and methods that will be used to develop an extension to the TRIPOD statement (TRIPOD-artificial intelligence, AI) and the PROBAST (PROBAST-AI) tool for prediction model studies that applied machine learning techniques.

Methods and analysis: TRIPOD-AI and PROBAST-AI will be developed following published guidance from the EQUATOR Network, and will comprise five stages. Stage 1 will comprise two systematic reviews (across all medical fields and specifically in oncology) to examine the quality of reporting in published machine-learning-based prediction model studies. In stage 2, we will consult a diverse group of key stakeholders using a Delphi process to identify items to be considered for inclusion in TRIPOD-AI and PROBAST-AI. Stage 3 will be virtual consensus meetings to consolidate and prioritise key items to be included in TRIPOD-AI and PROBAST-AI. Stage 4 will involve developing the TRIPOD-AI checklist and the PROBAST-AI tool, and writing the accompanying explanation and elaboration papers. In the final stage, stage 5, we will disseminate TRIPOD-AI and PROBAST-AI via journals, conferences, blogs, websites (including TRIPOD, PROBAST and EQUATOR Network) and social media. TRIPOD-AI will provide researchers working on prediction model studies based on machine learning with a reporting guideline that can help them report key details that readers need to evaluate the study quality and interpret its findings, potentially reducing research waste. We anticipate PROBAST-AI will help researchers, clinicians, systematic reviewers and policymakers critically appraise the design, conduct and analysis of machine learning based prediction model studies, with a robust standardised tool for bias evaluation.

Ethics and dissemination: Ethical approval has been granted by the Central University Research Ethics Committee, University of Oxford on 10-December-2020 (R73034/RE001). Findings from this study will be disseminated through peer-review publications.

Prospero registration number: CRD42019140361 and CRD42019161764.

Keywords: epidemiology; general medicine (see internal medicine); statistics & research methods.

Conflict of interest statement

Competing interests: None declared.

© Author(s) (or their employer(s)) 2021. Re-use permitted under CC BY. Published by BMJ.

References

    1. Collins GS, Reitsma JB, Altman DG, et al. . Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med 2015;162:55–63. 10.7326/M14-0697
    1. Moons KGM, Kengne AP, Woodward M, et al. . Risk prediction models: I. Development, internal validation, and assessing the incremental value of a new (bio)marker. Heart 2012;98:683–90. 10.1136/heartjnl-2011-301246
    1. Collins GS, Mallett S, Omar O, et al. . Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting. BMC Med 2011;9:103. 10.1186/1741-7015-9-103
    1. Collins GS, de Groot JA, Dutton S, et al. . External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol 2014;14:40. 10.1186/1471-2288-14-40
    1. Bouwmeester W, Zuithoff NPA, Mallett S, et al. . Reporting and methods in clinical prediction research: a systematic review. PLoS Med 2012;9:e1001221. 10.1371/journal.pmed.1001221
    1. Mallett S, Royston P, Dutton S, et al. . Reporting methods in studies developing prognostic models in cancer: a review. BMC Med 2010;8:20. 10.1186/1741-7015-8-20
    1. Chalmers I, Glasziou P. Avoidable waste in the production and reporting of research evidence. Lancet 2009;374:86–9. 10.1016/S0140-6736(09)60329-9
    1. Moons KGM, Altman DG, Reitsma JB, et al. . Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med 2015;162:W1–73. 10.7326/M14-0698
    1. Yu K-H, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nat Biomed Eng 2018;2:719–31. 10.1038/s41551-018-0305-z
    1. Beam AL, Kohane IS. Big data and machine learning in health care. JAMA 2018;319:1317. 10.1001/jama.2017.18391
    1. Ghassemi M, Naumann T, Schulam P, et al. . Practical guidance on artificial intelligence for health-care data. Lancet Digit Health 2019;1:e157–9. 10.1016/S2589-7500(19)30084-6
    1. Beam AL, Manrai AK, Ghassemi M. Challenges to the reproducibility of machine learning models in health care. JAMA 2020;323:305. 10.1001/jama.2019.20866
    1. Sendak M, D’Arcy J, Kashyap S. A path for translation of machine learning products into healthcare delivery. EMJ Innov 2020.
    1. Wiens J, Saria S, Sendak M, et al. . Do no harm: a roadmap for responsible machine learning for health care. Nat Med 2019.
    1. Wynants L, Van Calster B, Collins GS, et al. . Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. BMJ 2020;369:m1328. 10.1136/bmj.m1328
    1. Breiman L. Statistical modeling: the two cultures. Stat Sci 2001;16:199–231.
    1. Shillan D, Sterne JAC, Champneys A, et al. . Use of machine learning to analyse routinely collected intensive care unit data: a systematic review. Crit Care 2019;23:284. 10.1186/s13054-019-2564-9
    1. Christodoulou E, Ma J, Collins GS, et al. . A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol 2019;110:12–22. 10.1016/j.jclinepi.2019.02.004
    1. Collins GS, Moons KGM. Reporting of artificial intelligence prediction models. The Lancet 2019;393:1577–9. 10.1016/S0140-6736(19)30037-6
    1. Wolff RF, Moons KGM, Riley RD, et al. . PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med 2019;170:51–8. 10.7326/M18-1376
    1. Moons KGM, Wolff RF, Riley RD, et al. . PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann Intern Med 2019;170:W1–33. 10.7326/M18-1377
    1. Liu Y, Chen P-HC, Krause J, et al. . How to read articles that use machine learning: users' guides to the medical literature. JAMA 2019;322:1806. 10.1001/jama.2019.16489
    1. Faes L, Liu X, Wagner SK, et al. . A clinician's guide to artificial intelligence: how to critically appraise machine learning studies. Transl Vis Sci Technol 2020;9:7. 10.1167/tvst.9.2.7
    1. Moons KGM, Altman DG, Vergouwe Y, et al. . Prognosis and prognostic research: application and impact of prognostic models in clinical practice. BMJ 2009;338:b606. 10.1136/bmj.b606
    1. Sounderajah V, Ashrafian H, Aggarwal R, et al. . Developing specific reporting guidelines for diagnostic accuracy studies assessing AI interventions: the STARD-AI steering group. Nat Med 2020;26:807–8. 10.1038/s41591-020-0941-1
    1. Moher D, Schulz KF, Simera I, et al. . Guidance for developers of health research reporting guidelines. PLoS Med 2010;7:e1000217. 10.1371/journal.pmed.1000217
    1. Heus P, Damen JAAG, Pajouheshnia R, et al. . Poor reporting of multivariable prediction model studies: towards a targeted implementation strategy of the TRIPOD statement. BMC Med 2018;16:120. 10.1186/s12916-018-1099-2
    1. Liu X, Faes L, Kale AU, et al. . A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health 2019;1:e271–97. 10.1016/S2589-7500(19)30123-2
    1. Boel A, Navarro-Compán V, Landewé R, et al. . Two different invitation approaches for consecutive rounds of a Delphi survey led to comparable final outcome. J Clin Epidemiol 2021;129:31–9. 10.1016/j.jclinepi.2020.09.034

Source: PubMed

3
Subscribe