An Early Warning Risk Prediction Tool (RECAP-V1) for Patients Diagnosed With COVID-19: Protocol for a Statistical Analysis Plan

Francesca Fiorentino, Denys Prociuk, Ana Belen Espinosa Gonzalez, Ana Luisa Neves, Laiba Husain, Sonny Christian Ramtale, Emma Mi, Ella Mi, Jack Macartney, Sneha N Anand, Julian Sherlock, Kavitha Saravanakumar, Erik Mayer, Simon de Lusignan, Trisha Greenhalgh, Brendan C Delaney, Francesca Fiorentino, Denys Prociuk, Ana Belen Espinosa Gonzalez, Ana Luisa Neves, Laiba Husain, Sonny Christian Ramtale, Emma Mi, Ella Mi, Jack Macartney, Sneha N Anand, Julian Sherlock, Kavitha Saravanakumar, Erik Mayer, Simon de Lusignan, Trisha Greenhalgh, Brendan C Delaney

Abstract

Background: Since the start of the COVID-19 pandemic, efforts have been made to develop early warning risk scores to help clinicians decide which patient is likely to deteriorate and require hospitalization. The RECAP (Remote COVID-19 Assessment in Primary Care) study investigates the predictive risk of hospitalization, deterioration, and death of patients with confirmed COVID-19, based on a set of parameters chosen through a Delphi process performed by clinicians. We aim to use rich data collected remotely through the use of electronic data templates integrated in the electronic health systems of several general practices across the United Kingdom to construct accurate predictive models. The models will be based on preexisting conditions and monitoring data of a patient's clinical parameters (eg, blood oxygen saturation) to make reliable predictions as to the patient's risk of hospital admission, deterioration, and death.

Objective: This statistical analysis plan outlines the statistical methods to build the prediction model to be used in the prioritization of patients in the primary care setting. The statistical analysis plan for the RECAP study includes the development and validation of the RECAP-V1 prediction model as a primary outcome. This prediction model will be adapted as a three-category risk score split into red (high risk), amber (medium risk), and green (low risk) for any patient with suspected COVID-19. The model will predict the risk of deterioration and hospitalization.

Methods: After the data have been collected, we will assess the degree of missingness and use a combination of traditional data imputation using multiple imputation by chained equations, as well as more novel machine-learning approaches to impute the missing data for the final analysis. For predictive model development, we will use multiple logistic regression analyses to construct the model. We aim to recruit a minimum of 1317 patients for model development and validation. We will then externally validate the model on an independent dataset of 1400 patients. The model will also be applied for multiple different datasets to assess both its performance in different patient groups and its applicability for different methods of data collection.

Results: As of May 10, 2021, we have recruited 3732 patients. A further 2088 patients have been recruited through the National Health Service Clinical Assessment Service, and approximately 5000 patients have been recruited through the DoctalyHealth platform.

Conclusions: The methodology for the development of the RECAP-V1 prediction model as well as the risk score will provide clinicians with a statistically robust tool to help prioritize COVID-19 patients.

Trial registration: ClinicalTrials.gov NCT04435041; https://ichgcp.net/clinical-trials-registry/NCT04435041.

International registered report identifier (irrid): DERR1-10.2196/30083.

Keywords: COVID-19; early warning; modeling; remote assessment; risk score.

Conflict of interest statement

Conflicts of Interest: Simon de Lusignan is the Director of the Royal College of General Practitioners Research and Surveillance Centre. He has also received a grant through his university from AstraZeneca for vaccine effectiveness and to explore adverse events of interest. All other authors declare no conflicts of interest.

©Francesca Fiorentino, Denys Prociuk, Ana Belen Espinosa Gonzalez, Ana Luisa Neves, Laiba Husain, Sonny Christian Ramtale, Emma Mi, Ella Mi, Jack Macartney, Sneha N Anand, Julian Sherlock, Kavitha Saravanakumar, Erik Mayer, Simon de Lusignan, Trisha Greenhalgh, Brendan C Delaney. Originally published in JMIR Research Protocols (https://www.researchprotocols.org), 05.10.2021.

Figures

Figure 1
Figure 1
RECAP-V1 model predictor variables and their clinical severity, adapted from Greenhalgh et al [11]. O/E: on examination; ORCHID: Oxford-Royal College of General Practitioners Clinical Informatics Digital Hub; RECAP: Remote COVID-19 Assessment in Primary Care; SNOMED: Systemized Nomenclature of Medicine.
Figure 2
Figure 2
Sensitivity and specificity calculations based on test risk prediction and the real outcome, along with the positive and negative predictive value formulas. NPV: Negative Predictive Value; PPV: Positive Predictive Value.

References

    1. COVID-19 rapid guideline: managing COVID-19. National Institute for Health and Clinical Excellence. 2021. [2021-04-07]. .
    1. RCGP clarification on the use of Roth scores in the assessment of patients with potential COVID-19. Royal College of General Practitioners. 2020. [2021-04-07]. .
    1. Greenhalgh T. Should the Roth Score be used in the remote assessment of patients with possible COVID-19? The Centre for Excellence-Based Medicine. 2020. [2021-04-07].
    1. Wu Z, McGoogan JM. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72 314 cases from the Chinese Center for Disease Control and Prevention. JAMA. 2020 Apr 07;323(13):1239–1242. doi: 10.1001/jama.2020.2648.2762130
    1. Espinosa-Gonzalez A, Neves AL, Fiorentino F, Prociuk D, Husain L, Ramtale SC, Mi E, Mi E, Macartney J, Anand SN, Sherlock J, Saravanakumar K, Mayer E, de Lusignan S, Greenhalgh T, Delaney BC. Predicting risk of hospital admission in patients with suspected COVID-19 in a community setting: protocol for development and validation of a multivariate risk prediction tool. JMIR Res Protoc. 2021 May 25;10(5):e29072. doi: 10.2196/29072. v10i5e29072
    1. Bennett KE, Mullooly M, O'Loughlin M, Fitzgerald M, O'Donnell J, O'Connor L, Oza A, Cuddihy J. Underlying conditions and risk of hospitalisation, ICU admission and mortality among those with COVID-19 in Ireland: A national surveillance study. Lancet Reg Health Eur. 2021 Jun;5:100097. doi: 10.1016/j.lanepe.2021.100097. S2666-7762(21)00074-0
    1. Riley RD, Snell KI, Ensor J, Burke DL, Harrell FE, Moons KG, Collins GS. Minimum sample size for developing a multivariable prediction model: PART II - binary and time-to-event outcomes. Stat Med. 2019 Mar 30;38(7):1276–1296. doi: 10.1002/sim.7992.
    1. The Health Service (Control of Patient Information) Regulations 2002. National Health Service, England and Wales. 2002. [2021-04-16]. .
    1. Greenhalgh T, Koh GCH, Car J. Covid-19: a remote assessment in primary care. BMJ. 2020 Mar 25;368:m1182. doi: 10.1136/bmj.m1182.
    1. Shielded patient list: guidance for general practice. NHS Digital. 2020. [2021-04-07]. .
    1. Greenhalgh T, Thompson P, Weiringa S, Neves AL, Husain L, Dunlop M, Rushforth A, Nunan D, de Lusignan S, Delaney B. What items should be included in an early warning score for remote assessment of suspected COVID-19? qualitative and Delphi study. BMJ Open. 2020 Nov 12;10(11):e042626. doi: 10.1136/bmjopen-2020-042626. bmjopen-2020-042626
    1. Grant S, Collins GS, Nashef SAM. Statistical Primer: developing and validating a risk prediction model. Eur J Cardiothorac Surg. 2018 Aug 01;54(2):203–208. doi: 10.1093/ejcts/ezy180.4993384
    1. Hosmer DW, Hosmer T, Le Cessie S, Lemeshow S. A comparison of goodness-of-fit tests for the logistic regression model. Stat Med. 1997 May 15;16(9):965–980. doi: 10.1002/(sici)1097-0258(19970515)16:9<965::aid-sim509>;2-o.10.1002/(SICI)1097-0258(19970515)16:9<965::AID-SIM509>;2-O
    1. Harrell FE, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996 Feb 28;15(4):361–387. doi: 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>;2-4.10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>;2-4
    1. Froud R, Abel G. Using ROC curves to choose minimally important change thresholds when sensitivity and specificity are valued equally: the forgotten lesson of pythagoras. theoretical considerations and an example application of change in health status. PLoS One. 2014;9(12):e114468. doi: 10.1371/journal.pone.0114468. PONE-D-14-24081
    1. Riley RD, Ensor J, Snell KIE, Debray TPA, Altman DG, Moons KGM, Collins GS. External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges. BMJ. 2016 Jun 22;353:i3140. doi: 10.1136/bmj.i3140.
    1. Austin PC, van Klaveren D, Vergouwe Y, Nieboer D, Lee DS, Steyerberg EW. Validation of prediction models: examining temporal and geographic stability of baseline risk and estimated covariate effects. Diagn Progn Res. 2017;1:12. doi: 10.1186/s41512-017-0012-3.
    1. White IR, Royston P, Wood AM. Multiple imputation using chained equations: Issues and guidance for practice. Stat Med. 2011 Feb 20;30(4):377–399. doi: 10.1002/sim.4067.
    1. Marshall A, Altman DG, Holder RL, Royston P. Combining estimates of interest in prognostic modelling studies after multiple imputation: current practice and guidelines. BMC Med Res Methodol. 2009 Jul 28;9:57. doi: 10.1186/1471-2288-9-57. 1471-2288-9-57
    1. Jakobsen JC, Gluud C, Wetterslev J, Winkel P. When and how should multiple imputation be used for handling missing data in randomised clinical trials - a practical guide with flowcharts. BMC Med Res Methodol. 2017 Dec 06;17(1):162. doi: 10.1186/s12874-017-0442-1. 10.1186/s12874-017-0442-1
    1. Sterne JAC, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, Wood AM, Carpenter JR. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009 Jun 29;338:b2393. doi: 10.1136/bmj.b2393.
    1. Fine JP, Gray RJ. A proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc. 1999 Jun;94(446):496–509. doi: 10.1080/01621459.1999.10474144.
    1. Shah A, Bartlett JW, Carpenter J, Nicholas O, Hemingway H. Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study. Am J Epidemiol. 2014 Mar 15;179(6):764–774. doi: 10.1093/aje/kwt312. kwt312
    1. Biessmann F, Rukat T, Schmidt P, Naidu P, Schelter S, Taptunov A, Lange D, Salinas D. Datawig: Missing value imputation for tables. J Machine Learn Res. 2019;20(175):1–6.
    1. Yoon J, Jordon J, Schaar M. GAIN: missing data imputation using generative adversarial nets. 35th International Conference on Machine Learning; July 10-15, 2018; Stockholm, Sweden. 2018. pp. 5689–5698.
    1. Austin PC, van Klaveren D, Vergouwe Y, Nieboer D, Lee DS, Steyerberg EW. Geographic and temporal validity of prediction models: different approaches were useful to examine model performance. J Clin Epidemiol. 2016 Nov;79:76–85. doi: 10.1016/j.jclinepi.2016.05.007. S0895-4356(16)30140-8
    1. Steyerberg EW, Harrell FE. Prediction models need appropriate internal, internal-external, and external validation. J Clin Epidemiol. 2016 Jan;69:245–247. doi: 10.1016/j.jclinepi.2015.04.005. S0895-4356(15)00175-4
    1. Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015 Jan 07;350:g7594. doi: 10.1136/bmj.g7594.

Source: PubMed

3
Předplatit