Minimum sample size for developing a multivariable prediction model: PART II - binary and time-to-event outcomes

Richard D Riley, Kym Ie Snell, Joie Ensor, Danielle L Burke, Frank E Harrell Jr, Karel Gm Moons, Gary S Collins, Richard D Riley, Kym Ie Snell, Joie Ensor, Danielle L Burke, Frank E Harrell Jr, Karel Gm Moons, Gary S Collins

Abstract

When designing a study to develop a new prediction model with binary or time-to-event outcomes, researchers should ensure their sample size is adequate in terms of the number of participants (n) and outcome events (E) relative to the number of predictor parameters (p) considered for inclusion. We propose that the minimum values of n and E (and subsequently the minimum number of events per predictor parameter, EPP) should be calculated to meet the following three criteria: (i) small optimism in predictor effect estimates as defined by a global shrinkage factor of ≥0.9, (ii) small absolute difference of ≤ 0.05 in the model's apparent and adjusted Nagelkerke's R2 , and (iii) precise estimation of the overall risk in the population. Criteria (i) and (ii) aim to reduce overfitting conditional on a chosen p, and require prespecification of the model's anticipated Cox-Snell R2 , which we show can be obtained from previous studies. The values of n and E that meet all three criteria provides the minimum sample size required for model development. Upon application of our approach, a new diagnostic model for Chagas disease requires an EPP of at least 4.8 and a new prognostic model for recurrent venous thromboembolism requires an EPP of at least 23. This reinforces why rules of thumb (eg, 10 EPP) should be avoided. Researchers might additionally ensure the sample size gives precise estimates of key predictor effects; this is especially important when key categorical predictors have few events in some categories, as this may substantially increase the numbers required.

Keywords: binary and time-to-event outcomes; logistic and Cox regression; multivariable prediction model; pseudo R-squared; sample size; shrinkage.

© 2018 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

Figures

Figure 1
Figure 1
Summary of the steps involved in calculating the minimum sample size required for developing a multivariable prediction model for binary or time‐to‐event outcomes
Figure 2
Figure 2
Events per predictor parameter required to achieve various expected shrinkage (SVH) values for a new prediction model of venous thromboembolism recurrence risk with an assumed RCS_adj2 of 0.051 [Colour figure can be viewed at wileyonlinelibrary.com]
Figure 3
Figure 3
Sample size required (based on Equation (11)) for a particular number of predictor parameters (p) to achieve a particular value of expected shrinkage (SVH), for a new prediction model of venous thromboembolism recurrence risk with an assumed RCS_adj2 of 0.051 [Colour figure can be viewed at wileyonlinelibrary.com]

References

    1. Steyerberg EW. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. New York, NY: Springer Science+Business Media; 2009.
    1. Royston P, Moons KG, Altman DG, Vergouwe Y. Prognosis and prognostic research: developing a prognostic model. Br Med J. 2009;338:1373‐1377.10.1136/bmj.b604
    1. Steyerberg EW, Moons KG, van der Windt DA, et al. Prognosis research strategy (PROGRESS) 3: prognostic model research. PLoS Med. 2013;10(2):e1001381.
    1. Wells PS, Anderson DR, Rodger M, et al. Derivation of a simple clinical model to categorize patients probability of pulmonary embolism: increasing the models utility with the SimpliRED D‐dimer. Thromb Haemost. 2000;83(3):416‐420.
    1. Wells PS, Anderson DR, Bormanis J, et al. Value of assessment of pretest probability of deep‐vein thrombosis in clinical management. Lancet. 1997;350(9094):1795‐1798.
    1. Anderson KM, Odell PM, Wilson PW, Kannel WB. Cardiovascular disease risk profiles. Am Heart J. 1991;121(1 Pt 2):293‐298.
    1. Hippisley‐Cox J, Coupland C, Vinogradova Y, et al. Predicting cardiovascular risk in England and Wales: prospective derivation and validation of QRISK2. BMJ. 2008;336(7659):1475‐1482.
    1. Haybittle JL, Blamey RW, Elston CW, et al. A prognostic index in primary breast cancer. Br J Cancer. 1982;45(3):361‐366.
    1. Galea MH, Blamey RW, Elston CE, Ellis IO. The Nottingham prognostic index in primary breast cancer. Breast Cancer Res Treat. 1992;22(3):207‐219.
    1. Riley RD, Snell KIE, Ensor J, et al. Minimum sample size for developing a multivariable prediction model: PART I ‐ continuous outcomes. Statist Med. 2018. 10.1002/sim.7993
    1. Peduzzi P, Concato J, Feinstein AR, Holford TR. Importance of events per independent variable in proportional hazards regression analysis II. Accuracy and precision of regression estimates. J Clin Epidemiol. 1995;48(12):1503‐1510.
    1. Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996;49(12):1373‐1379.
    1. Concato J, Peduzzi P, Holford TR, Feinstein AR. Importance of events per independent variable in proportional hazards analysis I. Background, goals, and general strategy. J Clin Epidemiol. 1995;48(12):1495‐1501.
    1. Vittinghoff E, McCulloch CE. Relaxing the rule of ten events per variable in logistic and Cox regression. Am J Epidemiol. 2007;165(6):710‐718.
    1. Harrell FE Jr. Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. Second ed. Cham, Switzerland: Springer International Publishing; 2015.
    1. Ogundimu EO, Altman DG, Collins GS. Adequate sample size for developing prediction models is not simply related to events per variable. J Clin Epidemiol. 2016;76:175‐182.
    1. Austin PC, Steyerberg EW. Events per variable (EPV) and the relative performance of different strategies for estimating the out‐of‐sample validity of logistic regression models. Stat Methods Med Res. 2017;26(2):796‐808.
    1. Wynants L, Bouwmeester W, Moons KG, et al. A simulation study of sample size demonstrated the importance of the number of events per variable to develop prediction models in clustered data. J Clin Epidemiol. 2015;68(12):1406‐1414.
    1. van der Ploeg T, Austin PC, Steyerberg EW. Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints. BMC Med Res Methodol. 2014;14(1):137.
    1. Courvoisier DS, Combescure C, Agoritsas T, Gayet‐Ageron A, Perneger TV. Performance of logistic regression modeling: beyond the number of events per variable, the role of data structure. J Clin Epidemiol. 2011;64(9):993‐1000.
    1. Cox DR, Snell EJ. The Analysis of Binary Data. Second ed. Boca Raton, FL: Chapman and Hall; 1989.
    1. Van Houwelingen JC. Shrinkage and penalized likelihood as methods to improve predictive accuracy. Stat Neerlandica. 2001;55(1):17‐34.
    1. Van Houwelingen JC, Le Cessie S. Predictive value of statistical models. Statist Med. 1990;9(11):1303‐1325.
    1. Copas JB. Regression, prediction and shrinkage. J Royal Stat Soc Ser B Methodol. 1983;45(3):311‐354.
    1. Tibshirani R. Regression shrinkage and selection via the lasso. J Royal Stat Soc Ser B Methodol. 1996;58(1):267‐288.
    1. Pavlou M, Ambler G, Seaman SR, et al. How to develop a more accurate risk prediction model when there are few events. BMJ. 2015;351:h3868.
    1. Moons KG, Donders AR, Steyerberg EW, Harrell FE. Penalized maximum likelihood estimation to directly adjust diagnostic and prognostic prediction models for overoptimism: a clinical example. J Clin Epidemiol. 2004;57(12):1262‐1270.
    1. Efron B. Bootstrap methods: another look at the jackknife. Ann Stat. 1979;7(1):1‐26.
    1. van Diepen M, Schroijen MA, Dekkers OM, et al. Predicting mortality in patients with diabetes starting dialysis. PLoS One. 2014;9(3):e89744.
    1. Copas JB. Using regression models for prediction: shrinkage and regression to the mean. Stat Methods Med Res. 1997;6(2):167‐183.
    1. Magee L. R2 measures based on wald and likelihood ratio joint significance tests. Am Stat. 1990;44(3):250‐253.
    1. Hendry DF, Nielsen B. Econometric Modeling: A Likelihood Approach. Princeton, NJ: Princeton University Press; 2012.
    1. Mittlboeck M, Heinzl H. Pseudo R‐squared measures of generalized linear models. In: Proceedings of the 1st European Workshop on the Assessment of Diagnostic Performance; 2004; Milan, Italy.
    1. Nagelkerke N. A note on a general definition of the coefficient of determination. Biometrika. 1991;78:691‐692.
    1. Debray TP, Damen JA, Snell KI, et al. A guide to systematic review and meta‐analysis of prediction model performance. BMJ. 2017;356:i6460.
    1. Wessler BS, Paulus J, Lundquist CM, et al. Tufts PACE clinical predictive model registry: update 1990 through 2015. Diagn Progn Res. 2017;1(1):20.
    1. McFadden D. Conditional logit analysis of qualitative choice behavior In: Zarembka P, ed. Frontiers in Econometrics. Cambridge, MA: Academic Press; 1974:104‐142.
    1. O'Quigley J, Xu R, Stare J. Explained randomness in proportional hazards models. Statist Med. 2005;24(3):479‐489.
    1. Royston P. Explained variation for survival models. Stata J. 2006;6(1):83‐96.
    1. Royston P, Sauerbrei W. A new measure of prognostic separation in survival data. Statist Med. 2004;23(5):723‐748.
    1. Jinks RC, Royston P, Parmar MK. Discrimination‐based sample size calculations for multivariable prognostic models for time‐to‐event data. BMC Med Res Methodol. 2015;15(1):82.
    1. Poppe KK, Doughty RN, Wells S, et al. Developing and validating a cardiovascular risk score for patients in the community with prior cardiovascular disease. Heart. 2017;103(12):891‐892.
    1. Hippisley‐Cox J, Coupland C. Development and validation of QDiabetes‐2018 risk prediction algorithm to estimate future risk of type 2 diabetes: cohort study. BMJ. 2017;359:j5019.
    1. Sultan AA, West J, Grainge MJ, et al. Development and validation of risk prediction model for venous thromboembolism in postpartum women: multinational cohort study. BMJ. 2016;355:i6253.
    1. Dvoretzky A, Kiefer J, Wolfowitz J. Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Ann Math Stat. 1956;27(3):642‐669.
    1. Brasil PE, Xavier SS, Holanda MT, et al. Does my patient have chronic Chagas disease? Development and temporal validation of a diagnostic risk score. Rev Soc Bras Med Trop. 2016;49(3):329‐340.
    1. Ensor J, Riley RD, Jowett S, et al. Prediction of risk of recurrence of venous thromboembolism following treatment for a first unprovoked venous thromboembolism: systematic review, prognostic model and clinical decision rule, and economic evaluation. Health Technol Assess. 2016;20(12):1‐190.
    1. Royston P, Parmar MKB. Flexible parametric proportional‐hazards and proportional‐odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Statist Med. 2002;21(15):2175‐2197.
    1. Royston P, Lambert PC. Flexible Parametric Survival Analysis Using Stata: Beyond the Cox Model. Boca Raton, Fl: CRC Press; 2011.
    1. Maxwell SE, Kelley K, Rausch JR. Sample size planning for statistical power and accuracy in parameter estimation. Annu Rev Psychol. 2008;59:537‐563.
    1. Hsieh FY, Lavori PW, Cohen HJ, Feussner JR. An overview of variance inflation factors for sample‐size calculation. Eval Health Prof. 2003;26(3):239‐257.
    1. Hsieh FY, Lavori PW. Sample‐size calculations for the Cox proportional hazards regression model with nonbinary covariates. Control Clin Trials. 2000;21(6):552‐560.
    1. Schmoor C, Sauerbrei W, Schumacher M. Sample size considerations for the evaluation of prognostic factors in survival analysis. Stat Med. 2000;19(4):441‐452.
    1. Borenstein M. Planning for precision in survival studies. J Clin Epidemiol. 1994;47(11):1277‐1285.
    1. Feiveson AH. Power by simulation. Stata J. 2002;2(2):107‐124.
    1. Hsieh FY, Bloch DA, Larsen MD. A simple method of sample size calculation for linear and logistic regression. Statist Med. 1998;17(14):1623‐1634.
    1. van Smeden M, de Groot JA, Moons KG, et al. No rationale for 1 variable per 10 events criterion for binary logistic regression analysis. BMC Med Res Methodol. 2016;16(1):163.
    1. Riley RD, Hayden JA, Steyerberg EW, et al. Prognosis research strategy (PROGRESS) 2: prognostic factor research. PLoS Med. 2013;10(2):e1001380.
    1. Moons KG, Altman DG, Reitsma JB, et al. Transparent reporting of a multivariable prediction model for individual prognosis Or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162(1):W1‐W73.
    1. van Smeden M, Moons KG, de Groot JA, et al. Sample size for binary logistic prediction models: beyond events per variable criteria. Stat Methods Med Res. 2018. In press.

Source: PubMed

3
Abonneren