On the practice of ignoring center-patient interactions in evaluating hospital performance

Machteld Varewyck, Stijn Vansteelandt, Marie Eriksson, Els Goetghebeur, Machteld Varewyck, Stijn Vansteelandt, Marie Eriksson, Els Goetghebeur

Abstract

We evaluate the performance of medical centers based on a continuous or binary patient outcome (e.g., 30-day mortality). Common practice adjusts for differences in patient mix through outcome regression models, which include patient-specific baseline covariates (e.g., age and disease stage) besides center effects. Because a large number of centers may need to be evaluated, the typical model postulates that the effect of a center on outcome is constant over patient characteristics. This may be violated, for example, when some centers are specialized in children or geriatric patients. Including interactions between certain patient characteristics and the many fixed center effects in the model increases the risk for overfitting, however, and could imply a loss of power for detecting centers with deviating mortality. Therefore, we assess how the common practice of ignoring such interactions impacts the bias and precision of directly and indirectly standardized risks. The reassuring conclusion is that the common practice of working with the main effects of a center has minor impact on hospital evaluation, unless some centers actually perform substantially better on a specific group of patients and there is strong confounding through the corresponding patient characteristic. The bias is then driven by an interplay of the relative center size, the overlap between covariate distributions, and the magnitude of the interaction effect. Interestingly, the bias on indirectly standardized risks is smaller than on directly standardized risks. We illustrate our findings by simulation and in an analysis of 30-day mortality on Riksstroke.

Keywords: Firth correction; causal effects; direct and indirect standardization; misspecified model; quality of care.

© 2015 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

Figures

Figure 1
Figure 1
Extrapolation in the directly and indirectly standardized risk considering two centers (small or large center size). The 30‐day mortality risk is estimated by a model with or without interaction between center and patient's age.
Figure 2
Figure 2
Regression line for the data‐generating model (1) and for the working model (8) both with linear link function, considering two centers (center 0 at the bottom and center 1 on top) and a scalar L.
Figure 3
Figure 3
Estimated bias and precision for direct and indirect standardization are based on S = 500 simulations. Black dots are used for direct standardization and gray triangles for indirect, full lines are used for models without interactions and dotted lines for models with interactions. (a) Standard normal distribution for L and (b) Beta distribution for L.
Figure 4
Figure 4
Center‐specific values for the patient's age and time to hospital (hours) for one imputed dataset. Bubble size is proportional to center size. Center 47 and 54 have more than 1% difference in its estimated potential full population risk when ignoring interactions with time to hospital (MI).
Figure 5
Figure 5
The directly or indirectly standardized risk per center, with or without interactions between center and patient's age or time to hospital (grey without and black with interactions), in function of the standard deviation of the center‐specific distribution of patient's age or time to hospital for multiple imputation analysis. Bubble size is proportional to center size and ellipses indicate centers with more than 1% difference in estimated mortality risk.

References

    1. Normand S, Glickman M, Gatsonis C. Statistical methods for profiling providers of medical care: issues and applications. Journal of the American Statistical Association. 1997; 92(439):803–814.
    1. Shahian DM, Blackstone EH, Edwards FH, Grover FL, Grunkemeier GL, Naftel DC, Nashef SA, Nugent WC, Peterson ED. Cardiac surgery risk models: a position article. The Annals of Thoracic Surgery. 2004; 78(5):1868–1877.
    1. Shahian D, Normand S. Comparison of “risk‐adjusted" hospital outcomes. Circulation: Journal of the American Heart Association. 2008; 117: 1955–1963.
    1. Spiegelhalter D. Funnel plots for comparing institutional performance. Statistics in Medicine. 2005; 24(8):1185–1202.
    1. DeLong E, Peterson E, DeLong D, Muhlbaier L, Hackett S, Mark D. Comparing risk‐adjustment methods for provider profiling. Statistics in Medicine. 1997; 16(23):2645–2664.
    1. He K, Kalbfleisch JD, Li Y, Li Y. Evaluating hospital readmission rates in dialysis facilities; adjusting for hospital effects. Lifetime Data Analysis. 2013; 19(4):490–512.
    1. Varewyck M, Goetghebeur E, Eriksson M, Vansteelandt S. On shrinkage and model extrapolation in the evaluation of clinical center performance. Biostatistics. 2014; 15(4):651–664.
    1. Gatsonis C, Normand SL, Liu C, Morris C. Geographic variation of procedure utilization: a hierarchical model approach. Medical Care. 1993; 31(5):YS54–YS59.
    1. Gatsonis CA, Epstein AM, Newhouse JP, Normand SL, McNeil BJ. Variations in the utilization of coronary angiography for elderly patients with an acute myocardial infarction: an analysis using hierarchical logistic regression. Medical Care. 1995; 33(6):625–642.
    1. Austin P, Alter D, Tu J. The use of fixed‐ and random‐effects models for classifying hospitals as mortality outliers: a Monte Carlo assessment. Medical Decision Making. 2003; 23(6):526–539.
    1. Saposnik G, Baibergenova A, O'Donnell M, Hill M, Kapral M, Hachinski V. and On behalf of the Stroke Outcome Research Canada (SORCan) Working Group. Hospital volume and stroke outcome does it matter? Neurology. 2007; 69(11):1142–1151.
    1. Greenland S, Robins JM, Pearl J. Confounding and collapsibility in causal inference. Statistical Science. 1999; 14(1):29–46.
    1. VanderWeele TJ, Mukherjee B, Chen J. Sensitivity analysis for interactions under unmeasured confounding. Statistics in Medicine. 2012; 31(22):2552–2564.
    1. Hernán MA, Robins JM. Estimating causal effects from epidemiological data. Journal of Epidemiology and Community Health. 2006; 60(7):578–586.
    1. Firth D. Bias reduction of maximum likelihood estimates. Biometrika. 1993; 80(1):27–38.
    1. Shahian DM, Normand SL, Torchiana DF, Lewis SM, Pastore JO, Kuntz RE, Dreyer PI. Cardiac surgery report cards: comprehensive review and statistical critique. The Annals of Thoracic Surgery. 2001; 72(6):2155–2168.
    1. Liu J, Gustafson P. On average predictive comparisons and interactions. International Statistical Review. 2008; 76(3):419–432.
    1. The ATLANTIS, ECASS, and NINDS rt‐PA Study Group Investigators. Association of outcome with early stroke treatment: pooled analysis of atlantis, ecass, and ninds rt‐pa stroke trials. Lancet. 2004; 363(9411):768–774.
    1. Lin CB, Peterson ED, Smith EE, Saver JL, Liang L, Xian Y, Olson DM, Shah BR, Hernandez AF, Schwamm LH, Fonarow GC. Emergency medical service hospital prenotification is associated with improved evaluation and treatment of acute ischemic stroke. Circulation: Cardiovascular Quality and Outcomes. 2012; 5(4):514–522.
    1. Buuren S, Groothuis‐Oudshoorn K. MICE: multivariate imputation by chained equations in R. Journal of Statistical Software. 2011; 45(3):1–67.
    1. Rubin D. Estimating causal effects from large data sets using propensity scores. Annals of Internal Medicine. 1997; 127(8S):757–763.
    1. Hansen BB. The prognostic analogue of the propensity score. Biometrika. 2008; 95(2):481–488.
    1. Kalbfleisch JD, Wolfe RA. On monitoring outcomes of medical providers. Statistics in Biosciences. 2013; 5(2):286–302.

Source: PubMed

Подписаться