Trial Sequential Analysis in systematic reviews with meta-analysis
Jørn Wetterslev, Janus Christian Jakobsen, Christian Gluud, Jørn Wetterslev, Janus Christian Jakobsen, Christian Gluud
Abstract
Background: Most meta-analyses in systematic reviews, including Cochrane ones, do not have sufficient statistical power to detect or refute even large intervention effects. This is why a meta-analysis ought to be regarded as an interim analysis on its way towards a required information size. The results of the meta-analyses should relate the total number of randomised participants to the estimated required meta-analytic information size accounting for statistical diversity. When the number of participants and the corresponding number of trials in a meta-analysis are insufficient, the use of the traditional 95% confidence interval or the 5% statistical significance threshold will lead to too many false positive conclusions (type I errors) and too many false negative conclusions (type II errors).
Methods: We developed a methodology for interpreting meta-analysis results, using generally accepted, valid evidence on how to adjust thresholds for significance in randomised clinical trials when the required sample size has not been reached.
Results: The Lan-DeMets trial sequential monitoring boundaries in Trial Sequential Analysis offer adjusted confidence intervals and restricted thresholds for statistical significance when the diversity-adjusted required information size and the corresponding number of required trials for the meta-analysis have not been reached. Trial Sequential Analysis provides a frequentistic approach to control both type I and type II errors. We define the required information size and the corresponding number of required trials in a meta-analysis and the diversity (D2) measure of heterogeneity. We explain the reasons for using Trial Sequential Analysis of meta-analysis when the actual information size fails to reach the required information size. We present examples drawn from traditional meta-analyses using unadjusted naïve 95% confidence intervals and 5% thresholds for statistical significance. Spurious conclusions in systematic reviews with traditional meta-analyses can be reduced using Trial Sequential Analysis. Several empirical studies have demonstrated that the Trial Sequential Analysis provides better control of type I errors and of type II errors than the traditional naïve meta-analysis.
Conclusions: Trial Sequential Analysis represents analysis of meta-analytic data, with transparent assumptions, and better control of type I and type II errors than the traditional meta-analysis using naïve unadjusted confidence intervals.
Keywords: Diversity; Fixed-effect model; Group sequential analysis; Heterogeneity; Information size; Interim analysis; Meta-analysis; Random-effects model; Sample size; Trial sequential analysis.
Figures
References
- Turner RM, Bird SM, Higgins JP. The impact of study size on metaanalyses: examination of underpowered studies in Cochrane reviews. PLoS One. 2013;8:e59202. doi: 10.1371/journal.pone.0059202.
- Pereira TV, Ioannidis JP. Statistically significant metaanalyses of clinical trials have modest credibility and inflated effects. J Clin Epidemiol. 2011;64:1060–9. doi: 10.1016/j.jclinepi.2010.12.012.
- AlBalawi Z, McAlister FA, Thorlund K, Wong M, Wetterslev J. Random error in cardiovascular meta-analyses: how common are false positive and false negative results? Int J Cardiol. 2013;168:1102–7. doi: 10.1016/j.ijcard.2012.11.048.
- Imberger G. Multiplicity and sparse data in systematic reviews of anaesthesiological interventions: a cause of increased risk of random error and lack of reliability of conclusions? Copenhagen: Copenhagen University, Faculty of Health and Medical Sciences; 2014.
- Brok J, Thorlund K, Wetterslev J, Gluud C. Apparently conclusive metaanalyses may be inconclusive—trial sequential analysis adjustment of random error risk due to repetitive testing of accumulating data in apparently conclusive neonatal metaanalyses. Int J Epidemiol. 2009;38:287–98. doi: 10.1093/ije/dyn188.
- Thorlund K, Imberger G, Walsh M, Chu R, Gluud C, Wetterslev J, Guyatt G, Devereaux PJ, Thabane L. The number of patients and events required to limit the risk of overestimation of intervention effects in meta-analysis—a simulation study. PLoS One. 2011;6:e25491. doi: 10.1371/journal.pone.0025491.
- Wetterslev J, Thorlund K, Brok J, Gluud C. Trial sequential analysis may establish when firm evidence is reached in cumulative meta-analysis. J Clin Epidemiol. 2008;61:64–75. doi: 10.1016/j.jclinepi.2007.03.013.
- Pogue J, Yusuf S. Cumulating evidence from randomised trials: utilizing sequential monitoring boundaries for cumulative meta-analysis. Control Clin Trials. 1997;18:580–93. doi: 10.1016/S0197-2456(97)00051-2.
- Pogue J, Yusuf S. Overcoming the limitations of current meta-analysis of randomised controlled trials. Lancet. 1998;351:47–52. doi: 10.1016/S0140-6736(97)08461-4.
- Thorlund K, Engstrøm J, Wetterslev J, Brok J, Imberger G, Gluud C. User manual for trial sequential analysis (TSA). Copenhagen Trial Unit, Centre for Clinical Intervention research, Copenhagen, Denmark. 2011: 1–115 available from .
- Wetterslev J, Thorlund K, Brok J, Gluud C. Estimating required information size by quantifying diversity in a random-effects meta-analysis. BMC Med Res Methodol. 2009;9:86. doi: 10.1186/1471-2288-9-86.
- Thorlund K, Engstrøm J, Wetterslev J, Brok J, Imberger G, Gluud C. Software for trial sequential analysis (TSA) ver. 0.9.5.5 Beta. Copenhagen Trial Unit, Centre for Clinical Intervention Research, Copenhagen, Denmark, free-ware available at .
- Young C, Horton R. Putting clinical trials into context. Lancet. 2005;366:107–8. doi: 10.1016/S0140-6736(05)66846-8.
- Clarke M, Horton R. Bringing it all together: Lancet-Cochrane collaborate on systematic reviews. Lancet. 2001;357:1728. doi: 10.1016/S0140-6736(00)04934-5.
- Clarke M, Hopewell S, Chalmers I. Clinical trials should begin and end with systematic reviews of relevant evidence: 12 years and waiting. Lancet. 2010;376:20–21. doi: 10.1016/S0140-6736(10)61045-8.
- Nielsen N, Friberg H, Gluud C, Wetterslev J. Hypothermia after cardiac arrest should be further evaluated—a systematic review of randomised trials with metaanalysis and trial sequential analysis. Int J Cardiol. 2011;151:333–41. doi: 10.1016/j.ijcard.2010.06.008.
- Nielsen N, Wetterslev J, Cronberg T, Erlinge D, Gasche Y, Hassager C, Horn J, Hovdenes J, Kjaergaard J, Kuiper M, Pellis T, Stammet P, Wanscher M, Wise MP, Åneman A, Al-Subaie N, Boesgaard S, Bro-Jeppesen J, Brunetti I, Bugge JF, Hingston CD, Juffermans NP, Koopmans M, Køber L, Langørgen J, Lilja G, Møller JE, Rundgren M, Rylander C, Smid O, Werer C, Winkel P, Friberg H, TTM Trial Investigators Targeted temperature management at 33°C versus 36°C after cardiac arrest. N Engl J Med. 2013;369:2197–206. doi: 10.1056/NEJMoa1310519.
- Nielsen N, Wetterslev J, al-Subaie N, Andersson B, Bro-Jeppesen J, Bishop G, Brunetti I, Cranshaw J, Cronberg T, Edqvist K, Erlinge D, Gasche Y, Glover G, Hassager C, Horn J, Hovdenes J, Johnsson J, Kjaergaard J, Kuiper M, Langørgen J, Macken L, Martinell L, Martner P, Pellis T, Pelosi P, Petersen P, Persson S, Rundgren M, Saxena M, Svensson R, Stammet P, Thorén A, Undén J, Walden A, Wallskog J, Wanscher M, Wise MP, Wyon N, Aneman A, Friberg H. Target temperature management after out-of-hospital cardiac arrest – a randomised, parallel-group, assessor-blinded clinical trial – rationale and design. Am Heart J. 2012;163:541–8. doi: 10.1016/j.ahj.2012.01.013.
- Lan KKG, DeMets DL. Discrete sequential boundaries for clinical trials. Biometrika. 1983;70:659–63. doi: 10.2307/2336502.
- Peberdy MA, Callaway CW, Neumar RW, Geocadin RG, Zimmerman JL, Donnino M, Gabrielli A, Silvers SM, Zaritsky AL, Merchant R, Vanden Hoek TL, Kronick SL, American Heart Association Part 9: post-cardiac arrest care: American Heart Association Guidelines for Cardiopulmonary Resuscitation and Emergency Cardiovascular Care. Circulation. 2010;122(suppl 3):S768–S786. doi: 10.1161/CIRCULATIONAHA.110.971002.
- Armitage P, McPherson CK, Rowe BC. Repeated significance tests on accumulating data. J Royal Stat Soc Series A (General) 1969;132:235–44. doi: 10.2307/2343787.
- Pocock SJ. Group sequential methods in the design and analysis of clinical trials. Biometrika. 1977;64:191–9. doi: 10.1093/biomet/64.2.191.
- Berkey CS, Mosteller F, Lau J, Antman EM. Uncertainty of the time of first significance in random effects cumulative meta-analysis. Control Clin Trials. 1996;17:357–71. doi: 10.1016/S0197-2456(96)00014-1.
- Imberger G, Vejlby AD, Hansen SB, Møller AM, Wetterslev J. Statistical multiplicity in systematic reviews of anaesthesia interventions: a quantification and comparison between Cochrane and non-Cochrane reviews. PLoS One. 2011;6:e28422. doi: 10.1371/journal.pone.0028422.
- Wald A. Contributions to the theory of statistical estimation and testing hypotheses. Ann Math Stat. 1939;10:299–326. doi: 10.1214/aoms/1177732144.
- Wald A. Sequential tests of statistical hypotheses. Ann Math Stat. 1945;16:117–86. doi: 10.1214/aoms/1177731118.
- Wald A, Wolfowitz J. Bayes solutions of sequential decision problems. Proc Natl Acad Sci U S A. 1949;35:99–102. doi: 10.1073/pnas.35.2.99.
- Winkel P, Zhang NF. Statistical development of quality in medicine. Chichester, West Sussex: Wiley; 2007. pp. 1–224.
- Armitage P. The evolution of ways of deciding when clinical trials should stop recruiting. James Lind Library Bulletin 2013. .
- Dunn OJ. Multiple comparisons among means. J Am Stat Assoc. 1961;56:52–64. doi: 10.1080/01621459.1961.10482090.
- Peto R, Pike MC, Armitage P, Breslow NE, Cox DR, Howard SV, Mantel N, McPherson K, Peto J, Smith PG. Design and analysis of randomised clinical trials requiring prolonged observation of each patient. I. Introduction and design. Br J Cancer. 1976;34:585–612. doi: 10.1038/bjc.1976.220.
- O’Brien PC, Fleming TR. A multiple testing procedure for clinical trials. Biometrics. 1979;35:549–56. doi: 10.2307/2530245.
- ICH Harmonised Tripartite Guideline Statistical principles for clinical trials. International Conference on Harmonisation E9 Expert Working Group. Stat Med. 1999;18:1905–42.
- Kim K, DeMets DL. Confidence intervals following group sequential tests in clinical trials. Biometrics. 1987;43:857–64. doi: 10.2307/2531539.
- DeMets DL. Group sequential procedures: calendar versus information time. Stat Med. 1989;8:1191–8. doi: 10.1002/sim.4780081003.
- Jennison C, Turnbull BW. Group sequential methods with applications to clinical trials. Boca Raton: Chapman & Hall/CRC Press; 2000.
- Grant AM, Altman DG, Babiker AB, Campbell MK, Clemens FJ, Darbyshire JH, Elbourne DR, McLeer SK, Parmar MK, Pocock SJ, Spiegelhalter DJ, Sydes MR, Walker AE, Wallace SA, DAMOCLES Study Group Issues in data monitoring and interim analysis of trials. Health Technol Assess. 2005;9:1–238. doi: 10.3310/hta9070.
- Chow S, Shao J, Wang H. Sample size calculations in clinical research. Taylor & Francis/CRC: Boca Raton; 2003.
- Reboussin DM, DeMets DL, Kim KM, Lan KK. Computations for group sequential boundaries using the Lan-DeMets spending function method. Control Clin Trials. 2000;21:190–207. doi: 10.1016/S0197-2456(00)00057-X.
- DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials. 1986;7:177–88. doi: 10.1016/0197-2456(86)90046-2.
- Deeks JJ, Higgins JPT. Statistical algorithms in Review Manager ver. 5.3. On behalf of the Statistical Methods Group of The Cochrane Collaboration. 2010.
- Higgins JP, Thompson SG. Quantifying heterogeneity in a meta-analysis. Stat Med. 2002;21:1539–58. doi: 10.1002/sim.1186.
- Kulinskaya E, Wood J. Trial sequential methods for meta-analysis. Res Synth Methods. 2014;5:212–220. doi: 10.1002/jrsm.1104.
- Thorlund K, Devereaux PJ, Wetterslev J, Guyatt G, Ioannidis JP, Thabane L, Gluud LL, Als-Nielsen B, Gluud C. Can trial sequential monitoring boundaries reduce spurious inferences from meta-analyses? Int J Epidemiol. 2009;38:276–86. doi: 10.1093/ije/dyn179.
- Imberger G, Thorlund K, Gluud C, Wetterslev J. False positive findings in cumulative meta-analysis with and without application of trial sequential analysis: an empirical review. BMJ Open. 2016;6(8):e011890. doi: 10.1136/bmjopen-2016-011890.
- Imberger G, Gluud C, Boylan J, Wetterslev J. Systematic reviews of anesthesiologic interventions reported as statistically significant: problems with power, precision, and type 1 error protection. Anesth Analg. 2015;121:1611–1622. doi: 10.1213/ANE.0000000000000892.
- Mascha EJ. Alpha, beta, meta: guidelines for assessing power and type I error in meta-analyses. Anesth Analg. 2015;121:1430–1433. doi: 10.1213/ANE.0000000000000993.
- Turner RM, Davey J, Clarke MJ, Thompson SG, Higgins JP. Predicting the extent of heterogeneity in meta-analysis, using empirical data from the Cochrane Database of Systematic Reviews. Int J Epidemiol. 2012;41:818–27. doi: 10.1093/ije/dys041.
- Thorlund K, Imberger G, Johnston BC, Walsh M, Awad T, Thabane L, Gluud C, Devereaux PJ, Wetterslev J. Evolution of heterogeneity (I2) estimates and their 95% confidence intervals in large meta-analyses. PLoS One. 2012;7:e39471. doi: 10.1371/journal.pone.0039471.
- Brok J, Thorlund K, Gluud C, Wetterslev J. Trial sequential analysis reveals insufficient information size and potentially false positive results in many meta-analyses. J Clin Epidemiol. 2008;61:763–9. doi: 10.1016/j.jclinepi.2007.10.007.
- Higgins JPT, Green S. red. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0. The Cochrane Collaboration, 2011. .
- Keus F, Wetterslev J, Gluud C, van Laarhoven CJ. Evidence at a glance: error matrix approach for overviewing available evidence. BMC Med Res Methodol. 2010;10:90. doi: 10.1186/1471-2288-10-90.
- Garattini S, Jakobsen JC, Wetterslev J, Berthele’ V, Banzi R, Rath A, Neugebauer E, Laville M, Maisson Y, Hivert Y, Eickermann M, Aydin B, Ngwabyt S, Martinho C, Giradi C, Szmigielski C, Demotes-Maynard J, Gluud C. Evidence-based clinical practice: overview of threats to the validity of evidence. Eur J Intern Med. 2016;32:13–21. doi: 10.1016/j.ejim.2016.03.020.
- Kjaergard LL, Villumsen J, Gluud C. Reported methodological quality and discrepancies between large and small randomised trials in meta-analyses. Ann Intern Med. 2001;135:982–9. doi: 10.7326/0003-4819-135-11-200112040-00010.
- Savović J, Jones HE, Altman DG, Harris RJ, Jüni P, Pildal J, Als-Nielsen B, Balk EM, Gluud C, Gluud LL, Ioannidis JP, Schulz KF, Beynon R, Welton NJ, Wood L, Moher D, Deeks JJ, Sterne JA. Influence of reported study design characteristics on intervention effect estimates from randomised, controlled trials. Ann Intern Med. 2012;157:429–38. doi: 10.7326/0003-4819-157-6-201209180-00537.
- Lundh A, Sismondo S, Lexchin J, Busuioc OA, Bero L. Industry sponsorship and research outcome. Cochrane Database Syst Rev. 2012;12:MR000033.
- Chan AW, Hróbjartsson A, Haahr MT, Gøtzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomised trials: comparison of protocols to published articles. JAMA. 2004;291:2457–65. doi: 10.1001/jama.291.20.2457.
- Andrews JC, Schünemann HJ, Oxman AD, Pottie K, Meerpohl JJ, Coello PA, Rind D, Montori VM, Brito JP, Norris S, Elbarbary M, Post P, Nasser M, Shukla V, Jaeschke R, Brozek J, Djulbegovic B, Guyatt G. GRADE guidelines: 15. Going from evidence to recommendation-determinants of a recommendation’s direction and strength. J Clin Epidemiol. 2013;66:726–35. doi: 10.1016/j.jclinepi.2013.02.003.
- The Fermi paradox. . Accessed 27 Feb 2017.
- Roberts I, Ker K, Edwards P, Beecher D, Manno D, Sydenham E. The knowledge system underpinning healthcare is not fit for purpose and must change. BMJ. 2015;350:h2463. doi: 10.1136/bmj.h2463.
- Bolland MJ, Grey A, Gamble GD, Reid IR. The effect of vitamin D supplementation on skeletal, vascular, or cancer outcomes: a trial sequential meta-analysis. Lancet Diabetes Endocrinol. 2014;2(4):307–20. doi: 10.1016/S2213-8587(13)70212-2.
- Tovey DI, Bero L, Farquhar C, Lasserson T, MacLehose H, Macdonald G, et al. A response to Ian Roberts and his colleagues. Rapid response. BMJ. 2015;350:h2463. doi: 10.1136/bmj.h2952.
- Wetterslev J, Engstrøm J, Gluud C, Thorlund K. Trial sequential analysis: methods and software for cumulative meta-analyses. Cochrane Methods Cochrane Database Syst Rev. 2012;2(suppl 1):29–31.
- Higgins JPT. Comment on “Trial sequential analysis: methods and software for cumulative meta-analyses”. Cochrane Methods Cochrane Database Syst Rev. 2012;2(suppl 1):32–33.
- Wetterslev J, Engstrøm J, Gluud C, Thorlund K. Response to “Comment by Higgins”. Cochrane Methods Cochrane Database Syst Rev. 2012;2(suppl 1):33–5.
- Higgins JP, Whitehead A, Simmonds M. Sequential methods for random-effects meta-analysis. Stat Med. 2011;30:903–21. doi: 10.1002/sim.4088.
- Fleisher LA, Beckman JA, Brown KA, Calkins H, Chaikof EL, Fleischmann KE, Freeman WK, Froehlich JB, Kasper EK, Kersten JR, Riegel B, Robb JF, Smith SC, Jr, Jacobs AK, Adams CD, Anderson JL, Antman EM, Buller CE, Creager MA, Ettinger SM, Faxon DP, Fuster V, Halperin JL, Hiratzka LF, Hunt SA, Lytle BW, Nishimura R, Ornato JP, Page RL, Riegel B, Tarkington LG, Yancy CW. ACC/AHA 2007 guidelines on perioperative cardiovascular evaluation and care for noncardiac surgery: executive summary: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines (Writing Committee to Revise the 2002 Guidelines on Perioperative Cardiovascular Evaluation for Noncardiac Surgery) J Am Coll Cardiol. 2007;50:1707–32. doi: 10.1016/j.jacc.2007.09.001.
- Popper KR. Logik der Forschung. Vienna: Springer; 1959.
- Bangalore S, Wetterslev J, Pranesh S, Sawhney S, Gluud C, Messerli FH. Perioperative beta blockers in patients having non-cardiac surgery: a meta-analysis. Lancet. 2008;372:1962–76. doi: 10.1016/S0140-6736(08)61560-3.
- Jakobsen JC, Wetterslev J, Winkel P, Lange T, Gluud C. The threshold for statistical and clinical significance in systematic reviews with metaanalytic methods. Med Res Methodol. 2014;14:120. doi: 10.1186/1471-2288-14-120.
- Sterne JA. Teaching hypothesis tests – time for significant change? Stat Med 2002;21: 985–94, 995–9, 1001.
- Jakobsen JC, Gluud C, Winkel P, Lange T, Wetterslev J. The thresholds for statistical and clinical significance – a five-step procedure for evaluation of intervention effects in randomised clinical trials. BMC Med Res Methodol. 2014;14:34. doi: 10.1186/1471-2288-14-34.
- Roloff V, Higgins JP, Sutton AJ. Planning future studies based on the conditional power of a meta-analysis. Stat Med. 2013;32:11–24. doi: 10.1002/sim.5524.
- IntHout J, Ioannidis JP, Borm GF. Obtaining evidence by a single well-powered trial or several modestly powered trials. Stat Methods Med Res. 2016;25(2):538–52. doi: 10.1177/0962280212461098.
- Valentine JC, Pigott TD, Rothstein HR. How many studies do you need? A primer on statistical power for meta-analysis. J Educ Behav Stat. 2010;35(2):215–247. doi: 10.3102/1076998609346961.
- Borenstein M, Hedges LV, Higgins JPT, Rothstein HR. Introduction to meta-analysis. Chichester: John Wiley & Sons Ltd.; 2009.
- Higgins JP, Spiegelhalter DJ. Being sceptical about meta-analyses: a Bayesian perspective on magnesium trials in myocardial infarction. Int J Epidemiol. 2002;31:96–104. doi: 10.1093/ije/31.1.96.
- Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian approaches to clinical trials and health-care evaluation. Statistics in practice. Chichester: John Wiley & Sons Ltd; 2004.
- Higgins JP, Thompson SG, Spiegelhalter DJ. A re-evaluation of random-effects meta-analysis. J R Stat Soc Ser A Stat Soc. 2009;172:137–59. doi: 10.1111/j.1467-985X.2008.00552.x.
- Jennison C, Turnbull BW. Efficient group sequential designs when there are several effect sizes under consideration. Stat Med. 2006;25:917–32. doi: 10.1002/sim.2251.
- Pereira TV, Horwitz RI, Ioannidis JP. Empirical evaluation of very large treatment effects of medical interventions. JAMA. 2012;308:1676–84. doi: 10.1001/jama.2012.13444.
- Lindley DV. A statistical paradox. Biometrika. 1957;44:187–92. doi: 10.1093/biomet/44.1-2.187.
- Ioannidis JPA. Why most published research findings are false. PLoS Med. 2005;2:e124. doi: 10.1371/journal.pmed.0020124.
- Fisher R. Statistical methods and scientific induction. J R Stat Soc Ser B. 1955;17:69–78.
- Johnson EV. Revised standards for statistical evidence. PNAS. 2013, 110:48:19313–19317. Accessed Dec 2016. .
Source: PubMed