Are we assuming too much with our statistical assumptions? Lessons learned from the ALTTO trial

E M Holmes, I Bradbury, L S Williams, L Korde, E de Azambuja, D Fumagalli, A Moreno-Aspitia, J Baselga, M Piccart-Gebhart, A C Dueck, R D Gelber, ALTTO Trial Study Team, E M Holmes, I Bradbury, L S Williams, L Korde, E de Azambuja, D Fumagalli, A Moreno-Aspitia, J Baselga, M Piccart-Gebhart, A C Dueck, R D Gelber, ALTTO Trial Study Team

Abstract

Background: Design, conduct, and analysis of randomized clinical trials (RCTs) with time to event end points rely on a variety of assumptions regarding event rates (hazard rates), proportionality of treatment effects (proportional hazards), and differences in intensity and type of events over time and between subgroups.

Design and methods: In this article, we use the experience of the recently reported Adjuvant Lapatinib and/or Trastuzumab Treatment Optimization (ALTTO) RCT, which enrolled 8381 patients with human epidermal growth factor 2-positive early breast cancer between June 2007 and July 2011, to highlight how routinely applied statistical assumptions can impact RCT result reporting.

Results and conclusions: We conclude that (i) futility stopping rules are important to protect patient safety, but stopping early for efficacy can be misleading as short-term results may not imply long-term efficacy, (ii) biologically important differences between subgroups may drive clinically different treatment effects and should be taken into account, e.g. by pre-specifying primary subgroup analyses and restricting end points to events which are known to be affected by the targeted therapies, (iii) the usual focus on the Cox model may be misleading if we do not carefully consider non-proportionality of the hazards. The results of the accelerated failure time model illustrate that giving more weight to later events (as in the log rank test) can affect conclusions, (iv) the assumption that accruing additional events will always ensure gain in power needs to be challenged. Changes in hazard rates and hazard ratios over time should be considered, and (v) required family-wise control of type 1 error ≤ 5% in clinical trials with multiple experimental arms discourages investigations designed to answer more than one question.

Trial registration: clinicaltrials.gov Identifier NCT00490139.

Keywords: accelerated failure time models; early breast cancer; family-wise type 1 error; power; proportional hazards; stopping boundaries.

© The Author(s) 2019. Published by Oxford University Press on behalf of the European Society for Medical Oncology. All rights reserved. For permissions, please email: journals.permissions@oup.com.

Figures

Figure 1.
Figure 1.
Forest plot of interim and primary analyses for all patients (top) and for subgroups defined by hormone receptor status (positive [HR+] and negative [HR-]) and by chemotherapy timing (concurrent, sequential). L + T, lapatinib + trastuzumab treatment arm; T, trastuzumab-alone treatment arm; CI, confidence interval; median FU, median follow-up.
Figure 2.
Figure 2.
Log effect function plot illustrating the estimated treatment effect of L + T versus T as a function of the years since randomization, for all patients (solid blue line) and for separate patient cohorts defined by hormone receptor status and chemotherapy timing. [Note: hazard ratio = exp (log effect function); that is y-axis values for the log effect function = 0.25, 0.00, −0.25, −0.50, and −0.75 correspond to hazard ratios = 1.28, 1.00, 0.78, 0.61, and 0.47, respectively]. L + T, lapatinib + trastuzumab treatment arm; T, trastuzumab-alone treatment arm; HR, hormone receptor; Chemo Timing, chemotherapy timing.
Figure 3.
Figure 3.
Disease-free survival (DFS) event rates (number of DFS events per 1000 patient-years of follow-up) for all patients (on the left) and for separate patient cohorts defined by hormone receptor (HR) status and chemotherapy timing showing overall results and by type of DFS event for the time period from start of study to interim analysis (interim) and for the time period from interim analysis to primary analysis (interim to primary). Each bar shows the contributions to the overall DFS event rate from each of the four types of DFS event (distant recurrence, locoregional recurrence, second primary malignancy, and death without recurrence).

Source: PubMed

3
订阅