EFFECTIVELY SELECTING A TARGET POPULATION FOR A FUTURE COMPARATIVE STUDY

Lihui Zhao, Lu Tian, Tianxi Cai, Brian Claggett, L J Wei, Lihui Zhao, Lu Tian, Tianxi Cai, Brian Claggett, L J Wei

Abstract

When comparing a new treatment with a control in a randomized clinical study, the treatment effect is generally assessed by evaluating a summary measure over a specific study population. The success of the trial heavily depends on the choice of such a population. In this paper, we show a systematic, effective way to identify a promising population, for which the new treatment is expected to have a desired benefit, utilizing the data from a current study involving similar comparator treatments. Specifically, using the existing data, we first create a parametric scoring system as a function of multiple multiple baseline covariates to estimate subject-specific treatment differences. Based on this scoring system, we specify a desired level of treatment difference and obtain a subgroup of patients, defined as those whose estimated scores exceed this threshold. An empirically calibrated threshold-specific treatment difference curve across a range of score values is constructed. The subpopulation of patients satisfying any given level of treatment benefit can then be identified accordingly. To avoid bias due to overoptimism, we utilize a cross-training-evaluation method for implementing the above two-step procedure. We then show how to select the best scoring system among all competing models. Furthermore, for cases in which only a single pre-specified working model is involved, inference procedures are proposed for the average treatment difference over a range of score values using the entire data set, and are justified theoretically and numerically. Lastly, the proposals are illustrated with the data from two clinical trials in treating HIV and cardiovascular diseases. Note that if we are not interested in designing a new study for comparing similar treatments, the new procedure can also be quite useful for the management of future patients, so that treatment may be targeted towards those who would receive nontrivial benefits to compensate for the risk or cost of the new treatment.

Keywords: Cross-training-evaluation; Lasso procedure; Personalized medicine; Prediction; Ridge regression; Stratified medicine; Subgroup analysis; Variable selection.

Figures

Figure 1
Figure 1
Estimated average treatment difference for patients with (Z) ≥ c using the scoring system built with two baseline covariates, log(CD4) and log10(RNA), for the ACTG 320 data. (a) Without cross-validation. (b) With cross-validation (Solid: point estimate with cross-validation; Dotted Dash: point estimate without cross-validation; Dashed: 95% pointwise confidence interval; Shaded: 95% simultaneous confidence interval).
Figure 2
Figure 2
Comparing the two estimated average treatment differences for patients with largest 100(1 − q)% scores using the systems built with and without log10(RNA) for the ACTG 320 data
Figure 3
Figure 3
Comparing the estimated average treatment difference curves using various scoring systems based on 500 replicates of cross-validation for the ACTG 320 data (left panel: two separate models; right panel: a single interaction model)
Figure 4
Figure 4
Comparing the estimated average treatment difference curves using different scoring systems with respect to 72-month survival rate, based on 500 replicates of cross-validation for the PEACE data (left panel: two separate models; right panel: a single interaction model)
Figure 5
Figure 5
Comparing the estimated average treatment difference curves using different scoring systems with respect to restricted mean survival time up to 72 months, based on 500 replicates of cross-validation for the PEACE data (left panel: two separate models; right panel: a single interaction model)
Figure 6
Figure 6
Estimated average treatment difference for patients with (Z) ≥ c using the scoring system built with two separate models and 7 covariates for the PEACE data (left panel: 72-month survival rate; right panel: restricted mean survival time up to 72 months)
Figure 7
Figure 7
Comparisons between the estimation procedures with and without cross-validation with n = 870; (a) and (b) are based on simulation with the 9 covariates mimicking the HIV example; (c) and (d) are based on simulation with the 9 covariates plus 50 noise variables. The (a) and (c) present the average treatment difference curves, the solid curve is the “truth”, the dashed curve is the empirical average using cross-validation procedure with a 4:1 ratio of training and evaluation samples, and the dotted curve is the empirical average without using cross-validation. In (b) and (d), the solid and dashed lines are the coverage probabilities of the 95% confidence intervals without and with cross-validation, respectively.

Source: PubMed

Подписаться