Imputation-based strategies for clinical trial longitudinal data with nonignorable missing values

Xiaowei Yang, Jinhui Li, Steven Shoptaw, Xiaowei Yang, Jinhui Li, Steven Shoptaw

Abstract

Biomedical research is plagued with problems of missing data, especially in clinical trials of medical and behavioral therapies adopting longitudinal design. After a literature review on modeling incomplete longitudinal data based on full-likelihood functions, this paper proposes a set of imputation-based strategies for implementing selection, pattern-mixture, and shared-parameter models for handling intermittent missing values and dropouts that are potentially nonignorable according to various criteria. Within the framework of multiple partial imputation, intermittent missing values are first imputed several times; then, each partially imputed data set is analyzed to deal with dropouts with or without further imputation. Depending on the choice of imputation model or measurement model, there exist various strategies that can be jointly applied to the same set of data to study the effect of treatment or intervention from multi-faceted perspectives. For illustration, the strategies were applied to a data set with continuous repeated measures from a smoking cessation clinical trial.

(c) 2008 John Wiley & Sons, Ltd.

Figures

Figure 1
Figure 1
The average and SD curves for the log-scaled carbon monoxide levels. On this plot, the four mean curves of the log-scaled carbon monoxide levels and the corresponding pointwise standard errors are drawn for each of the four treatment conditions: Control, RP-only, CM-only, and RP+CM (RP = relapse prevention, CM = contingency management). Vertical bars indicate the estimated standard errors of average carbon monoxide levels. The stars (‘*’) over the x-axis mark the time points (i.e. visit numbers), where the carbon monoxide levels are significantly different indicated by a pointwise ANOVA (p-value<0.001). Y -axis indicates values of carbon monoxide levels after log(1+ x) transform. X-axis represents number of clinic visit for study participants (1, …, 36; three times per week).
Figure 2
Figure 2
Missingness patterns for the carbon monoxide levels across treatment conditions. For each treatment condition, an image depicts the missingness indicators of carbon monoxide levels for each smoker at each research visit. Dark colored area indicates that the corresponding carbon monoxide levels were observed while white colored area indicates that the corresponding data were missing intermittently or missing after dropout. The four treatment conditions are control, RP-only, CM-only, and RP+CM (RP = relapse prevention, CM = contingency management).
Figure 3
Figure 3
Mean carbon monoxide levels for completers and early terminators. By dividing the 174 smokers into two groups: completers (n1 = 112) and early terminators (n1 = 62), the mean curves of carbon monoxide levels for subjects receiving CM (contingency management) and for subjects receiving no CM are depicted within each of the two groups (completers and early terminators).
Figure 4
Figure 4
Plate 1. Pattern-dependent distribution of carbon monoxide levels. Using the software package named ‘MPI 2.0’, profiles and mean curves of carbon monoxide levels are drawn within each of the five groups determined by the dropout times: dropout at or before week 5, 7, 9, 11, and 12. In plots, green curves correspond to the mean carbon monoxide levels of subjects who received CM (contingency management), red curves indicate the mean curves of the subjects who did not receive CM, and gray-colored dash-lines depict the profiles of all the subjects within each group. The bottom-right plot depicts all the mean profiles corresponding to the five dropout patterns.

Source: PubMed

3
Sottoscrivi