Longitudinal functional principal component analysis

Sonja Greven, Ciprian Crainiceanu, Brian Caffo, Daniel Reich, Sonja Greven, Ciprian Crainiceanu, Brian Caffo, Daniel Reich

Abstract

We introduce models for the analysis of functional data observed at multiple time points. The dynamic behavior of functional data is decomposed into a time-dependent population average, baseline (or static) subject-specific variability, longitudinal (or dynamic) subject-specific variability, subject-visit-specific variability and measurement error. The model can be viewed as the functional analog of the classical longitudinal mixed effects model where random effects are replaced by random processes. Methods have wide applicability and are computationally feasible for moderate and large data sets. Computational feasibility is assured by using principal component bases for the functional processes. The methodology is motivated by and applied to a diffusion tensor imaging (DTI) study designed to analyze differences and changes in brain connectivity in healthy volunteers and multiple sclerosis (MS) patients. An R implementation is provided.87.

Figures

Fig 1
Fig 1
Top: Sagittal image of the corpus callosum in one of the study subjects, a healthy 33-year-old man, showing the segmentation used [following 46] for construction of the tract profile. Values denote the bin number at the boundary point from the splenium (back of the head) to the genu/rostrum (closer to the eyes). Bottom: Two example subjects (both MS patients) from the tractography data with 5 and 6 complete visits, respectively. Shown are the fractional anisotropy along the corpus callosum, measured at the 120 sample points. Different visits for the same subject are indicated by color and overlaid.
Fig 2
Fig 2
True and estimated eigenfunctions φkX=(φk0,φk1) and φkU, k = 1, …, 4. The left column gives results for the part φk0 corresponding to the random functional intercept Xi,0, the middle column for the part φk1 corresponding to the random functional slope Xi,1, and the right column for the component φkU corresponding to the visit-specific functional deviation Uij. Shown are the true function (thick black line), the mean of the estimated functions over 1000 simulations (dashed red line), the pointwise 5th and 95th percentiles of the estimated functions (blue), and the estimated functions from the first 50 simulations (grey). Simulations were based on model (2.3) with NX = NU = 4, a balanced design with I = 100 and Ji = 4 for all i, non-orthogonal φk0,φk1 and φkU with unequal weight on φk1 and φk0, a mixture distribution for the scores ξij and ζijk, and no bivariate smoothing of the covariance functions.
Fig 3
Fig 3
Computation time for LFPCA for a simulated data set with the given number of subjects I and number of observations per subject J, and with D sample points per curve. Specifics of how computation time was measured are given in Section 4.3.
Fig 4
Fig 4
The first three estimated principal components for the random intercept and slope process X. The left column gives estimates for the φk0, corresponding to the random functional intercept Xi,0. Depicted are estimates for the overall mean η(d) (solid line), and for η(d)±2λkφk0, k = 1, 2, 3 (+ and −, respectively). The middle column gives the corresponding results for the random functional slope Xi,1. The right column shows boxplots for the estimates of the scores ξik corresponding to ( φk0,φk1), k = 1, 2, 3, by case/control group. Estimated scores for the two example patients with tract profiles shown in Figure 1 are indicated by A and B, respectively.
Fig 5
Fig 5
The first three estimated principal components for the visit-specific deviation process U. The left column gives results for the principal components φkU, depicting estimates for the overall mean η(d) (solid line), and for η(d)±2νkφkU, k = 1, 2, 3 (+ and −, respectively). The right column shows boxplots for the estimates of the scores ζik corresponding to φkU, k = 1, 2, 3, by case/control group. Estimated scores for example visits of the two patients with tract profiles shown in Figure 1 are indicated by A (visit 8) and B (visit 2), respectively.

Source: PubMed

3
Prenumerera