TwinMARM: two-stage multiscale adaptive regression methods for twin neuroimaging data

Yimei Li, John H Gilmore, Jiaping Wang, Martin Styner, Weili Lin, Hongtu Zhu, Yimei Li, John H Gilmore, Jiaping Wang, Martin Styner, Weili Lin, Hongtu Zhu

Abstract

Twin imaging studies have been valuable for understanding the relative contribution of the environment and genes on brain structures and their functions. Conventional analyses of twin imaging data include three sequential steps: spatially smoothing imaging data, independently fitting a structural equation model at each voxel, and finally correcting for multiple comparisons. However, conventional analyses are limited due to the same amount of smoothing throughout the whole image, the arbitrary choice of smoothing extent, and the decreased power in detecting environmental and genetic effects introduced by smoothing raw images. The goal of this paper is to develop a two-stage multiscale adaptive regression method (TwinMARM) for spatial and adaptive analysis of twin neuroimaging and behavioral data. The first stage is to establish the relationship between twin imaging data and a set of covariates of interest, such as age and gender. The second stage is to disentangle the environmental and genetic influences on brain structures and their functions. In each stage, TwinMARM employs hierarchically nested spheres with increasing radii at each location and then captures spatial dependence among imaging observations via consecutively connected spheres across all voxels. Simulation studies show that our TwinMARM significantly outperforms conventional analyses of twin imaging data. Finally, we use our method to detect statistically significant effects of genetic and environmental variations on white matter structures in a neonatal twin study.

Figures

Fig. 1
Fig. 1
Diagram for the structural equation model for twin data. The correlation of additive effects (a1, a2) is 1 for MZ twin and 0.5 for DZ twin. The correlation of dominant effects (d1, d2) is 1 for MZ twin and 0.25 for DZ twin. The twin share the same common environmental effect (c). Residual effects (e1, e2) for twin are not correlated.
Fig. 2
Fig. 2
Diagram for the TwinMARM method for twin data. The first stage is to estimate β = {β(v) : v ∈ }, while the second stage is to estimate η = {η(v) : vV}. In each stage, we reformulate the problem of estimating β (or η) as a regression model and then apply the multiscale adaptive regression method (MARM) to spatially and adaptively calculate β̂(v;hs) and η̂(v; hs) and their associated test statistics. Moreover, yij denote imaging measures for twin pairs, xij is a vector of clinical variables, β is the vector of unknown regression parameters, rij are residuals obtained from first stage, η include the variances of additive genetic, common environmental and residual effects, and zij is the design matrix for the second stage. ω(v, v′; hs) is the weight of the first stage, whereas ωe(v, v′; hs) is the weight of the second stage.
Fig. 3
Fig. 3
Results from a simulation study of comparing EM and TwinMARM at 2 different sample sizes (n = 100, 400). The first row contains the results for σa2(v) as n = 100: Panel (A) is the ground truth image of σa2(v), in which five ROIs with black, blue, red, yellow, and white color represent σa2(v)=0,0.3,0.6,0.9and1.2 , respectively. Panel (B) is a selected slice of σ^a2(v;h10) obtained from a simulated dataset by using the TwinMARM method. Panel (C) is a selected slice of σ^a2(v) obtained from the same dataset as panel (B) by using the voxel-wise EM method. Panel (D) is a selected slice of σ^a2(v) by using EM, after smoothing the same simulated data set as panel (B). The second row contains panels (E), (F) and (G) as n = 400, which are the corresponding results of panels of panels (B), (C) and (D), respectively.
Fig. 4
Fig. 4
Results from a simulation study of comparing EM and TwinMARM at 2 different sample sizes (n = 100, 400). The first row contains the results for β1(v), when sample size n = 100: Panel (A) is the ground truth image of β1(v), in which five ROIs with black, blue, red, yellow, and white color represent β1(v)=0, 0.5, 1.0, 1.5, and 2.0, respectively. Panel (B) is a selected slice of β̂1(v; h10) obtained from a simulated dataset by using TwinMARM. Panel (C) is a selected slice of β̂1(v) obtained from the same simulated dataset as panel (B) by using EM. Panel (D) is a selected slice of β̂1(v) obtained by using EM, after smoothing the same simulated data set as panel (B). The second row contains panels (E), (F) and (G) as n = 400, which are the corresponding results of panels of panels (B), (C) and (D), respectively.
Fig. 5
Fig. 5
Results from a simulation study of comparing EM and TwinMARM at 2 different sample sizes (n = 100, 400). The first row contains the results for σa2(v) as n = 100: Panel (A) is the bias curve of σa2(v)=0,0.3,0.6,0.9and1.2, respectively. Panel (B) is the SD curve of σ^a2(v;h10) obtained from a simulated dataset by using TwinMARM, EM, and EM with smoothed data. Panel (C) is the ratio of RMS over SD. The second row contains panels (D), (E), and (F) as n = 400, which are the corresponding results of panels of panels (A), (B), and (C) respectively.
Fig. 6
Fig. 6
Results from a simulation study of comparing EM and TwinMARM at 2 different sample sizes (n = 100, 400). The first row contains the results for β1(v) as n = 100: Panel (A) is the bias curve of β1(v)=0, 0.5, 1.0, 1.5, and 2.0, respectively. Panel (B) is the SD curve of β̂(v) obtained from a simulated dataset by using TwinMARM, EM, and EM with smoothed data. Panel (C) is the ratio of RMS over SD. The second row contains panels (D), (E), and (F) as n = 400, which are the corresponding results of panels of panels (A), (B), and (C) respectively.
Fig. 7
Fig. 7
Simulation results for WA(v; h): rejection rates for pixels inside the five ROIs were reported by using TwinMARM at the h10 scale, EM, EM after smoothing the same simulated data and 2 different sample sizes (n = 100, 400) at α = 5%. For each case, 1,000 simulated datasets were used.
Fig. 8
Fig. 8
Results from the 49 twin pairs in a neonatal project on brain development on the selected 27th and 30th slices. Panels (A)–(F) : the – log10(p) values for testing genetic effects by using TwinMARM at the 1st and 10th iterations and EM with FWHM equal to 0mm, 3mm, 6mm, 9mm for the 27th slice; Panels (A′)–(F′) : the corresponding – log10(p) values for testing environmental effects for the 30th slice.

Source: PubMed

3
Abonner