Population structure in admixed populations: effect of admixture dynamics on the pattern of linkage disequilibrium

C L Pfaff, E J Parra, C Bonilla, K Hiester, P M McKeigue, M I Kamboh, R G Hutchinson, R E Ferrell, E Boerwinkle, M D Shriver, C L Pfaff, E J Parra, C Bonilla, K Hiester, P M McKeigue, M I Kamboh, R G Hutchinson, R E Ferrell, E Boerwinkle, M D Shriver

Abstract

Gene flow between genetically distinct populations creates linkage disequilibrium (admixture linkage disequilibrium [ALD]) among all loci (linked and unlinked) that have different allele frequencies in the founding populations. We have explored the distribution of ALD by using computer simulation of two extreme models of admixture: the hybrid-isolation (HI) model, in which admixture occurs in a single generation, and the continuous-gene-flow (CGF) model, in which admixture occurs at a steady rate in every generation. Linkage disequilibrium patterns in African American population samples from Jackson, MS, and from coastal South Carolina resemble patterns observed in the simulated CGF populations, in two respects. First, significant association between two loci (FY and AT3) separated by 22 cM was detected in both samples. The retention of ALD over relatively large (>10 cM) chromosomal segments is characteristic of a CGF pattern of admixture but not of an HI pattern. Second, significant associations were also detected between many pairs of unlinked loci, as observed in the CGF simulation results but not in the simulated HI populations. Such a high rate of association between unlinked markers in these populations could result in false-positive linkage signals in an admixture-mapping study. However, we demonstrate that by conditioning on parental admixture, we can distinguish between true linkage and association resulting from shared ancestry. Therefore, populations with a CGF history of admixture not only are appropriate for admixture mapping but also have greater power for detection of linkage disequilibrium over large chromosomal regions than do populations that have experienced a pattern of admixture more similar to the HI model, if methods are employed that detect and adjust for disequilibrium caused by continuous admixture.

Figures

Figure 1
Figure 1
Schematic of the two models of admixture used in computer simulations (adapted from Long [1991])
Figure 2
Figure 2
Observed association statistics between PAAs for Jackson (A) and South Carolina (B). Each bar represents the association statistic observed between a pair of PAA markers listed in table 1. The FY-AT3 pair, linked at 22 cM, is shown as a black bar, marker pairs that are unlinked but in association (G>3.84, P<.05) are shown as hatched bars, and unlinked pairs that are not associated (P>.05) are shown as unblackened bars. Marker pairs containing the FY locus are indicated by asterisks (*).
Figure 3
Figure 3
Amount of ALD expected under each model of admixture for unlinked loci and loci linked at 5 cM. The results shown are for two loci with δ = .54 and .49, and with 50% admixture in the first generation, for the HI model, and 36 generations of 1.9% admixture, for the CGF model (equivalent to 50% total).
Figure 4
Figure 4
Average proportion of simulation rounds demonstrating significant association for HI (▒), CGF (▴), and combination (▪)models, for a panel of population-association alleles with characteristics of those listed in table 1. Simulations were run such that the total amount of admixture in each model was 17%. The simulation parameters were as follows: 1,000 rounds of simulation, population size 100,000, sample size 1,000, significance level 5%, and 15 generations of admixture (the combination model was run with 12 generations of CGF, followed by 3 generations of random mating).
Figure 5
Figure 5
Simulated results of Dt compared with calculated (D0) for both the HI and the CGF models. For these simulations, D0 is calculated according to equation (2), for both models of admixture. In this case, Dt is the observed LD value, rather than the theoretical value predicted by equation (1).
Figure 6
Figure 6
Comparison of observed Dt/D0 ratios for two African American sample populations, from Jackson (A) and South Carolina (B). Unlinked pairs of markers are shown as blackened circles (░), and the FY-AT3 pair is shown as an unblackened circle (○). D0 is calculated according to equation (2).

Source: PubMed

3
Předplatit