DNA methylation mediates development of HbA1c-associated complications in type 1 diabetes

Zhuo Chen, Feng Miao, Barbara H Braffett, John M Lachin, Lingxiao Zhang, Xiwei Wu, Delnaz Roshandel, Melanie Carless, Xuejun Arthur Li, Joshua D Tompkins, John S Kaddis, Arthur D Riggs, Andrew D Paterson, DCCT/EDIC Study Group, Rama Natarajan, Barbara H Braffet, John M Lachin, Zhuo Chen, Feng Miao, Lingxiao Zhang, Rama Natarajan, Andrew D Paterson, Zhuo Chen, Feng Miao, Barbara H Braffett, John M Lachin, Lingxiao Zhang, Xiwei Wu, Delnaz Roshandel, Melanie Carless, Xuejun Arthur Li, Joshua D Tompkins, John S Kaddis, Arthur D Riggs, Andrew D Paterson, DCCT/EDIC Study Group, Rama Natarajan, Barbara H Braffet, John M Lachin, Zhuo Chen, Feng Miao, Lingxiao Zhang, Rama Natarajan, Andrew D Paterson

Abstract

Metabolic memory, the persistent benefits of early glycaemic control on preventing and/or delaying the development of diabetic complications, has been observed in the Diabetes Control and Complications Trial (DCCT) and in the Epidemiology of Diabetes Interventions and Complications (EDIC) follow-up study, but the underlying mechanisms remain unclear. Here, we show the involvement of epigenetic DNA methylation (DNAme) in metabolic memory by examining its associations with preceding glycaemic history, and with subsequent development of complications over an 18-yr period in the blood DNA of 499 randomly selected DCCT participants with type 1 diabetes who are also followed up in EDIC. We demonstrate the associations between DNAme near the closeout of DCCT and mean HbA1c during DCCT (mean-DCCT HbA1c) at 186 cytosine-guanine dinucleotides (CpGs) (FDR < 15%, including 43 at FDR < 5%), many of which were located in genes related to complications. Exploration studies into biological function reveal that these CpGs are enriched in binding sites for the C/EBP transcription factor, as well as enhancer/transcription regions in blood cells and haematopoietic stem cells, and open chromatin states in myeloid cells. Mediation analyses show that, remarkably, several CpGs in combination explain 68-97% of the association of mean-DCCT HbA1c with the risk of complications during EDIC. In summary, DNAme at key CpGs appears to mediate the association between hyperglycaemia and complications in metabolic memory, through modifying enhancer activity at myeloid and other cells.

Trial registration: ClinicalTrials.gov NCT00360815 NCT00360893.

Conflict of interest statement

Competing interests: All the authors declare that there are no competing interests associated with this manuscript.

Figures

Extended Data Fig. 1. Sensitivity analyses of…
Extended Data Fig. 1. Sensitivity analyses of additional clinical variables on the association significance between mean-DCCT HbA1c and DNAme at the HbA1c-assoc CpGs
For each HbA1c-assoc CpG, one additional clinical variable (indicated on top of each plot) was added as covariate in the multiple linear regression model used for the identification of HbA1c-assoc CpGs using all the samples (two-sided tests based on t-statistic, n=499). The resulting association significance in –log10p (y-axis) was compared with that obtained from the model without the specific variable (x-axis) and shown as one dot in the scatter plots. The blue line represents all the dots with same significance values. Abbreviations: SBP- systolic blood pressure; DBP-diastolic blood pressure; BMI-body mass index; CHL-cholesterol; TRG-triglyceride; LDL-low density lipoprotein; HDL-high density lipoprotein; PDR- proliferative diabetic retinopathy; SNPDR-severe nonproliferative diabetic retinopathy; CSME- clinically significant macular edema; AER30-AER > 30mg/24h; CARV-cardiovascular disease.
Extended Data Fig. 2. Sensitivity analyses of…
Extended Data Fig. 2. Sensitivity analyses of additional clinical variables on the percentage of association coefficients changes between mean-DCCT HbA1c and DNAme at the HbA1c-assoc CpGs.
For each HbA1c-assoc CpG, one additional clinical variable (indicated on top of each plot) was added as covariate in the multiple linear regression model with two-sided tests based on t-statistic used for identification of HbA1c-assoc CpGs using all the samples (n=499). For each clinical variable, the percentage change of the resulting association coefficients versus the coefficients estimated using the original model without the specific variable was calculated. For each additional variable, the distribution of the calculated changes was plotted using density plots. Abbreviations: COEF-coefficient; SBP- systolic blood pressure; DBP-diastolic blood pressure; BMI-body mass index; CHL-cholesterol; TRG-triglyceride; LDL-low density lipoprotein; HDL-high density lipoprotein; PDR- proliferative diabetic retinopathy; SNPDR-Severe nonproliferative diabetic retinopathy; CSME- clinically significant macular edema; AER30-AER > 30mg/24h; CARV-cardiovascular disease.
Extended Data Fig. 3. Validation of HbA1c-assoc…
Extended Data Fig. 3. Validation of HbA1c-assoc CpGs identified in DCCT499 in a meta-analysis of DCCT499 and Joslin195.
Fixed meta-analysis was performed using Bacon R package based on coefficients and standard error of CpGs obtained in both datasets (n=499 in DCCT499 and n=195 in Joslin195). Scatter plot is shown to compare the association estimates generated from meta-analysis and original models across 185 HbA1c-assoc CpGs, after excluding one HbA1c-assoc CpG not covered reliably in Joslin195 dataset due to detection p > 0.05 in at least one sample. Each dot represents one CpG with color indicating the significance levels obtained from meta-analysis. Red dots represent Bonferroni-adjusted p

Extended Data Fig. 4. Genomic locations and…

Extended Data Fig. 4. Genomic locations and IPA analyses of HbA1c-assoc genes.

a , Genomic…
Extended Data Fig. 4. Genomic locations and IPA analyses of HbA1c-assoc genes.
a, Genomic locations of HbA1c-assoc CpGs and non-associated CpGs relative to Refseq genes depicted using pie charts. TSS, transcription start site; TSS1500, distal promoter region from 1500bp upstream of TSS to TSS; TSS200, proximal promoter from 200bp upstream to 1500bp upstream of TSS; 5’UTR, 5’ untranslated region; 3’UTR, 3’ untranslated region; intergenic, genomic regions excluding TSS1500, TSS200 and gene body. b, Enriched canonical pathways (Benjamini-Hochberg adjusted P < 0.05) identified among annotated genes of HbA1c-assoc CpGs. Right-tailed Fisher’s exact test was used in IPA analysis. Total of 143 unique annotated genes were used as input for IPA, and 141 genes after excluding 2 that did not match in the IPA database were finally included in the analysis. Pathway names are listed on the left with corresponding significance level in –log10 form presented in the middle as bar plot. The pathways labeled in red font are known to be associated with diabetes and its complications. HbA1c-assoc genes identified in each pathway are shown on the right. Each row represents one pathway and each column represents one HbA1c-assoc gene found in the enriched pathway. Red indicates the corresponding specific gene (with name shown on the top of the column) that is involved in the pathway specified on the left of the panel. The majority of them contain several NFAT pathway genes, including TGFBR2, GNAI2, CACNB1, PLCB4, MEF2D, TGFB2, ADCY7, PRKD1 and CACNA1A.c, Top network related to cellular growth, proliferation and embryonic development. The 24 HbA1c-assoc genes labeled in red font are those identified to depict interactions with several proteins related to diabetes, its complications and insulin sensitivity including nuclear factor kappaB (NF-kB), protein kinaseB (PKB), phosphoinositide 3-kinase (PI3K), p38 mitogen-activated protein kinase (P38MAPK) and protein phosphatase2 catalytic (PPP2C) subunits.

Extended Data Fig. 5. Enrichment of 15…

Extended Data Fig. 5. Enrichment of 15 chromatin states at 186 HbA1c-assoc CpGs compared to…

Extended Data Fig. 5. Enrichment of 15 chromatin states at 186 HbA1c-assoc CpGs compared to all the non-HbA1c-assoc CpGs.
Fifteen genome-wide chromatin states were defined in 5 major blood cell types and hematopoietic stem cell (HSC) by the NIH roadmap Epigenetics Program (https://egg2.wustl.edu/roadmap/web_portal/chr_state_learning.html#core_15state for details). For each cell type, the percentage of CpGs located in the genomic regions depicting each chromatin state among HbA1c-assoc CpGs (black bars) versus all CpGs covered by the EPIC array (named as “all CpGs”, white bars) are shown side by side using bar plots. The enrichment significances of HbA1c-assoc CpGs versus the other CpGs were tested by right-tailed Fisher’s exact tests on 186 HbA1c-assoc CpGs vs. all the 815,246 non HbA1c-assoc CpGs reliably covered by the EPIC array in each specific chromatin state. **** p < 5e-08; *** p < 5e-05; ** p < 5e-03 and * p < 5e-02. States with statistically significant enrichment in at least one cell type are highlighted with colored background/shading: green for transcription-related (TxFlnk, Tx and TxWk) and yellow for enhancer-related (EnhG or Enh). The other states are shown in alternating grey and white shades for better visualization. Abbreviations: HSC-hematopoietic stem cell; TSS-transcription start site; TssA-Active TSS; TssAFlnk-Flanking Active TSS; TxFlnk-Transcribed state at gene 5’ and 3’; Tx-Strong transcription; TxWk-Weak transcription; EnhG-Genic enhancers; Enh-Enhancers; ZNF/Rpts- ZNF genes & repeats;Het- Heterochromatin; TssBiv- Bivalent/Poised TSS; BivFlnk- Flanking Bivalent TSS/Enh; EnhBiv- Bivalent Enhancer; ReprPC- Repressed PolyComb; ReprPCWk- Weak Repressed PolyComb; Quies-Quiescent/Low.

Extended Data Fig. 6. Enrichment of chromatin…

Extended Data Fig. 6. Enrichment of chromatin (epigenetic) states at HbA1c-assoc CpG sites in blood…

Extended Data Fig. 6. Enrichment of chromatin (epigenetic) states at HbA1c-assoc CpG sites in blood cells.
a, Enrichment of chromatin states in each cell-type at the 186 HbA1c-assoc CpGs. For each cell-type, the percentage of CpGs located in genomic regions enriched with 18 chromatin states (labeled under the plots) at HbA1c-assoc CpGs (dark bars) versus all CpGs covered by the EPIC array (white bars) in 4 major blood cell-types are shown side-by-side in the bar plots. Right-tailed Fisher’s exact tests were conducted to identify the significance of enrichment of each state in 186 HbA1c-assoc CpGs relative to all the 815,246 non HbA1c-assoc CpGs reliably covered by EPIC array. States with significant enrichment in at least one cell-type are highlighted with colored background/shading: green for transcription-related (Tx and TxWk) and yellow for enhancer-related states (EnhG1, EnhG2, EnhA1, EnhA2 or EnhWk). P values listed on top of each significantly enriched state.**** p< 5e-08; *** p< 5e-05; ** p< 5e-03 and * p< 5e-02. The other states are shown in alternating grey and white shades for better visualization. b, Enrichment of enhancer- or transcription-related states across 4 different cell-types in HbA1c-assoc CpGs versus all CpGs. The states for each plot are shown in the heading. P-values were determined using same tests and sample sizes as in panel a. c, Comparison of heatmaps of 15 chromatin states in 6 blood cell-types (top) with heatmaps of 18 states at 4 blood cell-types (bottom) at HbA1c-assoc CpGs. These states are defined by the NIH Roadmap Epigenetics Program. Unsupervised hierarchical analysis was performed on data from 15 states and visualized in the top panel using colors depicted in the boxed legend for 15 states (bottom left). 18-chromatin states at HbA1c-assoc CpGs, presented in the same order as 15 states, are shown in the lower panel with corresponding color legends in the bottom right box. Each row represents one cell-type and each column, one CpG.

Extended Data Fig. 7. Ribbon plots for…

Extended Data Fig. 7. Ribbon plots for the association between DNAme at candidate loci (HbA1c-assoc…

Extended Data Fig. 7. Ribbon plots for the association between DNAme at candidate loci (HbA1c-assoc CpGs) and the expression of genes located within 500kb (FDR + cells.
The associations of DNAme at HbA1c-assoc CpGs with the expression of gene (s) in monocytes or CD4+ T cells were analyzed on each pair of HbA1c-assoc CpGs and corresponding nearby genes within 500kb. Multiple linear regression models using two-sided tests based on t-statistic (n=1202 for monocytes and n=214 for CD4+ T cells) were applied to published datasets containing both DNAme and gene expression profiles from the same samples to identify HbA1c-assoc CpGs whose DNAme is associated with the expression of nearby gene (s) (expression-associated genes) with FDR < 0.05 in monocytes and CD4+ cells. The associations for each CpG in both monocytes (left) and CD4+ cells (right) are shown side-by-side within each panel. The order of CpGs is based on the significance level of the CpG with its most significantly expression-associated gene in both cell types. The height of the ribbon represents the significance level in –log10(p). Blue indicates negative association while red indicates positive association. Grey represents associations with nominal p < 0.05 but FDR >0.05. Most of the 9 common CpGs are in enhancers, suggesting a more ubiquitous regulatory role for DNAme at enhancers across different blood cell types.

Extended Data Fig. 8. Ribbon plots for…

Extended Data Fig. 8. Ribbon plots for the association between DNAme at candidate loci (HbA1c-assoc…

Extended Data Fig. 8. Ribbon plots for the association between DNAme at candidate loci (HbA1c-assoc CpGs) and expression of genes located within 500bp (FDR + cells).
The associations of DNAme at HbA1c-assoc CpGs with the expression of gene (s) in monocytes were analyzed on each pair of HbA1c-assoc CpGs and corresponding nearby genes within 500kb. Multiple linear regression models using two-sided tests based on t-statistic were applied to published datasets containing both DNAme and gene expression profiles from same monocyte samples (n=1202) to identify HbA1c-assoc CpGs whose DNAme is associated with the expression of nearby gene (s) (expression-associated genes) with FDR 0.05.

Extended Data Fig. 9. Ribbon plots for…

Extended Data Fig. 9. Ribbon plots for the association between DNAme at candidate loci (HbA1c-assoc…

Extended Data Fig. 9. Ribbon plots for the association between DNAme at candidate loci (HbA1c-assoc CpGs) and expression of genes located within 500bp (FDR + cells (not in monocytes).
The associations of DNAme at HbA1c-assoc CpGs with the expression of gene(s) in CD4+ T cells were analyzed on each pair of HbA1c-assoc CpGs and corresponding nearby genes within 500kb. Multiple linear regression models using two-sided tests based on t-statistic were applied to published datasets containing both DNAme and gene expression profiles from same CD4+ T cell samples (n=214) to identify HbA1c-assoc CpGs whose DNAme is associated with the expression of nearby gene(s) (expression-associated genes) with FDR < 0.05. The order of the CpGs is based on significance level of the CpG with its most significantly associated gene (expression data). The height of the ribbon represents the significance level in –log10(p). Blue indicates negative association while red indicates positive association. Grey represents the associations with nominal p < 0.05 but FDR >0.05.

Extended Data Fig. 10. Pipeline depicting the…

Extended Data Fig. 10. Pipeline depicting the step-wise sequence used for selecting CpGs for mediation…

Extended Data Fig. 10. Pipeline depicting the step-wise sequence used for selecting CpGs for mediation analyses of DNAme in between mean-DCCT HbA1c and complication development during EDIC.
At each selection step, the number of CpGs (n) is shown at the specified cut-off criteria. Arrows with same color reflect the same analytical step using the same model for each complication. The 4 steps as described in the Methods are specified on the right side of the figure. “n” represents number of CpGs identified at each step. In step 1, Model 1 was applied to each complication to analyze the association of mean-DCCT HbA1c with risk of complication development during EDIC (up to EDIC year 18). In step 2, Model 2 was applied to each complication to identify the HbA1c-assoc CpGs whose DNAme is associated with risk of complication development. In step 3, Model 3 was applied to each complication using similar model as Model 1 and Model 2, in which both DNAme and mean-DCCT HbA1c were now included. To identify the best combinations of CpGs to explain the association between HbA1c and complication development in step 4, only the top 10 CpGs (based on the association significance between risk of complication and DNAme identified using model 2 in step2) were considered in order to capture the major mediation effect, while maintaining reasonable computational complexity (which increases dramatically as the number of CpGs included increases using Brute-force approach).

Figure 1.. Identification of HbA1c-assoc CpGs.

a…

Figure 1.. Identification of HbA1c-assoc CpGs.

a , Study workflow for identification of CpGs whose…
Figure 1.. Identification of HbA1c-assoc CpGs.
a, Study workflow for identification of CpGs whose DNAme is associated with mean-DCCT HbA1c, subsequent exploration of biological functions of the identified CpGs, and mediation effect of DNAme on HbA1c-associated future complication development. b, Manhattan plot showing genome-wide association of DNAme with mean-DCCT HbA1c across 499 samples. Linear regression adjusted for covariates with two-sided test based on t-statistic was used to study the association of DNAme (M-values) with mean-DCCT HbA1c at each of the 815432 CpGs reliably covered by the Infinium MethylationEPIC array on all samples (n=499, see Methods). Each CpG is presented as a dot across genomic location (x-axis) of 24 chromosomes (alternating dark blue and orange). The significance level (in –log10) of each CpG is represented in the y-axis. 11 CpGs with Bonferroni-adjusted p < 0.05 (p=6.13e-08, red line) are shown as red dots, CpGs identified with FDR < 5% (blue line) as blue dots and those with FDR<15% (black line) as black dots. The top association CpG (cg19693031 at 3’UTR of TXNIP) is highlighted with red arrow. c&d. Comparison of the associations between DNAme and HbA1c in INT vs. CONV groups at 186 HbA1c-assoc CpGs by bubble plots. For each HbA1c-assoc CpG, linear regression model with addition of interaction term between DNAme and treatment (INT or CONV) was applied to the complete dataset (n=499) to identify CpGs having different associations of DNAme with HbA1c in INT vs. CONV. Each CpG is represented by one bubble dot. The size of the dot represents the significance level of interaction between DNAme and treatment (-log10Pinteraction) estimated by linear regression model. Pink represents the CpGs with Pinteraction<0.05 while blue represents CpGs with Pinteraction>0.05. In panel c, the x-axis represents significance level of association estimated by the same linear regression model described in panel b in the CONV group (n=249), and the y-axis represents the INT group (n=250). In panel d, x-axis represents the association coefficients (COEF) in CONV, and y-axis in the INT group. Significance levels obtained in all the multiple linear regression models in panels b-d were determined by two-sided tests based on t-statistic.

Figure 2.. Cross-cohort validation of HbA1c-assoc CpGs…

Figure 2.. Cross-cohort validation of HbA1c-assoc CpGs in DCCT41, SAFHS850 and Joslin195 and their persistence…

Figure 2.. Cross-cohort validation of HbA1c-assoc CpGs in DCCT41, SAFHS850 and Joslin195 and their persistence of DNAme in EDIC61.
a-c, Spearman correlation of HbA1c-assoc coefficients in DCCT499 (X-axis) versus log2 fold change (log2FC, Y-axis) between cases (history of hyperglycemia) and controls (history of normoglycemia) in the DCCT41 (A), FBG-associated coefficients in SAFHS850 (b), and associations of coefficients of HbA1c at sample collection in Joslin195 (c) across all the HbA1c-assoc CpGs identified in DCCT499 (FDR < 15%) and covered in each validation cohort. Each CpG is represented by one dot. Red/blue represents pos-assoc/neg-assoc CpGs with same trends of associations in both datasets respectively. Those with p < 0.05 in validation cohorts are shown as bigger dots. The remaining CpGs with discrepant association direction are shown in small green dots. The black diagonal line represents linear regression line. d, Heatmap on number (#) of validation cohorts in which each HbA1c-assoc CpG is validated. For each pos-assoc/neg-assoc CpG, total number of validation cohorts (one internal and two external cohorts) which covered the specific CpG and validated at different significance levels (same trend, nominal p < 0.05 or Bonferroni-adjusted p < 0.05) are summarized and depicted as heatmap with each column representing one CpG and each row representing one validation criterion as shown on the right side of the heatmap. Pos-assoc CpGs are shown in red and neg-assoc CpGs in blue as indicated in the color bar. The CpGs not depicting the same trends are shown as white box. The number of CpGs validated in at least 1, 2 or 3 cohorts at each criterion is summarized on the right side of the panel. e, Similar Spearman correlation between DCCT499 (X-axis) and EDIC61(Y-axis) as shown in panel a. f&g, DNAme differences between cases vs. controls in EDIC61 at 54 pos-assoc CpGs (f), and at 30 neg-assoc CpGs (g). Spearman correlation tests were used in panels a-c and e to determine the correlation coefficient and its significance using all the covered CpGs (n=84 in EDIC41, 77 in SAFHS850, 185 in Joslin195, and 84 in EDIC61). In each cohort, coefficients and significance levels of the association between DNAme and glucose-related variables at each covered CpGs were obtained by multiple linear regression model using all the samples in the corresponding cohort (n=41 in EDIC41, n=850 in SAFHS850, n=185 in Joslin195, and n=61 in EDIC61).

Figure 3.. Functional explorations of HbA1c-assoc CpGs.

Figure 3.. Functional explorations of HbA1c-assoc CpGs.

a , Heatmap of chromatin states within 5…
Figure 3.. Functional explorations of HbA1c-assoc CpGs.
a, Heatmap of chromatin states within 5 major peripheral blood cells and HSCs at 186 HbA1c-assoc CpGs. Fifteen genome-wide chromatin states defined by NIH roadmap Epigenetics Program (https://egg2.wustl.edu/roadmap/web_portal/chr_state_learning.html#core_15state for details) were presented using different colors as depicted in the legend (adjacent box). Each row represents one cell-type and each column represents one CpG. b, Enrichment of states related to enhancer regions (EnhG or Enh or EnhBiv) or transcription (TxFlnk, Tx and TxWk) across different cell types at HbA1c-assoc CpGs versus all CpGs. P values (right-tailed Fisher’s exact test comparing 186 HbA1c-assoc CpGs with all the 815,246 non HbA1c-assoc CpGs reliably covered by the EPIC array in each specific chromatin state) are shown on top of each cell type with the most significant ones emphasized by red boxes. TxFlnk: Flanking Active TSS; Tx: Strong transcription; TxWk: Weak transcription; EnhG: Genic enhancers; Enh: Enhancers; EnhBiv: Bivalent Enhancer. c, Enrichment of DNAse I hypersensitive regions containing HbA1c-assoc CpGs relative to all CpGs in blood cells based on the ENCODE Study (https://www.encodeproject.org). Cell types are classified into 3 categories (stem cells, myeloid progenitors and lymphoid progenitors) shown in different background colors. Each cell type is presented by a dot with –log10p as Y-axis. The color and size of the dots represent the FDR levels (Benjamini-Yekutieli-corrected Q value) as shown in the legend below. The binomial test was used in eFORGE to test 186 HbA1c-assoc CpGs compared to 1,000 matching background CpG sets, each containing the same number of CpGs with matching gene annotation and CpG island. d, The top 6 enriched motifs in the ±250bp regions of neg-assoc CpGs identified by RSAT analysis. For each motif, p value from binomial tests by comparing input sequences vs. background sequences, the corresponding Bonferroni-adjusted p-value (E-value), and number of input sequences containing the specific motif (n in parenthesis) are listed on top of the motif logo. Known JASPAR transcription factor (s) for the motif, if any, is/are listed below the corresponding logo. CEBPs are highlighted in red font. e, Scheme for the identification of promoters interacting with PIRs containing HbA1c-assoc CpGs in enhancers. f&g, IPA pathway (f) and the network (g) characterized using 224 genes identified by steps shown in panel e as input. Right-tailed Fisher’s Exact test on 179 genes (after excluding 45 without match in the IPA datatabase) was used to identify the enriched pathways at B-H adjusted p < 0.05 in panel f.

Figure 4.. Association of DNAme with gene…

Figure 4.. Association of DNAme with gene expression and genotypes.

a-e , The associations of…
Figure 4.. Association of DNAme with gene expression and genotypes.
a-e, The associations of DNAme at HbA1c-assoc CpGs with the expression of gene(s) in monocytes or CD4+ T cells were analyzed on each pair of HbA1c-assoc CpGs and its nearby genes within 500kb. Multiple linear regression models (two-sided tests based on t-stastistic, n=1202 for monocytes and n=214 for CD4+ T cells) were applied to published datasets containing both DNAme and gene expression profiles from same sample to identify HbA1c-assoc CpGs whose DNAme is associated with the expression of nearby gene(s) (expression-associated genes) with FDR < 0.05 in monocytes and CD4+ cells. a, Venn diagram of HbA1c-assoc CpGs with expression-associated genes in monocytes and CD4+ cells. b, Bar plots comparing the number (#) of CpGs whose most significant expression-associated gene is the same gene in which the corresponding CpG is located or not. c&d, Dot plots showing the association significance level between DNAme and gene expression at CpGs classified based on their genomic location relative to the expressed gene in monocytes (panel c) and CD4+ T cells (panel d). e, Heatmap of the number of pos-assoc or neg-assoc intronic CpGs with different chromatin states in monocytes and CD4+ cell types. Blue dots indicate neg-assoc CpGs and red dots indicate pos-assoc CpGs. Chromatin states related to transcription start site are represented as TSS, related to transcription as Tx, related to Enhancer as Enh, and quiescent state as Quies. f-i, Regional association plots of CpGs with methylation quantitative trait loci (meQTLs) identified genome-wide. The eight most significant independent HbA1c-assoc CpGs (p < 5e-08) were selected for meQTL analyses. meQTLs were identified from a GWAS by two-sided linear regression under an additive genetic model using DNAme and genotyping data of 474 European DCCT participants included in the current study. The 4 CpGs with statistically significant cis- (f-h) or trans-meQTLs (i) at Bonferroni-adjusted p < 0.05 (p < 5e-08) are shown, with corresponding CpGs listed on top of the plot. In each plot, each dot represents one SNP with left y-axis representing nominal association p value between SNPs and DNAme in –log10 format, right y-axis representing recombination rate, and x-axis representing chromosome coordinates (HG19). The most significant associated SNP is shown as a purple diamond while linkage disequilibrium (LD) between this and the other SNPs is shown in colors defined in the color scheme. The LD measures are based on 1000 Genomes Nov 2014 EUR population. The plot was created using LocusZoom http://locuszoom.sph.umich.edu/locuszoom.

Figure 5.. The association of mean-DCCT HbA1c…

Figure 5.. The association of mean-DCCT HbA1c and DNAme at DCCT closeout with retinopathy/DKD development…

Figure 5.. The association of mean-DCCT HbA1c and DNAme at DCCT closeout with retinopathy/DKD development during EDIC.
a, Survival plots of retinopathy (PDR, SNPDR and CSME) or DKD (AER300, GFR60) and cumulative distribution of normalized eGFR slopes during EDIC are stratified based on group. The follow-up period for retinopathy is through EDIC year 18 (interval censored data due to staggered clinic visit intervals, up to EDIC year 23) and, for DKD up to EDIC year 18. Each group (PRIM INT, SCND INT, PRIM CONV and SCND CONV) was plotted in different colors as indicated in the legend with 95% confidence interval (CI) depicted by the shaded area. b, Forest plots for association of mean-DCCT HbA1c (10% increase from previous time point) with risk of complications development (as indicated on the top of each panel), or with normalized eGFR slopes. Black dots with red lines represent either hazard ratios (HR) with 95% CI in forest plots, or coefficient (COEF) with standard error (SE) in dot plot. c, Forest plots for the association of DNAme with risk of complication development or with eGFR slope at HbA1c-assoc CpGs. The top 10 most significant HbA1c-assoc CpGs for each complication/manifestation are shown in order of significance level from high to low, each labeled with CpG ID on the left side and nominal p values listed on top of the plot. FDR is estimated by BH-adjustment over 186 HbA1c-assoc CpGs. *** FDR < 5%; ** FDR < 10%; * nominal P < 0.05. d, Percentage of association (mediation %) between history of HbA1c and complications that is explained by DNAme at the indicated HbA1c-assoc CpGs alone (each labeled with CpG ID), or in combination, using mediation analyses. The best combination of CpGs (indicated by red bar) which explains the association between mean DCCT HbA1c and risk of complications development for each indicated complication was identified from the top 10 CpGs among the HbA1c-assoc CpGs associated with complications. The explanation percentages of each CpG (up to 10 CpGs) in the combination are also presented as blue bars. cg19693031 located in TXNIP-3’UTR is shown in pink font. In a-d, participants who did not develop the corresponding retinopathies or DKDs during DCCT were included in the analyses for each complication (PDR: n=482; SNPDR: n=473; CSME: n=464; AER300: n=485; and GFR60: n=498). For eGFR slope, all the participants (n=499) were included in the analyses. In panel b-d, two-sided CoxPH for analyses related to each complication and linear regression models for eGFR slope adjusting for covariates were applied. See Methods for details.

Figure 6.. Functional exploration of DNAme at…

Figure 6.. Functional exploration of DNAme at the TXNIP genomic region.

a , Genomic location…
Figure 6.. Functional exploration of DNAme at the TXNIP genomic region.
a, Genomic location of the depicted region at/near TXNIP. b, Manhattan plots of association between mean-DCCT HbA1c and DNAme at the covered CpGs. Multiple linear regression models adjusting for covariates were applied to each CpG across all the samples (two-sided tests based on t-statistic, n=499). X-axis represents CpG location and Y-axis represents –log(p). Two HbA1c-assoc regions are highlighted with pink background. c, Heatmap showing the 15 chromatin states (top) and 18 states (bottom). Colors are shown in the legends (panel j). d, Ribbon plots representing the genomic interactions identified by IHEC PCHi-C data. Red represents the interactions involved in PIRs containing TXNIP 3’UTR. Blue represents interactions containing TXNIP promoter. Height of each ribbon represents the average CHiCAGO score (Y-axis) of the corresponding interaction across 5 major blood cells including monocytes, neutrophils, CD4+ T-cells, CD8+ T-cells and B-cells. e, Annotations of the RefSeq genes in the depicted region. f, Pearson correlation of DNAme at 33 CpGs in TXNIP and its 25kb flanking region. The associations of DNAme with mean-DCCT HbA1c at each CpG are obtained by the same linear regression model (n=499) indicated in panel b and shown as a Manhattan plot in the upper panel. HbA1c-assoc CpGs (FDR < 15%) are labeled in red font with the most significant cg19693031 underlined. Pair-wise correlations of DNAme of these CpGs were analyzed by Pearson correlation with coefficients shown as heatmap. Blue/red represents negative/positive correlation, respectively. The majority of the CpGs depict positive correlations including all 9 HbA1c-assoc CpGs (in red), while 5 CpGs (in blue) showed little to negative correlation with other CpGs. g, Ribbon plot representing the association of DNAme at cg19693031with expression of its nearby genes (including TXNIP) in monocytes. Multiple linear regression models adjusting for covariates based were applied to DNAme at cg19693031and expression of each nearby gene within 500 bp distance (n=1202 monocytes) to identify DNAme-assoc genes with nominal p < 0.05 (two-sided tests based on t-statistic). Blue indicates negative association at FDR < 0.05, while grey indicates associations with p < 0.05. h, In-vitro DNA hypomethylation at 3 CpGs in the 3’UTR of TXNIP (including cg19693031) induced by high glucose (HG) treatment of human primary bone marrow CD34+ cells. DNAme was measured by amplicon-seq. i, Upregulation of TXNIP expression induced by HG treatment of human primary bone marrow CD34+ cells. Cells were cultured in medium (see Methods) containing 25mmol/L glucose (control), or same medium with the addition of 20mmol/L glucose (45mmol/L total) for 72 hours (HG treatment). RT-PCRs were performed in triplicate, and the data shown represent means of triplicates from one experiment. j, Color codes of chromatin states in the heatmaps for 15 and 18 chromatin states shown in panel c.
All figures (16)
Extended Data Fig. 4. Genomic locations and…
Extended Data Fig. 4. Genomic locations and IPA analyses of HbA1c-assoc genes.
a, Genomic locations of HbA1c-assoc CpGs and non-associated CpGs relative to Refseq genes depicted using pie charts. TSS, transcription start site; TSS1500, distal promoter region from 1500bp upstream of TSS to TSS; TSS200, proximal promoter from 200bp upstream to 1500bp upstream of TSS; 5’UTR, 5’ untranslated region; 3’UTR, 3’ untranslated region; intergenic, genomic regions excluding TSS1500, TSS200 and gene body. b, Enriched canonical pathways (Benjamini-Hochberg adjusted P < 0.05) identified among annotated genes of HbA1c-assoc CpGs. Right-tailed Fisher’s exact test was used in IPA analysis. Total of 143 unique annotated genes were used as input for IPA, and 141 genes after excluding 2 that did not match in the IPA database were finally included in the analysis. Pathway names are listed on the left with corresponding significance level in –log10 form presented in the middle as bar plot. The pathways labeled in red font are known to be associated with diabetes and its complications. HbA1c-assoc genes identified in each pathway are shown on the right. Each row represents one pathway and each column represents one HbA1c-assoc gene found in the enriched pathway. Red indicates the corresponding specific gene (with name shown on the top of the column) that is involved in the pathway specified on the left of the panel. The majority of them contain several NFAT pathway genes, including TGFBR2, GNAI2, CACNB1, PLCB4, MEF2D, TGFB2, ADCY7, PRKD1 and CACNA1A.c, Top network related to cellular growth, proliferation and embryonic development. The 24 HbA1c-assoc genes labeled in red font are those identified to depict interactions with several proteins related to diabetes, its complications and insulin sensitivity including nuclear factor kappaB (NF-kB), protein kinaseB (PKB), phosphoinositide 3-kinase (PI3K), p38 mitogen-activated protein kinase (P38MAPK) and protein phosphatase2 catalytic (PPP2C) subunits.
Extended Data Fig. 5. Enrichment of 15…
Extended Data Fig. 5. Enrichment of 15 chromatin states at 186 HbA1c-assoc CpGs compared to all the non-HbA1c-assoc CpGs.
Fifteen genome-wide chromatin states were defined in 5 major blood cell types and hematopoietic stem cell (HSC) by the NIH roadmap Epigenetics Program (https://egg2.wustl.edu/roadmap/web_portal/chr_state_learning.html#core_15state for details). For each cell type, the percentage of CpGs located in the genomic regions depicting each chromatin state among HbA1c-assoc CpGs (black bars) versus all CpGs covered by the EPIC array (named as “all CpGs”, white bars) are shown side by side using bar plots. The enrichment significances of HbA1c-assoc CpGs versus the other CpGs were tested by right-tailed Fisher’s exact tests on 186 HbA1c-assoc CpGs vs. all the 815,246 non HbA1c-assoc CpGs reliably covered by the EPIC array in each specific chromatin state. **** p < 5e-08; *** p < 5e-05; ** p < 5e-03 and * p < 5e-02. States with statistically significant enrichment in at least one cell type are highlighted with colored background/shading: green for transcription-related (TxFlnk, Tx and TxWk) and yellow for enhancer-related (EnhG or Enh). The other states are shown in alternating grey and white shades for better visualization. Abbreviations: HSC-hematopoietic stem cell; TSS-transcription start site; TssA-Active TSS; TssAFlnk-Flanking Active TSS; TxFlnk-Transcribed state at gene 5’ and 3’; Tx-Strong transcription; TxWk-Weak transcription; EnhG-Genic enhancers; Enh-Enhancers; ZNF/Rpts- ZNF genes & repeats;Het- Heterochromatin; TssBiv- Bivalent/Poised TSS; BivFlnk- Flanking Bivalent TSS/Enh; EnhBiv- Bivalent Enhancer; ReprPC- Repressed PolyComb; ReprPCWk- Weak Repressed PolyComb; Quies-Quiescent/Low.
Extended Data Fig. 6. Enrichment of chromatin…
Extended Data Fig. 6. Enrichment of chromatin (epigenetic) states at HbA1c-assoc CpG sites in blood cells.
a, Enrichment of chromatin states in each cell-type at the 186 HbA1c-assoc CpGs. For each cell-type, the percentage of CpGs located in genomic regions enriched with 18 chromatin states (labeled under the plots) at HbA1c-assoc CpGs (dark bars) versus all CpGs covered by the EPIC array (white bars) in 4 major blood cell-types are shown side-by-side in the bar plots. Right-tailed Fisher’s exact tests were conducted to identify the significance of enrichment of each state in 186 HbA1c-assoc CpGs relative to all the 815,246 non HbA1c-assoc CpGs reliably covered by EPIC array. States with significant enrichment in at least one cell-type are highlighted with colored background/shading: green for transcription-related (Tx and TxWk) and yellow for enhancer-related states (EnhG1, EnhG2, EnhA1, EnhA2 or EnhWk). P values listed on top of each significantly enriched state.**** p< 5e-08; *** p< 5e-05; ** p< 5e-03 and * p< 5e-02. The other states are shown in alternating grey and white shades for better visualization. b, Enrichment of enhancer- or transcription-related states across 4 different cell-types in HbA1c-assoc CpGs versus all CpGs. The states for each plot are shown in the heading. P-values were determined using same tests and sample sizes as in panel a. c, Comparison of heatmaps of 15 chromatin states in 6 blood cell-types (top) with heatmaps of 18 states at 4 blood cell-types (bottom) at HbA1c-assoc CpGs. These states are defined by the NIH Roadmap Epigenetics Program. Unsupervised hierarchical analysis was performed on data from 15 states and visualized in the top panel using colors depicted in the boxed legend for 15 states (bottom left). 18-chromatin states at HbA1c-assoc CpGs, presented in the same order as 15 states, are shown in the lower panel with corresponding color legends in the bottom right box. Each row represents one cell-type and each column, one CpG.
Extended Data Fig. 7. Ribbon plots for…
Extended Data Fig. 7. Ribbon plots for the association between DNAme at candidate loci (HbA1c-assoc CpGs) and the expression of genes located within 500kb (FDR + cells.
The associations of DNAme at HbA1c-assoc CpGs with the expression of gene (s) in monocytes or CD4+ T cells were analyzed on each pair of HbA1c-assoc CpGs and corresponding nearby genes within 500kb. Multiple linear regression models using two-sided tests based on t-statistic (n=1202 for monocytes and n=214 for CD4+ T cells) were applied to published datasets containing both DNAme and gene expression profiles from the same samples to identify HbA1c-assoc CpGs whose DNAme is associated with the expression of nearby gene (s) (expression-associated genes) with FDR < 0.05 in monocytes and CD4+ cells. The associations for each CpG in both monocytes (left) and CD4+ cells (right) are shown side-by-side within each panel. The order of CpGs is based on the significance level of the CpG with its most significantly expression-associated gene in both cell types. The height of the ribbon represents the significance level in –log10(p). Blue indicates negative association while red indicates positive association. Grey represents associations with nominal p < 0.05 but FDR >0.05. Most of the 9 common CpGs are in enhancers, suggesting a more ubiquitous regulatory role for DNAme at enhancers across different blood cell types.
Extended Data Fig. 8. Ribbon plots for…
Extended Data Fig. 8. Ribbon plots for the association between DNAme at candidate loci (HbA1c-assoc CpGs) and expression of genes located within 500bp (FDR + cells).
The associations of DNAme at HbA1c-assoc CpGs with the expression of gene (s) in monocytes were analyzed on each pair of HbA1c-assoc CpGs and corresponding nearby genes within 500kb. Multiple linear regression models using two-sided tests based on t-statistic were applied to published datasets containing both DNAme and gene expression profiles from same monocyte samples (n=1202) to identify HbA1c-assoc CpGs whose DNAme is associated with the expression of nearby gene (s) (expression-associated genes) with FDR 0.05.
Extended Data Fig. 9. Ribbon plots for…
Extended Data Fig. 9. Ribbon plots for the association between DNAme at candidate loci (HbA1c-assoc CpGs) and expression of genes located within 500bp (FDR + cells (not in monocytes).
The associations of DNAme at HbA1c-assoc CpGs with the expression of gene(s) in CD4+ T cells were analyzed on each pair of HbA1c-assoc CpGs and corresponding nearby genes within 500kb. Multiple linear regression models using two-sided tests based on t-statistic were applied to published datasets containing both DNAme and gene expression profiles from same CD4+ T cell samples (n=214) to identify HbA1c-assoc CpGs whose DNAme is associated with the expression of nearby gene(s) (expression-associated genes) with FDR < 0.05. The order of the CpGs is based on significance level of the CpG with its most significantly associated gene (expression data). The height of the ribbon represents the significance level in –log10(p). Blue indicates negative association while red indicates positive association. Grey represents the associations with nominal p < 0.05 but FDR >0.05.
Extended Data Fig. 10. Pipeline depicting the…
Extended Data Fig. 10. Pipeline depicting the step-wise sequence used for selecting CpGs for mediation analyses of DNAme in between mean-DCCT HbA1c and complication development during EDIC.
At each selection step, the number of CpGs (n) is shown at the specified cut-off criteria. Arrows with same color reflect the same analytical step using the same model for each complication. The 4 steps as described in the Methods are specified on the right side of the figure. “n” represents number of CpGs identified at each step. In step 1, Model 1 was applied to each complication to analyze the association of mean-DCCT HbA1c with risk of complication development during EDIC (up to EDIC year 18). In step 2, Model 2 was applied to each complication to identify the HbA1c-assoc CpGs whose DNAme is associated with risk of complication development. In step 3, Model 3 was applied to each complication using similar model as Model 1 and Model 2, in which both DNAme and mean-DCCT HbA1c were now included. To identify the best combinations of CpGs to explain the association between HbA1c and complication development in step 4, only the top 10 CpGs (based on the association significance between risk of complication and DNAme identified using model 2 in step2) were considered in order to capture the major mediation effect, while maintaining reasonable computational complexity (which increases dramatically as the number of CpGs included increases using Brute-force approach).
Figure 1.. Identification of HbA1c-assoc CpGs.
Figure 1.. Identification of HbA1c-assoc CpGs.
a, Study workflow for identification of CpGs whose DNAme is associated with mean-DCCT HbA1c, subsequent exploration of biological functions of the identified CpGs, and mediation effect of DNAme on HbA1c-associated future complication development. b, Manhattan plot showing genome-wide association of DNAme with mean-DCCT HbA1c across 499 samples. Linear regression adjusted for covariates with two-sided test based on t-statistic was used to study the association of DNAme (M-values) with mean-DCCT HbA1c at each of the 815432 CpGs reliably covered by the Infinium MethylationEPIC array on all samples (n=499, see Methods). Each CpG is presented as a dot across genomic location (x-axis) of 24 chromosomes (alternating dark blue and orange). The significance level (in –log10) of each CpG is represented in the y-axis. 11 CpGs with Bonferroni-adjusted p < 0.05 (p=6.13e-08, red line) are shown as red dots, CpGs identified with FDR < 5% (blue line) as blue dots and those with FDR<15% (black line) as black dots. The top association CpG (cg19693031 at 3’UTR of TXNIP) is highlighted with red arrow. c&d. Comparison of the associations between DNAme and HbA1c in INT vs. CONV groups at 186 HbA1c-assoc CpGs by bubble plots. For each HbA1c-assoc CpG, linear regression model with addition of interaction term between DNAme and treatment (INT or CONV) was applied to the complete dataset (n=499) to identify CpGs having different associations of DNAme with HbA1c in INT vs. CONV. Each CpG is represented by one bubble dot. The size of the dot represents the significance level of interaction between DNAme and treatment (-log10Pinteraction) estimated by linear regression model. Pink represents the CpGs with Pinteraction<0.05 while blue represents CpGs with Pinteraction>0.05. In panel c, the x-axis represents significance level of association estimated by the same linear regression model described in panel b in the CONV group (n=249), and the y-axis represents the INT group (n=250). In panel d, x-axis represents the association coefficients (COEF) in CONV, and y-axis in the INT group. Significance levels obtained in all the multiple linear regression models in panels b-d were determined by two-sided tests based on t-statistic.
Figure 2.. Cross-cohort validation of HbA1c-assoc CpGs…
Figure 2.. Cross-cohort validation of HbA1c-assoc CpGs in DCCT41, SAFHS850 and Joslin195 and their persistence of DNAme in EDIC61.
a-c, Spearman correlation of HbA1c-assoc coefficients in DCCT499 (X-axis) versus log2 fold change (log2FC, Y-axis) between cases (history of hyperglycemia) and controls (history of normoglycemia) in the DCCT41 (A), FBG-associated coefficients in SAFHS850 (b), and associations of coefficients of HbA1c at sample collection in Joslin195 (c) across all the HbA1c-assoc CpGs identified in DCCT499 (FDR < 15%) and covered in each validation cohort. Each CpG is represented by one dot. Red/blue represents pos-assoc/neg-assoc CpGs with same trends of associations in both datasets respectively. Those with p < 0.05 in validation cohorts are shown as bigger dots. The remaining CpGs with discrepant association direction are shown in small green dots. The black diagonal line represents linear regression line. d, Heatmap on number (#) of validation cohorts in which each HbA1c-assoc CpG is validated. For each pos-assoc/neg-assoc CpG, total number of validation cohorts (one internal and two external cohorts) which covered the specific CpG and validated at different significance levels (same trend, nominal p < 0.05 or Bonferroni-adjusted p < 0.05) are summarized and depicted as heatmap with each column representing one CpG and each row representing one validation criterion as shown on the right side of the heatmap. Pos-assoc CpGs are shown in red and neg-assoc CpGs in blue as indicated in the color bar. The CpGs not depicting the same trends are shown as white box. The number of CpGs validated in at least 1, 2 or 3 cohorts at each criterion is summarized on the right side of the panel. e, Similar Spearman correlation between DCCT499 (X-axis) and EDIC61(Y-axis) as shown in panel a. f&g, DNAme differences between cases vs. controls in EDIC61 at 54 pos-assoc CpGs (f), and at 30 neg-assoc CpGs (g). Spearman correlation tests were used in panels a-c and e to determine the correlation coefficient and its significance using all the covered CpGs (n=84 in EDIC41, 77 in SAFHS850, 185 in Joslin195, and 84 in EDIC61). In each cohort, coefficients and significance levels of the association between DNAme and glucose-related variables at each covered CpGs were obtained by multiple linear regression model using all the samples in the corresponding cohort (n=41 in EDIC41, n=850 in SAFHS850, n=185 in Joslin195, and n=61 in EDIC61).
Figure 3.. Functional explorations of HbA1c-assoc CpGs.
Figure 3.. Functional explorations of HbA1c-assoc CpGs.
a, Heatmap of chromatin states within 5 major peripheral blood cells and HSCs at 186 HbA1c-assoc CpGs. Fifteen genome-wide chromatin states defined by NIH roadmap Epigenetics Program (https://egg2.wustl.edu/roadmap/web_portal/chr_state_learning.html#core_15state for details) were presented using different colors as depicted in the legend (adjacent box). Each row represents one cell-type and each column represents one CpG. b, Enrichment of states related to enhancer regions (EnhG or Enh or EnhBiv) or transcription (TxFlnk, Tx and TxWk) across different cell types at HbA1c-assoc CpGs versus all CpGs. P values (right-tailed Fisher’s exact test comparing 186 HbA1c-assoc CpGs with all the 815,246 non HbA1c-assoc CpGs reliably covered by the EPIC array in each specific chromatin state) are shown on top of each cell type with the most significant ones emphasized by red boxes. TxFlnk: Flanking Active TSS; Tx: Strong transcription; TxWk: Weak transcription; EnhG: Genic enhancers; Enh: Enhancers; EnhBiv: Bivalent Enhancer. c, Enrichment of DNAse I hypersensitive regions containing HbA1c-assoc CpGs relative to all CpGs in blood cells based on the ENCODE Study (https://www.encodeproject.org). Cell types are classified into 3 categories (stem cells, myeloid progenitors and lymphoid progenitors) shown in different background colors. Each cell type is presented by a dot with –log10p as Y-axis. The color and size of the dots represent the FDR levels (Benjamini-Yekutieli-corrected Q value) as shown in the legend below. The binomial test was used in eFORGE to test 186 HbA1c-assoc CpGs compared to 1,000 matching background CpG sets, each containing the same number of CpGs with matching gene annotation and CpG island. d, The top 6 enriched motifs in the ±250bp regions of neg-assoc CpGs identified by RSAT analysis. For each motif, p value from binomial tests by comparing input sequences vs. background sequences, the corresponding Bonferroni-adjusted p-value (E-value), and number of input sequences containing the specific motif (n in parenthesis) are listed on top of the motif logo. Known JASPAR transcription factor (s) for the motif, if any, is/are listed below the corresponding logo. CEBPs are highlighted in red font. e, Scheme for the identification of promoters interacting with PIRs containing HbA1c-assoc CpGs in enhancers. f&g, IPA pathway (f) and the network (g) characterized using 224 genes identified by steps shown in panel e as input. Right-tailed Fisher’s Exact test on 179 genes (after excluding 45 without match in the IPA datatabase) was used to identify the enriched pathways at B-H adjusted p < 0.05 in panel f.
Figure 4.. Association of DNAme with gene…
Figure 4.. Association of DNAme with gene expression and genotypes.
a-e, The associations of DNAme at HbA1c-assoc CpGs with the expression of gene(s) in monocytes or CD4+ T cells were analyzed on each pair of HbA1c-assoc CpGs and its nearby genes within 500kb. Multiple linear regression models (two-sided tests based on t-stastistic, n=1202 for monocytes and n=214 for CD4+ T cells) were applied to published datasets containing both DNAme and gene expression profiles from same sample to identify HbA1c-assoc CpGs whose DNAme is associated with the expression of nearby gene(s) (expression-associated genes) with FDR < 0.05 in monocytes and CD4+ cells. a, Venn diagram of HbA1c-assoc CpGs with expression-associated genes in monocytes and CD4+ cells. b, Bar plots comparing the number (#) of CpGs whose most significant expression-associated gene is the same gene in which the corresponding CpG is located or not. c&d, Dot plots showing the association significance level between DNAme and gene expression at CpGs classified based on their genomic location relative to the expressed gene in monocytes (panel c) and CD4+ T cells (panel d). e, Heatmap of the number of pos-assoc or neg-assoc intronic CpGs with different chromatin states in monocytes and CD4+ cell types. Blue dots indicate neg-assoc CpGs and red dots indicate pos-assoc CpGs. Chromatin states related to transcription start site are represented as TSS, related to transcription as Tx, related to Enhancer as Enh, and quiescent state as Quies. f-i, Regional association plots of CpGs with methylation quantitative trait loci (meQTLs) identified genome-wide. The eight most significant independent HbA1c-assoc CpGs (p < 5e-08) were selected for meQTL analyses. meQTLs were identified from a GWAS by two-sided linear regression under an additive genetic model using DNAme and genotyping data of 474 European DCCT participants included in the current study. The 4 CpGs with statistically significant cis- (f-h) or trans-meQTLs (i) at Bonferroni-adjusted p < 0.05 (p < 5e-08) are shown, with corresponding CpGs listed on top of the plot. In each plot, each dot represents one SNP with left y-axis representing nominal association p value between SNPs and DNAme in –log10 format, right y-axis representing recombination rate, and x-axis representing chromosome coordinates (HG19). The most significant associated SNP is shown as a purple diamond while linkage disequilibrium (LD) between this and the other SNPs is shown in colors defined in the color scheme. The LD measures are based on 1000 Genomes Nov 2014 EUR population. The plot was created using LocusZoom http://locuszoom.sph.umich.edu/locuszoom.
Figure 5.. The association of mean-DCCT HbA1c…
Figure 5.. The association of mean-DCCT HbA1c and DNAme at DCCT closeout with retinopathy/DKD development during EDIC.
a, Survival plots of retinopathy (PDR, SNPDR and CSME) or DKD (AER300, GFR60) and cumulative distribution of normalized eGFR slopes during EDIC are stratified based on group. The follow-up period for retinopathy is through EDIC year 18 (interval censored data due to staggered clinic visit intervals, up to EDIC year 23) and, for DKD up to EDIC year 18. Each group (PRIM INT, SCND INT, PRIM CONV and SCND CONV) was plotted in different colors as indicated in the legend with 95% confidence interval (CI) depicted by the shaded area. b, Forest plots for association of mean-DCCT HbA1c (10% increase from previous time point) with risk of complications development (as indicated on the top of each panel), or with normalized eGFR slopes. Black dots with red lines represent either hazard ratios (HR) with 95% CI in forest plots, or coefficient (COEF) with standard error (SE) in dot plot. c, Forest plots for the association of DNAme with risk of complication development or with eGFR slope at HbA1c-assoc CpGs. The top 10 most significant HbA1c-assoc CpGs for each complication/manifestation are shown in order of significance level from high to low, each labeled with CpG ID on the left side and nominal p values listed on top of the plot. FDR is estimated by BH-adjustment over 186 HbA1c-assoc CpGs. *** FDR < 5%; ** FDR < 10%; * nominal P < 0.05. d, Percentage of association (mediation %) between history of HbA1c and complications that is explained by DNAme at the indicated HbA1c-assoc CpGs alone (each labeled with CpG ID), or in combination, using mediation analyses. The best combination of CpGs (indicated by red bar) which explains the association between mean DCCT HbA1c and risk of complications development for each indicated complication was identified from the top 10 CpGs among the HbA1c-assoc CpGs associated with complications. The explanation percentages of each CpG (up to 10 CpGs) in the combination are also presented as blue bars. cg19693031 located in TXNIP-3’UTR is shown in pink font. In a-d, participants who did not develop the corresponding retinopathies or DKDs during DCCT were included in the analyses for each complication (PDR: n=482; SNPDR: n=473; CSME: n=464; AER300: n=485; and GFR60: n=498). For eGFR slope, all the participants (n=499) were included in the analyses. In panel b-d, two-sided CoxPH for analyses related to each complication and linear regression models for eGFR slope adjusting for covariates were applied. See Methods for details.
Figure 6.. Functional exploration of DNAme at…
Figure 6.. Functional exploration of DNAme at the TXNIP genomic region.
a, Genomic location of the depicted region at/near TXNIP. b, Manhattan plots of association between mean-DCCT HbA1c and DNAme at the covered CpGs. Multiple linear regression models adjusting for covariates were applied to each CpG across all the samples (two-sided tests based on t-statistic, n=499). X-axis represents CpG location and Y-axis represents –log(p). Two HbA1c-assoc regions are highlighted with pink background. c, Heatmap showing the 15 chromatin states (top) and 18 states (bottom). Colors are shown in the legends (panel j). d, Ribbon plots representing the genomic interactions identified by IHEC PCHi-C data. Red represents the interactions involved in PIRs containing TXNIP 3’UTR. Blue represents interactions containing TXNIP promoter. Height of each ribbon represents the average CHiCAGO score (Y-axis) of the corresponding interaction across 5 major blood cells including monocytes, neutrophils, CD4+ T-cells, CD8+ T-cells and B-cells. e, Annotations of the RefSeq genes in the depicted region. f, Pearson correlation of DNAme at 33 CpGs in TXNIP and its 25kb flanking region. The associations of DNAme with mean-DCCT HbA1c at each CpG are obtained by the same linear regression model (n=499) indicated in panel b and shown as a Manhattan plot in the upper panel. HbA1c-assoc CpGs (FDR < 15%) are labeled in red font with the most significant cg19693031 underlined. Pair-wise correlations of DNAme of these CpGs were analyzed by Pearson correlation with coefficients shown as heatmap. Blue/red represents negative/positive correlation, respectively. The majority of the CpGs depict positive correlations including all 9 HbA1c-assoc CpGs (in red), while 5 CpGs (in blue) showed little to negative correlation with other CpGs. g, Ribbon plot representing the association of DNAme at cg19693031with expression of its nearby genes (including TXNIP) in monocytes. Multiple linear regression models adjusting for covariates based were applied to DNAme at cg19693031and expression of each nearby gene within 500 bp distance (n=1202 monocytes) to identify DNAme-assoc genes with nominal p < 0.05 (two-sided tests based on t-statistic). Blue indicates negative association at FDR < 0.05, while grey indicates associations with p < 0.05. h, In-vitro DNA hypomethylation at 3 CpGs in the 3’UTR of TXNIP (including cg19693031) induced by high glucose (HG) treatment of human primary bone marrow CD34+ cells. DNAme was measured by amplicon-seq. i, Upregulation of TXNIP expression induced by HG treatment of human primary bone marrow CD34+ cells. Cells were cultured in medium (see Methods) containing 25mmol/L glucose (control), or same medium with the addition of 20mmol/L glucose (45mmol/L total) for 72 hours (HG treatment). RT-PCRs were performed in triplicate, and the data shown represent means of triplicates from one experiment. j, Color codes of chromatin states in the heatmaps for 15 and 18 chromatin states shown in panel c.

References

    1. Nathan DM et al. The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus. N Engl J Med 329, 977–986, doi:10.1056/NEJM199309303291401 (1993).
    1. Nathan DM The Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications study at 30 years: overview. Diabetes Care 37, 9–16, doi:10.2337/dc13-2112 (2014).
    1. Lachin JM et al. Effect of intensive diabetes therapy on the progression of diabetic retinopathy in patients with type 1 diabetes: 18 years of follow-up in the DCCT/EDIC. Diabetes 64, 631–642, doi:10.2337/db14-0930 (2015).
    1. de Boer IH et al. Intensive diabetes therapy and glomerular filtration rate in type 1 diabetes. N Engl J Med 365, 2366–2376, doi:10.1056/NEJMoa1111732 (2011).
    1. DCCT/EDIC research group. Effect of intensive diabetes treatment on albuminuria in type 1 diabetes: long-term follow-up of the Diabetes Control and Complications Trial and Epidemiology of Diabetes Interventions and Complications study. Lancet Diabetes Endocrinol 2, 793–800, doi:10.1016/S2213-8587(14)70155-X (2014).
    1. Kato M & Natarajan R Epigenetics and epigenomics in diabetic kidney disease and metabolic memory. Nat Rev Nephrol, doi:10.1038/s41581-019-0135-6 (2019).
    1. Reddy MA, Zhang E & Natarajan R Epigenetic mechanisms in diabetic complications and metabolic memory. Diabetologia 58, 443–455, doi:10.1007/s00125-014-3462-y (2015).
    1. Cooper ME & El-Osta A Epigenetics: mechanisms and implications for diabetic complications. Circ Res 107, 1403–1413, doi:10.1161/CIRCRESAHA.110.223552 (2010).
    1. Bird A Perceptions of epigenetics. Nature 447, 396–398, doi:10.1038/nature05913 (2007).
    1. Russo VEA, Martienssen RA & Riggs AD Epigenetic mechanisms of gene regulation. (Cold Spring Harbor Laboratory Press, 1996).
    1. Jirtle RL & Skinner MK Environmental epigenomics and disease susceptibility. Nat Rev Genet 8, 253–262, doi:10.1038/nrg2045 (2007).
    1. Simmons R Epigenetics and maternal nutrition: nature v. nurture. Proc Nutr Soc 70, 73–81, doi:10.1017/S0029665110003988 (2011).
    1. Susztak K Understanding the epigenetic syntax for the genetic alphabet in the kidney. J Am Soc Nephrol 25, 10–17, doi:10.1681/ASN.2013050461 (2014).
    1. Rosen ED et al. Epigenetics and Epigenomics: Implications for Diabetes and Obesity. Diabetes 67, 1923–1931, doi:10.2337/db18-0537 (2018).
    1. Sandholm N et al. The Genetic Landscape of Renal Complications in Type 1 Diabetes. J Am Soc Nephrol 28, 557–574, doi:10.1681/ASN.2016020231 (2017).
    1. Pollack S et al. Multiethnic Genome-Wide Association Study of Diabetic Retinopathy Using Liability Threshold Modeling of Duration of Diabetes and Glycemic Control. Diabetes 68, 441–456, doi:10.2337/db18-0567 (2019).
    1. Hosseini SM et al. The association of previously reported polymorphisms for microvascular complications in a meta-analysis of diabetic retinopathy. Hum Genet 134, 247–257, doi:10.1007/s00439-014-1517-2 (2015).
    1. Ko YA et al. Cytosine methylation changes in enhancer regions of core pro-fibrotic genes characterize kidney fibrosis development. Genome Biol 14, R108, doi:10.1186/gb-2013-14-10-r108 (2013).
    1. Wing MR et al. DNA methylation profile associated with rapid decline in kidney function: findings from the CRIC study. Nephrol Dial Transplant 29, 864–872, doi:10.1093/ndt/gft537 (2014).
    1. Chu AY et al. Epigenome-wide association studies identify DNA methylation associated with kidney function. Nat Commun 8, 1286, doi:10.1038/s41467-017-01297-7 (2017).
    1. Qiu C et al. Cytosine methylation predicts renal function decline in American Indians. Kidney Int 93, 1417–1431, doi:10.1016/j.kint.2018.01.036 (2018).
    1. Miao F et al. Evaluating the role of epigenetic histone modifications in the metabolic memory of type 1 diabetes. Diabetes 63, 1748–1762, doi:10.2337/db13-1251 (2014).
    1. Chen G et al. Aberrant DNA methylation of mTOR pathway genes promotes inflammatory activation of immune cells in diabetic kidney disease. Kidney Int, doi:10.1016/j.kint.2019.02.020 (2019).
    1. Gluck C et al. Kidney cytosine methylation changes improve renal function decline estimation in patients with diabetic kidney disease. Nat Commun 10, 2461, doi:10.1038/s41467-019-10378-8 (2019).
    1. Salem RM et al. Genome-Wide Association Study of Diabetic Kidney Disease Highlights Biology Involved in Glomerular Basement Membrane Collagen. J Am Soc Nephrol 30, 2000–2016, doi:10.1681/ASN.2019030218 (2019).
    1. Chen Z et al. Epigenomic profiling reveals an association between persistence of DNA methylation and metabolic memory in the DCCT/EDIC type 1 diabetes cohort. Proc Natl Acad Sci U S A 113, E3002–3011, doi:10.1073/pnas.1603712113 (2016).
    1. Shalev A Minireview: Thioredoxin-interacting protein: regulation and function in the pancreatic beta-cell. Mol Endocrinol 28, 1211–1220, doi:10.1210/me.2014-1095 (2014).
    1. De Marinis Y et al. Epigenetic regulation of the thioredoxin-interacting protein (TXNIP) gene by hyperglycemia in kidney. Kidney Int 89, 342–353, doi:10.1016/j.kint.2015.12.018 (2016).
    1. Kumar A & Mittal R Mapping Txnip: Key connexions in progression of diabetic nephropathy. Pharmacol Rep 70, 614–622, doi:10.1016/j.pharep.2017.12.008 (2018).
    1. Singh LP Thioredoxin Interacting Protein (TXNIP) and Pathogenesis of Diabetic Retinopathy. J Clin Exp Ophthalmol 4, doi:10.4172/2155-9570.1000287 (2013).
    1. Rich SS et al. A genome-wide association scan for acute insulin response to glucose in Hispanic-Americans: the Insulin Resistance Atherosclerosis Family Study (IRAS FS). Diabetologia 52, 1326–1333, doi:10.1007/s00125-009-1373-0 (2009).
    1. Neve B et al. Role of transcription factor KLF11 and its diabetes-associated gene variants in pancreatic beta cell function. Proc Natl Acad Sci U S A 102, 4807–4812, doi:10.1073/pnas.0409177102 (2005).
    1. Fernandez-Zapico ME et al. MODY7 gene, KLF11, is a novel p300-dependent regulator of Pdx-1 (MODY4) transcription in pancreatic islet beta cells. J Biol Chem 284, 36482–36490, doi:10.1074/jbc.M109.028852 (2009).
    1. Park IY et al. Dual Chromatin and Cytoskeletal Remodeling by SETD2. Cell 166, 950–962, doi:10.1016/j.cell.2016.07.005 (2016).
    1. Kulkarni H et al. Novel epigenetic determinants of type 2 diabetes in Mexican-American families. Hum Mol Genet 24, 5330–5344, doi:10.1093/hmg/ddv232 (2015).
    1. Yamanouchi M et al. Improved clinical trial enrollment criterion to identify patients with diabetes at risk of end-stage renal disease. Kidney Int 92, 258–266, doi:10.1016/j.kint.2017.02.010 (2017).
    1. Toperoff G et al. Genome-wide survey reveals predisposing diabetes type 2-related DNA methylation variations in human peripheral blood. Hum Mol Genet 21, 371–383, doi:10.1093/hmg/ddr472 (2012).
    1. Jones PA Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat Rev Genet 13, 484–492, doi:10.1038/nrg3230 (2012).
    1. Kundaje A et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330, doi:10.1038/nature14248 (2015).
    1. Thurman RE et al. The accessible chromatin landscape of the human genome. Nature 489, 75–82, doi:10.1038/nature11232 (2012).
    1. Avellino R & Delwel R Expression and regulation of C/EBPalpha in normal myelopoiesis and in malignant transformation. Blood 129, 2083–2091, doi:10.1182/blood-2016-09-687822 (2017).
    1. Javierre BM et al. Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters. Cell 167, 1369–1384 e1319, doi:10.1016/j.cell.2016.09.037 (2016).
    1. Rickels R & Shilatifard A Enhancer Logic and Mechanics in Development and Disease. Trends Cell Biol 28, 608–630, doi:10.1016/j.tcb.2018.04.003 (2018).
    1. Reynolds LM et al. Age-related variations in the methylome associated with gene expression in human monocytes and T cells. Nat Commun 5, 5366, doi:10.1038/ncomms6366 (2014).
    1. Roshandel D et al. Meta-genome-wide association studies identify a locus on chromosome 1 and multiple variants in the MHC region for serum C-peptide in type 1 diabetes. Diabetologia 61, 1098–1111, doi:10.1007/s00125-018-4555-9 (2018).
    1. Paterson AD et al. A genome-wide association study identifies a novel major locus for glycemic control in type 1 diabetes, as measured by both A1C and glucose. Diabetes 59, 539–549, doi:10.2337/db09-0653 (2010).
    1. Hainsworth DP et al. Risk Factors for Retinopathy in Type 1 Diabetes: The DCCT/EDIC Study. Diabetes Care 42, 875–882, doi:10.2337/dc18-2308 (2019).
    1. Perkins BA et al. Risk Factors for Kidney Disease in Type 1 Diabetes. Diabetes Care 42, 883–890, doi:10.2337/dc18-2062 (2019).
    1. Kriebel J et al. Association between DNA Methylation in Whole Blood and Measures of Glucose Metabolism: KORA F4 Study. PLoS One 11, e0152314, doi:10.1371/journal.pone.0152314 (2016).
    1. Ronn T et al. Impact of age, BMI and HbA1c levels on the genome-wide DNA methylation and mRNA expression patterns in human adipose tissue and identification of epigenetic biomarkers in blood. Hum Mol Genet 24, 3792–3813, doi:10.1093/hmg/ddv124 (2015).
    1. Hidalgo B et al. Epigenome-wide association study of fasting measures of glucose, insulin, and HOMA-IR in the Genetics of Lipid Lowering Drugs and Diet Network study. Diabetes 63, 801–807, doi:10.2337/db13-1100 (2014).
    1. Walaszczyk E et al. DNA methylation markers associated with type 2 diabetes, fasting glucose and HbA1c levels: a systematic review and replication in a case-control sample of the Lifelines study. Diabetologia 61, 354–368, doi:10.1007/s00125-017-4497-7 (2018).
    1. Chambers JC et al. Epigenome-wide association of DNA methylation markers in peripheral blood from Indian Asians and Europeans with incident type 2 diabetes: a nested case-control study. Lancet Diabetes Endocrinol 3, 526–534, doi:10.1016/S2213-8587(15)00127-8 (2015).
    1. Soriano-Tarraga C et al. Epigenome-wide association study identifies TXNIP gene associated with type 2 diabetes mellitus and sustained hyperglycemia. Hum Mol Genet 25, 609–619, doi:10.1093/hmg/ddv493 (2016).
    1. Cardona A et al. Epigenome-Wide Association Study of Incident Type 2 Diabetes in a British Population: EPIC-Norfolk Study. Diabetes 68, 2315–2326, doi:10.2337/db18-0290 (2019).
    1. Ye J et al. Identification of loci where DNA methylation potentially mediates genetic risk of type 1 diabetes. J Autoimmun 93, 66–75, doi:10.1016/j.jaut.2018.06.005 (2018).
    1. Vigorelli V et al. Abnormal DNA Methylation Induced by Hyperglycemia Reduces CXCR 4 Gene Expression in CD 34(+) Stem Cells. J Am Heart Assoc 8, e010012, doi:10.1161/JAHA.118.010012 (2019).
    1. Tsukada J, Yoshida Y, Kominato Y & Auron PE The CCAAT/enhancer (C/EBP) family of basic-leucine zipper (bZIP) transcription factors is a multifaceted highly-regulated system for gene regulation. Cytokine 54, 6–19, doi:10.1016/j.cyto.2010.12.019 (2011).
    1. Nagareddy PR et al. Hyperglycemia promotes myelopoiesis and impairs the resolution of atherosclerosis. Cell Metab 17, 695–708, doi:10.1016/j.cmet.2013.04.001 (2013).
    1. Mitroulis I et al. Modulation of Myelopoiesis Progenitors Is an Integral Component of Trained Immunity. Cell 172, 147–161 e112, doi:10.1016/j.cell.2017.11.034 (2018).
    1. Woroniecka KI et al. Transcriptome analysis of human diabetic kidney disease. Diabetes 60, 2354–2369, doi:10.2337/db10-1181 (2011).
    1. Kojima H, Kim J & Chan L Emerging roles of hematopoietic cells in the pathobiology of diabetic complications. Trends Endocrinol Metab 25, 178–187, doi:10.1016/j.tem.2014.01.002 (2014).
    1. The Diabetes Control and Complications Trial (DCCT). Design and methodologic considerations for the feasibility phase. The DCCT Research Group. Diabetes 35, 530–545 (1986).
    1. Fortin JP, Triche TJ Jr. & Hansen KD Preprocessing, normalization and integration of the Illumina HumanMethylationEPIC array with minfi. Bioinformatics 33, 558–560, doi:10.1093/bioinformatics/btw691 (2017).
    1. Aryee MJ et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30, 1363–1369, doi:10.1093/bioinformatics/btu049 (2014).
    1. Houseman EA et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics 13, 86, doi:10.1186/1471-2105-13-86 (2012).
    1. Krueger F & Andrews SR Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572, doi:10.1093/bioinformatics/btr167 (2011).
    1. Niewczas MA et al. A signature of circulating inflammatory proteins and development of end-stage renal disease in diabetes. Nat Med 25, 805–813, doi:10.1038/s41591-019-0415-5 (2019).
    1. Hemani G et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife 7, doi:10.7554/eLife.34408 (2018).

Source: PubMed

3
Sottoscrivi