Chromatin Accessibility Identifies Regulatory Elements Predictive of Gene Expression and Disease Outcome in Multiple Myeloma

Benjamin G Barwick, Vikas A Gupta, Shannon M Matulis, Jonathan C Patton, Doris R Powell, Yanyan Gu, David L Jaye, Karen N Conneely, Yin C Lin, Craig C Hofmeister, Ajay K Nooka, Jonathan J Keats, Sagar Lonial, Paula M Vertino, Lawrence H Boise, Benjamin G Barwick, Vikas A Gupta, Shannon M Matulis, Jonathan C Patton, Doris R Powell, Yanyan Gu, David L Jaye, Karen N Conneely, Yin C Lin, Craig C Hofmeister, Ajay K Nooka, Jonathan J Keats, Sagar Lonial, Paula M Vertino, Lawrence H Boise

Abstract

Purpose: Multiple myeloma is a malignancy of plasma cells. Extensive genetic and transcriptional characterization of myeloma has identified subtypes with prognostic and therapeutic implications. In contrast, relatively little is known about the myeloma epigenome.

Experimental design: CD138+CD38+ myeloma cells were isolated from fresh bone marrow aspirate or the same aspirate after freezing for 1-6 months. Gene expression and chromatin accessibility were compared between fresh and frozen samples by RNA sequencing (RNA-seq) and assay for transpose accessible chromatin sequencing (ATAC-seq). Chromatin accessible regions were used to identify regulatory RNA expression in more than 700 samples from newly diagnosed patients in the Multiple Myeloma Research Foundation CoMMpass trial (NCT01454297).

Results: Gene expression and chromatin accessibility of cryopreserved myeloma recapitulated that of freshly isolated samples. ATAC-seq performed on a series of biobanked specimens identified thousands of chromatin accessible regions with hundreds being highly coordinated with gene expression. More than 4,700 of these chromatin accessible regions were transcribed in newly diagnosed myelomas from the CoMMpass trial. Regulatory element activity alone recapitulated myeloma gene expression subtypes, and in particular myeloma subtypes with immunoglobulin heavy chain translocations were defined by transcription of distal regulatory elements. Moreover, enhancer activity predicted oncogene expression implicating gene regulatory mechanisms in aggressive myeloma.

Conclusions: These data demonstrate the feasibility of using biobanked specimens for retrospective studies of the myeloma epigenome and illustrate the unique enhancer landscapes of myeloma subtypes that are coupled to gene expression and disease progression.

©2021 American Association for Cancer Research.

Figures

Figure 1.
Figure 1.
The mRNA content and chromatin accessibility of myeloma is maintained in cryopreserved specimens. A-B, Principle component analysis of mRNA expression (A) and chromatin accessibility (B). Principle components (PC) 1 and 2 are shown with the percent of variation explained by each component denoted in parentheses. C, mRNA-seq reads for CCND1, CCND2, andMYC genes. Note that specimens 1562 and 1563 were FISH positive for t(11;14). D, ATAC-seq data for fresh and frozen replicates at CCND2 (left) and MYC (right). Gray shading denotes regions of differences between the samples. The scale is either reads per million (RPM; C), log2(RPM+1) (A), reads per peak million (RPPM; D) or log2(RPPM+1) (B).
Figure 2.
Figure 2.
Chromatin accessibility clusters at highly expressed genes in myeloma.A, Heatmap of ATAC-seq data at 91,632 chromatin accessible autosomal regions in eight CD138+CD38+ myeloma specimens from bone marrow aspirates. Accessible regions were sorted from most (top) to least (bottom) accessible. B, ATAC-seq signal (reads per peak million; RPPM) in stretch regions of chromatin accessibility C, Top 5 gene ontology biological processes for genes proximal to stretch regions of chromatin accessibility. D, Genome plot of stretch regions of chromatin accessibility at IRF4, CD38, andSLAMF7 shown for all eight samples.
Figure 3.
Figure 3.
Chromatin accessibility and H3K27ac correspond with gene expression.A, Schematic of analysis where RNA expression (top) was correlated with regions of chromatin accessibility (ATAC; bottom) within 100 kb of the top 5% of variably expressed genes (see Supplementary Figure S3).B, Correlation of ATAC and RNA shown for the region identified by the red arrow in part A. C, Frequency of H3K27ac overlap and accessible regions that are negatively (Neg.) or positively (Pos.) associated with gene expression. The frequency of patients with overlapping H3K27ac enriched regions are shown in a gray scale and P-values for significant differences are shown on top (Fisher’s exact test). D,Correlation (Pearson R) of H3K27ac level and proximal gene expression for regions that overlap chromatin accessibility which is negatively or positively associated with gene expression. P-values for significant differences in correlation distribution are shown on top (Mann Whitney U-test). E,cis-regulatory elements identified upstream of the promoter and in the first intron (see red arrow) of NEK6 using both chromatin accessibility (top; blue) and H3K27ac from Jin et al.(bottom; green). The scale is RPPM (ATAC) or RPM (ChIP-seq) and RNA expression is shown (right). F, Scatterplot of chromatin accessibility and gene expression (left) or H3K27ac enrichment and gene expression (right) for the loci shown in part E (see red arrow) with samples from Emory (blue) and Jin et al. (green) denoted by color.
Figure 4.
Figure 4.
Regulatory element transcription reflects the myeloma gene expression program. A, Frequency of detected transcription at 13,452 regulatory elements (RE) in 768 newly diagnosed myeloma patients from the CoMMpass study. Detection for each sample and regulatory element is determined based on the actual signal compared to 1,000 permutated enhancers (P ≤0.01; see methods). B,Gene set enrichment analysis of the top pathway enriched at genes proximal to transcribed regulatory elements. C, Genome plot of thePIK3CB and MDM4 locus (left) and theFOS locus (right) with regulatory elements (grayed boxes) defined as overlapping regions with both H3K27ac ChIP-seq (from Jin et al.) and chromatin accessibility (ATAC) that are 500 bp from the TSS and 5 kb from TTS. Both stranded mRNA-seq data (Emory) and unstranded mRNA-seq data (CoMMpass) are shown. The scale for ChIP-seq is reads per million (RPM), ATAC-seq is reads per peak million (RPPM), and mRNA-seq is log2(RPM+1), all tracks represent a composite of all samples analyzed.
Figure 5.
Figure 5.
Distal regulatory elements define myeloma subtype gene expression.A, t-SNE analysis of gene expression data (left), transcription at promoter proximal (middle) and distal (right) regulatory elements. Samples are colored by myeloma gene expression subtype (key bottom). B,t-SNE distance between samples within a given subtype (denoted by color) for mRNA as well as promoter proximal and distal regulatory elements. ***P<0.001, **P<0.01, *P<0.05, NS: not significant; determined by a paired Mann-Whitney U test. C, Genome plot of regions distinctly regulated between myeloma subtypes near theCCND2 locus. Regulatory elements (REs) are denoted on top with cumulative ATAC (RPPM) and H3K27ac (RPM) signal shown and mean transcription is shown (log2(RPM+1)) for each myeloma subtype. Transcription for the shaded region is shown (right). D, Scatterplot ofCCND2 expression and regulatory element transcription.
Figure 6.
Figure 6.
Distal regulatory element activity predicts expression of genes prognostic of myeloma outcome. A, Volcano plot of gene expression (GX) associated with overall survival (OS). The y-axis represents the hazard ratio (HR) of OS given gene expression with genes significantly associated with poor outcome denoted in blue (less expression) or red (more expression). The significance line (FDR ≤0.01) is shown (dashed red line). B,Correlation (Pearson R) between transcribed distal regulatory elements and gene expression. C, Frequency of regulatory elements relative to the average gene. All expressed distal regulatory elements are denoted in gray and those that are positively associated with gene expression are shown in green (FDR ≤0.01). D, Top transcription factor consensus binding motifs enriched in distal regulatory elements that predict gene expression associated with poor outcome. Only the top factor for each family is shown (see Supplementary Data S6 for full results). NRF: Nuclear Respiratory Factor; ZF: Zinc Finger; NR: Nuclear Receptor; HLH: Helix-Loop-Helix; AP2: Activating Protein 2; IRF: Interferon Regulatory Factor. E, Overlap odds ratio of transcription factor binding motifs in regulatory elements positively associated with gene expression indicative of poor outcome relative to all transcribed regulatory elements (95% confidence intervals are shown). F, Overall survival (OS) hazard ratio of expression of the transcription factors (95% confidence intervals are shown). G, Genome plot ofRUNX2 (denoted in red). The correlation of regulatory element transcription with gene expression (GX) is shown below the genes with the height of each line denoting the significance of correlation. Regulatory elements, and cumulative ATAC and H3K27ac signal are shown below. A specific distal regulatory region is enlarged as denoted by a black polygon with regulatory element transcription stratified by quartile of expression.H, Regulatory element transcription quartiles (left) and corresponding gene transcription for RUNX2. The Pearson correlation (R) and significance (FDR) of reRNA and RUNX2expression is shown.

Source: PubMed

3
Předplatit