RNA-seq analysis of prostate cancer in the Chinese population identifies recurrent gene fusions, cancer-associated long noncoding RNAs and aberrant alternative splicings

Shancheng Ren, Zhiyu Peng, Jian-Hua Mao, Yongwei Yu, Changjun Yin, Xin Gao, Zilian Cui, Jibin Zhang, Kang Yi, Weidong Xu, Chao Chen, Fubo Wang, Xinwu Guo, Ji Lu, Jun Yang, Min Wei, Zhijian Tian, Yinghui Guan, Liang Tang, Chuanliang Xu, Linhui Wang, Xu Gao, Wei Tian, Jian Wang, Huanming Yang, Jun Wang, Yinghao Sun, Shancheng Ren, Zhiyu Peng, Jian-Hua Mao, Yongwei Yu, Changjun Yin, Xin Gao, Zilian Cui, Jibin Zhang, Kang Yi, Weidong Xu, Chao Chen, Fubo Wang, Xinwu Guo, Ji Lu, Jun Yang, Min Wei, Zhijian Tian, Yinghui Guan, Liang Tang, Chuanliang Xu, Linhui Wang, Xu Gao, Wei Tian, Jian Wang, Huanming Yang, Jun Wang, Yinghao Sun

Abstract

There are remarkable disparities among patients of different races with prostate cancer; however, the mechanism underlying this difference remains unclear. Here, we present a comprehensive landscape of the transcriptome profiles of 14 primary prostate cancers and their paired normal counterparts from the Chinese population using RNA-seq, revealing tremendous diversity across prostate cancer transcriptomes with respect to gene fusions, long noncoding RNAs (long ncRNA), alternative splicing and somatic mutations. Three of the 14 tumors (21.4%) harbored a TMPRSS2-ERG fusion, and the low prevalence of this fusion in Chinese patients was further confirmed in an additional tumor set (10/54=18.5%). Notably, two novel gene fusions, CTAGE5-KHDRBS3 (20/54=37%) and USP9Y-TTTY15 (19/54=35.2%), occurred frequently in our patient cohort. Further systematic transcriptional profiling identified numerous long ncRNAs that were differentially expressed in the tumors. An analysis of the correlation between expression of long ncRNA and genes suggested that long ncRNAs may have functions beyond transcriptional regulation. This study yielded new insights into the pathogenesis of prostate cancer in the Chinese population.

Figures

Figure 1
Figure 1
The landscape of gene fusions in prostate cancer. (A) A Circos plot of the genomic landscape of gene fusions discovered by RNA-seq in the 14 prostate cancer samples. The outer ring shows chromosome ideograms. The gene fusions are shown as arcs linking the two genomic loci, each colored according to the frequency with which the gene fusion was found in the 14 prostate cancer samples (red=3 and black=1). (B)TMPRSS2-ERG fusion in three prostate cancers. The TMPRSS2-ERG fusion was between exon 1 of TMPRSS2 (red) and exon 4 of ERG (blue). The number of reliable pair-end and fusion spanning reads in each sample is indicated to the right of each read. The sample ID is indicated in brackets. (C) The CTAGE5-KHDRBS3 fusion in one prostate cancer is revealed by one paired-end and one fusion-spanning read. The CTAGE5-KHDRBS3 fusion was between exon 23 of CTAGE5 (blue) and exon 8 of KHDRBS3 (red). (D) Representative experimental validation of the fusion gene transcript by RT-PCR and Sanger sequencing. (E) Prevalence of the TMPRSS2-ERG, USP9Y-TTTY15, CTAGE5-KHDRBS3, RAD50-PDLIM4, and SDK1-AMACR fusions in prostate cancer. (F) Interphase FISH on formalin-fixed, paraffin-embedded tissue confirming the fusion of SDK1 and AMACR. Probes for SDK1 (red) and AMACR (green) demonstrate the fusion of the genomic loci (yellow arrows) in cancerous cells.
Figure 2
Figure 2
Transcriptional landscape of human long ncRNAs in prostate cancer. (A) Supervised hierarchical clustering analysis using 137 long ncRNAs that were consistently upregulated or downregulated in more than 50% of the prostate cancer samples (≥ 2-fold and FDR ≤ 0.001). Shades of red and green are used to illustrate whether the expression value is above (red) or below (green) the mean expression value across all samples (each row in the data was normalized from −1 to +1). (B) Correlation heatmap between the expression of long ncRNAs and genes. Rows represent genes aligned according to their chromosomal locations, and the columns represent differentially expressed long ncRNAs. A red color indicates a positive correlation, whereas green bars represent a negative correlation (absolute correlation coefficient |R| ≥ 0.85, FDR ≤ 0.01). (C) qRT-PCR assessed the expression levels of DD3, FR0257520, FR0348383, and MALAT1 in the additional set of 40 pairs of prostate cancer and adjacent normal tissues. (D) Comparison of the expression levels of DD3, FR0257520, FR0348383, and MALAT1 between prostate cancer and normal tissues by qRT-PCR.
Figure 3
Figure 3
The landscape of somatic mutations in prostate cancers. (A) The distribution of somatic mutations among different locations in the genome. (B) The frequencies of different substitutions. (C) Validation of a somatic mutation in CHAF1A. The mapped reads are shown in the top panel. The mutated residue is highlighted by a red box. An electropherogram of the Sanger sequencing validation of the mutation and its surrounding nucleotides is shown in the bottom panel.
Figure 4
Figure 4
The landscape of alternative splicing in prostate cancer. (A) A Circos plot showing the genomic landscape of AS events in the 14 prostate cancer samples discovered by RNA-seq. The outer ring shows chromosome ideograms. The bars along each inner ring represent AS events in a prostate cancer sample. (B) An example of RNA-seq data indicative of intron retention in the KLK3 (PSA) gene. The line plot displays the expression of each exon (e1, e2, etc.) and intron (in1, in2, etc.), and alternative expression events are highlighted in yellow. (C) An example of RNA-seq data indicative of exon skipping in the AMACR gene. The line plot displays the expression of each exon (e1, e2, etc.) and exon junction (e1-e2 and others), and alternative expression events are highlighted in yellow. (D) Validation of KLK3 intron retention and AMACR exon skipping by RT-PCR. A pair of primers was designed to detect only KLK3 intron retention.
Figure 5
Figure 5
Three major signaling pathways are altered in prostate cancers. (A) Genes altered in the RAS-PI3K-AKT pathway. (B) Genes altered in the AR signaling pathway. (C) Genes altered in the RB signaling pathway. The activated genes are colored in a red gradient, and the inactivated genes are colored in a blue gradient according to the percentage of alterations in the 14 prostate cancer samples. The darker the color, the greater the percentage.

Source: PubMed

Подписаться