Comparison of triple-negative breast cancer molecular subtyping using RNA from matched fresh-frozen versus formalin-fixed paraffin-embedded tissue

Bojana Jovanović, Quanhu Sheng, Robert S Seitz, Kasey D Lawrence, Stephan W Morris, Lance R Thomas, David R Hout, Brock L Schweitzer, Yan Guo, Jennifer A Pietenpol, Brian D Lehmann, Bojana Jovanović, Quanhu Sheng, Robert S Seitz, Kasey D Lawrence, Stephan W Morris, Lance R Thomas, David R Hout, Brock L Schweitzer, Yan Guo, Jennifer A Pietenpol, Brian D Lehmann

Abstract

Background: Triple negative breast cancer (TNBC) is a heterogeneous disease that lacks unifying molecular alterations that can guide therapy decisions. We previously identified distinct molecular subtypes of TNBC (TNBCtype) using gene expression data generated on a microarray platform using frozen tumor specimens. Tumors and cell lines representing the identified subtypes have distinct enrichment in biologically relevant transcripts with differing sensitivity to standard chemotherapies and targeted agents. Since our initial discoveries, RNA-sequencing (RNA-seq) has evolved as a sensitive and quantitative tool to measure transcript abundance.

Methods: To demonstrate that TNBC subtypes were similar between platforms, we compared gene expression from matched specimens profiled by both microarray and RNA-seq from The Cancer Genome Atlas (TCGA). In the clinical care of patients with TNBC, tumor specimens collected for diagnostic purposes are processed by formalin fixation and paraffin-embedding (FFPE). Thus, for TNBCtype to eventually have broad and practical clinical utility we performed RNA-seq gene expression and molecular classification comparison between fresh-frozen (FF) and FFPE tumor specimens.

Results: Analysis of TCGA showed consistent subtype calls between 91% of evaluable samples demonstrating conservation of TNBC subtypes across microarray and RNA-seq platforms. We compared RNA-seq performed on 21-paired FF and FFPE TNBC specimens and evaluated genome alignment, transcript coverage, differential transcript enrichment and concordance of TNBC molecular subtype calls. We demonstrate that subtype accuracy between matched FF and FFPE samples increases with sequencing depth and correlation strength to an individual TNBC subtype.

Conclusions: TNBC subtypes were reliably identified from FFPE samples, with highest accuracy if the samples were less than 4 years old and reproducible subtyping increased with sequencing depth. To reproducibly subtype tumors using gene expression, it is critical to select genes that do not vary due to platform type, tissue processing or RNA isolation method. The majority of differentially expressed transcripts between matched FF and FFPE samples could be attributed to transcripts selected for by RNA enrichment method. While differentially expressed transcripts did not impact TNBC subtyping, they will provide guidance on determining which transcripts to avoid when implementing a gene set size reduction strategy.

Trial registration: NCT00930930 07/01/2009.

Keywords: Formalin-fixed paraffin embedded; Fresh-frozen; RNA-seq; TNBCtype.

Figures

Fig. 1
Fig. 1
TNBC molecular subtype concordance between matched FF and FFPE samples processed on microarray and RNA-seq improves with increased prediction confidence. aScatterplot shows TNBC subtype accuracy between microarray and RNA-seq as a function of prediction confidence in the TCGA breast (BRCA) cohort. bPlot shows RNA-seq prediction accuracy by confidence score. Vertical line cutoff demarks the prediction confidence score generating 95% concordance between platforms. cScatterplot shows the concordance between microarray and RNA-seq platforms by strength of correlation to a subtype (prediction score)
Fig. 2
Fig. 2
MiSeq and HiSeq platform mapped read comparison from FF- and FFPE-derived RNA sequences. aBarplot depicts the percentage of mapped reads that are on-target, or off-target (intronic and intergenic) for FF and FFPE samples processed on MiSeq and HiSeq platforms. bBeeswarm box plot shows mapped reads (%) form individual FF (blue) and FFPE (red) samples processed on the HiSeq
Fig. 3
Fig. 3
FF and FFPE transcript correlation improves with increased sequencing depth. Density plots show the pairwise Spearman correlation of matched FF and FFPE samples for a all transcripts, b protein-coding transcripts or c TNBC centroid transcripts processed on the HiSeq or MiSeq platform
Fig. 4
Fig. 4
Removal of differential transcripts improves FF and FFPE gene expression correlation. Heatmaps display unsupervised hierarchical clustering of a sample-wise correlation coefficients, b all transcripts (n = 27,577) or c principal component analysis (PCA) of all transcripts. Following removal of differentially expressed transcripts between FF and FFPE samples, remaining transcripts (n = 15,624) were used to perform d sample-wise correlation coefficients e hierarchical clustering or f PCA. Underlined samples indicate clustering of paired FF and FFPE samples
Fig. 5
Fig. 5
Differential transcripts are enriched for longer transcripts in FFPE compared to FF samples. aBoxplot shows transcript length (log10 bp) distribution for all protein-coding transcripts (n = 16,630), non-differential transcripts (n = 4450), transcripts enriched in FF (n = 2338) or FFPE (n = 2112). bBeeswarm boxplot shows the distribution of length for individual protein coding transcripts enriched in FF or FFPE. Line graphs show cTTN and dSYNE1 exon level expression (count) along the transcript for in paired FF and FFPE samples
Fig. 6
Fig. 6
Accuracy of TNBC subtype calls between FF and FFPE depends on prediction confidence and sequencing depth. a Table summarizes TNBC subtype correlations, prediction calls, prediction confidence and concordance between matched FF and FFPE samples processed on the Illumina HiSeq and MiSeq. bScatter plots show concordance (blue) between FF and FFPE samples run on HiSeq and MiSeq as a function of prediction confidence. cScatterplots show the prediction confidence and prediction accuracy for FFPE (left) or FF (right) samples processed on the HiSeq (top) or MiSeq (bottom). dScatter plots show prediction confidence and prediction strength for FFPE and FF samples processed on the HiSeq and MiSeq. Those samples with concordant subtype calls are indicated in blue and discordant calls in red

References

    1. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004;351:2817–2826. doi: 10.1056/NEJMoa041588.
    1. Sholl LM, Xiao Y, Joshi V, Yeap BY, Cioffredi L-A, Jackman DM, et al. EGFR mutation is a better predictor of response to tyrosine kinase inhibitors in non-small cell lung carcinoma than FISH, CISH, and immunohistochemistry. Am J Clin Pathol. 2010;133:922–934. doi: 10.1309/AJCPST1CTHZS3PSZ.
    1. Weichert W, Schewe C, Lehmann A, Sers C, Denkert C, Budczies J, et al. KRAS genotyping of paraffin-embedded colorectal cancer tissue in routine diagnostics: comparison of methods and impact of histology. J Mol Diagn. 2010;12:35–42. doi: 10.2353/jmoldx.2010.090079.
    1. Lehmann BD, Bauer JA, Chen X, Sanders ME, Chakravarthy AB, Shyr Y, et al. Identification of human triple-negative breast cancer subtypes and preclinical models for selection of targeted therapies. J Clin Invest. 2011;121:2750–2767. doi: 10.1172/JCI45014.
    1. Masuda H, Baggerly KA, Wang Y, Zhang Y, Gonzalez-Angulo AM, Meric-Bernstam F, et al. Differential response to neoadjuvant chemotherapy among 7 triple-negative breast cancer molecular subtypes. Clin Cancer Res. 2013;19:5533–5540. doi: 10.1158/1078-0432.CCR-13-0799.
    1. Lehmann BD, Jovanović B, Chen X, Estrada MV, Johnson KN, Shyr Y, et al. Refinement of triple-negative breast cancer molecular subtypes: implications for Neoadjuvant chemotherapy selection. PLoS One. 2016;11:e0157368. doi: 10.1371/journal.pone.0157368.
    1. Wilhelm BT, Landry J-R. RNA-Seq-quantitative measurement of expression through massively parallel RNA-sequencing. Methods. 2009;48:249–257. doi: 10.1016/j.ymeth.2009.03.016.
    1. Li P, Conley A, Zhang H, Kim HL. Whole-Transcriptome profiling of formalin-fixed, paraffin-embedded renal cell carcinoma by RNA-seq. BMC Genomics. 2014;15:1087. doi: 10.1186/1471-2164-15-1087.
    1. Hedegaard J, Thorsen K, Lund MK, Hein A-MK, Hamilton-Dutoit SJ, Vang S, et al. Next-generation sequencing of RNA and DNA isolated from paired fresh-frozen and formalin-fixed paraffin-embedded samples of human cancer and normal tissue. PLoS One. 2014;9:e98187. doi: 10.1371/journal.pone.0098187.
    1. Zhao W, He X, Hoadley KA, Parker JS, Hayes DN, Perou CM. Comparison of RNA-Seq by poly (a) capture, ribosomal RNA depletion, and DNA microarray for expression profiling. BMC Genomics. 2014;15:419. doi: 10.1186/1471-2164-15-419.
    1. Guo Y, Wu J, Zhao S, Ye F, Su Y, Clark T, et al. RNA sequencing of formalin-fixed, paraffin-embedded specimens for gene expression quantification and data mining. Int J Genomics. 2016;2016:9837310.
    1. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635.
    1. Anders S, Pyl PT, Huber W. HTSeq--a python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31:166–169. doi: 10.1093/bioinformatics/btu638.
    1. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8.
    1. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102.
    1. Chen X, Li J, Gray WH, Lehmann BD, Bauer JA, Shyr Y, et al. TNBCtype: a Subtyping tool for triple-negative breast cancer. Cancer Informat. 2012;11:147–156. doi: 10.4137/CIN.S9983.
    1. Lehmann BD, Pietenpol JA. Identification and use of biomarkers in treatment strategies for triple-negative breast cancer subtypes. J Pathol. 2014;232:142–150. doi: 10.1002/path.4280.
    1. Fumagalli D, Blanchet-Cohen A, Brown D, Desmedt C, Gacquer D, Michiels S, et al. Transfer of clinically relevant gene expression signatures in breast cancer: from Affymetrix microarray to Illumina RNA-sequencing technology. BMC Genomics. 2014;15:1008. doi: 10.1186/1471-2164-15-1008.
    1. Sweeney C, Bernard PS, Factor RE, Kwan ML, Habel LA, Quesenberry CP, et al. Intrinsic subtypes from PAM50 gene expression assay in a population-based breast cancer cohort: differences by age, race, and tumor characteristics. Cancer Epidemiol Biomark Prev. 2014;23:714–724. doi: 10.1158/1055-9965.EPI-13-1023.
    1. van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AAM, Voskuil DW, et al. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002;347:1999–2009. doi: 10.1056/NEJMoa021967.
    1. Filipits M, Rudas M, Jakesz R, Dubsky P, Fitzal F, Singer CF, et al. A new molecular predictor of distant recurrence in ER-positive, HER2-negative breast cancer adds independent information to conventional clinical risk factors. Clin Cancer Res. 2011;17:6012–6020. doi: 10.1158/1078-0432.CCR-11-0926.
    1. Perou CM, Sørlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–752. doi: 10.1038/35021093.
    1. Parker JS, Mullins M, Cheang MCU, Leung S, Voduc D, Vickery T, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol. 2009;27:1160–1167. doi: 10.1200/JCO.2008.18.1370.
    1. Wallden B, Storhoff J, Nielsen T, Dowidar N, Schaper C, Ferree S, et al. Development and verification of the PAM50-based Prosigna breast cancer gene signature assay. BMC Med Genet. 2015;8:54.
    1. Omolo B, Yang M, Lo FY, Schell MJ, Austin S, Howard K, et al. Adaptation of a RAS pathway activation signature from FF to FFPE tissues in colorectal cancer. BMC Med Genet. 2016;9:65.
    1. Chen R, Guan Q, Cheng J, He J, Liu H, Cai H, et al. Robust transcriptional tumor signatures applicable to both formalin-fixed paraffin-embedded and fresh-frozen samples. Oncotarget. 2017;8(4):6652-6662.

Source: PubMed

3
Prenumerera