Perturbation and stability of PAM50 subtyping in population-based primary invasive breast cancer

Srinivas Veerla, Lennart Hohmann, Deborah F Nacer, Johan Vallon-Christersson, Johan Staaf, Srinivas Veerla, Lennart Hohmann, Deborah F Nacer, Johan Vallon-Christersson, Johan Staaf

Abstract

PAM50 gene expression subtypes represent a cornerstone in the molecular classification of breast cancer and are included in risk prediction models to guide therapy. We aimed to illustrate the impact of included genes and biological processes on subtyping while considering a tumor's underlying clinical subgroup defined by ER, PR, and HER2 status. To do this we used a population-representative and clinically annotated early-stage breast tumor cohort of 6233 samples profiled by RNA sequencing and applied a perturbation strategy of excluding co-expressed genes (gene sets). We demonstrate how PAM50 nearest-centroid classification depends on biological processes present across, but also within, ER/PR/HER2 subgroups and PAM50 subtypes themselves. Our analysis highlights several key aspects of PAM50 classification. Firstly, we demonstrate the tight connection between a tumor's nearest and second-nearest PAM50 centroid. Additionally, we show that the second-best subtype is associated with overall survival in ER-positive, HER2-negative, and node-negative disease. We also note that ERBB2 expression has little impact on PAM50 classification in HER2-positive disease regardless of ER status and that the Basal subtype is highly stable in contrast to the Normal subtype. Improved consciousness of the commonly used PAM50 subtyping scheme will aid in our understanding and interpretation of breast tumors that have seemingly conflicting PAM50 classification when compared to clinical biomarkers. Finally, our study adds further support in challenging the common misconception that PAM50 subtypes are distinct classes by illustrating that PAM50 subtypes in tumors represent a continuum with prognostic implications.

Conflict of interest statement

The authors declare no competing interests.

© 2023. Springer Nature Limited.

Figures

Fig. 1. Patterns of PAM50 NC versus…
Fig. 1. Patterns of PAM50NC versus PAM50NC_2nd subtype.
In panels (ae), the left panels show the cross-tabulated PAM50NC subtype versus the PAM50NC_2nd subtype for separate tumor subsets, whereas the right panels show the corresponding difference (delta) in Spearman correlation between PAM50NC and PAM50NC_2nd subtype based on the average Spearman correlation of the 100 NC classifications for each case. In the cross tables, colored boxes highlight consistent subtype patterns between PAM50NC and PAM50NC_2nd subtypes. Of all 6233 tumors, 6228 had an unambiguous second-best subtype based on NC classification. a All SCAN-B tumors. b TNBC tumors. c ERnHER2p tumors. d ERpHER2p tumors. e ERpHER2n tumors. f Heatmap of Pearson correlations between PAM50 centroids. Heatmap cells marked with colored boxes show centroid correlation patterns consistent with the PAM50NC and PAM50NC_2nd subtype patterns shown in panels (ae). g Scatter plot of LumA correlation values versus LumB correlation values for tumors classified as LumANC – LumBNC_2nd or LumBNC – LumANC_2nd (n = 1599). The red line corresponds to a 1:1 relationship between correlation estimates. Boxplot elements correspond to: (1) center line = median, (2) box limits = upper and lower quartiles, (3) whiskers = 1.5x interquartile range.
Fig. 2. Association of PAM50 NC_2nd subtype…
Fig. 2. Association of PAM50NC_2nd subtype with patient outcome.
a Overall survival (OS) for endocrine-treated ERpHER2nLNn patients >50 years of age that were PAM50NC subtyped as LumANC. Patients are stratified by their PAM50NC_2nd subtype. b Overall survival for endocrine-treated ERpHER2nLNn patients >50 years of age that were PAM50NC subtyped as LumBNC. Patients are stratified by their PAM50NC_2nd subtype. c Distributions for age at diagnosis (left), tumor size (center), and ROR T0 scores (right) obtained from ref. in endocrine-treated ERpHER2nLNn patients >50 years of age comparing cases subtyped as LumANC – LumBNC_2nd versus LumANC – NormalNC_2nd. d Distributions for age at diagnosis (left), tumor size (center), and ROR T0 scores (right) obtained from ref. in endocrine-treated ERpHER2nLNn patients >50 years of age comparing cases subtyped as LumBNC – HER2ENC_2nd versus LumBNC – LumANC_2nd. e Overall survival for endocrine-treated ERpHER2nLNn patients >50 years of age that were PAM50NC subtyped as LumANC and as ROR-low risk category according to ref. . Patients are stratified by their PAM50NC_2nd subtype. f Overall survival for endocrine-treated ERpHER2nLNn patients >50 years of age that were PAM50NC subtyped as LumBNC and as ROR-high risk category according to ref. . Patients are stratified by their PAM50NC_2nd subtype. Boxplot elements correspond to: (1) center line = median, (2) box limits = upper and lower quartiles, (3) whiskers = 1.5x interquartile range.
Fig. 3. Study overview and PAM50 reclassification…
Fig. 3. Study overview and PAM50 reclassification results for the leave-oneGeneCluster-out strategy.
a Study outline, perturbation methodology, and subtype switch concept. A sample is called as having a subtype switch if the PAM50NC subtype is observed in ≤50% of the 100 PAM50perturb reclassifications (right panel). b Left panel, size of identified SRIQ core gene clusters defined from 9206 RNA sequencing profiles from ref. . Center panel, heatmap of average PAM50 centroid value for each gene set for each PAM50 centroid subtype. Right panel, Spearman correlation of average SRIQ FPKM gene cluster expression for each gene set combination in all 9206 RNA sequencing profiles. c Spearman correlation matrix of average SRIQ FPKM gene cluster expression versus rank-based scores for eight reported biological metagenes from Fredlund et al. for the 6233 tumors included in this study. d Heatmap of scaled FPKM expression for PAM50 genes stratified by SRIQ gene cluster definition and ordered by clinical group and PAM50NC subtype for the 6233 included tumors. e Percent of tumors switching subtype (i.e., a different PAM50perturb subtype compared to PAM50NC) by the leave-oneGeneCluster-out strategy on a whole cohort level stratified by PAM50NC subtypes for the 6233 included tumors. f Percent of tumors switching subtype by the leave-oneGeneCluster-out strategy on a whole cohort level stratified by tumors’ ER, PR, and HER2 status.
Fig. 4. PAM50 reclassification results for the…
Fig. 4. PAM50 reclassification results for the leave-oneGeneCluster-out strategy when stratified for molecular and clinical subgroup.
a Heatmap showing the proportion of tumors that switched subtype after gene set exclusion stratified by molecular and clinical subgroup. Numbers represent total group sizes per row. b Top panel shows the proportion of ERpHER2n LumANC tumors with a PAM50perturb subtype different from their PAM50NC subtype, i.e., switching subtype, when excluding a specific gene set in leave-oneGeneCluster-out reclassification. Lower panel shows the distribution of the PAM50perturb subtypes in tumors that switched in the top panel, numbers on top represent the total number of samples that switched subtype. c The same illustration as in (b), but for ERpHER2n LumBNC tumors. d Heatmap showing the proportion of tumors that switched subtype after gene set exclusion that had a PAM50perturb subtype similar to the PAM50NC_2nd subtype. Numbers represent total group sizes per row. e Summary bar plots of the percentage of tumors in each clinical group further stratified by their PAM50NC subtype that never switch subtype across all gene set perturbations, i.e., the PAM50perturb subtype is the same as the PAM50NC subtype in all perturbations. These cases are hereon referred to as K0 cases.
Fig. 5. Leave-oneGeneCluster-out perturbation and association to…
Fig. 5. Leave-oneGeneCluster-out perturbation and association to patient outcome.
Forest plot of hazard ratios with 95% confidence intervals from univariate Cox regression, using DRFI as clinical endpoint, for tumors that switched subtype versus tumors that did not switch subtype (reference) after exclusion of a gene set in a TNBC tumors, b ERpHER2n tumors, and c endocrine-treated ERpHER2n tumors only. d Kaplan–Meier plot of DRFI for PAM50perturb subtypes in endocrine-treated ERpHER2n LumANC tumors after exclusion of gene set 1 (proliferation). e Kaplan–Meier plot of DRFI for PAM50perturb subtypes in endocrine-treated ERpHER2n LumBNC tumors after exclusion of gene set 3 (basal keratins). f Boxplots of rank-based scores for the mitotic progression, basal, steroid response, and lipid metagenes for endocrine-treated ERpHER2n LumANC tumors in panel (d). g Boxplots of rank-based scores for the mitotic progression, basal, steroid response, and lipid metagenes for endocrine-treated ERpHER2n LumBNC tumors in panel (e). Note that not all included cases in the study have DRFI outcome data, thus the difference in sample numbers between boxplots and survival plots. Boxplot elements correspond to: (1) center line = median, (2) box limits = upper and lower quartiles, (3) whiskers = 1.5x interquartile range.
Fig. 6. Refined single sample PAM50 subtyping…
Fig. 6. Refined single sample PAM50 subtyping in ERpHER2n tumors based on leave-oneGeneCluster-out perturbation stable tumors.
a Outline of the scheme to create refined ERpHER2n PAM50 centroids (termed PAM50K0) used for single sample classification by Spearman correlation based on FPKM values only (i.e., no gene centering). b Sankey plot of subtype change for ERpHER2n tumors when performing PAM50K0 classification as outlined in (a). c Kaplan–Meier plot of DRFI for PAM50K0 subtypes in endocrine-treated ERpHER2n tumors. d Boxplots of rank-based scores for the mitotic checkpoint, steroid response, and basal metagenes for endocrine-treated ERpHER2n tumors stratified by PAM50K0 subtypes. e Kaplan–Meier plot of DRFI for PAM50K0 subtypes in endocrine-treated ERpHER2n LumANC tumors. HER2EK0 and NormalK0 groups excluded due to size. f Left panel, Kaplan–Meier plot of OS for PAM50K0 subtypes in all endocrine-treated ERpHER2n LumANC tumors. HER2EK0 and NormalK0 groups excluded due to size. Right panel, same plot but only for non-K0 tumors (i.e., tumors not included in the PAM50K0 centroid creation). g Boxplots of rank-based scores for the mitotic checkpoint, steroid response, and basal metagenes for endocrine-treated ERpHER2n LumANC tumors stratified by PAM50K0 subtypes. Note that not all included cases in the study have DRFI outcome data, thus the difference in sample numbers between boxplots and survival plots. Boxplot elements correspond to: (1) center line = median, (2) box limits = upper and lower quartiles, (3) whiskers = 1.5x interquartile range.

References

    1. Sung H, et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021;71:209–249. doi: 10.3322/caac.21660.
    1. Goldhirsch A, et al. Personalizing the treatment of women with early breast cancer: highlights of the St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2013. Ann. Oncol. 2013;24:2206–2223. doi: 10.1093/annonc/mdt303.
    1. Cardoso F, et al. 70-gene signature as an aid to treatment decisions in early-stage breast cancer. N. Engl. J. Med. 2016;375:717–729. doi: 10.1056/NEJMoa1602253.
    1. Gnant M, et al. Predicting distant recurrence in receptor-positive breast cancer patients with limited clinicopathological risk: using the PAM50 Risk of Recurrence score in 1478 postmenopausal patients of the ABCSG-8 trial treated with adjuvant endocrine therapy alone. Ann. Oncol. 2014;25:339–345. doi: 10.1093/annonc/mdt494.
    1. Sparano JA, et al. Adjuvant chemotherapy guided by a 21-gene expression assay in breast cancer. N. Engl. J. Med. 2018;379:111–121. doi: 10.1056/NEJMoa1804710.
    1. Bartlett, J. M. et al. Comparing breast cancer multiparameter tests in the OPTIMA prelim trial: no test is more equal than the others. J. Natl Cancer Inst.108, djw050 (2016).
    1. Parker JS, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 2009;27:1160–1167. doi: 10.1200/JCO.2008.18.1370.
    1. Perou CM, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–752. doi: 10.1038/35021093.
    1. Laenkholm AV, et al. Population-based study of Prosigna-PAM50 and outcome among postmenopausal women with estrogen receptor-positive and HER2-negative operable invasive lobular or ductal breast cancer. Clin. Breast Cancer. 2020;20:e423–e432. doi: 10.1016/j.clbc.2020.01.013.
    1. Laenkholm AV, et al. PAM50 risk of recurrence score predicts 10-year distant recurrence in a comprehensive Danish cohort of postmenopausal women allocated to 5 years of endocrine therapy for hormone receptor-positive early breast cancer. J. Clin. Oncol. 2018;36:735–740. doi: 10.1200/JCO.2017.74.6586.
    1. Gnant M, et al. Identifying clinically relevant prognostic subgroups of postmenopausal women with node-positive hormone receptor-positive early-stage breast cancer treated with endocrine therapy: a combined analysis of ABCSG-8 and ATAC using the PAM50 risk of recurrence score and intrinsic subtype. Ann. Oncol. 2015;26:1685–1691. doi: 10.1093/annonc/mdv215.
    1. Ohnstad HO, et al. Prognostic value of PAM50 and risk of recurrence score in patients with early-stage breast cancer with long-term follow-up. Breast Cancer Res. 2017;19:120. doi: 10.1186/s13058-017-0911-9.
    1. Sorlie T, et al. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc. Natl Acad. Sci. USA. 2003;100:8418–8423. doi: 10.1073/pnas.0932692100.
    1. Staaf J, et al. RNA sequencing-based single sample predictors of molecular subtype and risk of recurrence for clinical assessment of early-stage breast cancer. NPJ Breast Cancer. 2022;8:94. doi: 10.1038/s41523-022-00465-3.
    1. Fredlund E, et al. The gene expression landscape of breast cancer is shaped by tumor protein p53 status and epithelial-mesenchymal transition. Breast Cancer Res. 2012;14:R113. doi: 10.1186/bcr3236.
    1. Paquet ER, Hallett MT. Absolute assignment of breast cancer intrinsic molecular subtype. J. Natl Cancer Inst. 2015;107:357. doi: 10.1093/jnci/dju357.
    1. Wallden B, et al. Development and verification of the PAM50-based Prosigna breast cancer gene signature assay. BMC Med. Genomics. 2015;8:54. doi: 10.1186/s12920-015-0129-6.
    1. Sorlie T, et al. The importance of gene-centring microarray data. Lancet Oncol. 2010;11:719–720. doi: 10.1016/S1470-2045(10)70174-1.
    1. Staaf J, Ringner M. Making breast cancer molecular subtypes robust? J. Natl Cancer Inst. 2015;107:386. doi: 10.1093/jnci/dju386.
    1. Ringner M, Jonsson G, Staaf J. Prognostic and chemotherapy predictive value of gene-expression phenotypes in primary lung adenocarcinoma. Clin. Cancer Res. 2016;22:218–229. doi: 10.1158/1078-0432.CCR-15-0529.
    1. Prat A, Parker JS. Standardized versus research-based PAM50 intrinsic subtyping of breast cancer. Clin. Transl. Oncol. 2020;22:953–955. doi: 10.1007/s12094-019-02203-x.
    1. Vallon-Christersson J, et al. Cross comparison and prognostic assessment of breast cancer multigene signatures in a large population-based contemporary clinical series. Sci. Rep. 2019;9:12184. doi: 10.1038/s41598-019-48570-x.
    1. Burstein HJ, et al. Customizing local and systemic therapies for women with early breast cancer: the St. Gallen International Consensus Guidelines for treatment of early breast cancer 2021. Ann. Oncol. 2021;32:1216–1235. doi: 10.1016/j.annonc.2021.06.023.
    1. Kuilman MM, et al. BluePrint breast cancer molecular subtyping recognizes single and dual subtype tumors with implications for therapeutic guidance. Breast Cancer Res. Treat. 2022;195:263–274. doi: 10.1007/s10549-022-06698-x.
    1. Prat A, et al. Phenotypic and molecular characterization of the claudin-low intrinsic subtype of breast cancer. Breast Cancer Res. 2010;12:R68. doi: 10.1186/bcr2635.
    1. Lien, T. G. et al. Sample preparation approach influences PAM50 risk of recurrence score in early breast cancer. Cancers13, 6118 (2021).
    1. Prat A, Perou CM. Deconstructing the molecular portraits of breast cancer. Mol. Oncol. 2011;5:5–23. doi: 10.1016/j.molonc.2010.11.003.
    1. Nielsen TO, et al. Immunohistochemical and clinical characterization of the basal-like subtype of invasive breast carcinoma. Clin. Cancer Res. 2004;10:5367–5374. doi: 10.1158/1078-0432.CCR-04-0220.
    1. Nielsen T, et al. Analytical validation of the PAM50-based Prosigna Breast Cancer Prognostic Gene Signature Assay and nCounter Analysis System using formalin-fixed paraffin-embedded breast tumor specimens. BMC Cancer. 2014;14:177. doi: 10.1186/1471-2407-14-177.
    1. Ryden L, et al. Minimizing inequality in access to precision medicine in breast cancer by real-time population-based molecular analysis in the SCAN-B initiative. Br. J. Surg. 2018;105:e158–e168. doi: 10.1002/bjs.10741.
    1. Saal LH, et al. The Sweden Cancerome Analysis Network-Breast (SCAN-B) Initiative: a large-scale multicenter infrastructure towards implementation of breast cancer genomic analyses in the clinical routine. Genome Med. 2015;7:20. doi: 10.1186/s13073-015-0131-9.
    1. Karlstrom J, Aine M, Staaf J, Veerla S. SRIQ clustering: a fusion of Random Forest, QT clustering, and KNN concepts. Comput. Struct. Biotechnol. J. 2022;20:1567–1579. doi: 10.1016/j.csbj.2022.03.036.
    1. Staaf J, et al. High-resolution genomic and expression analyses of copy number alterations in HER2-amplified breast cancer. Breast Cancer Res. 2010;12:R25. doi: 10.1186/bcr2568.
    1. Kuleshov MV, et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Re. 2016;44:W90–W97. doi: 10.1093/nar/gkw377.
    1. Chen EY, et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics. 2013;14:128. doi: 10.1186/1471-2105-14-128.
    1. Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000;28:27–30. doi: 10.1093/nar/28.1.27.
    1. Kanehisa M, Furumichi M, Sato Y, Kawashima M, Ishiguro-Watanabe M. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 2023;51:D587–D592. doi: 10.1093/nar/gkac963.
    1. Ashburner M, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000;25:25–29. doi: 10.1038/75556.
    1. Gene Ontology C. The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res. 2021;49:D325–D334. doi: 10.1093/nar/gkaa1113.
    1. Nacer DF, et al. Molecular characteristics of breast tumors in patients screened for germline predisposition from a population-based observational study. Genome Med. 2023;15:25. doi: 10.1186/s13073-023-01177-4.

Source: PubMed

3
Abonnieren