An immune response gene expression module identifies a good prognosis subtype in estrogen receptor negative breast cancer

Andrew E Teschendorff, Ahmad Miremadi, Sarah E Pinder, Ian O Ellis, Carlos Caldas, Andrew E Teschendorff, Ahmad Miremadi, Sarah E Pinder, Ian O Ellis, Carlos Caldas

Abstract

Background: Estrogen receptor (ER)-negative breast cancer specimens are predominantly of high grade, have frequent p53 mutations, and are broadly divided into HER2-positive and basal subtypes. Although ER-negative disease has overall worse prognosis than does ER-positive breast cancer, not all ER-negative breast cancer patients have poor clinical outcome. Reliable identification of ER-negative tumors that have a good prognosis is not yet possible.

Results: We apply a recently proposed feature selection method in an integrative analysis of three major microarray expression datasets to identify molecular subclasses and prognostic markers in ER-negative breast cancer. We find a subclass of basal tumors, characterized by over-expression of immune response genes, which has a better prognosis than the rest of ER-negative breast cancers. Moreover, we show that, in contrast to ER-positive tumours, the majority of prognostic markers in ER-negative breast cancer are over-expressed in the good prognosis group and are associated with activation of complement and immune response pathways. Specifically, we identify an immune response related seven-gene module and show that downregulation of this module confers greater risk for distant metastasis (hazard ratio 2.02, 95% confidence interval 1.2-3.4; P = 0.009), independent of lymph node status and lymphocytic infiltration. Furthermore, we validate the immune response module using two additional independent datasets.

Conclusion: We show that ER-negative basal breast cancer is a heterogeneous disease with at least four main subtypes. Furthermore, we show that the heterogeneity in clinical outcome of ER-negative breast cancer is related to the variability in expression levels of complement and immune response pathway genes, independent of lymphocytic infiltration.

Figures

Figure 1
Figure 1
FDR comparison in ER- and ER+ breast cancer. For various significance thresholds (sigth), we plot the fraction of observed genes with P values less than the significance threshold (black) as well as the corresponding fraction of false positives, as estimated using a q value analysis (red). (a) Overall survival for ER+ breast cancer. (b) Overall survival for ER- breast cancer. (c) Time to distant metastasis for ER+ breast cancer. (d) Time to distant metastasis for ER- breast cancer. P values were obtained from the log-rank test using Cox regression models. ER, estrogen receptor; FDR, false discovery rate.
Figure 2
Figure 2
PACK flowchart. (a) A schematic diagram of PACK, as used in this study. For each gene expression profile an unbiased estimate of its kurtosis, K, is computed. Genes with negative kurtosis are selected because only these define large subgroups (of sizes >22% of the total sample size). Further unsupervised clustering may then be performed on this subset of negative kurtosis profiles to find novel tumor subclasses. Alternatively, to find robust prognostic markers, negative kurtosis profiles are filtered further based on whether there is evidence of bimodality (C = 2). This step requires a cluster inference algorithm and a model selection criterion to discard those profiles that are best described by a single gaussian (C = 1; by random chance gaussian profiles may have negative kurtosis). Correlation to phenotypes (here phenotypes) is done with Fisher's test to evaluate whether the distribution of the categorical phenotype across the two clusters is significantly different from random. (b) Density curves of typical bimodal negative and positive kurtosis gene expression profiles. X-axis shows gene expression on a log2 scale. PACK, Profile Analysis using Clustering and Kurtosis.
Figure 3
Figure 3
Molecular subclasses in ER- breast cancer. (a) Complete linkage hierarchical clustering of 186 ER- breast tumors over 813 genes with negative kurtosis profiles. Five sample clusters were identified and characterized in terms of the patterns of over-expression and under-expression of four gene clusters related to cell cycle (CC; blue), immune response (IR; red), extracellular matrix (ECM; green), and steroid hormone response (SR; pink) functions. Panels show the distribution of the SSP subtype [23], the lymphocytic infiltration score, histologic grade, basal marker [27], and ERBB2+ amplifier subtype. Panel color codes: SSP (pink = HER2, brown = basal, dark green = normal, sky blue = luminal A, and blue = luminal B); LYM.INF (black = high, gray = low, and white = missing); GRADE (black = high, blue = intermediate, sky blue = low, and white = missing), BASAL.MARK. (black = high and white = low), ERBB2-AMP (black = high and white = low). The BASAL.MARK. profile represents an average over validated basal markers in [27], whereas the ERBB2-AMP profile was calculated as an average over three genes in the ERBB2 amplicon (ERBB2, STARD3, GRB7). (b) Kaplan-Meier curves for time to distant metastasis (years) and for the five subclasses identified in panel (a). (c) Partitioning around medoids clustering over the seven-gene prognostic immune response module. Panel color codes: purple = cluster over-expressing module, yellow = cluster under-expressing module, black = poor outcome samples, gray = good outcome samples, green = relative under-expression, and red = relative over-expression. (d) Kaplan-Meier curves for time to distant metastasis for the two groups identified in panel (c). Hazard ratio, 95% confidence interval, and log-rank test P values are shown. ER, estrogen receptor; SSP, single sample predictor.
Figure 4
Figure 4
Expression profiles of selected prognostic markers in ER- breast cancer. Expression profile (on a log2 scale) of selected prognostic markers (a) IGLC2 and (b) C1QA in the integrated cohort of 186 ER- tumours (NKI2 + EMC + NCH), and in the validation cohorts UPP and JRH-2. Good outcome samples are shown in green, and poor outcome samples in blue. Clusters were inferred using the variational Bayesian approach in NKI2 + EMC + NCH and the pam algorithm in the UPP and JRH-2 cohorts. Infered clusters are indicated by different shapes (triangles and diamonds). ER, estrogen receptor; pam, partitioning around medoids.
Figure 5
Figure 5
Pam clustering over IR module in external ER- cohorts. Heatmap of gene expression of seven-gene IR-module in ER- samples of the (a) UPP and (b) JRH-2 cohorts. Shown are the clusters over-expressing (purple) and under-expressing (yellow) the IR module, as predicted by the pam algorithm. Good outcome samples are shown in gray, and poor outcome samples in black. Green indicates relative under-expression, and red indicates relative over-expression. (c) Kaplan-Meier survival curves over combined external cohorts (for UPP end-point was disease-specific survival, and for JRH-2 it was recurrence-free survival), with the number of events and samples in each of the two predicted groups. ER, estrogen receptor; pam, partitioning around medoids.

References

    1. Brenton JD, Carey LA, Ahmed AA, Caldas C. Molecular classification and molecular forecasting of breast cancer: ready for clinical application? J Clin Oncol. 2005;23:7350–7360. doi: 10.1200/JCO.2005.03.3845.
    1. Rakha EA, El-Sayed ME, Green AR, Paish EC, Lee AH, Ellis IO. Breast carcinoma with basal differentiation: a proposal for pathology definition based on basal cytokeratin expression. Histopathology. 2007;50:434–438. doi: 10.1111/j.1365-2559.2007.02638.x.
    1. van 't Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415:530–536. doi: 10.1038/415530a.
    1. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, Baehner FL, Walker MG, Watson D, Park T, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004;351:2817–2826. doi: 10.1056/NEJMoa041588.
    1. Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, Meijer-van Gelder ME, Yu J, et al. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet. 2005;365:671–679.
    1. Pawitan Y, Bjohle J, Amler L, Borg AL, Egyhazi S, Hall P, Han X, Holmberg L, Huang F, Klaar S, et al. Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Res. 2005;7:R953–R964. doi: 10.1186/bcr1325.
    1. Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J, Nordgren H, Farmer P, Praz V, Haibe-Kains B, et al. Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst. 2006;98:262–272.
    1. Foekens JA, Atkins D, Zhang Y, Sweep FC, Harbeck N, Paradiso A, Cufer T, Sieuwerts AM, Talantov D, Span PN, et al. Multicenter validation of a gene expression-based prognostic signature in lymph node-negative primary breast cancer. J Clin Oncol. 2006;24:1665–1671. doi: 10.1200/JCO.2005.03.9115.
    1. Naderi A, Teschendorff AE, Barbosa-Morais NL, Pinder SE, Green AR, Powe DG, Robertson JF, Aparicio S, Ellis IO, Brenton JD, et al. A gene-expression signature to predict survival in breast cancer across independent data sets. Oncogene. 2007;26:1507–1516. doi: 10.1038/sj.onc.1209920.
    1. Teschendorff AE, Naderi A, Barbosa-Morais NL, Pinder SE, Ellis IO, Aparicio S, Brenton JD, Caldas C. A consensus prognostic gene expression classifier for ER positive breast cancer. Genome Biol. 2006;7:R101. doi: 10.1186/gb-2006-7-10-r101.
    1. van de Rijn M, Perou CM, Tibshirani R, Haas P, Kallioniemi O, Kononen J, Torhorst J, Sauter G, Zuber M, Kochli OR, et al. Expression of cytokeratins 17 and 5 identifies a group of breast carcinomas with poor clinical outcome. Am J Pathol. 2002;161:1991–1996.
    1. Malzahn K, Mitze M, Thoenes M, Moll R. Biological and prognostic significance of stratified epithelial cytokeratins in infiltrating ductal breast carcinomas. Virchows Arch. 1998;433:119–129. doi: 10.1007/s004280050226.
    1. Rakha EA, El-Rehim DA, Paish C, Green AR, Lee AH, Robertson JF, Blamey RW, Macmillan D, Ellis IO. Basal phenotype identifies a poor prognostic subgroup of breast cancer of clinical importance. Eur J Cancer. 2006;42:3149–3156. doi: 10.1016/j.ejca.2006.08.015.
    1. Rakha EA, El-Sayed ME, Green AR, Lee AH, Robertson JF, Ellis IO. Prognostic markers in triple-negative breast cancer. Cancer. 2007;109:25–32. doi: 10.1002/cncr.22381.
    1. Jumppanen M, Gruvberger-Saal S, Kauraniemi P, Tanner M, Bendahl PO, Lundin M, Krogh M, Kataja P, Borg A, Ferno M, et al. Basal-like phenotype is not associated with patient survival in estrogen-receptor-negative breast cancers. Breast Cancer Res. 2007;9:R16. doi: 10.1186/bcr1649.
    1. Eden P, Ritz C, Rose C, Ferno M, Peterson C. 'Good Old' clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers. Eur J Cancer. 2004;40:1837–1841. doi: 10.1016/j.ejca.2004.02.025.
    1. Teschendorff AE, Naderi A, Barbosa-Morais NL, Caldas C. PACK: Profile Analysis using Clustering and Kurtosis to find molecular classifiers in cancer. Bioinformatics. 2006;22:2269–2275. doi: 10.1093/bioinformatics/btl174.
    1. van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AA, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, et al. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002;347:1999–2009. doi: 10.1056/NEJMoa021967.
    1. Miller LD, Smeds J, George J, Vega VB, Vergara L, Ploner A, Pawitan Y, Hall P, Klaar S, Liu ET, Bergh J. An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Natl Acad Sci USA. 2005;102:13550–13555. doi: 10.1073/pnas.0506230102.
    1. Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci USA. 2003;100:9440–9445. doi: 10.1073/pnas.1530509100.
    1. Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A, Deng S, Johnsen H, Pesich R, Geisler S, et al. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci USA. 2003;100:8418–8423. doi: 10.1073/pnas.0932692100.
    1. Sotiriou C, Neo SY, McShane LM, Korn EL, Long PM, Jazaeri A, Martiat P, Fox SB, Harris AL, Liu ET. Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc Natl Acad Sci USA. 2003;100:10393–10398. doi: 10.1073/pnas.1732912100.
    1. Hu Z, Fan C, Oh DS, Marron JS, He X, Qaqish BF, Livasy C, Carey LA, Reynolds E, Dressler L, et al. The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics. 2006;7:96. doi: 10.1186/1471-2164-7-96.
    1. Farmer P, Bonnefoi H, Becette V, Tubiana-Hulin M, Fumoleau P, Larsimont D, Macgrogan G, Bergh J, Cameron D, Goldstein D, et al. Identification of molecular apocrine breast tumours by microarray analysis. Oncogene. 2005;24:4660–4671. doi: 10.1038/sj.onc.1208561.
    1. Doane AS, Danso M, Lal P, Donaton M, Zhang L, Hudis C, Gerald WL. An estrogen receptor-negative breast cancer subset characterized by a hormonally regulated transcriptional program and response to androgen. Oncogene. 2006;25:3994–4008. doi: 10.1038/sj.onc.1209415.
    1. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–752. doi: 10.1038/35021093.
    1. Charafe-Jauffret E, Ginestier C, Monville F, Finetti P, Adelaide J, Cervera N, Fekairi S, Xerri L, Jacquemier J, Birnbaum D, Bertucci F. Gene expression profiling of breast cell lines identifies potential new basal markers. Oncogene. 2006;25:2273–2284. doi: 10.1038/sj.onc.1209254.
    1. Zhang B, Schmoyer D, Kirov S, Snoddy J. GOTree Machine (GOTM): a web-based platform for interpreting sets of interesting genes using Gene Ontology hierarchies. BMC Bioinformatics. 2004;5:1–8. doi: 10.1186/1471-2105-5-1.
    1. Kaufman L, Rousseeuw PJ. Finding Groups in Data: An Introduction to Cluster Analysis. New York: Wiley; 1990.
    1. Huang E, Cheng SH, Dressman H, Pittman J, Tsou MH, Horng CF, Bild A, Iversen ES, Liao M, Chen CM, et al. Gene expression predictors of breast cancer outcomes. Lancet. 2003;361:1590–1596. doi: 10.1016/S0140-6736(03)13308-9.
    1. Lee AH, Gillett CE, Ryder K, Fentiman IS, Miles DW, Millis RR. Different patterns of inflammation and prognosis in invasive carcinoma of the breast. Histopathology. 2006;48:692–701. doi: 10.1111/j.1365-2559.2006.02410.x.
    1. Marques LA, Franco EL, Torloni H, Brentani MM, da Silva-Neto JB, Brentani RR. Independent prognostic value of laminin receptor expression in breast cancer survival. Cancer Res. 1990;50:1479–1483.
    1. Nixon AJ, Neuberg D, Hayes DF, Gelman R, Connolly JL, Schnitt S, Abner A, Recht A, Vicini F, Harris JR. Relationship of patient age to pathologic features of the tumor and prognosis for patients with stage I or II breast cancer. J Clin Oncol. 1994;12:888–894.
    1. Rilke F, Colnaghi MI, Cascinelli N, Andreola S, Baldini MT, Bufalino R, Della Porta G, Menard S, Pierotti MA, Testori A. Prognostic significance of HER-2/neu expression in breast cancer and its relationship to other prognostic factors. Int J Cancer. 1991;49:44–49. doi: 10.1002/ijc.2910490109.
    1. Aaltomaa S, Lipponen P, Eskelinen M, Kosma VM, Marin S, Alhava E, Syrjanen K. Lymphocyte infiltrates as a prognostic variable in female breast cancer. Eur J Cancer. 1992;28A:859–864. doi: 10.1016/0959-8049(92)90134-N.
    1. Holmberg L, Adami HO, Lindgren A, Ekbom A, Sandstrom A, Bergstrom R. Prognostic significance of the Ackerman classification and other histopathological characteristics in breast cancer. An analysis of 1,349 consecutive cases with complete follow-up over seven years. APMIS. 1988;96:979–990.
    1. Carlomagno C, Perrone F, Lauria R, de Laurentiis M, Gallo C, Morabito A, Pettinato G, Panico L, Bellelli T, Apicella A, et al. Prognostic significance of necrosis, elastosis, fibrosis and inflammatory cell reaction in operable breast cancer. Oncology. 1995;52:272–277.
    1. Bertucci F, Finetti P, Cervera N, Charafe-Jauffret E, Mamessier E, Adelaide J, Debono S, Houvenaeghel G, Maraninchi D, Viens P, et al. Gene expression profiling shows medullary breast cancer is a subgroup of basal breast cancers. Cancer Res. 2006;66:4636–4644. doi: 10.1158/0008-5472.CAN-06-0031.
    1. Teschendorff AE, Wang Y, Barbosa-Morais NL, Brenton JD, Caldas C. A variational Bayesian mixture modelling framework for cluster analysis of gene-expression data. Bioinformatics. 2005;21:3025–3033. doi: 10.1093/bioinformatics/bti466.
    1. Agresti A. Categorical Data Analysis Wiley Series in Probability and Statistics. New York: Wiley; 2002.
    1. Boehm JS, Zhao JJ, Yao J, Kim SY, Firestein R, Dunn IF, Sjostrom SK, Garraway LA, Weremowicz S, Richardson AL, et al. Integrative genomic approaches identify IKBKE as a breast cancer oncogene. Cell. 2007;129:1065–1079. doi: 10.1016/j.cell.2007.03.052.
    1. Buckley NE, Hosey AM, Gorski JJ, Purcell JW, Mulligan JM, Harkin DP, Mullan PB. BRCA1 regulates IFN-gamma signaling through a mechanism involving the type I IFNs. Mol Cancer Res. 2007;5:261–270. doi: 10.1158/1541-7786.MCR-06-0250.
    1. Racila E, Racila DM, Ritchie JM, Taylor C, Dahle C, Weiner GJ. The pattern of clinical breast cancer metastasis correlates with a single nucleotide polymorphism in the C1qA component of complement. Immunogenetics. 2006;58:1–8. doi: 10.1007/s00251-005-0077-y.
    1. Allan AL, George R, Vantyghem SA, Lee MW, Hodgson NC, Engel CJ, Holliday RL, Girvan DP, Scott LA, Postenka CO, et al. Role of the integrin-binding protein osteopontin in lymphatic metastasis of breast cancer. Am J Pathol. 2006;169:233–246. doi: 10.2353/ajpath.2006.051152.
    1. de Silva Rudland S, Martin L, Roshanlall C, Winstanley J, Leinster S, Platt-Higgins A, Carroll J, West C, Barraclough R, Rudland P. Association of S100A4 and osteopontin with specific prognostic factors and survival of patients with minimally invasive breast cancer. Clin Cancer Res. 2006;12:1192–1200. doi: 10.1158/1078-0432.CCR-05-1580.
    1. Alter O, Brown PO, Botstein D. Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms. Proc Natl Acad Sci USA. 2003;100:3351–3356. doi: 10.1073/pnas.0530258100.
    1. Tibshirani R, Hastie T, Narasimhan B, Chu G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci USA. 2002;99:6567–6572. doi: 10.1073/pnas.082099299.
    1. Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T, et al. The Ensembl genome database project. Nucleic Acids Res. 2002;30:38–41. doi: 10.1093/nar/30.1.38.
    1. Balanda KP, MacGillivray HL. Kurtosis: a critical review. Am Stat. 1988;42:111–119. doi: 10.2307/2684482.
    1. Snedecor GW, Cochran WG. Statistical Methods. 6. Ames, IA: Iowa State University Press; 1967.
    1. Schwarz G. Estimating the dimension of a model. Annls Stat. 1978;6:461–464. doi: 10.1214/aos/1176344136.
    1. Yeung KY, Fraley C, Murua A, Raftery AE, Ruzzo WL. Model-based clustering and data transformations for gene expression data. Bioinformatics. 2001;17:977–987. doi: 10.1093/bioinformatics/17.10.977.
    1. Attias H. Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence; 30-31 July 1999; Stockholm, Sweden. San Francisco, CA: Morgan Kaufmann; 1999. Inferring parameters and structure of latent variable models by variational bayes. pp. 21–30.
    1. MacKay DJ. Neural Networks: Artificial Intelligence and Industrial Applications Proceedings of the 3rd Annual Symposium on Neural Networks: 14-15 September 1995; Nijmengen, The Netherlands. Berlin: Springer; 1995. Developments in probabilistic modelling with neural networks-ensemble learning. pp. 191–198.
    1. R Development Core Team . R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2003.

Source: PubMed

3
Tilaa