Breast cancer genome and transcriptome integration implicates specific mutational signatures with immune cell infiltration

Marcel Smid, F Germán Rodríguez-González, Anieta M Sieuwerts, Roberto Salgado, Wendy J C Prager-Van der Smissen, Michelle van der Vlugt-Daane, Anne van Galen, Serena Nik-Zainal, Johan Staaf, Arie B Brinkman, Marc J van de Vijver, Andrea L Richardson, Aquila Fatima, Kim Berentsen, Adam Butler, Sancha Martin, Helen R Davies, Reno Debets, Marion E Meijer-Van Gelder, Carolien H M van Deurzen, Gaëtan MacGrogan, Gert G G M Van den Eynden, Colin Purdie, Alastair M Thompson, Carlos Caldas, Paul N Span, Peter T Simpson, Sunil R Lakhani, Steven Van Laere, Christine Desmedt, Markus Ringnér, Stefania Tommasi, Jorunn Eyford, Annegien Broeks, Anne Vincent-Salomon, P Andrew Futreal, Stian Knappskog, Tari King, Gilles Thomas, Alain Viari, Anita Langerød, Anne-Lise Børresen-Dale, Ewan Birney, Hendrik G Stunnenberg, Mike Stratton, John A Foekens, John W M Martens, Marcel Smid, F Germán Rodríguez-González, Anieta M Sieuwerts, Roberto Salgado, Wendy J C Prager-Van der Smissen, Michelle van der Vlugt-Daane, Anne van Galen, Serena Nik-Zainal, Johan Staaf, Arie B Brinkman, Marc J van de Vijver, Andrea L Richardson, Aquila Fatima, Kim Berentsen, Adam Butler, Sancha Martin, Helen R Davies, Reno Debets, Marion E Meijer-Van Gelder, Carolien H M van Deurzen, Gaëtan MacGrogan, Gert G G M Van den Eynden, Colin Purdie, Alastair M Thompson, Carlos Caldas, Paul N Span, Peter T Simpson, Sunil R Lakhani, Steven Van Laere, Christine Desmedt, Markus Ringnér, Stefania Tommasi, Jorunn Eyford, Annegien Broeks, Anne Vincent-Salomon, P Andrew Futreal, Stian Knappskog, Tari King, Gilles Thomas, Alain Viari, Anita Langerød, Anne-Lise Børresen-Dale, Ewan Birney, Hendrik G Stunnenberg, Mike Stratton, John A Foekens, John W M Martens

Abstract

A recent comprehensive whole genome analysis of a large breast cancer cohort was used to link known and novel drivers and substitution signatures to the transcriptome of 266 cases. Here, we validate that subtype-specific aberrations show concordant expression changes for, for example, TP53, PIK3CA, PTEN, CCND1 and CDH1. We find that CCND3 expression levels do not correlate with amplification, while increased GATA3 expression in mutant GATA3 cancers suggests GATA3 is an oncogene. In luminal cases the total number of substitutions, irrespective of type, associates with cell cycle gene expression and adverse outcome, whereas the number of mutations of signatures 3 and 13 associates with immune-response specific gene expression, increased numbers of tumour-infiltrating lymphocytes and better outcome. Thus, while earlier reports imply that the sheer number of somatic aberrations could trigger an immune-response, our data suggests that substitutions of a particular type are more effective in doing so than others.

Figures

Figure 1. Clustered correlation matrix of 266…
Figure 1. Clustered correlation matrix of 266 breast cancer cases.
The left panel shows the dendrogram and clustered correlation matrix (red is positive, blue negative correlation) of 266 breast cancer cases. The top 5,000 most variable transcripts were used for correlating the samples. For the columns in the right panel, colour codes are as follows: ER: ER-positive dark grey, ER-negative light grey. Subtype: Red, basal; dark blue, Luminal B; light blue, Luminal A; green, normal-like; and dark yellow, her2. Grade: white, NA; light grey, grade 1; grey, grade 2; and black, grade 3. Tissuetype: red, ductal; blue, lobular; light blue, micropapillary; grey, mucinous; dark yellow, papillary; yellow, apocrine; and dark green, other type. # subs: The length of the green bar is proportional to the number of substitutions. Cases with >10,000 substitutions are shown with a soft-red coloured bar of equal length. The remaining nine columns show the status of driver genes. Light grey, wild type; dark yellow, copy-number amplification; blue, homozygous deletion; and mutations (substitution, indels, rearrangements) are dark green if activating and red if inactivating.
Figure 2. Boxplots of CCND3 and GATA3…
Figure 2. Boxplots of CCND3 and GATA3 expression.
All y axes show expression levels in our cohort (n=266, left panels, log2-FPKM) and from TCGA cases (n=960, right panels, RNA Seq V2 RSEM, log2) according to copy-number state of CCND3 (top panel, our cohort n=7 amplified, TCGA n=152 with loss and n=233 amplified) and GATA3 mutation state (bottom panel, our cohort n=25 mutated, TCGA n=95 mutated).The box is bounded by the first and third quartile with a horizontal line at the median, whiskers extend to the maximum and minimum value. The notch shows the 95% CI of the median.
Figure 3. Mitotic cell cycle gene activity…
Figure 3. Mitotic cell cycle gene activity related to mutational signatures and outcome.
(a) The average expression of genes (n=409) from Gene-Ontology term Mitotic Cell Cycle (GO:0000278) were used to rank ER-positive samples. The vertical black line indicates the third quartile border. Top panel: heatmap of median centred expression values in log2-FPKM, red indicates above median, blue below median expression. Genes are in rows, samples in columns. Below the heatmap: ‘Average MCC' shows the average expression of the cell cycle genes. Grade: pathological grade; white, NA; light grey, grade 1; grey, grade 2; and black, grade 3. Last five rows: the length of the green bar is proportional to the number of substitutions. # subs: the total number of substitutions, samples with >10,000 substitutions are shown with a soft-red coloured bar and are of equal length. For the columns labelled Sig. 1, 3, 8 and 13, soft-red indicate samples >3,000 of such substitutions. (b,c) Overall and relapse-free survival Kaplan–Meier curves. Blue indicates patients with less than the median number of substitutions, red indicates higher than median. P values are logrank-test values. The x-axis shows time in months, y-axis shows the proportion of surviving patients.
Figure 4. Activity of TIL-signature genes related…
Figure 4. Activity of TIL-signature genes related to mutational signatures.
The average expression of genes (n=116) from a TIL specific RNA-signature was used to rank ER-positive samples. The vertical black line indicates the third quartile border. Top panel: heatmap of median centred expression values in log2-FPKM, red indicates above median, blue below median expression. Genes are in rows, samples in columns. Below the heatmap: the first row shows the average expression of the TIL genes. Last three rows: the length of the green bar is proportional to the number of substitutions of the indicated signatures. Samples with >3,000 substitutions are shown with a soft-red coloured bar and are of equal length.
Figure 5. Combined MCC and TIL-signature genes…
Figure 5. Combined MCC and TIL-signature genes and outcome.
Overall (a) and relapse-free (b) survival Kaplan−Meier curves of our cohort and (c) metastasis-free survival of independent in-house and public data sets. Green line indicates patients with high expression of TIL genes (top quartile) and low expression of MCC genes (bottom three quartiles). The red line indicates patients with low expression of TIL genes and high expression of MCC genes. Blue indicates the remaining patients. P values are logrank-test for trend values. The x-axis shows time in months, y-axis shows the proportion of patients. For the numbers at risk, 1 indicates the highTIL/lowCC group (green line), 3 indicates the lowTIL/highCC (red line) and 2 the remaining patients.

References

    1. Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).
    1. Alexandrov L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).
    1. Banerji S. et al. Sequence analysis of mutations and translocations across breast cancer subtypes. Nature 486, 405–409 (2012).
    1. Curtis C. et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486, 346–352 (2012).
    1. Nik-Zainal S. et al. Mutational processes molding the genomes of 21 breast cancers. Cell 149, 979–993 (2012).
    1. Paquet E. R. & Hallett M. T. Absolute assignment of breast cancer intrinsic molecular subtype. J. Natl Cancer Inst. 107, 357 (2015).
    1. Pleasance E. D. et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196 (2010).
    1. Stephens P. J. et al. The landscape of cancer genes and mutational processes in breast cancer. Nature 486, 400–404 (2012).
    1. Nik-Zainal S. et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 534, 47–54 (2016).
    1. Perou C. M. et al. Molecular portraits of human breast tumours. Nature 406, 747–752 (2000).
    1. Smid M. et al. Subtypes of breast cancer show preferential site of relapse. Cancer Res. 68, 3108–3114 (2008).
    1. Sorlie T. et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl Acad. Sci. USA 98, 10869–10874 (2001).
    1. Langerod A. et al. TP53 mutation status and gene expression profiles are powerful prognostic markers of breast cancer. Breast. Cancer Res. 9, R30 (2007).
    1. Manie E. et al. High frequency of TP53 mutation in BRCA1 and sporadic basal-like carcinomas but not in BRCA1 luminal breast tumors. Cancer Res. 69, 663–671 (2009).
    1. Usary J. et al. Mutation of GATA3 in human breast tumors. Oncogene 23, 7669–7678 (2004).
    1. Holm K. et al. Characterisation of amplification patterns and target genes at chromosome 11q13 in CCND1-amplified sporadic and familial breast tumours. Breast. Cancer Res. Treat. 133, 583–594 (2012).
    1. Roy P. G. et al. High CCND1 amplification identifies a group of poor prognosis women with estrogen receptor positive breast cancer. Int. J. Cancer 127, 355–360 (2010).
    1. Berx G. et al. E-cadherin is a tumour/invasion suppressor gene mutated in human lobular breast cancers. EMBO J. 14, 6107–6115 (1995).
    1. Berx G. et al. E-cadherin is inactivated in a majority of invasive human lobular breast cancers by truncation mutations throughout its extracellular domain. Oncogene 13, 1919–1925 (1996).
    1. Walerych D., Napoli M., Collavin L. & Del Sal G. The rebel angel: mutant p53 as the driving oncogene in breast cancer. Carcinogenesis 33, 2007–2017 (2012).
    1. Leiserson M. D., Wu H. T., Vandin F. & Raphael B. J. CoMEt: a statistical approach to identify combinations of mutually exclusive alterations in cancer. Genome. Biol. 16, 160 (2015).
    1. Massink M. P., Kooi I. E., Martens J. W., Waisfisz Q. & Meijers-Heijboer H. Genomic profiling of CHEK2*1100delC-mutated breast carcinomas. BMC Cancer 15, 877 (2015).
    1. Lundegaard C. et al. NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8-11. Nucleic Acids Res. 36, W509–W512 (2008).
    1. Nielsen M. et al. Reliable prediction of T-cell epitopes using neural networks with novel sequence representations. Protein Sci. 12, 1007–1017 (2003).
    1. Chowell D. et al. TCR contact residue hydrophobicity is a hallmark of immunogenic CD8+ T cell epitopes. Proc. Natl Acad. Sci. USA 112, E1754–E1762 (2015).
    1. Kim Y., Sidney J., Pinilla C., Sette A. & Peters B. Derivation of an amino acid similarity matrix for peptide: MHC binding and its application as a Bayesian prior. BMC Bioinformatics 10, 394 (2009).
    1. Reis-Filho J. S. & Pusztai L. Gene expression profiling in breast cancer: classification, prognostication, and prediction. Lancet 378, 1812–1823 (2011).
    1. Yu J. X. et al. Pathway analysis of gene signatures predicting metastasis of node-negative primary breast cancer. BMC Cancer 7, 182 (2007).
    1. Berns E. M. et al. c-myc amplification is a better prognostic factor than HER2/neu amplification in primary breast cancer. Cancer Res. 52, 1107–1113 (1992).
    1. Liao D. J. & Dickson R. B. c-Myc in breast cancer. Endocr. Relat. Cancer 7, 143–164 (2000).
    1. Olivier M. et al. The clinical value of somatic TP53 gene mutations in 1,794 patients with breast cancer. Clin. Cancer. Res. 12, 1157–1167 (2006).
    1. Haricharan S., Bainbridge M. N., Scheet P. & Brown P. H. Somatic mutation load of estrogen receptor-positive breast tumors predicts overall survival: an analysis of genome sequence data. Breast Cancer Res. Treat. 146, 211–220 (2014).
    1. van Verschuer V. M. et al. Tumor-associated inflammation as a potential prognostic tool in BRCA1/2-associated breast cancer. Hum. Pathol. 46, 182–190 (2015).
    1. Bane A. L. et al. BRCA2 mutation-associated breast cancers exhibit a distinguishing phenotype based on morphology and molecular profiles from tissue microarrays. Am. J. Surg. Pathol. 31, 121–128 (2007).
    1. Schumacher T. N. & Schreiber R. D. Neoantigens in cancer immunotherapy. Science 348, 69–74 (2015).
    1. Rizvi N. A. et al. Cancer immunology. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science 348, 124–128 (2015).
    1. Snyder A. et al. Genetic basis for clinical response to CTLA-4 blockade in melanoma. N. Engl. J. Med. 371, 2189–2199 (2014).
    1. Eisen M. B., Spellman P. T., Brown P. O. & Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA 95, 14863–14868 (1998).
    1. Goeman J. J., van de Geer S. A., de Kort F. & van Houwelingen H. C. A global test for groups of genes: testing association with a clinical outcome. Bioinformatics 20, 93–99 (2004).
    1. Kanehisa M. & Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
    1. Kyte J. & Doolittle R. F. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132 (1982).
    1. McCall M. N., Bolstad B. M. & Irizarry R. A. Frozen robust multiarray analysis (fRMA). Biostatistics 11, 242–253 (2010).
    1. Johnson W. E., Li C. & Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).

Source: PubMed

3
Subscribe