The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups

Christina Curtis, Sohrab P Shah, Suet-Feung Chin, Gulisa Turashvili, Oscar M Rueda, Mark J Dunning, Doug Speed, Andy G Lynch, Shamith Samarajiwa, Yinyin Yuan, Stefan Gräf, Gavin Ha, Gholamreza Haffari, Ali Bashashati, Roslin Russell, Steven McKinney, METABRIC Group, Anita Langerød, Andrew Green, Elena Provenzano, Gordon Wishart, Sarah Pinder, Peter Watson, Florian Markowetz, Leigh Murphy, Ian Ellis, Arnie Purushotham, Anne-Lise Børresen-Dale, James D Brenton, Simon Tavaré, Carlos Caldas, Samuel Aparicio, Carlos Caldas, Samuel Aparicio, Christina Curtis, Sohrab P Shah, Carlos Caldas, Samuel Aparicio, James D Brenton, Ian Ellis, David Huntsman, Sarah Pinder, Arnie Purushotham, Leigh Murphy, Carlos Caldas, Samuel Aparicio, Carlos Caldas, Helen Bardwell, Suet-Feung Chin, Christina Curtis, Zhihao Ding, Stefan Gräf, Linda Jones, Bin Liu, Andy G Lynch, Irene Papatheodorou, Stephen J Sammut, Gordon Wishart, Samuel Aparicio, Steven Chia, Karen Gelmon, David Huntsman, Steven McKinney, Caroline Speers, Gulisa Turashvili, Peter Watson, Ian Ellis, Roger Blamey, Andrew Green, Douglas Macmillan, Emad Rakha, Arnie Purushotham, Cheryl Gillett, Anita Grigoriadis, Sarah Pinder, Emanuele de Rinaldis, Andy Tutt, Leigh Murphy, Michelle Parisien, Sandra Troup, Carlos Caldas, Suet-Feung Chin, Derek Chan, Claire Fielding, Ana-Teresa Maia, Sarah McGuire, Michelle Osborne, Sara M Sayalero, Inmaculada Spiteri, James Hadfield, Samuel Aparicio, Gulisa Turashvili, Lynda Bell, Katie Chow, Nadia Gale, David Huntsman, Maria Kovalik, Ying Ng, Leah Prentice, Carlos Caldas, Simon Tavaré, Christina Curtis, Mark J Dunning, Stefan Gräf, Andy G Lynch, Oscar M Rueda, Roslin Russell, Shamith Samarajiwa, Doug Speed, Florian Markowetz, Yinyin Yuan, James D Brenton, Samuel Aparicio, Sohrab P Shah, Ali Bashashati, Gavin Ha, Gholamreza Haffari, Steven McKinney, Christina Curtis, Sohrab P Shah, Suet-Feung Chin, Gulisa Turashvili, Oscar M Rueda, Mark J Dunning, Doug Speed, Andy G Lynch, Shamith Samarajiwa, Yinyin Yuan, Stefan Gräf, Gavin Ha, Gholamreza Haffari, Ali Bashashati, Roslin Russell, Steven McKinney, METABRIC Group, Anita Langerød, Andrew Green, Elena Provenzano, Gordon Wishart, Sarah Pinder, Peter Watson, Florian Markowetz, Leigh Murphy, Ian Ellis, Arnie Purushotham, Anne-Lise Børresen-Dale, James D Brenton, Simon Tavaré, Carlos Caldas, Samuel Aparicio, Carlos Caldas, Samuel Aparicio, Christina Curtis, Sohrab P Shah, Carlos Caldas, Samuel Aparicio, James D Brenton, Ian Ellis, David Huntsman, Sarah Pinder, Arnie Purushotham, Leigh Murphy, Carlos Caldas, Samuel Aparicio, Carlos Caldas, Helen Bardwell, Suet-Feung Chin, Christina Curtis, Zhihao Ding, Stefan Gräf, Linda Jones, Bin Liu, Andy G Lynch, Irene Papatheodorou, Stephen J Sammut, Gordon Wishart, Samuel Aparicio, Steven Chia, Karen Gelmon, David Huntsman, Steven McKinney, Caroline Speers, Gulisa Turashvili, Peter Watson, Ian Ellis, Roger Blamey, Andrew Green, Douglas Macmillan, Emad Rakha, Arnie Purushotham, Cheryl Gillett, Anita Grigoriadis, Sarah Pinder, Emanuele de Rinaldis, Andy Tutt, Leigh Murphy, Michelle Parisien, Sandra Troup, Carlos Caldas, Suet-Feung Chin, Derek Chan, Claire Fielding, Ana-Teresa Maia, Sarah McGuire, Michelle Osborne, Sara M Sayalero, Inmaculada Spiteri, James Hadfield, Samuel Aparicio, Gulisa Turashvili, Lynda Bell, Katie Chow, Nadia Gale, David Huntsman, Maria Kovalik, Ying Ng, Leah Prentice, Carlos Caldas, Simon Tavaré, Christina Curtis, Mark J Dunning, Stefan Gräf, Andy G Lynch, Oscar M Rueda, Roslin Russell, Shamith Samarajiwa, Doug Speed, Florian Markowetz, Yinyin Yuan, James D Brenton, Samuel Aparicio, Sohrab P Shah, Ali Bashashati, Gavin Ha, Gholamreza Haffari, Steven McKinney

Abstract

The elucidation of breast cancer subgroups and their molecular drivers requires integrated views of the genome and transcriptome from representative numbers of patients. We present an integrated analysis of copy number and gene expression in a discovery and validation set of 997 and 995 primary breast tumours, respectively, with long-term clinical follow-up. Inherited variants (copy number variants and single nucleotide polymorphisms) and acquired somatic copy number aberrations (CNAs) were associated with expression in ~40% of genes, with the landscape dominated by cis- and trans-acting CNAs. By delineating expression outlier genes driven in cis by CNAs, we identified putative cancer genes, including deletions in PPP2R2A, MTAP and MAP2K4. Unsupervised analysis of paired DNA–RNA profiles revealed novel subgroups with distinct clinical outcomes, which reproduced in the validation cohort. These include a high-risk, oestrogen-receptor-positive 11q13/14 cis-acting subgroup and a favourable prognosis subgroup devoid of CNAs. Trans-acting aberration hotspots were found to modulate subgroup-specific gene networks, including a TCR deletion-mediated adaptive immune response in the ‘CNA-devoid’ subgroup and a basal-specific chromosome 5 deletion-associated mitotic network. Our results provide a novel molecular stratification of the breast cancer population, derived from the impact of somatic CNAs on the transcriptome.

Figures

Figure 1. Germline and somatic variants influence…
Figure 1. Germline and somatic variants influence tumour expression architecture
a, Venn diagrams depict the relative contribution of SNPs, CNVs and CNAs to genome-wide, cis and trans tumour expression variation for significant expression associations (Šidák adjusted P-value ≤0.0001). b, Histograms illustrate the proportion of variance explained by the most significantly associated predictor for each predictor type, where several of the top associations are indicated.
Figure 2. Patterns of cis outlying expression…
Figure 2. Patterns of cis outlying expression refine putative breast cancer drivers
A genome-wide view of outlying expression coincident with extreme copy number events in the CNA landscape highlights putative driver genes, as indicated by the arrows and numbered regions. The frequency (absolute count) of cases exhibiting an outlying expression profile at regions across the genome is shown, as is the distribution across subgroups for several regions in the insets. High-level amplifications are indicated in red and homozygous deletions in blue. Red asterisks above the bar plots indicate significantly different observed distributions than expected based on the overall population frequency (χ2 test, P < 0.0001).
Figure 3. Trans -acting aberration hotspots modulate…
Figure 3. Trans-acting aberration hotspots modulate concerted molecular pathways
a, Manhattan plot illustrating cis and trans expression-associated copy number aberrations from the eQTL analysis (top panel). The matrix of significant predictor–expression associations (adjusted P-value ≤0.0001) exhibits strong off-diagonal patterns (middle panel), and the frequency of mRNAs associated with a particular copy number aberration further illuminates these trans-acting aberration hotspots (bottom panel). The directionality of the associations is indicated as follows: cis: positive, red; negative, pink; trans: positive, blue; negative, green. b, Enrichment map of immune response modules in the trans-associated TRA network, where letters in parentheses represent the source database as follows: b, NCI-PID BioCarta; c, cancer cell map; k, KEGG; n, NCI-PID curated pathways; p, PANTHER; r, Reactome.
Figure 4. The integrative subgroups have distinct…
Figure 4. The integrative subgroups have distinct copy number profiles
Genome-wide frequencies (F, proportion of cases) of somatic CNAs (y-axis, upper plot) and the subtype-specific association (−log10P-value) of aberrations (y-axis, bottom plot) based on a χ2 test of independence are shown for each of the 10 integrative clusters. Regions of copy number gain are indicated in red and regions of loss in blue in the frequency plot (upper plot). Subgroups were ordered by hierarchical clustering of their copy number profiles in the discovery cohort (n = 997). For the validation cohort (n = 995), samples were classified into each of the integrative clusters as described in the text. The number of cases in each subgroup (n) is indicated as is the in-group proportion (IGP) and associated P-value, as well as the distribution of PAM50 subtypes within each cluster.
Figure 5. The integrative subgroups have distinct…
Figure 5. The integrative subgroups have distinct clinical outcomes
a, Kaplan–Meier plot of disease-specific survival (truncated at 15 years) for the integrative subgroups in the discovery cohort. For each cluster, the number of samples at risk is indicated as well as the total number of deaths (in parentheses). b, 95% confidence intervals for the Cox proportional hazard ratios are illustrated for the discovery and validation cohort for selected values of key covariates, where each subgroup was compared against IntClust 3.

References

    1. Leary RJ, et al. Integrated analysis of homozygous deletions, focal amplifications, and sequence alterations in breast and colorectal cancers. Proc. Natl Acad. Sci. USA. 2008;105:16224–16229.
    1. Bignell GR, et al. Signatures of mutation and selection in the cancer genome. Nature. 2010;463:893–898.
    1. Perou CM, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–752.
    1. Sørlie T, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl Acad. Sci. USA. 2001;98:10869–10874.
    1. Chin K, et al. Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Cancer Cell. 2006;10:529–541.
    1. Chin SF, et al. High-resolution aCGH and expression profiling identifies a novel genomic subtype of ER negative breast cancer. Genome Biol. 2007;8:R215.
    1. Parker JS, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 2009;27:1160–1167.
    1. Stranger BE, et al. Genome-wide associations of gene expression variation in humans. PLoS Genet. 2005;1:e78.
    1. Gilad Y, Rifkin SA, Pritchard JK. Revealing the architecture of gene regulation: the promise of eQTL studies. Trends Genet. 2008;24:408–415.
    1. Teschendorff AE, Naderi A, Barbosa-Morais NL, Caldas C. PACK: Profile analysis using clustering and kurtosis to find molecular classifiers in cancer. Bioinformatics. 2006;22:2269–2275.
    1. Holland D, et al. ZNF703 is a common Luminal B breast cancer oncogene that differentially regulates luminal and basal progenitors in human mammary epithelium. EMBO Mol. Med. 2011;3:167–180.
    1. Li J, et al. PTEN, a putative protein tyrosine phosphatase gene mutated in human brain, breast, and prostate cancer. Science. 1997;275:1943–1947.
    1. Santarius T, Shiply J, Brewer D, Stratton MR, Cooper CS. A census of amplified and overexpressed human cancer genes. Nature Rev. Cancer. 2010;10:59–64.
    1. Jones S, et al. Frequent mutations of chromatin remodeling gene ARID1A in ovarian clear cell carcinoma. Science. 2010;330:228–231.
    1. McConechy MK, et al. Subtype-specific mutation of PPP2R1A in endometrial and ovarian carcinomas. J. Pathol. 2011;223:567–573.
    1. Tan J, et al. B55β-associated PP2A complex controls PDK1-directed MYC signaling and modulates rapamycin sensitivity incolorectal cancer. Cancer Cell. 2010;18:459–471.
    1. Christopher SA, Diegelman P, Porter CW, Kruger WD. Methylthioadenosine phosphorylase, a gene frequently codeleted with p16 (CDKN2A/ARF), acts as a tumor suppressor in a breast cancer cell line. Cancer Res. 2002;62:6639–6644.
    1. Teng DH, et al. Human mitogen-activated protein kinase kinase 4 as a candidate tumor suppressor. Cancer Res. 1997;57:4177–4182.
    1. Hollestelle A, et al. Distinct gene mutation profiles among luminal-type and basal-type breast cancer cell lines. Breast Cancer Res. Treat. 2010;121:53–64.
    1. Shen R, Olshen AB, Ladanyi M. Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis. Bioinformatics. 2009;25:2906–2912.
    1. Tibshirani R, Hastie T, Narasimhan B, Chu G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl Acad. Sci. USA. 2002;99:6567–6572.
    1. Kapp AV, Tibshirani R. Are clusters found in one dataset present in another dataset? Biostatistics. 2007;8:9–31.
    1. Hughes-Davies L, et al. EMSY links the BRCA2 pathway to sporadic breast and ovarian cancer. Cell. 2003;115:523–535.
    1. Brown LA, et al. Amplification of 11q13 in ovarian carcinoma. Genes Chromosom. Cancer. 2008;47:481–489.
    1. Russnes HG, et al. Genomic architecture characterizes tumor progression paths and fate in breast cancer patients. Sci. Transl. Med. 2010;2:38ra47.
    1. Blows FM, et al. Subtyping of breast cancer by immunohistochemistry to investigate a relationship between subtype and short and long term survival: a collaborative analysis of data for 10,159 cases from 12 studies. PLoS Med. 2010;7:e1000279.
    1. Mahmoud SMA, et al. Tumor-infiltrating CD8+ lymphocytes predict clinical outcome in breast cancer. J. Clin. Oncol. 2011;29:1949–1955.
    1. Daniel J, Coulter J, Woo J-H, Wilsbach K, Gabrielson E. High levels of the Mps1 checkpoint protein are protective of aneuploidy in breast cancer cells. Proc. Natl Acad. Sci. USA. 2011;108:5384–5389.
    1. Chen Y, et al. Variations in DNA elucidate molecular networks that cause disease. Nature. 2008;452:429–435.

Source: PubMed

3
Iratkozz fel