Mutational landscape and significance across 12 major cancer types

Cyriac Kandoth, Michael D McLellan, Fabio Vandin, Kai Ye, Beifang Niu, Charles Lu, Mingchao Xie, Qunyuan Zhang, Joshua F McMichael, Matthew A Wyczalkowski, Mark D M Leiserson, Christopher A Miller, John S Welch, Matthew J Walter, Michael C Wendl, Timothy J Ley, Richard K Wilson, Benjamin J Raphael, Li Ding, Cyriac Kandoth, Michael D McLellan, Fabio Vandin, Kai Ye, Beifang Niu, Charles Lu, Mingchao Xie, Qunyuan Zhang, Joshua F McMichael, Matthew A Wyczalkowski, Mark D M Leiserson, Christopher A Miller, John S Welch, Matthew J Walter, Michael C Wendl, Timothy J Ley, Richard K Wilson, Benjamin J Raphael, Li Ding

Abstract

The Cancer Genome Atlas (TCGA) has used the latest sequencing and analysis methods to identify somatic variants across thousands of tumours. Here we present data and analytical results for point mutations and small insertions/deletions from 3,281 tumours across 12 tumour types as part of the TCGA Pan-Cancer effort. We illustrate the distributions of mutation frequencies, types and contexts across tumour types, and establish their links to tissues of origin, environmental/carcinogen influences, and DNA repair defects. Using the integrated data sets, we identified 127 significantly mutated genes from well-known (for example, mitogen-activated protein kinase, phosphatidylinositol-3-OH kinase, Wnt/β-catenin and receptor tyrosine kinase signalling pathways, and cell cycle control) and emerging (for example, histone, histone modification, splicing, metabolism and proteolysis) cellular processes in cancer. The average number of mutations in these significantly mutated genes varies across tumour types; most tumours have two to six, indicating that the number of driver mutations required during oncogenesis is relatively small. Mutations in transcriptional factors/regulators show tissue specificity, whereas histone modifiers are often mutated across several cancer types. Clinical association analysis identifies genes having a significant effect on survival, and investigations of mutations with respect to clonal/subclonal architecture delineate their temporal orders during tumorigenesis. Taken together, these results lay the groundwork for developing new diagnostics and individualizing cancer treatment.

Conflict of interest statement

The authors declare no competing financial interests.

Figures

Figure 1. Mutation frequencies, spectra and contexts…
Figure 1. Mutation frequencies, spectra and contexts across 12 cancer types.
a, Distribution of mutation frequencies across 12 cancer types. Dashed grey and solid white lines denote average across cancer types and median for each type, respectively. b, Mutation spectrum of six transition (Ti) and transversion (Tv) categories for each cancer type. c, Hierarchically clustered mutation context (defined by the proportion of A, T, C and G nucleotides within ±2 bp of variant site) for six mutation categories. Cancer types correspond to colours in a. Colour denotes degree of correlation: yellow (r = 0.75) and red (r = 1). PowerPoint slide
Figure 2. The 127 SMGs from 20…
Figure 2. The 127 SMGs from 20 cellular processes in cancer identified in 12 cancer types.
Percentages of samples mutated in individual tumour types and Pan-Cancer are shown, with the highest percentage in each gene among 12 cancer types in bold. PowerPoint slide
Figure 3. Distribution of mutations in 127…
Figure 3. Distribution of mutations in 127 SMGs across Pan-Cancer cohort.
Box plot displays median numbers of non-synonymous mutations, with outliers shown as dots. In total, 3,210 tumours were used for this analysis (hypermutators excluded). PowerPoint slide
Figure 4. Unsupervised clustering based on mutation…
Figure 4. Unsupervised clustering based on mutation status of SMGs.
Tumours having no mutation or more than 500 mutations were excluded. A mutation status matrix was constructed for 2,611 tumours. Major clusters of mutations detected in UCEC, COAD, GBM, AML, KIRC, OV and BRCA were highlighted. Complete gene list shown in Extended Data Fig. 3. PowerPoint slide
Figure 5. Driver initiation and progression mutations…
Figure 5. Driver initiation and progression mutations and tumour clonal architecture.
a, Variant allele fraction (VAF) distribution of mutations in SMGs across tumours from AML, BRCA and UCEC for mutations (≥20× coverage) in copy neutral segments. SMGs having ≥5 mutation data points were included. ChrX, chromosome X. b, In AML sample TCGA-AB-2968 (WGS), two DNMT3A mutations are in the founding clone, and one NRAS mutation is in the subclone. In BRCA tumour TCGA-BH-A18P (exome), one FOXA1 mutation is in the founding clone, and PIK3R1 and MLL3 mutations are in the subclone. In UCEC tumour TCGA-B5-A0JV (exome), PIK3CA, ARID1A and CTCF mutations are in the founding clone, and NRAS, PTEN and KRAS mutations are in the secondary clone. Asterisk denotes stop codon. PowerPoint slide
Extended Data Figure 1. Mutation context across…
Extended Data Figure 1. Mutation context across 12 cancer types.
Mutation context showing proportions of A, T, C and G nucleotides within ±5 bp for all validated mutations of type C>G/G>C and C>T/G>A across all 12 cancer types. The y axis denotes the total number of mutations in each category.
Extended Data Figure 2. The distribution of…
Extended Data Figure 2. The distribution of KRAS hotspot mutations across tumour types.
Distribution of changes caused by mutations of the KRAS hotspot at amino acids 12 and 13. Lung adenocarcinoma has a significantly higher proportion of Gly12Cys mutations than other cancers (P < 3.2 × 10−10), caused by the increase in C>A transversions in the genomic DNA at that location.
Extended Data Figure 3. Unsupervised clustering based…
Extended Data Figure 3. Unsupervised clustering based on mutation status of SMGs.
Tumours having no mutation or more than 500 mutations were excluded to reduce noise. A mutation status matrix was constructed for 2,611 tumours. Major clusters of mutations detected in UCEC, COAD, GBM, AML, KIRC, OV and BRCA were highlighted. The shorter version is shown in Fig. 4.
Extended Data Figure 4. Mutation relation analysis…
Extended Data Figure 4. Mutation relation analysis in individual tumour types and the Pan-Cancer set.
a, Exclusivity and co-occurrence between SMGs in each tumour type. The −log10P value appears in either red or green if the pair shows exclusivity or co-occurrence, respectively. b, Exclusivity and co-occurrence between genes in the most significant (q < 0.05) pairs in Pan-Cancer set. Colour scheme is as in a.
Extended Data Figure 5. Mutually exclusive mutations…
Extended Data Figure 5. Mutually exclusive mutations identified by Dendrix in the Pan-Cancer and individual cancer type data sets.
a, The highest scoring exclusive set of mutated genes in 127 SMGs contains several genes that are strongly associated with one cancer type. b, The highest scoring exclusive set of mutations in the top 600 genes (not enriched for mutations in one cancer type) reported by MuSiC. c, Relationships between exclusive gene sets identified by Dendrix in individual cancer types. Eight types include TP53 in the most exclusive set, three include KRAS, and two include PTEN, with the remaining genes appearing in only a single type. d, Exclusivity and co-occurrence assessed at the Pan-Cancer level. The −log10P value appears in red or green if the pair shows exclusivity or co-occurrence, respectively. KIRC is most exclusive to other tumour types, whereas COAD/READ presented strong co-occurrence with other types.
Extended Data Figure 6. Kaplan–Meier plots for…
Extended Data Figure 6. Kaplan–Meier plots for genes significantly associated with survival.
Plots are shown for 24 genes showing significant (P ≤ 0.05) association in individual cancer types. Although NPM1 mutations in patients with AML having intermediate cytogenetic risk are relatively benign in the absence of internal tandem duplications in FLT3, we did not stratify patients based on cytogenetics or FLT3 internal tandem duplication status in this analysis, and cannot discern this effect. Because most patients with OV (95%) have TP53 mutations, we could not obtain sufficient non-TP53 mutant controls for confidently dissecting the relationship between TP53 status and survival in OV.
Extended Data Figure 7. VAF distribution of…
Extended Data Figure 7. VAF distribution of mutations in SMGs across tumours from BLCA, KIRC, HNSC, LUAD, LUSC, COAD/READ, OV and GBM.
To minimize the effect of copy number alterations on VAFs, only mutations residing in copy number neutral segments were used for this analysis. Only mutation sites with ≥20× coverage were used for analysis and plotting. SMGs with at least five data points were included in the plot.
Extended Data Figure 8. Mutation expression and…
Extended Data Figure 8. Mutation expression and tumour clonal architecture in AML, BRCA and UCEC.
a, Density plots of expressed VAFs for mutations in SMGs (blue) and non-SMGs (red). b, SciClone clonality example plots for AML (validation data), BRCA and UCEC. Two plots are shown for each case: kernel density (top), followed by the plot of tumour VAF by sequence depth for sites from selected copy number neutral regions. Mutations (with annotations) in SMGs were shown.
Extended Data Figure 9. Summary of major…
Extended Data Figure 9. Summary of major findings in Pan-Cancer 12.
Systematic analysis of the TCGA Pan-Cancer mutation dataset identifies SMGs, cancer-related cellular processes, and genes associated with clinical features and tumour progression.

References

    1. Larson DE, et al. SomaticSniper: identification of somatic point mutations in whole genome sequencing data. Bioinformatics. 2012;28:311–317. doi: 10.1093/bioinformatics/btr665.
    1. Koboldt DC, et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 2012;22:568–576. doi: 10.1101/gr.129684.111.
    1. Dees ND, et al. MuSiC: Identifying mutational significance in cancer genomes. Genome Res. 2012;22:1589–1598. doi: 10.1101/gr.134635.111.
    1. Roth A, et al. JointSNVMix: a probabilistic model for accurate detection of somatic mutations in normal/tumour paired next-generation sequencing data. Bioinformatics. 2012;28:907–913. doi: 10.1093/bioinformatics/bts053.
    1. Cibulskis K, et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nature Biotechnol. 2013;31:213–219. doi: 10.1038/nbt.2514.
    1. Jones S, et al. Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science. 2008;321:1801–1806. doi: 10.1126/science.1164368.
    1. Parsons DW, et al. An integrated genomic analysis of human glioblastoma multiforme. Science. 2008;321:1807–1812. doi: 10.1126/science.1164382.
    1. Sjöblom T, et al. The consensus coding sequences of human breast and colorectal cancers. Science. 2006;314:268–274. doi: 10.1126/science.1133427.
    1. The Cancer Genome Atlas Research Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature455, 1061–1068 (2008)
    1. Ding L, et al. Somatic mutations affect key pathways in lung adenocarcinoma. Nature. 2008;455:1069–1075. doi: 10.1038/nature07423.
    1. Wood LD, et al. The genomic landscapes of human breast and colorectal cancers. Science. 2007;318:1108–1113. doi: 10.1126/science.1145720.
    1. The Cancer Genome Atlas Research Network. Integrated genomic analyses of ovarian carcinoma. Nature474, 609–615 (2011)
    1. The Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature490, 61–70 (2012)
    1. Levine Douglas A. Integrated genomic characterization of endometrial carcinoma. Nature. 2013;497(7447):67–73. doi: 10.1038/nature12113.
    1. The Cancer Genome Atlas Research Network. Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N. Engl. J. Med.368, 2059–2074 (2013)
    1. The Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature487, 330–337 (2012)
    1. Ellis MJ, et al. Whole-genome analysis informs breast cancer response to aromatase inhibition. Nature. 2012;486:353–360. doi: 10.1038/nature11143.
    1. The Cancer Genome Atlas Research Network. Comprehensive molecular characterization of clear cell renal cell carcinoma. Nature499, 43–49 (2013)
    1. Hanahan D, Weinberg RA. The hallmarks of cancer. Cell. 2000;100:57–70. doi: 10.1016/S0092-8674(00)81683-9.
    1. Downing JR, et al. The Pediatric Cancer Genome Project. Nature Genet. 2012;44:619–622. doi: 10.1038/ng.2287.
    1. Ma Z, Leijon A. Bayesian estimation of beta mixture models with variational inference. IEEE Trans. Pattern Anal. Mach. Intell. 2011;33:2160–2173. doi: 10.1109/TPAMI.2011.63.
    1. Lawrence MS, et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature. 2013;499:214–218. doi: 10.1038/nature12213.
    1. Tao MH, Freudenheim JL. DNA methylation in endometrial cancer. Epigenetics. 2010;5:491–498. doi: 10.4161/epi.5.6.12431.
    1. Etcheverry A, et al. DNA methylation in glioblastoma: impact on gene expression and clinical outcome. BMC Genomics. 2010;11:701. doi: 10.1186/1471-2164-11-701.
    1. Varela I, et al. Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma. Nature. 2011;469:539–542. doi: 10.1038/nature09639.
    1. Peña-Llopis S, et al. BAP1 loss defines a new class of renal cell carcinoma. Nature Genet. 2012;44:751–759. doi: 10.1038/ng.2323.
    1. Clapier CR, Cairns BR. The biology of chromatin remodeling complexes. Annu. Rev. Biochem. 2009;78:273–304. doi: 10.1146/annurev.biochem.77.062706.153223.
    1. Kapur P, et al. Effects on survival of BAP1 and PBRM1 mutations in sporadic clear-cell renal-cell carcinoma: a retrospective analysis with independent validation. Lancet Oncol. 2013;14:159–167. doi: 10.1016/S1470-2045(12)70584-3.
    1. Jiao Y, et al. Frequent ATRX, CIC, FUBP1 and IDH1 mutations refine the classification of malignant gliomas. Oncotarget. 2012;3:709–722. doi: 10.18632/oncotarget.588.
    1. Vandin F, Upfal E, Raphael BJ. De novo discovery of mutated driver pathways in cancer. Genome Res. 2012;22:375–385. doi: 10.1101/gr.120477.111.
    1. Piazza R, et al. Recurrent SETBP1 mutations in atypical chronic myeloid leukemia. Nature Genet. 2013;45:18–24. doi: 10.1038/ng.2495.
    1. Yang D, et al. Association of BRCA1 and BRCA2 mutations with survival, chemotherapy sensitivity, and gene mutator phenotype in patients with ovarian cancer. J. Am. Med. Assoc. 2011;306:1557–1565. doi: 10.1001/jama.2011.1456.
    1. Bolton KL, et al. Association between BRCA1 and BRCA2 mutations and survival in women with invasive epithelial ovarian cancer. J. Am. Med. Assoc. 2012;307:382–390. doi: 10.1001/jama.2012.20.
    1. Ley TJ, et al. DNMT3A mutations in acute myeloid leukemia. N. Engl. J. Med. 2010;363:2424–2433. doi: 10.1056/NEJMoa1005143.
    1. Myung JK, et al. IDH1 mutation of gliomas with long-term survival analysis. Oncol. Rep. 2012;28:1639–1644. doi: 10.3892/or.2012.1994.
    1. Ding L, et al. Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature. 2012;481:506–510. doi: 10.1038/nature10738.
    1. Welch JS, et al. The origin and evolution of mutations in acute myeloid leukemia. Cell. 2012;150:264–278. doi: 10.1016/j.cell.2012.06.023.
    1. Vogelstein B, et al. Cancer genome landscapes. Science. 2013;339:1546–1558. doi: 10.1126/science.1235122.

Source: PubMed

3
Abonnieren