Signatures of mutational processes in human cancer

Ludmil B Alexandrov, Serena Nik-Zainal, David C Wedge, Samuel A J R Aparicio, Sam Behjati, Andrew V Biankin, Graham R Bignell, Niccolò Bolli, Ake Borg, Anne-Lise Børresen-Dale, Sandrine Boyault, Birgit Burkhardt, Adam P Butler, Carlos Caldas, Helen R Davies, Christine Desmedt, Roland Eils, Jórunn Erla Eyfjörd, John A Foekens, Mel Greaves, Fumie Hosoda, Barbara Hutter, Tomislav Ilicic, Sandrine Imbeaud, Marcin Imielinski, Natalie Jäger, David T W Jones, David Jones, Stian Knappskog, Marcel Kool, Sunil R Lakhani, Carlos López-Otín, Sancha Martin, Nikhil C Munshi, Hiromi Nakamura, Paul A Northcott, Marina Pajic, Elli Papaemmanuil, Angelo Paradiso, John V Pearson, Xose S Puente, Keiran Raine, Manasa Ramakrishna, Andrea L Richardson, Julia Richter, Philip Rosenstiel, Matthias Schlesner, Ton N Schumacher, Paul N Span, Jon W Teague, Yasushi Totoki, Andrew N J Tutt, Rafael Valdés-Mas, Marit M van Buuren, Laura van 't Veer, Anne Vincent-Salomon, Nicola Waddell, Lucy R Yates, Australian Pancreatic Cancer Genome Initiative, ICGC Breast Cancer Consortium, ICGC MMML-Seq Consortium, ICGC PedBrain, Jessica Zucman-Rossi, P Andrew Futreal, Ultan McDermott, Peter Lichter, Matthew Meyerson, Sean M Grimmond, Reiner Siebert, Elías Campo, Tatsuhiro Shibata, Stefan M Pfister, Peter J Campbell, Michael R Stratton, Alexander Claviez, Andreas Rosenwald, Andreas Rosenwald, Arndt Borkhardt, Benedikt Brors, Bernhard Radlwimmer, Chris Lawerenz, Cristina Lopez, David Langenberger, Dennis Karsch, Dido Lenze, Dieter Kube, Ellen Leich, Gesine Richter, Jan Korbel, Jessica Hoell, Jürgen Eils, Kebriah Hezaveh, Lorenz Trümper, Maciej Rosolowski, Marc Weniger, Marius Rohde, Markus Kreuz, Markus Loeffler, Markus Schilhabel, Martin Dreyling, Martin-Leo Hansmann, Michael Hummel, Monika Szczepanowski, Ole Ammerpohl, Peter F Stadler, Peter Möller, Ralf Küppers, Siegfried Haas, Sonja Eberth, Stefan Schreiber, Stephan H Bernhart, Steve Hoffmann, Sylwester Radomski, Ulrike Kostezka, Wolfram Klapper, Christos Sotiriou, Denis Larsimont, Delphine Vincent, Marion Maetens, Odette Mariani, Anieta M Sieuwerts, John W M Martens, Jon G Jonasson, Isabelle Treilleux, Emilie Thomas, Gaëtan Mac Grogan, Cécile Mannina, Laurent Arnould, Laura Burillier, Jean-Louis Merlin, Magali Lefebvre, Frédéric Bibeau, Blandine Massemin, Frédérique Penault-Llorca, Qian Lopez, Marie-Christine Mathieu, Per Eystein Lonning, Margrete Schlooz-Vries, Jolien Tol, Hanneke van Laarhoven, Fred Sweep, Peter Bult, Ludmil B Alexandrov, Serena Nik-Zainal, David C Wedge, Samuel A J R Aparicio, Sam Behjati, Andrew V Biankin, Graham R Bignell, Niccolò Bolli, Ake Borg, Anne-Lise Børresen-Dale, Sandrine Boyault, Birgit Burkhardt, Adam P Butler, Carlos Caldas, Helen R Davies, Christine Desmedt, Roland Eils, Jórunn Erla Eyfjörd, John A Foekens, Mel Greaves, Fumie Hosoda, Barbara Hutter, Tomislav Ilicic, Sandrine Imbeaud, Marcin Imielinski, Natalie Jäger, David T W Jones, David Jones, Stian Knappskog, Marcel Kool, Sunil R Lakhani, Carlos López-Otín, Sancha Martin, Nikhil C Munshi, Hiromi Nakamura, Paul A Northcott, Marina Pajic, Elli Papaemmanuil, Angelo Paradiso, John V Pearson, Xose S Puente, Keiran Raine, Manasa Ramakrishna, Andrea L Richardson, Julia Richter, Philip Rosenstiel, Matthias Schlesner, Ton N Schumacher, Paul N Span, Jon W Teague, Yasushi Totoki, Andrew N J Tutt, Rafael Valdés-Mas, Marit M van Buuren, Laura van 't Veer, Anne Vincent-Salomon, Nicola Waddell, Lucy R Yates, Australian Pancreatic Cancer Genome Initiative, ICGC Breast Cancer Consortium, ICGC MMML-Seq Consortium, ICGC PedBrain, Jessica Zucman-Rossi, P Andrew Futreal, Ultan McDermott, Peter Lichter, Matthew Meyerson, Sean M Grimmond, Reiner Siebert, Elías Campo, Tatsuhiro Shibata, Stefan M Pfister, Peter J Campbell, Michael R Stratton, Alexander Claviez, Andreas Rosenwald, Andreas Rosenwald, Arndt Borkhardt, Benedikt Brors, Bernhard Radlwimmer, Chris Lawerenz, Cristina Lopez, David Langenberger, Dennis Karsch, Dido Lenze, Dieter Kube, Ellen Leich, Gesine Richter, Jan Korbel, Jessica Hoell, Jürgen Eils, Kebriah Hezaveh, Lorenz Trümper, Maciej Rosolowski, Marc Weniger, Marius Rohde, Markus Kreuz, Markus Loeffler, Markus Schilhabel, Martin Dreyling, Martin-Leo Hansmann, Michael Hummel, Monika Szczepanowski, Ole Ammerpohl, Peter F Stadler, Peter Möller, Ralf Küppers, Siegfried Haas, Sonja Eberth, Stefan Schreiber, Stephan H Bernhart, Steve Hoffmann, Sylwester Radomski, Ulrike Kostezka, Wolfram Klapper, Christos Sotiriou, Denis Larsimont, Delphine Vincent, Marion Maetens, Odette Mariani, Anieta M Sieuwerts, John W M Martens, Jon G Jonasson, Isabelle Treilleux, Emilie Thomas, Gaëtan Mac Grogan, Cécile Mannina, Laurent Arnould, Laura Burillier, Jean-Louis Merlin, Magali Lefebvre, Frédéric Bibeau, Blandine Massemin, Frédérique Penault-Llorca, Qian Lopez, Marie-Christine Mathieu, Per Eystein Lonning, Margrete Schlooz-Vries, Jolien Tol, Hanneke van Laarhoven, Fred Sweep, Peter Bult

Abstract

All cancers are caused by somatic mutations; however, understanding of the biological processes generating these mutations is limited. The catalogue of somatic mutations from a cancer genome bears the signatures of the mutational processes that have been operative. Here we analysed 4,938,362 mutations from 7,042 cancers and extracted more than 20 distinct mutational signatures. Some are present in many cancer types, notably a signature attributed to the APOBEC family of cytidine deaminases, whereas others are confined to a single cancer class. Certain signatures are associated with age of the patient at cancer diagnosis, known mutagenic exposures or defects in DNA maintenance, but many are of cryptic origin. In addition to these genome-wide mutational signatures, hypermutation localized to small genomic regions, 'kataegis', is found in many cancer types. The results reveal the diversity of mutational processes underlying the development of cancer, with potential implications for understanding of cancer aetiology, prevention and therapy.

Figures

Figure 1. The prevalence of somatic mutations…
Figure 1. The prevalence of somatic mutations across human cancer types
Every dot represents a sample while the red horizontal lines are the median numbers of mutations in the respective cancer types. The vertical axis (log scaled) shows the number of mutations per megabase while the different cancer types are ordered on the horizontal axis based on their median numbers of somatic mutations. We would like to thank Gad Getz and colleagues for the design of this figure.
Figure 2. Validated mutational signatures found in…
Figure 2. Validated mutational signatures found in human cancer
Each signature is displayed according to the 96 substitution classification defined by the substitution class and sequence context immediately 3′ and 5′ to the mutated base. The probability bars for the six types of substitutions are displayed in different colors. The mutation types are on the horizontal axes, while vertical axes depict the percentage of mutations attributed to a specific mutation type. All mutational signatures are displayed based on the trinucleotide frequency of the human genome. A higher resolution of each panel is found respectively in Supplementary Figures 2 to 23.
Figure 3. The presence of mutational signatures…
Figure 3. The presence of mutational signatures across human cancer types
Cancer types are ordered alphabetically as columns while mutational signatures are displayed as rows. “Other” indicates mutational signatures for which we were not able to perform validation or for which validation failed (Supplementary Figs 24 to 28). Prevalence in cancer samples indicates the percentage of samples from our dataset of 7,042 cancers in which the signature contributed significant number of somatic mutations. For most signatures, significant number of mutations in a sample is defined as more than 100 substitutions or more than 25% of all mutations in that sample.
Figure 4. The contributions of mutational signatures…
Figure 4. The contributions of mutational signatures to individual cancers of selected cancer types
Each bar represents a typical selected sample from the respective cancer type and the vertical axis denotes the number of mutations per megabase. Contributions across all cancer samples could be found in Supplementary Figures 29 to 58. Summary of the total contributions for all operative mutational processes in a cancer type could be found in Supplementary Figures 59 to 88. “Other” indicates mutational signatures for which we were not able to perform validation or for which validation failed (Supplementary Figs 24 to 28).
Figure 5. Mutational signatures with strong transcriptional…
Figure 5. Mutational signatures with strong transcriptional strand bias
Mutations are shown according to the 192 mutation classification incorporating the substitution type, the sequence context immediately 5′ and 3′ to the mutated base and whether the mutated pyrimidine is on the transcribed or untranscribed strand. The mutation types are displayed on the horizontal axis, while the vertical axis depicts the percentage of mutations attributed to a specific mutation type. A higher resolution version of each panel is found respectively in Supplementary Figures 89 to 95.
Figure 6. Kataegis in three cancers
Figure 6. Kataegis in three cancers
Each of these “rainfall” plots represents an individual cancer sample in which each dot represents a single somatic mutation ordered on the horizontal axis according to its position in the human genome. The vertical axis denotes the genomic distance of each mutation from the previous mutation. Clusters of mutations in kataegis are arrowed.

References

    1. Stratton MR, Campbell PJ, Futreal PA. The cancer genome. Nature. 2009;458:719–724. doi:10.1038/nature07943.
    1. Pfeifer GP. Environmental exposures and mutational patterns of cancer genomes. Genome medicine. 2010;2:54.
    1. Pena-Diaz J, et al. Noncanonical mismatch repair as a source of genomic instability in human cells. Molecular cell. 2012;47:669–680.
    1. Olivier M, Hollstein M, Hainaut P. TP53 mutations in human cancers: origins, consequences, and clinical use. Cold Spring Harb Perspect Biol. 2010;2:a001008. doi:10.1101/cshperspect.a001008.
    1. Alexandrov LB, Nik-Zainal S, Wedge DC, Campbell PJ, Stratton MR. Deciphering signatures of mutational processes operative in human cancer. Cell reports. 2013;3:246–259. doi:10.1016/j.celrep.2012.12.008.
    1. Nik-Zainal S, et al. Mutational Processes Molding the Genomes of 21 Breast Cancers. Cell. 2012;149:979–993.
    1. Nik-Zainal S, et al. The life history of 21 breast cancers. Cell. 2012;149:994–1007. doi:10.1016/j.cell.2012.04.023.
    1. Hudson TJ, et al. International network of cancer genome projects. Nature. 2010;464:993–998. doi:10.1038/nature08987.
    1. Pfeifer GP. Mutagenesis at methylated CpG sequences. Curr Top Microbiol Immunol. 2006;301:259–281.
    1. Welch JS, et al. The origin and evolution of mutations in acute myeloid leukemia. Cell. 2012;150:264–278. doi:10.1016/j.cell.2012.06.023.
    1. Di Noia JM, Neuberger MS. Molecular mechanisms of antibody somatic hypermutation. Annu Rev Biochem. 2007;76:1–22. doi:10.1146/annurev.biochem.76.061705.090740.
    1. Hanawalt PC, Spivak G. Transcription-coupled DNA repair: two decades of progress and surprises. Nat Rev Mol Cell Biol. 2008;9:958–970. doi:10.1038/nrm2549.
    1. Pfeifer GP, et al. Tobacco smoke carcinogens, DNA damage and p53 mutations in smoking-associated cancers. Oncogene. 2002;21:7435–7451. doi:10.1038/sj.onc.1205803.
    1. Pfeifer GP, You YH, Besaratinia A. Mutations induced by ultraviolet light. Mutat Res. 2005;571:19–31. doi:10.1016/j.mrfmmm.2004.06.057.
    1. Boland CR, Goel A. Microsatellite instability in colorectal cancer. Gastroenterology. 2010;138:2073–2087. e2073. doi:10.1053/j.gastro.2009.12.064.
    1. Thompson LH. Recognition, signaling, and repair of DNA double-strand breaks produced by ionizing radiation in mammalian cells: the molecular choreography. Mutat Res. 2012;751:158–246. doi:10.1016/j.mrrev.2012.06.002.
    1. Hunter C, et al. A hypermutation phenotype and somatic MSH6 mutations in recurrent human malignant gliomas after alkylator chemotherapy. Cancer research. 2006;66:3987–3991.
    1. Tomita-Mitchell A, et al. Mismatch repair deficient human cells: spontaneous and MNNG-induced mutational spectra in the HPRT gene. Mutat Res. 2000;450:125–138.
    1. Taylor BJM, et al. DNA deaminases induce break-associated mutation showers with implication of APOBEC3B and 3A in breast cancer kataegis. eLife. 2013;2:e00534.
    1. Burns MB, et al. APOBEC3B is an enzymatic source of mutation in breast cancer. Nature. 2013;494:366–370. doi:10.1038/nature11881.
    1. Harris RS, Petersen-Mahrt SK, Neuberger MS. RNA editing enzyme APOBEC1 and some of its homologs can act as DNA mutators. Molecular cell. 2002;10:1247–1253.
    1. Koito A, Ikeda T. Intrinsic immunity against retrotransposons by APOBEC cytidine deaminases. Front Microbiol. 2013;4:28. doi:10.3389/fmicb.2013.00028.
    1. Puente XS, et al. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia. Nature. 2011;475:101–105. doi:10.1038/nature10113.
    1. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012;487:330–337. doi:10.1038/nature11252.
    1. Cancer Genome Atlas Research, N. et al. Integrated genomic characterization of endometrial carcinoma. Nature. 2013;497:67–73. doi:10.1038/nature12113.
    1. Lawrence MS, et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature. 2013 doi:10.1038/nature12213.
Additional References (Online Methods)
    1. Holmfeldt L, et al. The genomic landscape of hypodiploid acute lymphoblastic leukemia. Nature genetics. 2013;45:242–252. doi:10.1038/ng.2532.
    1. Zhang J, et al. The genetic basis of early T-cell precursor acute lymphoblastic leukaemia. Nature. 2012;481:157–163. doi:10.1038/nature10725.
    1. De Keersmaecker K, et al. Exome sequencing identifies mutation in CNOT3 and ribosomal genes RPL5 and RPL10 in T-cell acute lymphoblastic leukemia. Nature genetics. 2013;45:186–190. doi:10.1038/ng.2508.
    1. Ding L, et al. Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature. 2012;481:506–510. doi:10.1038/nature10738.
    1. Stephens PJ, et al. The landscape of cancer genes and mutational processes in breast cancer. Nature. 2012;486:400–404. doi:10.1038/nature11017.
    1. Quesada V, et al. Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia. Nature genetics. 2012;44:47–52. doi:10.1038/ng.1032.
    1. Seshagiri S, et al. Recurrent R-spondin fusions in colon cancer. Nature. 2012;488:660–664. doi:10.1038/nature11282.
    1. Dulak AM, et al. Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity. Nature genetics. 2013 doi:10.1038/ng.2591.
    1. Agrawal N, et al. Exome sequencing of head and neck squamous cell carcinoma reveals inactivating mutations in NOTCH1. Science. 2011;333:1154–1157. doi:10.1126/science.1206923.
    1. Stransky N, et al. The mutational landscape of head and neck squamous cell carcinoma. Science. 2011;333:1157–1160. doi:10.1126/science.1208130.
    1. Guo G, et al. Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell renal cell carcinoma. Nature genetics. 2012;44:17–19. doi:10.1038/ng.1014.
    1. Pena-Llopis S, et al. BAP1 loss defines a new class of renal cell carcinoma. Nature genetics. 2012;44:751–759. doi:10.1038/ng.2323.
    1. Ding L, et al. Somatic mutations affect key pathways in lung adenocarcinoma. Nature. 2008;455:1069–1075. doi:10.1038/nature07423.
    1. Seo JS, et al. The transcriptional landscape and mutational profile of lung adenocarcinoma. Genome research. 2012;22:2109–2119. doi:10.1101/gr.145144.112.
    1. Imielinski M, et al. Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing. Cell. 2012;150:1107–1120. doi:10.1016/j.cell.2012.08.029.
    1. Love C, et al. The genetic landscape of mutations in Burkitt lymphoma. Nature genetics. 2012;44:1321–1325. doi:10.1038/ng.2468.
    1. Zhang J, et al. Whole-genome sequencing identifies genetic alterations in pediatric low-grade gliomas. Nature genetics. 2013 doi:10.1038/ng.2611.
    1. Morin RD, et al. Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma. Nature. 2011;476:298–303. doi:10.1038/nature10351.
    1. Jiao Y, et al. DAXX/ATRX, MEN1, and mTOR pathway genes are frequently altered in pancreatic neuroendocrine tumors. Science. 2011;331:1199–1203. doi:10.1126/science.1200609.
    1. Pugh TJ, et al. The genetic landscape of high-risk neuroblastoma. Nature genetics. 2013;45:279–284. doi:10.1038/ng.2529.
    1. Jones S, et al. Frequent mutations of chromatin remodeling gene ARID1A in ovarian clear cell carcinoma. Science. 2010;330:228–231. doi:10.1126/science.1196333.
    1. Wu J, et al. Whole-exome sequencing of neoplastic cysts of the pancreas reveals recurrent mutations in components of ubiquitin-dependent pathways. Proceedings of the National Academy of Sciences of the United States of America. 2011;108:21188–21193. doi:10.1073/pnas.1118046108.
    1. Sausen M, et al. Integrated genomic analyses identify ARID1A and ARID1B alterations in the childhood cancer neuroblastoma. Nature genetics. 2013;45:12–17. doi:10.1038/ng.2493.
    1. Berger MF, et al. The genomic complexity of primary human prostate cancer. Nature. 2011;470:214–220. doi:10.1038/nature09744.
    1. Grasso CS, et al. The mutational landscape of lethal castration-resistant prostate cancer. Nature. 2012;487:239–243. doi:10.1038/nature11125.
    1. Barbieri CE, et al. Exome sequencing identifies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer. Nature genetics. 2012;44:685–689. doi:10.1038/ng.2279.
    1. Rudin CM, et al. Comprehensive genomic analysis identifies SOX2 as a frequently amplified gene in small-cell lung cancer. Nature genetics. 2012;44:1111–1116. doi:10.1038/ng.2405.
    1. Peifer M, et al. Integrative genome analyses identify key somatic driver mutations of small-cell lung cancer. Nature genetics. 2012;44:1104–1110. doi:10.1038/ng.2396.
    1. Stark MS, et al. Frequent somatic mutations in MAP3K5 and MAP3K9 in metastatic melanoma identified by exome sequencing. Nature genetics. 2012;44:165–169. doi:10.1038/ng.1041.
    1. Berger MF, et al. Melanoma genome sequencing reveals frequent PREX2 mutations. Nature. 2012;485:502–506. doi:10.1038/nature11071.
    1. Hodis E, et al. A landscape of driver mutations in melanoma. Cell. 2012;150:251–263. doi:10.1016/j.cell.2012.06.024.
    1. Zang ZJ, et al. Exome sequencing of gastric adenocarcinoma identifies recurrent somatic mutations in cell adhesion and chromatin remodeling genes. Nature genetics. 2012;44:570–574. doi:10.1038/ng.2246.
    1. Wang K, et al. Exome sequencing identifies frequent mutation of ARID1A in molecular subtypes of gastric cancer. Nature genetics. 2011;43:1219–1223. doi:10.1038/ng.982.
    1. Sherry ST, et al. dbSNP: the NCBI database of genetic variation. Nucleic acids research. 2001;29:308–311.
    1. Abecasis GR, et al. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65. doi:10.1038/nature11632.
    1. Fu W, et al. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature. 2013;493:216–220. doi:10.1038/nature11690.
    1. Baumbusch LO, et al. Comparison of the Agilent, ROMA/NimbleGen and Illumina platforms for classification of copy number alterations in human breast tumors. BMC Genomics. 2008;9:379.
    1. Pickrell JK, Gaffney DJ, Gilad Y, Pritchard JK. False positive peaks in ChIP-seq and other sequencing-based functional assays caused by unannotated high copy number regions. Bioinformatics. 2011;27:2144–2146.

Source: PubMed

3
Subscribe