Mutational processes molding the genomes of 21 breast cancers

Serena Nik-Zainal, Ludmil B Alexandrov, David C Wedge, Peter Van Loo, Christopher D Greenman, Keiran Raine, David Jones, Jonathan Hinton, John Marshall, Lucy A Stebbings, Andrew Menzies, Sancha Martin, Kenric Leung, Lina Chen, Catherine Leroy, Manasa Ramakrishna, Richard Rance, King Wai Lau, Laura J Mudie, Ignacio Varela, David J McBride, Graham R Bignell, Susanna L Cooke, Adam Shlien, John Gamble, Ian Whitmore, Mark Maddison, Patrick S Tarpey, Helen R Davies, Elli Papaemmanuil, Philip J Stephens, Stuart McLaren, Adam P Butler, Jon W Teague, Göran Jönsson, Judy E Garber, Daniel Silver, Penelope Miron, Aquila Fatima, Sandrine Boyault, Anita Langerød, Andrew Tutt, John W M Martens, Samuel A J R Aparicio, Åke Borg, Anne Vincent Salomon, Gilles Thomas, Anne-Lise Børresen-Dale, Andrea L Richardson, Michael S Neuberger, P Andrew Futreal, Peter J Campbell, Michael R Stratton, Breast Cancer Working Group of the International Cancer Genome Consortium, Serena Nik-Zainal, Ludmil B Alexandrov, David C Wedge, Peter Van Loo, Christopher D Greenman, Keiran Raine, David Jones, Jonathan Hinton, John Marshall, Lucy A Stebbings, Andrew Menzies, Sancha Martin, Kenric Leung, Lina Chen, Catherine Leroy, Manasa Ramakrishna, Richard Rance, King Wai Lau, Laura J Mudie, Ignacio Varela, David J McBride, Graham R Bignell, Susanna L Cooke, Adam Shlien, John Gamble, Ian Whitmore, Mark Maddison, Patrick S Tarpey, Helen R Davies, Elli Papaemmanuil, Philip J Stephens, Stuart McLaren, Adam P Butler, Jon W Teague, Göran Jönsson, Judy E Garber, Daniel Silver, Penelope Miron, Aquila Fatima, Sandrine Boyault, Anita Langerød, Andrew Tutt, John W M Martens, Samuel A J R Aparicio, Åke Borg, Anne Vincent Salomon, Gilles Thomas, Anne-Lise Børresen-Dale, Andrea L Richardson, Michael S Neuberger, P Andrew Futreal, Peter J Campbell, Michael R Stratton, Breast Cancer Working Group of the International Cancer Genome Consortium

Abstract

All cancers carry somatic mutations. The patterns of mutation in cancer genomes reflect the DNA damage and repair processes to which cancer cells and their precursors have been exposed. To explore these mechanisms further, we generated catalogs of somatic mutation from 21 breast cancers and applied mathematical methods to extract mutational signatures of the underlying processes. Multiple distinct single- and double-nucleotide substitution signatures were discernible. Cancers with BRCA1 or BRCA2 mutations exhibited a characteristic combination of substitution mutation signatures and a distinctive profile of deletions. Complex relationships between somatic mutation prevalence and transcription were detected. A remarkable phenomenon of localized hypermutation, termed "kataegis," was observed. Regions of kataegis differed between cancers but usually colocalized with somatic rearrangements. Base substitutions in these regions were almost exclusively of cytosine at TpC dinucleotides. The mechanisms underlying most of these mutational signatures are unknown. However, a role for the APOBEC family of cytidine deaminases is proposed.

Copyright © 2012 Elsevier Inc. All rights reserved.

Figures

Graphical abstract
Graphical abstract
Figure 1
Figure 1
Somatic Mutation Profiles of 21 Breast Cancers, Related to Table S1 Breast cancers grouped according to subtype on the far left. (A) Base substitution mutation spectra. ∗Ultra-deep sequenced PD4120a has an alternative scale on the x axis (0 to 45,000). (B) Mutation spectra of double substitutions from all 21 samples. (C) Genomic heatmap constructed from counts of each mutation-type at each mutation context corrected for the frequency of each trinucleotide in the reference genome. Log-transformed values of these ratios have been plotted in the heatmap. The 5′ base to each mutated base is shown on the vertical axis and 3′ base on the horizontal axis. Heatmap scale at the bottom. (D) Proportion of the total substitutions contributed by each of the five mutational signatures, as identified by NMF analysis, for all 21 cancer genomes.
Figure 2
Figure 2
Five Mutational Signatures Extracted by NMF in 21 Breast Cancers, Related to Figure S1 (A) Fraction of contribution of each mutation-type at each context for the five mutational signatures identified by NMF analysis. The major components contributing to each signature are highlighted with arrows. (B) Cluster dendrogram generated by unsupervised hierarchical clustering based on contributions of the five mutational signatures identified by NMF to the 21 breast cancer genomes.
Figure 3
Figure 3
Kataegis, Regional Hypermutation of Base Substitutions, Related to Figure S2 (A) Rainfall plot of PD4107a. Mutations are ordered on the x axis from the first variant on the short arm of chromosome 1 to the last variant on the long arm of chromosome X and are colored according to mutation-type. The distance between each mutation and the one prior to it (the intermutation distance) is plotted on the vertical axis on a log scale. Most mutations in this genome have an intermutation distance of ∼105 bp to ∼106 bp. Mutations in a region of hypermutation present as a cluster of lower intermutation distances. (B) Rainfall plot for PD4103a demonstrating kataegis occurring at multiple loci through the genome. (C) Rainfall plot for PD4085a, showing no kataegis. (D) Plots of flanking sequence of all C>X mutations and C>X mutations within the regions of kataegis in PD4107a. Mutated base is at position 0 with ten bases of flanking sequence provided, demonstrating a strong preference for T at the −1 position.
Figure 4
Figure 4
Rainfall Plot for Chromosome 6 of PD4107a (A) The x axis shows the genomic coordinates of the mutations. Rearrangements are presented as brown triangles (rearrs is an abbreviation for rearrangements). The region of kataegis is highlighted at increasing resolution to demonstrate microclusters within the macrocluster. The processive nature of C>T mutations at TpC context occurring in cis is seen in the lowest panel (G-browse image). (B) Alternating processivity of kataegis in PD4107a. Long regions of C>T mutations are interspersed with regions of G>A mutations. (C) Kataegis occurs with a variety of rearrangement architectures. Thick top line shows the copy number segments for the region of chromosome 6 of PD4107a. Point mutations are shown in lower panel as black points. x axis reflecting genomic position and y axis represents variant allele fraction. The proportions of reads derived from contaminating normal cells are depicted in gray and the fraction coming from each of the copies of that segment in the tumor cells are depicted by the multiple bars from green to yellow to pink to white. Early mutations will be found relatively higher up these bars, whereas late ones will be seen down the bottom of the variant allele fraction. Grey vertical lines represent rearrangements. Interconnecting lines indicate intrachromosomal rearrangements. On a macroscopic scale, this demonstrates how kataegis can be associated with chromothripsis (within region 130–135 Mb) as well as other rearrangement architectures.
Figure 5
Figure 5
Processivity and Complex Colocalization of Rearrangement Architecture with Kataegis in PD4103a (A) Stretches of C>T alternate with stretches of G>A on chromosome 4 in PD4103a. (B) Alternating C>G and G>C mutation on the same chromosome in PD4103a. (C) The complex web of rearrangements involving 8 chromosomes in PD4103a colocalizing with kataegis.
Figure 6
Figure 6
Relationship between Mutation Prevalence and Transcription and/or Expression, Related to Figure S3 Mutation prevalence is expressed as the number of mutations per Mb from 0 to 2 per Mb on the vertical axis. Log 2 expression levels range from 6 to 12 on the horizontal axis. Lines are fitted curves to the data for A and B. (A) C>A mutations; and (B) T>A mutations. Breast cancer samples without expression data are shown in gray. (C) Effect of distance from transcription start site on mutation prevalence. Each dot represents a 1 kb bin at increasing distances from all transcription start sites (TSS) up to 200 kb. The y axis shows the percentage of genes in each bin carrying a somatic mutation. The mutation prevalence increases as distance increases from the TSS. (D) This is particularly marked in the first 1 kb after the TSS. Each dot represents a 100 bp bin.
Figure 7
Figure 7
Somatic Mutation Profile of Indels (A) The x axis shows indel size from 1–10 and all larger indels between 11-50 bp in size grouped in a single bin. The y axis shows the number in each genome from 0–300. (B) Frequency of indels by indel size. This demonstrates how repeat-mediated indels are usually of smaller size. From a Kolmogorov-Smirnov (K-S) test, the distribution of indel lengths for repeats and microhomologies is significantly different (p −16). (C) Observed number of bases involved in microhomology at junction of indels versus expected number of bases if microhomology occurred simply by chance.
Figure S1
Figure S1
Selection of the Optimal Number of Signatures via the NMF Model Selection Framework, Related to Figure 2 (A) The x axis depicts the number of signatures, whereas the y axis shows the cophenetic coefficient. As an indicator of stable reproducibility, the cophenetic correlation coefficient is at its highest points at between 2 and 6 processes. Given that there are no further peaks after 6 for this data set, the number of signatures recognized by the NMF algorithm here is up to six. (B) The error in reconstruction for each number of potential signatures, k, showed a marked reduction in the slope of the reconstruction error until k = 5, suggesting that the model was stable at five mutational signatures. (C) A typical comparison between the reconstructed and original mutation profile demonstrating how well the extracted signatures and their exposures describe the original data for five signatures. (D) Signatures A,C and D with contributions from each of the 96 trinucleotides corrected for the frequency of trinucleotides in the genome. This form of representation highlights the contrast between Signature A and C, as well as demonstrates the differences between Signatures C and D. Note the absence of C > T transitions at XpCpG in Signature D.
Figure S2
Figure S2
Rainfall Plots for 18 Genomes, Related to Figure 3 PD4115a, PD4116a, PD3904a, PD3945a, PD4005a and PD4006a show an excess of mutations of intermutation distance of 1bp, in-keeping with the observed excess of double substitutions in these genomes. Subtle regions of kataegis are present in many samples (PD4199a, PD4192a, PD4198a, PD4248a, PD4116a, PD3904a, PD4005a and PD4006a). Intermutation distance (bp) is presented on the vertical axis and mutation number is presented on the horizontal axis.
Figure S3
Figure S3
Relationship between Mutation Prevalence, Transcription and Gene Expression, Related to Figure 6 Overall effect of transcription and gene expression on mutation prevalence by mutation type. p values of significance are provided for each mutation-type if a strong effect was seen in either strand bias and/or relationship with expression. Mutation prevalence is expressed as the number of mutations per Mb from 0 to 2 per Mb on the vertical axis. Log 2 expression levels range from 6 to 12 on the horizontal axis. Lines are fitted curves to the data for A and B.

References

    1. Berger M.F., Lawrence M.S., Demichelis F., Drier Y., Cibulskis K., Sivachenko A.Y., Sboner A., Esgueva R., Pflueger D., Sougnez C. The genomic complexity of primary human prostate cancer. Nature. 2011;470:214–220.
    1. Berry M.W., Browne M., Langville A.N., Pauca V.P., Plemmons R.J. Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 2007;52:155–173.
    1. Bignell G.R., Greenman C.D., Davies H., Butler A.P., Edkins S., Andrews J.M., Buck G., Chen L., Beare D., Latimer C. Signatures of mutation and selection in the cancer genome. Nature. 2010;463:893–898.
    1. Campbell P.J., Stephens P.J., Pleasance E.D., O'Meara S., Li H., Santarius T., Stebbings L.A., Leroy C., Edkins S., Hardy C. Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat. Genet. 2008;40:722–729.
    1. Chapman M.A., Lawrence M.S., Keats J.J., Cibulskis K., Sougnez C., Schinzel A.C., Harview C.L., Brunet J.P., Ahmann G.J., Adli M. Initial genome sequencing and analysis of multiple myeloma. Nature. 2011;471:467–472.
    1. Chen J.M., Férec C., Cooper D.N. Transient hypermutability, chromothripsis and replication-based mechanisms in the generation of concurrent clustered mutations. Mutat. Res. 2011;750:52–59.
    1. Deem A., Keszthelyi A., Blackgrove T., Vayl A., Coffey B., Mathur R., Chabes A., Malkova A. Break-induced replication is highly inaccurate. PLoS Biol. 2011;9:e1000594.
    1. Ding L., Ellis M.J., Li S., Larson D.E., Chen K., Wallis J.W., Harris C.C., McLellan M.D., Fulton R.S., Fulton L.L. Genome remodelling in a basal-like breast cancer metastasis and xenograft. Nature. 2010;464:999–1005.
    1. Fong P.C., Boss D.S., Yap T.A., Tutt A., Wu P., Mergui-Roelvink M., Mortimer P., Swaisland H., Lau A., O'Connor M.J. Inhibition of poly(ADP-ribose) polymerase in tumors from BRCA mutation carriers. N. Engl. J. Med. 2009;361:123–134.
    1. Forster M.D., Dedes K.J., Sandhu S., Frentzas S., Kristeleit R., Ashworth A., Poole C.J., Weigelt B., Kaye S.B., Molife L.R. Treatment with olaparib in a patient with PTEN-deficient endometrioid endometrial cancer. Nature reviews. Clin. Oncol. 2011;8:302–306.
    1. Greenman C., Stephens P., Smith R., Dalgliesh G.L., Hunter C., Bignell G., Davies H., Teague J., Butler A., Stevens C. Patterns of somatic mutation in human cancer genomes. Nature. 2007;446:153–158.
    1. Hainaut P., Pfeifer G.P. Patterns of p53 G→T transversions in lung cancers reflect the primary mutagenic signature of DNA-damage by tobacco smoke. Carcinogenesis. 2001;22:367–374.
    1. Hanawalt P.C., Spivak G. Transcription-coupled DNA repair: two decades of progress and surprises. Nat. Rev. Mol. Cell Biol. 2008;9:958–970.
    1. Harris R.S., Petersen-Mahrt S.K., Neuberger M.S. RNA editing enzyme APOBEC1 and some of its homologs can act as DNA mutators. Mol. Cell. 2002;10:1247–1253.
    1. Hedenfalk I., Duggan D., Chen Y., Radmacher M., Bittner M., Simon R., Meltzer P., Gusterson B., Esteller M., Kallioniemi O.P. Gene-expression profiles in hereditary breast cancer. N. Engl. J. Med. 2001;344:539–548.
    1. Hicks W.M., Kim M., Haber J.E. Increased mutagenesis and unique mutation signature associated with mitotic gene conversion. Science. 2010;329:82–85.
    1. Hori M., Suzuki T., Minakawa N., Matsuda A., Harashima H., Kamiya H. Mutagenicity of secondary oxidation products of 8-oxo-7,8-dihydro-2′-deoxyguanosine 5′-triphosphate (8-hydroxy-2′- deoxyguanosine 5′-triphosphate) Mutat. Res. 2011;714:11–16.
    1. Hudson T.J., Anderson W., Artez A., Barker A.D., Bell C., Bernabé R.R., Bhan M.K., Calvo F., Eerola I., Gerhard D.S., International Cancer Genome Consortium International network of cancer genome projects. Nature. 2010;464:993–998.
    1. Hultquist J.F., Lengyel J.A., Refsland E.W., LaRue R.S., Lackey L., Brown W.L., Harris R.S. Human and rhesus APOBEC3D, APOBEC3F, APOBEC3G, and APOBEC3H demonstrate a conserved capacity to restrict Vif-deficient HIV-1. J. Virol. 2011;85:11220–11234.
    1. Jansen J.G., Langerak P., Tsaalbi-Shtylik A., van den Berk P., Jacobs H., de Wind N. Strand-biased defect in C/G transversions in hypermutating immunoglobulin genes in Rev1-deficient mice. J. Exp. Med. 2006;203:319–323.
    1. Kozarewa I., Ning Z., Quail M.A., Sanders M.J., Berriman M., Turner D.J. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. Nat. Methods. 2009;6:291–295.
    1. Landry S., Narvaiza I., Linfesty D.C., Weitzman M.D. APOBEC3A can activate the DNA damage response and cause cell-cycle arrest. EMBO Rep. 2011;12:444–450.
    1. Lee D.D., Seung H.S. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401:788–791.
    1. Lee W., Jiang Z., Liu J., Haverty P.M., Guan Y., Stinson J., Yue P., Zhang Y., Pant K.P., Bhatt D. The mutation spectrum revealed by paired genome sequences from a lung cancer patient. Nature. 2010;465:473–477.
    1. Ley T.J., Mardis E.R., Ding L., Fulton B., McLellan M.D., Chen K., Dooling D., Dunford-Shore B.H., McGrath S., Hickenbotham M. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature. 2008;456:66–72.
    1. Li H., Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760.
    1. Longerich S., Basu U., Alt F., Storb U. AID in somatic hypermutation and class switch recombination. Curr. Opin. Immunol. 2006;18:164–174.
    1. Macé K., Aguilar F., Wang J.S., Vautravers P., Gómez-Lechón M., Gonzalez F.J., Groopman J., Harris C.C., Pfeifer A.M. Aflatoxin B1-induced DNA adduct formation and p53 mutations in CYP450-expressing human liver cell lines. Carcinogenesis. 1997;18:1291–1297.
    1. Mardis E.R., Ding L., Dooling D.J., Larson D.E., McLellan M.D., Chen K., Koboldt D.C., Fulton R.S., Delehaunty K.D., McGrath S.D. Recurring mutations found by sequencing an acute myeloid leukemia genome. N. Engl. J. Med. 2009;361:1058–1066.
    1. Nedelko T., Arlt V.M., Phillips D.H., Hollstein M. TP53 mutation signature supports involvement of aristolochic acid in the aetiology of endemic nephropathy-associated tumours. Int. J. Cancer. 2009;124:987–990.
    1. Nik-Zainal S., Van Loo P., Wedge D.C., Alexandrov L.B., Greenman C.D., Lau K.W., Raine K., Jones D., Marshall J., Ramakrishna M. The life history of 21 breast cancers. Cell. 2012;149 Published online May 17, 2012.
    1. Nussenzweig A., Nussenzweig M.C. Origin of chromosomal translocations in lymphoid cancer. Cell. 2010;141:27–38.
    1. Palacios J., Robles-Frías M.J., Castilla M.A., López-García M.A., Benítez J. The molecular pathology of hereditary breast cancer. Pathobiology. 2008;75:85–94.
    1. Perou C.M., Sørlie T., Eisen M.B., van de Rijn M., Jeffrey S.S., Rees C.A., Pollack J.R., Ross D.T., Johnsen H., Akslen L.A. Molecular portraits of human breast tumours. Nature. 2000;406:747–752.
    1. Pfeifer G.P., Denissenko M.F., Olivier M., Tretyakova N., Hecht S.S., Hainaut P. Tobacco smoke carcinogens, DNA damage and p53 mutations in smoking-associated cancers. Oncogene. 2002;21:7435–7451.
    1. Pfeifer G.P., You Y.H., Besaratinia A. Mutations induced by ultraviolet light. Mutat. Res. 2005;571:19–31.
    1. Pham P., Bransteitter R., Petruska J., Goodman M.F. Processive AID-catalysed cytosine deamination on single-stranded DNA simulates somatic hypermutation. Nature. 2003;424:103–107.
    1. Pleasance E.D., Cheetham R.K., Stephens P.J., McBride D.J., Humphray S.J., Greenman C.D., Varela I., Lin M.L., Ordóñez G.R., Bignell G.R. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature. 2010;463:191–196.
    1. Pleasance E.D., Stephens P.J., O'Meara S., McBride D.J., Meynert A., Jones D., Lin M.L., Beare D., Lau K.W., Greenman C. A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature. 2010;463:184–190.
    1. Ross A.L., Sale J.E. The catalytic activity of REV1 is employed during immunoglobulin gene diversification in DT40. Mol. Immunol. 2006;43:1587–1594.
    1. Shah S.P., Morin R.D., Khattra J., Prentice L., Pugh T., Burleigh A., Delaney A., Gelmon K., Guliany R., Senz J. Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution. Nature. 2009;461:809–813.
    1. Sheehy A.M., Gaddis N.C., Choi J.D., Malim M.H. Isolation of a human gene that inhibits HIV-1 infection and is suppressed by the viral Vif protein. Nature. 2002;418:646–650.
    1. Sørlie T., Perou C.M., Tibshirani R., Aas T., Geisler S., Johnsen H., Hastie T., Eisen M.B., van de Rijn M., Jeffrey S.S. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl. Acad. Sci. USA. 2001;98:10869–10874.
    1. Spencer W.A., Vadhanam M.V., Jeyabalan J., Gupta R.C. Oxidative DNA damage following microsome/Cu(II)-mediated activation of the estrogens, 17β-estradiol, equilenin, and equilin: role of reactive oxygen species. Chem. Res. Toxicol. 2012;25:305–314.
    1. Stenglein M.D., Burns M.B., Li M., Lengyel J., Harris R.S. APOBEC3 proteins mediate the clearance of foreign DNA from human cells. Nat. Struct. Mol. Biol. 2010;17:222–229.
    1. Stephens P., Edkins S., Davies H., Greenman C., Cox C., Hunter C., Bignell G., Teague J., Smith R., Stevens C. A screen of the complete protein kinase gene family identifies diverse patterns of somatic mutations in human breast cancer. Nat. Genet. 2005;37:590–592.
    1. Stephens P.J., McBride D.J., Lin M.L., Varela I., Pleasance E.D., Simpson J.T., Stebbings L.A., Leroy C., Edkins S., Mudie L.J. Complex landscapes of somatic rearrangement in human breast cancer genomes. Nature. 2009;462:1005–1010.
    1. Stephens P.J., Greenman C.D., Fu B., Yang F., Bignell G.R., Mudie L.J., Pleasance E.D., Lau K.W., Beare D., Stebbings L.A. Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell. 2011;144:27–40.
    1. Stephens P.J., Tarpey P., Davies H., Van Loo P., Greenman C., Varela I., Nik Zainal S., Bignell G.R., Martin S., Wedge D.C. The landscape of cancer genes and mutational processes in breast cancer. Nature. 2012;485 in press. Published online May 17, 2012.
    1. Stratton M.R. Exploring the genomes of cancer cells: progress and promise. Science. 2011;331:1553–1558.
    1. Stratton M.R., Campbell P.J., Futreal P.A. The cancer genome. Nature. 2009;458:719–724.
    1. Suspène R., Aynaud M.M., Guétard D., Henry M., Eckhoff G., Marchio A., Pineau P., Dejean A., Vartanian J.P., Wain-Hobson S. Somatic hypermutation of human mitochondrial and nuclear DNA by APOBEC3 cytidine deaminases, a pathway for DNA catabolism. Proc. Natl. Acad. Sci. USA. 2011;108:4858–4863.
    1. Tao Y., Ruan J., Yeh S.H., Lu X., Wang Y., Zhai W., Cai J., Ling S., Gong Q., Chong Z. Rapid growth of a hepatocellular carcinoma and the driving mutations revealed by cell-population genetic analysis of whole-genome data. Proc. Natl. Acad. Sci. USA. 2011;108:12042–12047.
    1. Teng B., Burant C.F., Davidson N.O. Molecular cloning of an apolipoprotein B messenger RNA editing protein. Science. 1993;260:1816–1819.
    1. Van Loo P., Nordgard S.H., Lingjærde O.C., Russnes H.G., Rye I.H., Sun W., Weigman V.J., Marynen P., Zetterberg A., Naume B. Allele-specific copy number analysis of tumors. Proc. Natl. Acad. Sci. USA. 2010;107:16910–16915.
    1. Wang J., Gonzalez K.D., Scaringe W.A., Tsai K., Liu N., Gu D., Li W., Hill K.A., Sommer S.S. Evidence for mutation showers. Proc. Natl. Acad. Sci. USA. 2007;104:8403–8408.
    1. Waters T.R., Swann P.F. Thymine-DNA glycosylase and G to A transition mutations at CpG sites. Mutat. Res. 2000;462:137–147.
    1. Wilson D.M., 3rd, Bohr V.A. The mechanics of base excision repair, and its relationship to aging and disease. DNA Repair (Amst.) 2007;6:544–559.
    1. Yamanaka S., Balestra M.E., Ferrell L.D., Fan J., Arnold K.S., Taylor S., Taylor J.M., Innerarity T.L. Apolipoprotein B mRNA-editing protein induces hepatocellular carcinoma and dysplasia in transgenic animals. Proc. Natl. Acad. Sci. USA. 1995;92:8483–8487.
    1. Ye K., Schulz M.H., Long Q., Apweiler R., Ning Z. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics. 2009;25:2865–2871.
Supplemental References
    1. Bignell, G.R., Greenman, C.D., Davies, H., Butler, A.P., Edkins, S., Andrews, J.M., Buck, G., Chen, L., Beare, D., Latimer, C., et al. (2010). Signatures of mutation and selection in the cancer genome. Nature 463, 893–898.
    1. Brunet, J.P., Tamayo, P., Golub, T.R., and Mesirov, J.P. (2004). Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl. Acad. Sci. USA 101, 4164–4169.
    1. Campbell, P.J., Stephens, P.J., Pleasance, E.D., O'Meara, S., Li, H., Santarius, T., Stebbings, L.A., Leroy, C., Edkins, S., Hardy, C., et al. (2008). Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat. Genet. 40, 722–729.
    1. Kozarewa, I., Ning, Z., Quail, M.A., Sanders, M.J., Berriman, M., and Turner, D.J. (2009). Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. Nat. Methods 6, 291–295.
    1. Lee, D.D., and Seung, H.S. (1999). Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791.
    1. Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760.
    1. Pleasance, E.D., Cheetham, R.K., Stephens, P.J., McBride, D.J., Humphray, S.J., Greenman, C.D., Varela, I., Lin, M.L., Ordóñez, G.R., Bignell, G.R., et al. (2010). A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196.
    1. Stephens, P.J., McBride, D.J., Lin, M.L., Varela, I., Pleasance, E.D., Simpson, J.T., Stebbings, L.A., Leroy, C., Edkins, S., Mudie, L.J., et al. (2009). Complex landscapes of somatic rearrangement in human breast cancer genomes. Nature 462, 1005–1010.
    1. Van Loo, P., Nordgard, S.H., Lingjærde, O.C., Russnes, H.G., Rye, I.H., Sun, W., Weigman, V.J., Marynen, P., Zetterberg, A., Naume, B., et al. (2010). Allele-specific copy number analysis of tumors. Proc. Natl. Acad. Sci. USA 107, 16910–16915.
    1. Ye, K., Schulz, M.H., Long, Q., Apweiler, R., and Ning, Z. (2009). Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 2865–2871.
    1. Zerbino, D.R., and Birney, E. (2008). Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829.

Source: PubMed

3
Abonner