Ultra-Sensitive Mutation Detection and Genome-Wide DNA Copy Number Reconstruction by Error-Corrected Circulating Tumor DNA Sequencing

Sonia Mansukhani, Louise J Barber, Dimitrios Kleftogiannis, Sing Yu Moorcraft, Michael Davidson, Andrew Woolston, Paula Zuzanna Proszek, Beatrice Griffiths, Kerry Fenwick, Bram Herman, Nik Matthews, Ben O'Leary, Sanna Hulkki, David Gonzalez De Castro, Anisha Patel, Andrew Wotherspoon, Aleruchi Okachi, Isma Rana, Ruwaida Begum, Matthew N Davies, Thomas Powles, Katharina von Loga, Michael Hubank, Nick Turner, David Watkins, Ian Chau, David Cunningham, Stefano Lise, Naureen Starling, Marco Gerlinger, Sonia Mansukhani, Louise J Barber, Dimitrios Kleftogiannis, Sing Yu Moorcraft, Michael Davidson, Andrew Woolston, Paula Zuzanna Proszek, Beatrice Griffiths, Kerry Fenwick, Bram Herman, Nik Matthews, Ben O'Leary, Sanna Hulkki, David Gonzalez De Castro, Anisha Patel, Andrew Wotherspoon, Aleruchi Okachi, Isma Rana, Ruwaida Begum, Matthew N Davies, Thomas Powles, Katharina von Loga, Michael Hubank, Nick Turner, David Watkins, Ian Chau, David Cunningham, Stefano Lise, Naureen Starling, Marco Gerlinger

Abstract

Background: Circulating free DNA sequencing (cfDNA-Seq) can portray cancer genome landscapes, but highly sensitive and specific technologies are necessary to accurately detect mutations with often low variant frequencies.

Methods: We developed a customizable hybrid-capture cfDNA-Seq technology using off-the-shelf molecular barcodes and a novel duplex DNA molecule identification tool for enhanced error correction.

Results: Modeling based on cfDNA yields from 58 patients showed that this technology, requiring 25 ng of cfDNA, could be applied to >95% of patients with metastatic colorectal cancer (mCRC). cfDNA-Seq of a 32-gene, 163.3-kbp target region detected 100% of single-nucleotide variants, with 0.15% variant frequency in spike-in experiments. Molecular barcode error correction reduced false-positive mutation calls by 97.5%. In 28 consecutively analyzed patients with mCRC, 80 out of 91 mutations previously detected by tumor tissue sequencing were called in the cfDNA. Call rates were similar for point mutations and indels. cfDNA-Seq identified typical mCRC driver mutations in patients in whom biopsy sequencing had failed or did not include key mCRC driver genes. Mutations only called in cfDNA but undetectable in matched biopsies included a subclonal resistance driver mutation to anti-EGFR antibodies in KRAS, parallel evolution of multiple PIK3CA mutations in 2 cases, and TP53 mutations originating from clonal hematopoiesis. Furthermore, cfDNA-Seq off-target read analysis allowed simultaneous genome-wide copy number profile reconstruction in 20 of 28 cases. Copy number profiles were validated by low-coverage whole-genome sequencing.

Conclusions: This error-corrected, ultradeep cfDNA-Seq technology with a customizable target region and publicly available bioinformatics tools enables broad insights into cancer genomes and evolution.

Clinicaltrialsgov identifier: NCT02112357.

© 2018 American Association for Clinical Chemistry.

Figures

Figure 1
Figure 1
(A) Percentage of reads on-target before de-duplication in samples prepared with 65°C vs 70°C post-capture washes. (B) Graphic depicting the principles of MBC error correction. Reads with the same MBC that map to the identical genomic location are grouped into a consensus family. If a variant (pink) occurs in all reads then the consensus read sequence will be variant for that base (top). However if a variant (green) is only detected in a small fraction of the reads in the family, it will be disregarded and the consensus read sequence will be wild-type (bottom). (C) cfDNA mixing experiment: 25 ng mixes of donor A spiked into donor B at 0.15%, 0.075% and 0.0375%. (D) Illustration of duplex read pair detection. A double stranded cfDNA fragment (black) containing a variant (green) is depicted, ligated to Y-shaped MBC-tagged adapters (grey). (E) Expected and observed variant allele frequencies (VAF) and genomic positions for the 16 SNPs in the cfDNA mixing experiment. (F) Impact of MBC error correction on true positive and false positive calls. The top panels show the number of true positive variants (expected SNPs) that were bioinformatically called in the mixing experiment with standard de-duplication (left) and MBC de-duplication (right) using different variant call quality thresholds. The lower panel shows the number of likely false positive variant calls (not observed in the deep sequencing of either cfDNA sample used in the mix) for standard de-duplication (left) and MBC de-duplication (right).
Figure 2
Figure 2
(A) Concordance of mutations identified by cfDNA-Seq and by sequencing of tumor material. Mutations identified in both cfDNA-Seq and tumor sequencing are colored green. Novel variants called by cfDNA-Seq and not by tumor sequencing are colored blue. Variants not detected by cfDNA-Seq that were detected in tumor sequencing are colored orange. Pink indicates clonal hematopoiesis. Red outlines indicate mutations reported as tumorigenic in COSMIC. Variants in grey have been identified in the cfDNA of patients that either had been sequenced using the limited 5-gene amplicon panel or failed FOrMAT sequencing. Percentages indicate VAF in cfDNA. (B) Read depth and number of consensus family reads supporting each of the 11 variants in cases 7, 8, and 21 that had not been called in cfDNA but had previously been detected in tumor tissue. Median VAF 0.066%. (C) ddPCR validation of the KRAS c.183A>C mutation that results in the amino acid change Q61H in case 10. Green dots: droplets with wild-type DNA, blue dots (outlined by the red quadrant): droplets with mutant DNA, black dots: droplets that have no incorporated DNA. (D) ddPCR validation of 6 subclonal mutations called in cfDNA but not in tumor tissue.
Figure 3
Figure 3
(A) Genome wide copy number aberrations can be detected from targeted cfDNA-Seq, even where tumor content is low. Representative log copy ratio plots for five cases (green number) in our cohort with tumor content ranging from 53.5% to 8.6% (red number indicates max VAF) are shown. (B) Genome wide heat map of segmented copy number raw log ratio data after amplitude normalization. Gains are red and losses are blue. Profiles are ordered (left to right) from highest to lowest tumor content (based on maximum VAF) for all 20 cases that had a visible CNA profile. (C) Focused log copy ratio plot of chromosome 17 for case 11 which had a high level amplification of ERBB2.

Source: PubMed

3
Abonner