Integrated digital error suppression for improved detection of circulating tumor DNA

Aaron M Newman, Alexander F Lovejoy, Daniel M Klass, David M Kurtz, Jacob J Chabon, Florian Scherer, Henning Stehr, Chih Long Liu, Scott V Bratman, Carmen Say, Li Zhou, Justin N Carter, Robert B West, George W Sledge, Joseph B Shrager, Billy W Loo Jr, Joel W Neal, Heather A Wakelee, Maximilian Diehn, Ash A Alizadeh, Aaron M Newman, Alexander F Lovejoy, Daniel M Klass, David M Kurtz, Jacob J Chabon, Florian Scherer, Henning Stehr, Chih Long Liu, Scott V Bratman, Carmen Say, Li Zhou, Justin N Carter, Robert B West, George W Sledge, Joseph B Shrager, Billy W Loo Jr, Joel W Neal, Heather A Wakelee, Maximilian Diehn, Ash A Alizadeh

Abstract

High-throughput sequencing of circulating tumor DNA (ctDNA) promises to facilitate personalized cancer therapy. However, low quantities of cell-free DNA (cfDNA) in the blood and sequencing artifacts currently limit analytical sensitivity. To overcome these limitations, we introduce an approach for integrated digital error suppression (iDES). Our method combines in silico elimination of highly stereotypical background artifacts with a molecular barcoding strategy for the efficient recovery of cfDNA molecules. Individually, these two methods each improve the sensitivity of cancer personalized profiling by deep sequencing (CAPP-Seq) by about threefold, and synergize when combined to yield ∼15-fold improvements. As a result, iDES-enhanced CAPP-Seq facilitates noninvasive variant detection across hundreds of kilobases. Applied to non-small cell lung cancer (NSCLC) patients, our method enabled biopsy-free profiling of EGFR kinase domain mutations with 92% sensitivity and >99.99% specificity at the variant level, and with 90% sensitivity and 96% specificity at the patient level. In addition, our approach allowed monitoring of NSCLC ctDNA down to 4 in 10(5) cfDNA molecules. We anticipate that iDES will aid the noninvasive genotyping and detection of ctDNA in research and clinical settings.

Figures

Figure 1. Framework for noninvasive profiling of…
Figure 1. Framework for noninvasive profiling of ctDNA
(a) Theoretical ctDNA detection limits are shown for single-mutation non-invasive genotyping and multi-mutation monitoring based on a typical cfDNA yield from 10 mL blood (assuming ~50% molecule recovery; top) (Methods). Background noise of CAPP-Seq is shown to increase as a function of lower ctDNA allele fractions, and was determined using pooled cfDNA sequencing data from 30 healthy adult controls (Methods). (b) Schematic illustrating the potential application of CAPP-Seq to noninvasive (biopsy-free) genotyping and monitoring of ctDNA.
Figure 2. Development of integrated digital error…
Figure 2. Development of integrated digital error suppression (iDES)
(a) Diagram depicting the use of CAPP-Seq barcode adapters to suppress errors. Here, CAPP-Seq adapters are ligated to a double-stranded (duplex) DNA molecule containing a real biological mutation in both strands as well as a non-replicated, asymmetric base change in only one strand (top). The combined application of insert and index barcodes allows for (i) error suppression and (ii) recovery of single stranded (center) and duplex (bottom) DNA molecules (Supplementary Fig. 1a, Methods). (b) Top: Heat map showing position-specific selector-wide error rates parceled into all possible base substitutions (rows) and organized by decreasing mean allele fractions (for each substitution type) across 12 cfDNA samples from healthy controls (columns; Supplementary Table 2). Background patterns are shown for different error suppression methods, including the combined application of barcoding and background polishing. Errors were defined as non-reference alleles excluding germline SNPs. Bottom: Selector-wide error metrics (Methods).
Figure 3. Technical performance of iDES
Figure 3. Technical performance of iDES
(a) Impact of alternative error suppression methods on nucleotide substitution classes. Error rates were calculated with respect to each of the four reference bases separately (Methods). (b) Distribution of background alleles uniquely eliminated by barcoding or polishing alone in healthy control cfDNA. (c) Comparison of iDES with various barcoding strategies for selector-wide error profiles and recovered hGEs. The barcoding strategy denoted by ‘2*’ maximizes the retention of sequenced molecules and is the approach used in this work (Methods). Data are presented as means +/− s.e.m. (d) Analytical modeling of detection limits for various error suppression methods,, as a function of available tumor-derived mutations (90% confidence detection limit; Methods). Sequencing throughput was calibrated to iDES, such that the quantity of reads needed to recover 5,000 hGEs was determined and then used to estimate the number of recovered hGEs for all other methods given their reported efficiencies (Supplementary Fig. 8a). The theoretically maximum detection-limit of a given method, shown as a horizontal line, is bound by the method’s error rate. For additional details, see Supplementary Figure 8. The same 12 normal control samples shown in Fig. 2b were used for the analyses in ac.
Figure 4. Noninvasive tumor genotyping with iDES-enhanced…
Figure 4. Noninvasive tumor genotyping with iDES-enhanced CAPP-Seq
Noninvasive tumor genotyping with iDES-enhanced CAPP-Seq was assessed using technical controls (ac) and patients with NSCLC (df). (a) A DNA reference blend containing known alleles spanning a broad AF range was diluted to 5% in normal cfDNA and analyzed in replicate (n=4) for both known variants (n=29) and 279 negative control variants (Supplementary Table 4, Methods). Left: Differential impact of barcoding, polishing, and iDES on genotyping results for a single representative replicate. Only variant calls with at least 2 supporting reads are shown. Asterisks highlight the complementary background profiles removed by barcoding and polishing. Note that all variant calls are ordered along the x-axis, first by validation status and then by AF. Identical calls are aligned vertically. Right: Performance metrics across all four replicates. Genotyping thresholds were determined as described in Methods. (b) AFs determined by iDES-enhanced CAPP-Seq in the 5% variant blend from panel a (observed) versus their concentrations determined by digital PCR (expected). Only variants in the reference blend with externally validated AFs targeted by our NSCLC selector are shown (n=13; Supplementary Table 4). Data are expressed as means ± s.e.m (n=4 replicates). (c) Heat map (top) and scatter plot (bottom) depicting candidate SNVs identified by noninvasive selector-wide genotyping of the 5% variant blend from panel a (Supplementary Fig. 10, Methods). SNVs were tracked across three additional replicates and a ten-fold lower spike. Horizontal lines depict mean AFs. (df) Noninvasive tumor genotyping of NSCLC patients. (d) Bottom: The number of hotspot SNVs noninvasively detected in 24 pretreatment NSCLC cfDNA samples by four methods, including iDES (barcoding + polishing). All queried variants are listed in Supplementary Table 4. Top: Positive predictive value (PPV) of each method (indicated below), based on the number of hotspot SNVs that were later confirmed in matching tumor biopsies. (e) The performance of iDES for noninvasive tumor genotyping of two plasma cohorts was assessed using observed allele fractions with a Receiver Operating Characteristic (ROC) plot. In the first cohort (n=66 plasma samples from patients with matching tumor biopsies), hotspot variants from a predefined list of 292 variants were assessed (Supplementary Table 4). Results are shown for the 46 plasma samples with at least one detectable mutation (‘All genes’, n=24 patients); specificity was assessed using variants that were detected but that could not be verified in the primary tumor. In the second cohort, EGFR hotspot variants were assessed in an extended cohort of 103 plasma samples from 41 EGFR-positive patients with NSCLC (‘EGFR’). Specificity was assessed using 27 EGFR-wildtype subjects (Methods). The pie chart shows the distribution of detected EGFR variants. Only patients with genotyped tumors were analyzed. AUC, area under the curve. (f) Noninvasive genotyping of EGFR mutations in plasma samples from 37 patients with advanced NSCLC and with biopsy-confirmed EGFR mutations. Top: Performance of iDES-enhanced CAPP-Seq for the genotyping of actionable EGFR mutations (n=36 patients; 1 of 37 patients did not have an actionable alteration). All performance metrics were assessed at the variant level. Bottom: Comparison of error-suppression methods for noninvasive tumor genotyping of the entire EGFR kinase domain in all patients with biopsy-confirmed EGFR SNVs (n=29 of 37 patients). Performance metrics were assessed separately at the variant level and patient level (using 27 EGFR-wildtype subjects). Percentages indicate iDES performance only. Further details are provided in Methods. Sn, sensitivity; Sp, specificity; PPV, positive predictive value; NPV, negative predictive value.
Figure 5. Ultrasensitive ctDNA detection and monitoring…
Figure 5. Ultrasensitive ctDNA detection and monitoring with iDES-enhanced CAPP-Seq
(a) Analysis of ctDNA detection limits using a hypermutated glioblastoma (GBM) tumor mixed into normal control cfDNA in defined proportions. Here, 30 mutations were randomly selected from a pool of 1,502 total mutations known to be present in the GBM tumor and covered by the sequencing panel. Random sampling of 30 mutations was repeated 50 times and the results are presented as means +/− 95% confidence intervals. For further details, see Supplementary Fig. 12 and Methods. AF, allele fraction. (b) Comparison of error-suppression methods for the detection of ctDNA in pre- and post-treatment plasma from 30 NSCLC patients. Patient-derived somatic variants (columns; n=30 sets) were assessed in every plasma sample (rows; n=116), including 30 normal controls to evaluate specificity. The same samples were analyzed for each method (e.g., iDES) and are identically ordered in the heat map. Red squares denote a genetically matched sample (i.e., patient-derived tumor mutations were significantly detectable in a plasma sample from the same patient). Additional details are provided in Supplementary Fig. 13. (c) Using iDES, but not other methods, ctDNA was detectable prior to clinical progression in a stage IIIB NSCLC patient. (d) Top: Analysis of variants called from tumor biopsies versus variants called directly from pretreatment cfDNA with iDES-enhanced CAPP-Seq. Estimated ctDNA levels were compared by linear regression. Open circles/squares indicate time points without significantly detectable ctDNA. ND, not detected. Time points are shown in chronological order (1, pretreatment; >1, post-treatment). Bottom: Comparison of error suppression methods for the same analysis shown above but across all 8 evaluable patients (Methods). Linear regression was applied globally across all 37 plasma time points profiled for these eight patients.
Figure 6. iDES-enhanced CAPP-Seq
Figure 6. iDES-enhanced CAPP-Seq
Same as Fig. 1a, but showing the impact of iDES on the probability of background errors. Post-iDES background data were derived from cfDNA samples pooled from a test cohort of 18 normal donors, none of which were used for learning baseline background distributions. Further details are provided in Methods.

References

    1. Heitzer E, Ulz P, Geigl JB. Circulating tumor DNA as a liquid biopsy for cancer. Clin Chem. 2015;61:112–123.
    1. Diehl F, et al. Circulating mutant DNA to assess tumor dynamics. Nat Med. 2008;14:985–990.
    1. Bettegowda C, et al. Detection of circulating tumor DNA in early- and late-stage human malignancies. Sci Transl Med. 2014;6:224ra224.
    1. Bratman SV, Newman AM, Alizadeh AA, Diehn M. Potential clinical utility of ultrasensitive circulating tumor DNA detection with CAPP-Seq. Expert Rev Mol Diagn. 2015;15:715–719.
    1. Diaz LA, Bardelli A. Liquid Biopsies: Genotyping Circulating Tumor DNA. Journal of Clinical Oncology. 2014
    1. Kurtz DM, et al. Noninvasive monitoring of diffuse large B-cell lymphoma by immunoglobulin high-throughput sequencing. Blood. 2015;125:3679–3687.
    1. Butler TM, et al. Exome Sequencing of Cell-Free DNA from Metastatic Cancer Patients Identifies Clinically Actionable Mutations Distinct from Primary Disease. PLoS One. 2015;10:e0136407.
    1. Newman AM, et al. An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat Med. 2014;20:548–554.
    1. Taniguchi K, et al. Quantitative detection of EGFR mutations in circulating tumor DNA derived from lung adenocarcinomas. Clin Cancer Res. 2011;17:7808–7815.
    1. Jabara CB, Jones CD, Roach J, Anderson JA, Swanstrom R. Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID. Proc Natl Acad Sci U S A. 2011;108:20166–20171.
    1. Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B. Detection and quantification of rare mutations with massively parallel sequencing. Proc Natl Acad Sci U S A. 2011;108:9530–9535.
    1. Schmitt MW, et al. Detection of ultra-rare mutations by next-generation sequencing. Proc Natl Acad Sci U S A. 2012;109:14508–14513.
    1. Kennedy SR, et al. Detecting ultralow-frequency mutations by Duplex Sequencing. Nat Protoc. 2014;9:2586–2606.
    1. Gregory MT, et al. Targeted single molecule mutation detection with massively parallel sequencing. Nucleic Acids Research. 2015
    1. Kukita Y, et al. High-fidelity target sequencing of individual molecules identified using barcode sequences: de novo detection and absolute quantitation of mutations in plasma cell-free DNA from cancer patients. DNA Research. 2015
    1. Lou DI, et al. High-throughput DNA sequencing errors are reduced by orders of magnitude using circle sequencing. Proc Natl Acad Sci U S A. 2013;110:19872–19877.
    1. Schmitt MW, et al. Sequencing small genomic targets with high efficiency and extreme accuracy. Nat Methods. 2015;12:423–425.
    1. De Mattos-Arruda L, et al. Cerebrospinal fluid-derived circulating tumour DNA better represents the genomic alterations of brain tumours than plasma. Nat Commun. 2015;6:8839.
    1. Costello M, et al. Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation. Nucleic Acids Research. 2013;41:e67.
    1. Chen G, Mosier S, Gocke CD, Lin MT, Eshleman JR. Cytosine deamination is a major cause of baseline noise in next-generation sequencing. Mol Diagn Ther. 2014;18:587–593.
    1. Leon SA, Shapiro B, Sklaroff DM, Yaros MJ. Free DNA in the serum of cancer patients and the effect of therapy. Cancer Res. 1977;37:646–650.
    1. Hafner C, et al. Oncogenic PIK3CA mutations occur in epidermal nevi and seborrheic keratoses with a characteristic mutation pattern. Proc Natl Acad Sci U S A. 2007;104:13450–13454.
    1. Higgins MJ, et al. Detection of tumor PIK3CA status in metastatic breast cancer using peripheral blood. Clin Cancer Res. 2012;18:3462–3469.
    1. Sequist LV, et al. Rociletinib in EGFR-mutated non-small-cell lung cancer. N Engl J Med. 2015;372:1700–1709.
    1. Oxnard GR, et al. Noninvasive detection of response and resistance in EGFR-mutant lung cancer using quantitative next-generation genotyping of cell-free plasma DNA. Clin Cancer Res. 2014;20:1698–1705.
    1. Pao W, et al. EGF receptor gene mutations are common in lung cancers from "never smokers" and are associated with sensitivity of tumors to gefitinib and erlotinib. Proc Natl Acad Sci U S A. 2004;101:13306–13311.
    1. Pao W, et al. Acquired resistance of lung adenocarcinomas to gefitinib or erlotinib is associated with a second mutation in the EGFR kinase domain. PLoS Med. 2005;2:e73.
    1. Sequist LV, et al. Genotypic and histological evolution of lung cancers acquiring resistance to EGFR inhibitors. Sci Transl Med. 2011;3:75ra26.
    1. Douillard JY, et al. Gefitinib treatment in EGFR mutated caucasian NSCLC: circulating-free tumor DNA as a surrogate for determination of EGFR status. J Thorac Oncol. 2014;9:1345–1353.
    1. Mok T, et al. Detection and Dynamic Changes of EGFR Mutations from Circulating Tumor DNA as a Predictor of Survival Outcomes in NSCLC Patients Treated with First-line Intercalated Erlotinib and Chemotherapy. Clin Cancer Res. 2015;21:3196–3203.
    1. Misale S, et al. Emergence of KRAS mutations and acquired resistance to anti-EGFR therapy in colorectal cancer. Nature. 2012;486:532–536.
    1. Murtaza M, et al. Non-invasive analysis of acquired resistance to cancer therapy by sequencing of plasma DNA. Nature. 2013;497:108–112.
    1. Thress KS, et al. Acquired EGFR C797S mutation mediates resistance to AZD9291 in non-small cell lung cancer harboring EGFR T790M. Nat Med. 2015;21:560–562.
    1. Marchetti A, et al. Early Prediction of Response to Tyrosine Kinase Inhibitors by Quantification of EGFR Mutations in Plasma of NSCLC Patients. J Thorac Oncol. 2015;10:1437–1443.
    1. Dawson SJ, et al. Analysis of circulating tumor DNA to monitor metastatic breast cancer. N Engl J Med. 2013;368:1199–1209.
    1. Garcia-Murillas I, et al. Mutation tracking in circulating tumor DNA predicts relapse in early breast cancer. Sci Transl Med. 2015;7:302ra133.
    1. Roschewski M, et al. Circulating tumour DNA and CT monitoring in patients with untreated diffuse large B-cell lymphoma: a correlative biomarker study. Lancet Oncol. 2015;16:541–549.
    1. Samorodnitsky E, et al. Evaluation of Hybridization Capture Versus Amplicon-Based Methods for Whole-Exome Sequencing. Hum Mutat. 2015;36:903–914.
    1. Drilon A, et al. Broad, Hybrid Capture-Based Next-Generation Sequencing Identifies Actionable Genomic Alterations in Lung Adenocarcinomas Otherwise Negative for Such Alterations by Other Genomic Testing Approaches. Clin Cancer Res. 2015;21:3631–3639.
    1. Rehm HL, et al. ACMG clinical laboratory standards for next-generation sequencing. Genet Med. 2013;15:733–747.
    1. Ellis PM, Verma S, Sehdev S, Younus J, Leighl NB. Challenges to implementation of an epidermal growth factor receptor testing strategy for non-small-cell lung cancer in a publicly funded health care system. J Thorac Oncol. 2013;8:1136–1141.
    1. Leighl NB, et al. Molecular testing for selection of patients with lung cancer for epidermal growth factor receptor and anaplastic lymphoma kinase tyrosine kinase inhibitors: American Society of Clinical Oncology endorsement of the College of American Pathologists/International Association for the study of lung cancer/association for molecular pathology guideline. J Clin Oncol. 2014;32:3673–3679.
    1. Lim C, et al. Biomarker testing and time to treatment decision in patients with advanced nonsmall-cell lung cancer. Ann Oncol. 2015;26:1415–1421.
    1. Shiau CJ, et al. Sample features associated with success rates in population-based EGFR mutation testing. J Thorac Oncol. 2014;9:947–956.
    1. Yatabe Y, et al. EGFR mutation testing practices within the Asia Pacific region: results of a multicenter diagnostic survey. J Thorac Oncol. 2015;10:438–445.
    1. Hindson BJ, et al. High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal Chem. 2011;83:8604–8610.
    1. Forbes SA, et al. COSMIC: exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res. 2015;43:D805–811.
    1. Su Z, et al. A platform for rapid detection of multiple oncogenic mutations with relevance to targeted therapy in non-small-cell lung cancer. J Mol Diagn. 2011;13:74–84.
    1. Lambert D. Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics. 1992;34:1–14.

Source: PubMed

3
S'abonner