Letter to the Editor: An ultra-sensitive assay using cell-free DNA fragmentomics for multi-cancer early detection

Hua Bao, Zheng Wang, Xiaoji Ma, Wei Guo, Xiangyu Zhang, Wanxiangfu Tang, Xin Chen, Xinyu Wang, Yikuan Chen, Shaobo Mo, Naixin Liang, Qianli Ma, Shuyu Wu, Xiuxiu Xu, Shuang Chang, Yulin Wei, Xian Zhang, Hairong Bao, Rui Liu, Shanshan Yang, Ya Jiang, Xue Wu, Yaqi Li, Long Zhang, Fengwei Tan, Qi Xue, Fangqi Liu, Sanjun Cai, Shugeng Gao, Junjie Peng, Jian Zhou, Yang Shao, Hua Bao, Zheng Wang, Xiaoji Ma, Wei Guo, Xiangyu Zhang, Wanxiangfu Tang, Xin Chen, Xinyu Wang, Yikuan Chen, Shaobo Mo, Naixin Liang, Qianli Ma, Shuyu Wu, Xiuxiu Xu, Shuang Chang, Yulin Wei, Xian Zhang, Hairong Bao, Rui Liu, Shanshan Yang, Ya Jiang, Xue Wu, Yaqi Li, Long Zhang, Fengwei Tan, Qi Xue, Fangqi Liu, Sanjun Cai, Shugeng Gao, Junjie Peng, Jian Zhou, Yang Shao

Abstract

Early detection can benefit cancer patients with more effective treatments and better prognosis, but existing early screening tests are limited, especially for multi-cancer detection. This study investigated the most prevalent and lethal cancer types, including primary liver cancer (PLC), colorectal adenocarcinoma (CRC), and lung adenocarcinoma (LUAD). Leveraging the emerging cell-free DNA (cfDNA) fragmentomics, we developed a robust machine learning model for multi-cancer early detection. 1,214 participants, including 381 PLC, 298 CRC, 292 LUAD patients, and 243 healthy volunteers, were enrolled. The majority of patients (N = 971) were at early stages (stage 0, N = 34; stage I, N = 799). The participants were randomly divided into a training cohort and a test cohort in a 1:1 ratio while maintaining the ratio for the major histology subtypes. An ensemble stacked machine learning approach was developed using multiple plasma cfDNA fragmentomic features. The model was trained solely in the training cohort and then evaluated in the test cohort. Our model showed an Area Under the Curve (AUC) of 0.983 for differentiating cancer patients from healthy individuals. At 95.0% specificity, the sensitivity of detecting all cancer reached 95.5%, while 100%, 94.6%, and 90.4% for PLC, CRC, and LUAD, individually. The cancer origin model demonstrated an overall 93.1% accuracy for predicting cancer origin in the test cohort (97.4%, 94.3%, and 85.6% for PLC, CRC, and LUAD, respectively). Our model sensitivity is consistently high for early-stage and small-size tumors. Furthermore, its detection and origin classification power remained superior when reducing sequencing depth to 1× (cancer detection: ≥ 91.5% sensitivity at 95.0% specificity; cancer origin: ≥ 91.6% accuracy). In conclusion, we have incorporated plasma cfDNA fragmentomics into the ensemble stacked model and established an ultrasensitive assay for multi-cancer early detection, shedding light on developing cancer early screening in clinical practice.

Keywords: Cell-free DNA; Fragmentomics; Machine learning; Multi-cancer early detection.

Conflict of interest statement

Hua B, WT, XC, SW, XX, Shuang C, YW, Xian Z, Hairong B, RL, SSY, YJ, Xue W, and YS are employees of Nanjing Geneseeq Technology Inc., China. All other authors have declared no conflicts of interest.

© 2022. The Author(s).

Figures

Fig. 1
Fig. 1
Schematic diagram of the study design. A The training cohort (N = 608) included 191 primary liver cancer (PLC), 149 colorectal cancer (CRC), 146 lung adenocarcinoma (LUAD) patients, and 122 healthy controls, which were used to train the cancer detection and cancer origin models. The test cohort (N = 606), which included 190 PLC, 149 CRC, 146 LUAD, and 121 healthy controls, was used to evaluate model performances. B Plasma samples were collected from PLC, CRC, LUAD patients, and healthy volunteers. The cfDNA was extracted from the participant's plasma sample and subject to whole-genome sequencing (WGS). Five different feature types, including Fragment Size Coverage (FSC), Fragment Size Distribution (FSD), EnD Motif (EDM), BreakPoint Motif (BPM), and Copy Number Variation (CNV), were calculated. For each feature type, a base model was constructed based on the ensemble learning of five algorithms- GLM, GBM, Random Forest, Deep Learning, and XGBoost. The base model predictions were then ensembled into a large matrix, subsequently used to train the final ensemble stacked model
Fig. 2
Fig. 2
Performance and robustness valuation for the ensemble stacked model. A ROC curves evaluating the cancer detection model in distinguishing cancer patients from healthy volunteers in the test cohort, and further categorized into each cancer type class. B Violin plots illustrating cancer score distribution in the healthy, all cancer, primary liver cancer (PLC), colorectal cancer (CRC), and lung adenocarcinoma (LUAD) groups in the test cohort predicted by the cancer detection model. The 95% specificity cutoff for cancer score was 0.39, as shown by the dotted line. C Performance of the cancer detection model in identifying all cancer patients. D Dot plot of sensitivity in cancer detection by each cancer type and/or stage, at 95% specificity. The error bars represented the 95% confidence interval. E Robustness test for the cancer detection model using test cohort with downsampled coverage depth (4×-1×). The error bars were calculated based on five repeats for each coverage. F Confusion matrix of the selected test cohort by cancer detection model for the cancer origin model. G Violin illustrating cancer origin score distribution in the PLC, CRC, and LUAD groups in the selected test cohorts predicted by the cancer origin model. H Dot plot illustrating robustness test for the cancer origin model using the selected test cohort with downsampled coverage depth (4×-1×). I Heatmap illustrating the detailed results of each patient for the robustness test of the cancer origin model

References

    1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021;71:209–249.
    1. World Health Organization . Guide to cancer early diagnosis. World Health Organization; 2017.
    1. Chen X, Gole J, Gore A, He Q, Lu M, Min J, Yuan Z, Yang X, Jiang Y, Zhang T, et al. Non-invasive early detection of cancer four years before conventional diagnosis using a blood test. Nat Commun. 2020;11:3475. doi: 10.1038/s41467-020-17316-z.
    1. Patel M, Shariff MI, Ladep NG, Thillainayagam AV, Thomas HC, Khan SA, Taylor-Robinson SD. Hepatocellular carcinoma: diagnostics and screening. J Eval Clin Pract. 2012;18:335–342. doi: 10.1111/j.1365-2753.2010.01599.x.
    1. National Lung Screening Trial Research T. Church TR, Black WC, Aberle DR, Berg CD, Clingan KL, Duan F, Fagerstrom RM, Gareen IF, Gierada DS, et al. Results of initial low-dose computed tomographic screening for lung cancer. N Engl J Med. 2013;368:1980–1991. doi: 10.1056/NEJMoa1209120.
    1. Daskalakis C, DiCarlo M, Hegarty S, Gudur A, Vernon SW, Myers RE. Predictors of overall and test-specific colorectal Cancer screening adherence. Prev Med. 2020;133:106022. doi: 10.1016/j.ypmed.2020.106022.
    1. Stroun M, Maurice P, Vasioukhin V, Lyautey J, Lederrey C, Lefort F, Rossier A, Chen XQ, Anker P. The origin and mechanism of circulating DNA. Ann N Y Acad Sci. 2000;906:161–168. doi: 10.1111/j.1749-6632.2000.tb06608.x.
    1. Sun K, Jiang P, Chan KC, Wong J, Cheng YK, Liang RH, Chan WK, Ma ES, Chan SL, Cheng SH, et al. Plasma DNA tissue mapping by genome-wide methylation sequencing for noninvasive prenatal, cancer, and transplantation assessments. Proc Natl Acad Sci U S A. 2015;112:E5503–E5512. doi: 10.1073/pnas.1422986112.
    1. Benesova L, Belsanova B, Suchanek S, Kopeckova M, Minarikova P, Lipska L, Levy M, Visokai V, Zavoral M, Minarik M. Mutation-based detection and monitoring of cell-free tumor DNA in peripheral blood of cancer patients. Anal Biochem. 2013;433:227–234. doi: 10.1016/j.ab.2012.06.018.
    1. Lo YMD, Han DSC DSC, Jiang P, Chiu RWK. Epigenetics, fragmentomics, and topology of cell-free DNA in liquid biopsies. Sci. 2021;372:eaaw3616. doi: 10.1126/science.aaw3616.
    1. Cristiano S, Leal A, Phallen J, Fiksel J, Adleff V, Bruhm DC, Jensen SO, Medina JE, Hruban C, White JR, et al. Genome-wide cell-free DNA fragmentation in patients with cancer. Nature. 2019;570:385–389. doi: 10.1038/s41586-019-1272-6.
    1. Mathios D, Johansen JS, Cristiano S, Medina JE, Phallen J, Larsen KR, Bruhm DC, Niknafs N, Ferreira L, Adleff V, et al. Detection and characterization of lung cancer using cell-free DNA fragmentomes. Nat Commun. 2021;12:5060. doi: 10.1038/s41467-021-24994-w.
    1. Liu J, Zhao H, Huang Y, Xu S, Zhou Y, Zhang W, Li J, Ming Y, Wang X, Zhao S, et al. Genome-wide cell-free DNA methylation analyses improve accuracy of non-invasive diagnostic imaging for early-stage breast cancer. Mol Cancer. 2021;20:36. doi: 10.1186/s12943-021-01330-w.
    1. Jiang P, Sun K, Tong YK, Cheng SH, Cheng THT, Heung MMS, Wong J, Wong VWS, Chan HLY, Chan KCA, et al. Preferred end coordinates and somatic variants as signatures of circulating tumor DNA associated with hepatocellular carcinoma. Proc Natl Acad Sci U S A. 2018;115:E10925–E10933.
    1. Chabon JJ, Hamilton EG, Kurtz DM, Esfahani MS, Moding EJ, Stehr H, Schroers-Martin J, Nabet BY, Chen B, Chaudhuri AA, et al. Integrating genomic features for non-invasive early lung cancer detection. Nature. 2020;580:245–251. doi: 10.1038/s41586-020-2140-0.
    1. Klein EA, Richards D, Cohn A, Tummala M, Lapham R, Cosgrove D, Chung G, Clement J, Gao J, Hunkapiller N, et al. Clinical validation of a targeted methylation-based multi-cancer early detection test using an independent validation set. Ann Oncol. 2021;32:1167–1177. doi: 10.1016/j.annonc.2021.05.806.
    1. Chen L, Abou-Alfa GK, Zheng B, Liu JF, Bai J, Du LT, Qian YS, Fan R, Liu XL, Wu L, et al. Genome-scale profiling of circulating cell-free DNA signatures for early detection of hepatocellular carcinoma in cirrhotic patients. Cell Res. 2021;31:589–592. doi: 10.1038/s41422-020-00457-7.
    1. Ma X, Chen Y, Tang W, Bao H, Mo S, Liu R, Wu S, Bao H, Li Y, Zhang L, et al. Multi-dimensional fragmentomic assay for ultrasensitive early detection of colorectal advanced adenoma and adenocarcinoma. J Hematol Oncol. 2021;14:175. doi: 10.1186/s13045-021-01189-w.

Source: PubMed

3
Předplatit