Machine Learning Models to Improve the Differentiation Between Benign and Malignant Breast Lesions on Ultrasound: A Multicenter External Validation Study

Ling Huo, Yao Tan, Shu Wang, Cuizhi Geng, Yi Li, XiangJun Ma, Bin Wang, YingJian He, Chen Yao, Tao Ouyang, Ling Huo, Yao Tan, Shu Wang, Cuizhi Geng, Yi Li, XiangJun Ma, Bin Wang, YingJian He, Chen Yao, Tao Ouyang

Abstract

Purpose: This study aimed to establish and evaluate the usefulness of a simple, practical, and easy-to-promote machine learning model based on ultrasound imaging features for diagnosing breast cancer (BC).

Materials and methods: Logistic regression, random forest, extra trees, support vector, multilayer perceptron, and XG Boost models were developed. The modeling data set of 1345 cases was from a tertiary class A hospital in China. The external validation data set of 1965 cases were from 3 tertiary class A hospitals and 2 primary hospitals. The area under the receiver operating characteristic curve (AUC) was used as the main evaluation index, and pathological biopsy was used as the gold standard for evaluating each model. Diagnostic capability was also compared with that of clinicians.

Results: Among the six models, the logistic model showed superior diagnostic efficiency, with an AUC of 0.771 and 0.906 and Brier scores of 0.181 and 0.165 in the test and validation sets, respectively. The AUCs of the clinician diagnosis and the logistic model were 0.913 and 0.906. Their AUCs in the tertiary class A hospitals were 0.915 and 0.915, respectively, and were 0.894 and 0.873 in primary hospitals, respectively.

Conclusion: The externally validated logical model can be used to distinguish between malignant and benign breast lesions in ultrasound images. Compared with clinician diagnosis, the logistic model has better diagnostic efficiency, making it potentially useful to assist in screening, particularly in lower level medical institutions.

Trial registration: http://www.clinicaltrials.gov. ClinicalTrials.gov ID: NCT03080623.

Keywords: breast cancer; diagnostic accuracy; machine learning; patient stratification; screening modalities; ultrasound imaging.

Conflict of interest statement

The authors report no conflicts of interest in this work.

© 2021 Huo et al.

Figures

Figure 1
Figure 1
Representative ultrasound images showing malignant breast lesions. (A) A hypoechoic malignant lesion with irregular shape, calcification (thick arrow), and not circumscribed margin thin arrow). (B) A hypoechoic lesion with an oval shape, circumscribed margins (thin arrow), and enhancement posterior features (thick arrow). (C) A heterogeneous, hypoechoic structural disordered area with irregular shape and parallel orientation characteristic.
Figure 2
Figure 2
ROC plots of the calibrated model in the test set (A) and validation set (B).
Figure 3
Figure 3
Calibration plots of the calibrated model in the test set (A) and validation set (B).
Figure 4
Figure 4
Probability distribution by model.

References

    1. Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68:394–424. doi:10.3322/caac.21492
    1. Fan L, Strasser-Weippl K, Li JJ, et al. Breast cancer in China. Lancet Oncol. 2014;15:e279–e289. doi:10.1016/s1470-2045(13)70567-9
    1. Ma J, Jemal A, Fedewa SA, et al. The American Cancer Society 2035 challenge goal on cancer mortality reduction. CA Cancer J Clin. 2019;69:351–362. doi:10.3322/caac.21564
    1. Chen W, Zheng R, Baade PD, et al. Cancer statistics in China, 2015. CA Cancer J Clin. 2016;66(2):115–132. doi:10.3322/caac.21338
    1. Gao Y, Liu M, Shi S, et al. Diagnostic value of seven biomarkers for breast cancer: an overview with evidence mapping and indirect comparisons of diagnostic test accuracy. Clin Exp Med. 2020;20:97–108. doi:10.1007/s10238-019-00598-z
    1. Yan C, Hu J, Yang Y, et al. Plasma extracellular vesicle‑packaged microRNAs as candidate diagnostic biomarkers for early‑stage breast cancer. Mol Med Rep. 2019;20:3991–4002. doi:10.3892/mmr.2019.10669
    1. Kahraman M, Röske A, Laufer T, et al. MicroRNA in diagnosis and therapy monitoring of early-stage triple-negative breast cancer. Sci Rep. 2018;8:11584. doi:10.1038/s41598-018-29917-2
    1. Nassar FJ, Nasr R, Talhouk R. MicroRNAs as biomarkers for early breast cancer diagnosis, prognosis and therapy prediction. Pharmacol Ther. 2017;172:34–49. doi:10.1016/j.pharmthera.2016.11.012
    1. Loke SY, Munusamy P, Koh GL, et al. A circulating miRNA signature for stratification of breast lesions among women with abnormal screening mammograms. Cancers (Basel). 2019;11:1872. doi:10.3390/cancers11121872
    1. Yang Y, Li F, Luo X, et al. Identification of LCN1 as a potential biomarker for breast cancer by bioinformatic analysis. DNA Cell Biol. 2019;38:1088–1099. doi:10.1089/dna.2019.4843
    1. Qian X, Ruan L. APC gene promoter aberrant methylation in serum as a biomarker for breast cancer diagnosis: a meta-analysis. Thorac Cancer. 2018;9:284–290. doi:10.1111/1759-7714.12580
    1. Ye M, Huang T, Ying Y, et al. Detection of 14-3-3 sigma (σ) promoter methylation as a noninvasive biomarker using blood samples for breast cancer diagnosis. Oncotarget. 2017;8:9230–9242. doi:10.18632/oncotarget.13992
    1. Crigna AT, Samec M, Koklesova L, et al. Cell-free nucleic acid patterns in disease prediction and monitoring-hype or hope? EPMA J. 2020;11:1–25. doi:10.1007/s13167-020-00226-x
    1. Phallen J, Sausen M, Adleff V, et al. Direct detection of early-stage cancers using circulating tumor DNA. Sci Transl Med. 2017;9:eaan2415. doi:10.1126/scitranslmed.aan2415
    1. Goldstein E, Yeghiazaryan K, Ahmad A, et al. Optimal multiparametric set-up modelled for best survival outcomes in palliative treatment of liver malignancies: unsupervised machine learning and 3 PM recommendations. EPMA J. 2020;11:505–515. doi:10.1007/s13167-020-00221-2
    1. Qian S, Golubnitschaja O, Zhan X. Chronic inflammation: key player and biomarker-set to predict and prevent cancer development and progression based on individualized patient profiles. EPMA J. 2019;10:365–381. doi:10.1007/s13167-019-00194-x
    1. Samec M, Liskova A, Koklesova L, et al. Flavonoids against the Warburg phenotype-concepts of predictive, preventive and personalised medicine to cut the Gordian knot of cancer cell metabolism. EPMA J. 2020;11:377–398. doi:10.1007/s13167-020-00217-y
    1. Yap MH, Pons G, Martí J, et al. Automated breast ultrasound lesions detection using convolutional neural networks. IEEE J Biomed Health Inform. 2018;22:1218–1226. doi:10.1109/JBHI.2017.2731873
    1. Han S, Kang H-K, Jeong J-Y, et al. A deep learning framework for supporting the classification of breast lesions in ultrasound images. Phys Med Biol. 2017;62:7714. doi:10.1088/1361-6560/aa82ec
    1. Fujioka T, Kubota K, Mori M, et al. Efficient anomaly detection with generative adversarial network for breast ultrasound imaging. Diagnostics. 2020;10:456. doi:10.3390/diagnostics10070456
    1. Zhu Y-C, AlZoubi A, Jassim S, et al. A generic deep learning framework to classify thyroid and breast lesions in ultrasound images. Ultrasonics. 2021;110:106300. doi:10.1016/j.ultras.2020.106300
    1. Fujioka T, Kubota K, Mori M, et al. Distinction between benign and malignant breast masses at breast ultrasound using deep learning method with convolutional neural network. Jpn J Radiol. 2019;37:466–472. doi:10.1007/s11604-019-00831-5
    1. Yap MH, Goyal M, Osman FM, et al. Breast ultrasound lesions recognition: end-to-end deep learning approaches. J Med Imaging. 2018;6:011007.
    1. Zhao Z, He YJOT. Application value of random forest and support vector machine in diagnosing breast lesions by using ultrasonic image features. Chin J Health Stat. 2018;35:684–688.
    1. Harkness EF, Astley SM, Evans DG. Risk-based breast cancer screening strategies in women. Best Pract Res Clin Obstet Gynaecol. 2020;65:3–17. doi:10.1016/j.bpobgyn.2019.11.005
    1. Kolb TM, Lichy J, Newhouse JH. Comparison of the performance of screening mammography, physical examination, and breast US and evaluation of factors that influence them: an analysis of 27,825 patient evaluations. Radiology. 2002;225:165–175. doi:10.1148/radiol.2251011667
    1. Melnikow J, Fenton JJ, Whitlock EP, et al. Supplemental screening for breast cancer in women with dense breasts: a systematic review for the U.S. Preventive Services Task Force. Ann Intern Med. 2016;164(4):268–278. doi:10.7326/M15-1789
    1. Wang Y, Chen H, Li N, et al. Ultrasound for breast cancer screening in high-risk women: results from a population-based cancer screening program in China. Front Oncol. 2019;9:286. doi:10.3389/fonc.2019.00286
    1. Islam MM, Haque MR, Iqbal H, et al. Breast cancer prediction: a comparative study using machine learning techniques. SN Comp Sci. 2020;1(5):290. doi:10.1007/s42979-020-00305-w
    1. Salod Z, Singh Y. A five-year (2015 to 2019) analysis of studies focused on breast cancer prediction using machine learning: a systematic review and bibliometric analysis. J Public Health Res. 2020;9:1792. doi:10.4081/jphr.2020.1772
    1. Nindrea RD, Aryandono T, Lazuardi L, et al. Diagnostic accuracy of different machine learning algorithms for breast cancer risk calculation: a meta-analysis. Asian Pac J Cancer Prev. 2018;19:1747.
    1. Yassin NIR, Omran S, El Houby EMF, Allam H. Machine learning techniques for breast cancer computer aided diagnosis using different image modalities: a systematic review. Comput Methods Programs Biomed. 2018;156:25–45. doi:10.1016/j.cmpb.2017.12.012
    1. Kumar M, Khatri SK, Mohammadian M. Breast cancer identification and prognosis with machine learning techniques - an elucidative review. J Interdiscip Math. 2020;23:503–521. doi:10.1080/09720502.2020.1731963
    1. Guo Q, Zhang L, Di Z, et al. Assessing risk category of breast cancer by ultrasound imaging characteristics. Ultrasound Med Biol. 2018;44:815–824. doi:10.1016/j.ultrasmedbio.2017.12.001
    1. Niu Z, Tian JW, Ran HT, et al. Risk-predicted dual nomograms consisting of clinical and ultrasound factors for downgrading BI-RADS category 4a breast lesions - a multiple centre study. J Cancer. 2021;12:292–304. doi:10.7150/jca.51302
    1. Luo WQ, Huang QX, Huang XW, et al. Predicting breast cancer in Breast Imaging Reporting and Data System (BI-RADS) ultrasound category 4 or 5 Lesions: a nomogram combining radiomics and BI-RADS. Sci Rep. 2019;9:11921. doi:10.1038/s41598-019-48488-4
    1. Chhatwal J, Alagoz O, Lindstrom MJ, et al. A logistic regression model based on the national mammography database format to aid breast cancer diagnosis. Am J Roentgenol. 2009;192:1117–1127. doi:10.2214/AJR.07.3345
    1. Wang ZL, Li JL, Li M, et al. Study of quantitative elastography with supersonic shear imaging in the diagnosis of breast tumours. Radiol Med. 2013;118:583–590. doi:10.1007/s11547-012-0903-x
    1. Gao LY, Gu Y, Tian JW, et al. Gail model improves the diagnostic performance of the fifth edition of Ultrasound BI-RADS for Predicting Breast Cancer: a multicenter prospective study. Acad Radiol. 2020. doi:10.1016/j.acra.2020.12.002
    1. Liberman L, Menell JH. Breast imaging reporting and data system (BI-RADS). Radiol Clin North Am. 2002;40:409–430. doi:10.1016/S0033-8389(01)00017-3

Source: PubMed

3
Tilaa