Identifying normal mammograms in a large screening population using artificial intelligence

Kristina Lång, Magnus Dustler, Victor Dahlblom, Anna Åkesson, Ingvar Andersson, Sophia Zackrisson, Kristina Lång, Magnus Dustler, Victor Dahlblom, Anna Åkesson, Ingvar Andersson, Sophia Zackrisson

Abstract

Objectives: To evaluate the potential of artificial intelligence (AI) to identify normal mammograms in a screening population.

Methods: In this retrospective study, 9581 double-read mammography screening exams including 68 screen-detected cancers and 187 false positives, a subcohort of the prospective population-based Malmö Breast Tomosynthesis Screening Trial, were analysed with a deep learning-based AI system. The AI system categorises mammograms with a cancer risk score increasing from 1 to 10. The effect on cancer detection and false positives of excluding mammograms below different AI risk thresholds from reading by radiologists was investigated. A panel of three breast radiologists assessed the radiographic appearance, type, and visibility of screen-detected cancers assigned low-risk scores (≤ 5). The reduction of normal exams, cancers, and false positives for the different thresholds was presented with 95% confidence intervals (CI).

Results: If mammograms scored 1 and 2 were excluded from screen-reading, 1829 (19.1%; 95% CI 18.3-19.9) exams could be removed, including 10 (5.3%; 95% CI 2.1-8.6) false positives but no cancers. In total, 5082 (53.0%; 95% CI 52.0-54.0) exams, including 7 (10.3%; 95% CI 3.1-17.5) cancers and 52 (27.8%; 95% CI 21.4-34.2) false positives, had low-risk scores. All, except one, of the seven screen-detected cancers with low-risk scores were judged to be clearly visible.

Conclusions: The evaluated AI system can correctly identify a proportion of a screening population as cancer-free and also reduce false positives. Thus, AI has the potential to improve mammography screening efficiency.

Key points: • Retrospective study showed that AI can identify a proportion of mammograms as normal in a screening population. • Excluding normal exams from screening using AI can reduce false positives.

Keywords: Artificial intelligence; Breast cancer; Mammography; Mass screening.

Conflict of interest statement

The authors of this manuscript declare relationships with the following companies: Siemens Healthineers (KL, IA, MD, and SZ received speaker fees).

Figures

Fig. 1
Fig. 1
Distribution of AI risk scores for all mammography-screen exams, screen-detected cancers, and false positives
Fig. 2
Fig. 2
A cancer missed by the AI system. A 7-mm-large invasive tubular cancer (grade 1) with the radiographic appearance of a spiculated mass that was categorised with an AI risk score of 3. MLO, mediolateral oblique view; CC, craniocaudal view
Fig. 3
Fig. 3
Distribution of AI risk scores in relation to radiographic appearance of screen-detected cancers. Three cancers are not included in the analysis (women recalled due to enlarged lymph node or due to symptoms)

References

    1. Giordano L, von Karsa L, Tomatis M, et al. Mammographic screening programmes in Europe: organization, coverage and participation. J Med Screen. 2012;19(Suppl 1):72–82. doi: 10.1258/jms.2012.012085.
    1. Smith RA, Andrews KS, Brooks D, et al. Cancer screening in the United States, 2018: a review of current American Cancer Society guidelines and current issues in cancer screening. CA Cancer J Clin. 2018;68:297–316. doi: 10.3322/caac.21446.
    1. Perry N, Broeders M, De Wolf C et al (2006) European guidelines for quality assurance in breast cancer screening and diagnosis Fourth Edition. Luxembourg: Office for Official Publications of the European Communities
    1. Gulland A. Staff shortages are putting UK breast cancer screening “at risk,” survey finds. BMJ. 2016;353:i2350. doi: 10.1136/bmj.i2350.
    1. Posso MC, Puig T, Quintana MJ, Solà-Roca J, Bonfill X. Double versus single reading of mammograms in a breast cancer screening programme: a cost-consequence analysis. Eur Radiol. 2016;26:3262–3271. doi: 10.1007/s00330-015-4175-4.
    1. Bond M, Pavey T, Welch K et al (2013) Systematic review of the psychological consequences of false-positive screening mammograms. Health Technol Assess 17:1–170, v-vi: 10.3310/hta17130
    1. Sechopoulos I, Mann RM. Stand-alone artificial intelligence - the future of breast cancer screening? Breast. 2020;49:254–260. doi: 10.1016/j.breast.2019.12.014.
    1. McKinney SM, Sieniek M, Godbole V, et al. International evaluation of an AI system for breast cancer screening. Nature. 2020;577:89–94. doi: 10.1038/s41586-019-1799-6.
    1. Wu N, Phang J, Park J et al (2019) Deep neural networks improve radiologists’ performance in breast Cancer screening. IEEE Trans Med Imaging:1–1. 10.1109/TMI.2019.2945514
    1. Kim H-E, Kim HH, Han B-K, et al. Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study. The Lancet Digital Health. 2020;2:e138–e148. doi: 10.1016/S2589-7500(20)30003-0.
    1. Rodriguez-Ruiz A, Lång K, Gubern-Merida A et al (2019) Stand-alone artificial intelligence for breast Cancer detection in mammography. Comparison With 101 Radiologists. 10.1093/jnci/djy222
    1. Rodríguez-Ruiz A, Krupinski E, Mordang J-J, et al. Detection of breast Cancer with mammography: effect of an artificial intelligence support system. Radiology. 2018;290:305–314. doi: 10.1148/radiol.2018181371.
    1. Zackrisson S, Lång K, Rosso A, et al. One-view breast tomosynthesis versus two-view mammography in the Malmö breast Tomosynthesis screening trial (MBTST): a prospective, population-based, diagnostic accuracy study. Lancet Oncol. 2018;19:1493–1503. doi: 10.1016/S1470-2045(18)30521-7.
    1. Mordang J-J, Janssen T, Bria A, Kooi T, Gubern-Mérida A, Karssemeijer N. Automatic microcalcification detection in multi-vendor mammography using convolutional neural networks. In: Tingberg A, Lång K, Timberg P, editors. Breast imaging. Cham: Springer International Publishing; 2016. pp. 35–42.
    1. Bria A, Karssemeijer N, Tortorella F. Learning from unbalanced data: a cascade-based approach for detecting clustered microcalcifications. Med Image Anal. 2014;18:241–252. doi: 10.1016/j.media.2013.10.014.
    1. Hupse R, Karssemeijer N. Use of normal tissue context in computer-aided detection of masses in mammograms. IEEE Trans Med Imaging. 2009;28:2033–2041. doi: 10.1109/tmi.2009.2028611.
    1. Karssemeijer N. Automated classification of parenchymal patterns in mammograms. Phys Med Biol. 1998;43:365. doi: 10.1088/0031-9155/43/2/011.
    1. Kooi T, Litjens G, van Ginneken B, et al. Large scale deep learning for computer aided detection of mammographic lesions. Med Image Anal. 2017;35:303–312. doi: 10.1016/j.media.2016.07.007.
    1. Karssemeijer N, Te Brake GM. Detection of stellate distortions in mammograms. IEEE Trans Med Imaging. 1996;15:611–619. doi: 10.1109/42.538938.
    1. Rakha EA, Lee AH, Evans AJ, et al. Tubular carcinoma of the breast: further evidence to support its excellent prognosis. J Clin Oncol. 2010;28:99–104. doi: 10.1200/jco.2009.23.5051.
    1. D’Orsi CJ, Sickles EA, Mendelson EB, Morris EA (2013) ACR BI-RADS® atlas, breast imaging reporting and data system. American College of Radiology, Reston, VA
    1. Perry N, Broeders M, de Wolf C, Tornberg S, Holland R, von Karsa L (2008) European guidelines for quality assurance in breast cancer screening and diagnosis. Ann Oncol 19:614–622. pii: mdm481. 10.1093/annonc/mdm481
    1. Hofvind S, Geller BM, Skelly J, Vacek PM. Sensitivity and specificity of mammographic screening as practised in Vermont and Norway. Br J Radiol. 2012;85:e1226–e1232. doi: 10.1259/bjr/15168178.
    1. Le MT, Mothersill CE, Seymour CB, McNeill FE. Is the false-positive rate in mammography in North America too high? Br J Radiol. 2016;89:20160045. doi: 10.1259/bjr.20160045.
    1. Rodriguez-Ruiz A, Lång K, Gubern-Merida A et al (2019) Can we reduce the workload of mammographic screening by automatic identification of normal exams with artificial intelligence? A feasibility study. Eur Radiol. 10.1007/s00330-019-06186-9
    1. Yala A, Schuster T, Miles R, Barzilay R, Lehman C. A deep learning model to triage screening mammograms: a simulation study. Radiology. 2019;293:38–46. doi: 10.1148/radiol.2019182908.
    1. Houssami N. Overdiagnosis of breast cancer in population screening: does it make breast screening worthless? Cancer Biol Med. 2017;14:1–8. doi: 10.20892/j.issn.2095-3941.2016.0050.
    1. Evans A, Vinnicombe S. Overdiagnosis in breast imaging. Breast. 2017;31:270–273. doi: 10.1016/j.breast.2016.10.011.
    1. Evans KK, Birdwell RL, Wolfe JM. If you don't find it often, you often don't find it: why some cancers are missed in breast cancer screening. PLoS One. 2013;8:e64366. doi: 10.1371/journal.pone.0064366.

Source: PubMed

3
Tilaa