Validation of a Deep Learning Algorithm for the Detection of Malignant Pulmonary Nodules in Chest Radiographs

Hyunsuk Yoo, Ki Hwan Kim, Ramandeep Singh, Subba R Digumarthy, Mannudeep K Kalra, Hyunsuk Yoo, Ki Hwan Kim, Ramandeep Singh, Subba R Digumarthy, Mannudeep K Kalra

Abstract

Importance: The improvement of pulmonary nodule detection, which is a challenging task when using chest radiographs, may help to elevate the role of chest radiographs for the diagnosis of lung cancer.

Objective: To assess the performance of a deep learning-based nodule detection algorithm for the detection of lung cancer on chest radiographs from participants in the National Lung Screening Trial (NLST).

Design, setting, and participants: This diagnostic study used data from participants in the NLST ro assess the performance of a deep learning-based artificial intelligence (AI) algorithm for the detection of pulmonary nodules and lung cancer on chest radiographs using separate training (in-house) and validation (NLST) data sets. Baseline (T0) posteroanterior chest radiographs from 5485 participants (full T0 data set) were used to assess lung cancer detection performance, and a subset of 577 of these images (nodule data set) were used to assess nodule detection performance. Participants aged 55 to 74 years who currently or formerly (ie, quit within the past 15 years) smoked cigarettes for 30 pack-years or more were enrolled in the NLST at 23 US centers between August 2002 and April 2004. Information on lung cancer diagnoses was collected through December 31, 2009. Analyses were performed between August 20, 2019, and February 14, 2020.

Exposures: Abnormality scores produced by the AI algorithm.

Main outcomes and measures: The performance of an AI algorithm for the detection of lung nodules and lung cancer on radiographs, with lung cancer incidence and mortality as primary end points.

Results: A total of 5485 participants (mean [SD] age, 61.7 [5.0] years; 3030 men [55.2%]) were included, with a median follow-up duration of 6.5 years (interquartile range, 6.1-6.9 years). For the nodule data set, the sensitivity and specificity of the AI algorithm for the detection of pulmonary nodules were 86.2% (95% CI, 77.8%-94.6%) and 85.0% (95% CI, 81.9%-88.1%), respectively. For the detection of all cancers, the sensitivity was 75.0% (95% CI, 62.8%-87.2%), the specificity was 83.3% (95% CI, 82.3%-84.3%), the positive predictive value was 3.8% (95% CI, 2.6%-5.0%), and the negative predictive value was 99.8% (95% CI, 99.6%-99.9%). For the detection of malignant pulmonary nodules in all images of the full T0 data set, the sensitivity was 94.1% (95% CI, 86.2%-100.0%), the specificity was 83.3% (95% CI, 82.3%-84.3%), the positive predictive value was 3.4% (95% CI, 2.2%-4.5%), and the negative predictive value was 100.0% (95% CI, 99.9%-100.0%). In digital radiographs of the nodule data set, the AI algorithm had higher sensitivity (96.0% [95% CI, 88.3%-100.0%] vs 88.0% [95% CI, 75.3%-100.0%]; P = .32) and higher specificity (93.2% [95% CI, 89.9%-96.5%] vs 82.8% [95% CI, 77.8%-87.8%]; P = .001) for nodule detection compared with the NLST radiologists. For malignant pulmonary nodule detection on digital radiographs of the full T0 data set, the sensitivity of the AI algorithm was higher (100.0% [95% CI, 100.0%-100.0%] vs 94.1% [95% CI, 82.9%-100.0%]; P = .32) compared with the NLST radiologists, and the specificity (90.9% [95% CI, 89.6%-92.1%] vs 91.0% [95% CI, 89.7%-92.2%]; P = .91), positive predictive value (8.2% [95% CI, 4.4%-11.9%] vs 7.8% [95% CI, 4.1%-11.5%]; P = .65), and negative predictive value (100.0% [95% CI, 100.0%-100.0%] vs 99.9% [95% CI, 99.8%-100.0%]; P = .32) were similar to those of NLST radiologists.

Conclusions and relevance: In this study, the AI algorithm performed better than NLST radiologists for the detection of pulmonary nodules on digital radiographs. When used as a second reader, the AI algorithm may help to detect lung cancer.

Conflict of interest statement

Conflict of Interest Disclosures: Dr Yoo reported receiving personal fees from Lunit during the conduct of the study. Dr Kim reported receiving personal fees from Lunit during the conduct of the study. Dr Digumarthy reported receiving grants from Lunit during the conduct of the study and providing independent image analysis for hospital-contracted clinical trials for Abbvie, Bristol-Myers Squibb, Cascadian Therapeutics, Clinical Bay Laboratories, Gradalis, Merck, Novartis, Pfizer, Roche, Polaris Pharmaceuticals, and Zai Laboratories; receiving grants from Lunit; and receiving honoraria from Siemens outside the submitted work. Dr Kalra reported receiving grants from Riverain Technologies and Siemens Healthineers outside the submitted work. No other disclosures were reported.

Figures

Figure 1.. Receiver Operating Characteristic Curve of…
Figure 1.. Receiver Operating Characteristic Curve of the Performance of the Artificial Intelligence Algorithm vs NLST Radiologists for the Detection of Noncalcified Nodules in the Nodule Data Set
Colored lines represent results from the artificial intelligence algorithm, and colored Xs represent results from NLST radiologists. AUROC indicates area under the receiver operating characteristic; CR, computed radiography; DR, digital radiography; and NLST, National Lung Screening Trial.
Figure 2.. Frontal Chest Radiographs of Patients…
Figure 2.. Frontal Chest Radiographs of Patients With Malignant Pulmonary Nodules Missed by NLST Radiologists But Detected by Artificial Intelligence Algorithm
A, Chest radiograph of woman in her 60s (without AI detection). The woman was diagnosed with lung cancer 86 days after baseline imaging. B, Chest radiograph of woman in her 60s (with AI detection). The AI algorithm detected the missed subtle abnormality (in green, with nodule score of 38%) in the left perihilar region. C, Chest radiograph of man in his 50s (without AI detection). The man was diagnosed with lung cancer 127 days after baseline imaging. D, Chest radiograph of man in his 50s (with AI detection). The AI algorithm detected the missed subcentimeter nodule (in green, with nodule score of 53%) in the right upper lung zone. AI indicates artificial intelligence; and NLST, National Lung Screening Trial.

References

    1. de Koning HJ, van der Aalst CM, de Jong PA, et al. . Reduced lung-cancer mortality with volume CT screening in a randomized trial. N Engl J Med. 2020;382(6):503-513. doi:10.1056/NEJMoa1911793
    1. Aberle DR, Adams AM, Berg CD; National Lung Screening Trial Research Team. Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med. 2011;365(5):395-409. doi:10.1056/NEJMoa1102873
    1. Aberle DR, DeMello S, Berg CD, et al. ; National Lung Screening Trial Research Team . Results of the two incidence screenings in the National Lung Screening Trial. N Engl J Med. 2013;369(10):920-931. doi:10.1056/NEJMoa1208962
    1. De Koning H, Van Der Aalst C, Ten Haaf K, Oudkerk M.. PL02.05 Effects of volume CT lung cancer screening: mortality results of the NELSON randomised-controlled population based trial. J Thorac Oncol. 2018;13(10):S185. doi:10.1016/j.jtho.2018.08.012
    1. Church TR, Black WC, Aberle DR, et al. ; National Lung Screening Trial Research Team . Results of initial low-dose computed tomographic screening for lung cancer. N Engl J Med. 2013;368(21):1980-1991. doi:10.1056/NEJMoa1209120
    1. de Hoop B, Schaefer-Prokop C, Gietema HA, et al. . Screening for lung cancer with digital chest radiography: sensitivity and number of secondary work-up CT examinations. Radiology. 2010;255(2):629-637. doi:10.1148/radiol.09091308
    1. Bach PB, Mirkin JN, Oliver TK, et al. . Benefits and harms of CT screening for lung cancer: a systematic review. JAMA. 2012;307(22):2418-2429. doi:10.1001/jama.2012.5521
    1. Ahmad U, Detterbeck FC. Current status of lung cancer screening. Semin Thorac Cardiovasc Surg. 2012;24(1):27-36. doi:10.1053/j.semtcvs.2012.01.014
    1. Nam JG, Park S, Hwang EJ, et al. . Development and validation of deep learning–based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology. 2019;290(1):218-228. doi:10.1148/radiol.2018180237
    1. Gavelli G, Giampalma E. Sensitivity and specificity of chest X-ray screening for lung cancer: review article. Cancer. 2000;89(11 Suppl):2453-2456. doi:10.1002/1097-0142(20001201)89:11+<2453::AID-CNCR21>;2-M
    1. Muhm JR, Miller WE, Fontana RS, Sanderson DR, Uhlenhopp MA. Lung cancer detected during a screening program using four-month chest radiographs. Radiology. 1983;148(3):609-615. doi:10.1148/radiology.148.3.6308709
    1. Austin JH, Romney BM, Goldsmith LS. Missed bronchogenic carcinoma: radiographic findings in 27 patients with a potentially resectable lesion evident in retrospect. Radiology. 1992;182(1):115-122. doi:10.1148/radiology.182.1.1727272
    1. Schalekamp S, van Ginneken B, Koedam E, et al. . Computer-aided detection improves detection of pulmonary nodules in chest radiographs beyond the support by bone-suppressed images. Radiology. 2014;272(1):252-261. doi:10.1148/radiol.14131315
    1. Quadrelli S, Lyons G, Colt H, Chimondeguy D, Buero A. Clinical characteristics and prognosis of incidentally detected lung cancers. Int J Surg Oncol. 2015;2015:287604. doi:10.1155/2015/287604
    1. Ausawalaithong W, Thirach A, Marukatat S, Wilaiprasitporn T Automatic lung cancer prediction from chest x-ray images using the deep learning approach. In: 2018 11th Biomedical Engineering International Conference (BMEICON) Institute of Electrical and Electronics Engineers; 2018:1-5. doi:10.1109/BMEiCON.2018.8609997
    1. Tataru C, Yi D, Shenoyas A, Ma A Deep learning for abnormality detection in chest x-ray images. Stanford University; 2017. Accessed September 4, 2020.
    1. Gang P, Zhen W, Zeng W, et al. Dimensionality reduction in deep learning for chest X-ray analysis of lung cancer. In: 2018 10th International Conference on Advanced Computational Intelligence (ICACI) Institute of Electrical and Electronics Engineers; 2018:878-883. doi:10.1109/ICACI.2018.8377579
    1. Hwang EJ, Nam JG, Lim WH, et al. . Deep learning for chest radiograph diagnosis in the emergency department. Radiology. 2019;293(3):573-580. doi:10.1148/radiol.2019191225
    1. Rajpurkar P, Irvin J, Ball RL, et al. . Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 2018;15(11):e1002686. doi:10.1371/journal.pmed.1002686
    1. Cicero M, Bilbily A, Colak E, et al. . Training and validating a deep convolutional neural network for computer-aided detection and classification of abnormalities on frontal chest radiographs. Invest Radiol. 2017;52(5):281-287. doi:10.1097/RLI.0000000000000341
    1. Sim Y, Chung MJ, Kotter E, et al. . Deep convolutional neural network–based software improves radiologist detection of malignant lung nodules on chest radiographs. Radiology. 2020;294(1):199-209. doi:10.1148/radiol.2019182465
    1. Aberle DR, Berg CD, Black WC, et al. ; National Lung Screening Trial Research Team . The National Lung Screening Trial: overview and study design. Radiology. 2011;258(1):243-253. doi:10.1148/radiol.10091808
    1. Patz EF Jr, Greco E, Gatsonis C, Pinsky P, Kramer BS, Aberle DR. Lung cancer incidence and mortality in National Lung Screening Trial participants who underwent low-dose CT prevalence screening: a retrospective cohort analysis of a randomised, multicentre, diagnostic screening trial. Lancet Oncol. 2016;17(5):590-599. doi:10.1016/S1470-2045(15)00621-X
    1. NLST nodule detection. GitHub. Accessed March 29, 2020.
    1. He K, Zhang X, Ren S, Sun J Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Institute of Electrical and Electronics Engineers; 2016:770-778. doi:10.1109/CVPR.2016.90
    1. Oquab M, Bottou L, Laptev I, Sivic J Is object localization for free? weakly-supervised learning with convolutional neural networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Institute of Electrical and Electronics Engineers; 2015:685-694. doi:10.1109/CVPR.2015.7298668
    1. Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV AutoAugment: learning augmentation strategies from data. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Institute of Electrical and Electronics Engineers; 2019:113-123. doi:10.1109/CVPR.2019.00020
    1. Hwang EJ, Park S, Jin K-N, et al. ; DLAD Development and Evaluation Group . Development and validation of a deep learning–based automated detection algorithm for major thoracic diseases on chest radiographs. JAMA Netw Open. 2019;2(3):e191095. doi:10.1001/jamanetworkopen.2019.1095
    1. Leisenring W, Alonzo T, Pepe MS. Comparisons of predictive values of binary medical diagnostic tests for paired designs. Biometrics. 2000;56(2):345-351. doi:10.1111/j.0006-341X.2000.00345.x
    1. Majkowska A, Mittal S, Steiner DF, et al. . Chest radiograph interpretation with deep learning models: assessment with radiologist-adjudicated reference standards and population-adjusted evaluation. Radiology. 2020;294(2):421-431. doi:10.1148/radiol.2019191293
    1. Purandare NC, Rangarajan V. Imaging of lung cancer: implications on staging and management. Indian J Radiol Imaging. 2015;25(2):109-120. doi:10.4103/0971-3026.155831

Source: PubMed

Подписаться