Diagnostic effect of artificial intelligence solution for referable thoracic abnormalities on chest radiography: a multicenter respiratory outpatient diagnostic cohort study

Kwang Nam Jin, Eun Young Kim, Young Jae Kim, Gi Pyo Lee, Hyungjin Kim, Sohee Oh, Yong Suk Kim, Ju Hyuck Han, Young Jun Cho, Kwang Nam Jin, Eun Young Kim, Young Jae Kim, Gi Pyo Lee, Hyungjin Kim, Sohee Oh, Yong Suk Kim, Ju Hyuck Han, Young Jun Cho

Abstract

Objectives: We aim ed to evaluate a commercial artificial intelligence (AI) solution on a multicenter cohort of chest radiographs and to compare physicians' ability to detect and localize referable thoracic abnormalities with and without AI assistance.

Methods: In this retrospective diagnostic cohort study, we investigated 6,006 consecutive patients who underwent both chest radiography and CT. We evaluated a commercially available AI solution intended to facilitate the detection of three chest abnormalities (nodule/masses, consolidation, and pneumothorax) against a reference standard to measure its diagnostic performance. Moreover, twelve physicians, including thoracic radiologists, board-certified radiologists, radiology residents, and pulmonologists, assessed a dataset of 230 randomly sampled chest radiographic images. The images were reviewed twice per physician, with and without AI, with a 4-week washout period. We measured the impact of AI assistance on observer's AUC, sensitivity, specificity, and the area under the alternative free-response ROC (AUAFROC).

Results: In the entire set (n = 6,006), the AI solution showed average sensitivity, specificity, and AUC of 0.885, 0.723, and 0.867, respectively. In the test dataset (n = 230), the average AUC and AUAFROC across observers significantly increased with AI assistance (from 0.861 to 0.886; p = 0.003 and from 0.797 to 0.822; p = 0.003, respectively).

Conclusions: The diagnostic performance of the AI solution was found to be acceptable for the images from respiratory outpatient clinics. The diagnostic performance of physicians marginally improved with the use of AI solutions. Further evaluation of AI assistance for chest radiographs using a prospective design is required to prove the efficacy of AI assistance.

Key points: • AI assistance for chest radiographs marginally improved physicians' performance in detecting and localizing referable thoracic abnormalities on chest radiographs. • The detection or localization of referable thoracic abnormalities by pulmonologists and radiology residents improved with the use of AI assistance.

Keywords: Artificial intelligence; Cohort studies; Diagnosis; Radiography; Thorax.

Conflict of interest statement

Kwang Nam Jin and Hyungjin Kim received a research grant from Lunit for activities not related to the present article. Other authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject matter of the article.

© 2021. The Author(s).

Figures

Fig. 1
Fig. 1
Flow diagram of the study population and study design for AI augmentation test
Fig. 2
Fig. 2
Graphs showing receiver operating characteristic curves (a) and jackknife alternative free-response receiver operating characteristic curves (b) of each physician and AI solution for referable thoracic abnormalities on chest radiographs. TPF, false-positive fraction; FPF, true-positive fraction; LLF,lesion localization fraction; AI,artificial intelligence; GR,general radiologist; P,pulmonologist; RR,radiology resident; TR, thoracic radiologist
Fig. 3
Fig. 3
A 54-year-old woman with pneumonia in the right lower lung zone. Chest radiography demonstrated ill-defined ground-glass opacity or consolidation in the right para-hilar area, which was marked with a white outline as the reference standard. a The AI solution correctly detected the lesion with a probability value of 69%. b Chest CT without contrast enhancement shows consolidation and tiny ill-defined nodules in the right middle lobe. c Among the 12 observers, seven could detect the lesions without AI assistance. With the use of an AI solution, all observers could detect the lesions. The AI solution led to accurate detection of pneumonia on chest radiographs in the case of five observers (42%), including two pulmonologists, one thoracic radiologist, one general radiologist, and one radiology resident
Fig. 4
Fig. 4
A 56-year-old man with adenocarcinoma of the right upper lobe. A chest radiograph shows a faint nodular opacity in the right upper lung zone. a The AI solution correctly detected the lesion with a probability value of 63%. b Chest CT with contrast enhancement demonstrated a spiculated nodule in the right upper lobe. c Among the 12 observers, two observers, including one pulmonologist and one radiology resident, could detect the lesion without AI assistance (unaided reading). In addition, two observers, one thoracic radiologist, and one pulmonologist marked a false-positive lesion in unaided reading. With the use of an AI solution, observers could detect the lesions. The false-positive lesion marked on unaided reading was withdrawn by two observers in AI-assisted reading. Regarding visual certainty for the lesion, three observers, including two thoracic radiologists and one pulmonologist, rated a higher score in AI-assisted reading than in unaided reading

References

    1. Ron E. Cancer risks from medical radiation. Health Phys. 2003;85:47–59. doi: 10.1097/00004032-200307000-00011.
    1. National Council on Radiation Protection and Measurements (2009) Ionizing Radiation Exposure of the Population of the United States (NCRP Report No. 160). National Council on Radiation Protection and Measurements, Bethesda
    1. Pinsky PF, Freedman M, Kvale P, et al. Abnormalities on chest radiograph reported in subjects in a cancer screening trial. Chest. 2006;130:688–693. doi: 10.1378/chest.130.3.688.
    1. Little BP, Gilman MD, Humphrey KL, et al. Outcome of recommendations for radiographic follow-up of pneumonia on outpatient chest radiography. AJR Am J Roentgenol. 2014;202:54–59. doi: 10.2214/AJR.13.10888.
    1. Fitzgerald R. Error in radiology. Clin Radiol. 2001;56:938–946. doi: 10.1053/crad.2001.0858.
    1. Donald JJ, Barnard SA. Common patterns in 558 diagnostic radiology errors. J Med Imaging Radiat Oncol. 2012;56:173–178. doi: 10.1111/j.1754-9485.2012.02348.x.
    1. Sistrom CL, Dreyer KJ, Dang PP, et al. Recommendations for additional imaging in radiology reports: multifactorial analysis of 5.9 million examinations. Radiology. 2009;253:453–461. doi: 10.1148/radiol.2532090200.
    1. Harvey HB, Gilman MD, Wu CC, et al. Diagnostic yield of recommendations for chest CT examination prompted by outpatient chest radiographic findings. Radiology. 2015;275:262–271. doi: 10.1148/radiol.14140583.
    1. Nam JG, Park S, Hwang EJ, et al. Development and validation of deep learning-based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology. 2019;290:218–228. doi: 10.1148/radiol.2018180237.
    1. Hwang EJ, Park S, Jin KN, et al. Development and validation of a deep learning-based automatic detection algorithm for active pulmonary tuberculosis on chest radiographs. Clin Infect Dis. 2019;69:739–747. doi: 10.1093/cid/ciy967.
    1. Hwang EJ, Park S, Jin KN, et al. Development and validation of a deep learning-based automated detection algorithm for major thoracic diseases on chest radiographs. JAMA Netw Open. 2019;2:e191095. doi: 10.1001/jamanetworkopen.2019.1095.
    1. Park SH, Han K. Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction. Radiology. 2018;286:800–809. doi: 10.1148/radiol.2017171920.
    1. Park SH. Diagnostic case-control versus diagnostic cohort studies for clinical validation of artificial intelligence algorithm performance. Radiology. 2019;290:272–2731. doi: 10.1148/radiol.2018182294.
    1. Hwang EJ, Nam JG, Lim WH, et al. Deep learning for chest radiograph diagnosis in the emergency department. Radiology. 2019;293:573–580. doi: 10.1148/radiol.2019191225.
    1. Lee JH, Sun HY, Park SG, et al. Performance of a deep learning algorithm compared with radiologic interpretation for lung cancer detection on chest radiographs in a health screening population. Radiology. 2020;297:687–696. doi: 10.1148/radiol.2020201240.
    1. Rajpurkar P, Irvin J, Zhu K et al (2017) CheXNet: radiologist-level pneumonia detection on chest X-rays with deep learning. arXiv []
    1. Hansell DM, Bankier AA, MacMahon H, et al. Fleischner Society: glossary of terms for thoracic imaging. Radiology. 2008;246:697–722. doi: 10.1148/radiol.2462070712.
    1. Johnson AEW, Pollard TJ, Greenbaum NR et al (2019) MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs. arXiv []
    1. World Health Organization, Others (2016) ICD-10 Version: . F00-F09. Accessed 25 February 2016
    1. Zou KH, Warfield SK, Bharatha A, et al. Statistical validation of image segmentation quality based on a spatial overlap index. Acad Radiol. 2004;11:178–189. doi: 10.1016/S1076-6332(03)00671-8.
    1. Hillis SL, Berbaum KS, Metz CE. Recent developments in the DorfmanBerbaum-Metz procedure for multireader ROC study analysis. Acad Radiol. 2008;15:647–661. doi: 10.1016/j.acra.2007.12.015.
    1. Nam JG, Kim M, Park J, et al. Development and validation of a deep learning algorithm detecting 10 common abnormalities on chest radiographs. Eur Respir J. 2020 doi: 10.1183/13993003.03061-2020.
    1. Yoo H, Lee SH, Arru CD, et al. AI-based improvement in lung cancer detection on chest radiographs: results of a multi-reader study in NLST dataset. Eur Radiol. 2021 doi: 10.1007/s00330-021-08074-7.
    1. Hwang EJ, Park CM. Clinical implementation of deep learning in thoracic radiology: potential applications and challenges. Korean J Radiol. 2020;21:511–525. doi: 10.3348/kjr.2019.0821.
    1. Dunnmon JA, Yi D, Langlotz CP, et al. Assessment of convolutional neural networks for automated classification of chest radiographs. Radiology. 2019;290:537–544. doi: 10.1148/radiol.2018181422.
    1. Park S, Lee SM, Lee KH, et al. Deep learning-based detection system for multiclass lesions on chest radiographs: comparison with observer readings. Eur Radiol. 2020;30:1359–1368. doi: 10.1007/s00330-019-06532-x.
    1. Majkowska A, Mittal S, Steiner DF, et al. Chest radiograph interpretation with deep learning models: assessment with radiologist-adjudicated reference standards and population-adjusted evaluation. Radiology. 2020;294:421–431. doi: 10.1148/radiol.2019191293.
    1. Sim Y, Chung MJ, Kotter E, et al. Deep Convolutional neural network–based software improves radiologist detection of malignant lung nodules on chest radiographs. Radiology. 2020;294:199–209. doi: 10.1148/radiol.2019182465.

Source: PubMed

3
S'abonner