Association of Artificial Intelligence-Aided Chest Radiograph Interpretation With Reader Performance and Efficiency

Jong Seok Ahn, Shadi Ebrahimian, Shaunagh McDermott, Sanghyup Lee, Laura Naccarato, John F Di Capua, Markus Y Wu, Eric W Zhang, Victorine Muse, Benjamin Miller, Farid Sabzalipour, Bernardo C Bizzo, Keith J Dreyer, Parisa Kaviani, Subba R Digumarthy, Mannudeep K Kalra, Jong Seok Ahn, Shadi Ebrahimian, Shaunagh McDermott, Sanghyup Lee, Laura Naccarato, John F Di Capua, Markus Y Wu, Eric W Zhang, Victorine Muse, Benjamin Miller, Farid Sabzalipour, Bernardo C Bizzo, Keith J Dreyer, Parisa Kaviani, Subba R Digumarthy, Mannudeep K Kalra

Abstract

Importance: The efficient and accurate interpretation of radiologic images is paramount.

Objective: To evaluate whether a deep learning-based artificial intelligence (AI) engine used concurrently can improve reader performance and efficiency in interpreting chest radiograph abnormalities.

Design, setting, and participants: This multicenter cohort study was conducted from April to November 2021 and involved radiologists, including attending radiologists, thoracic radiology fellows, and residents, who independently participated in 2 observer performance test sessions. The sessions included a reading session with AI and a session without AI, in a randomized crossover manner with a 4-week washout period in between. The AI produced a heat map and the image-level probability of the presence of the referrable lesion. The data used were collected at 2 quaternary academic hospitals in Boston, Massachusetts: Beth Israel Deaconess Medical Center (The Medical Information Mart for Intensive Care Chest X-Ray [MIMIC-CXR]) and Massachusetts General Hospital (MGH).

Main outcomes and measures: The ground truths for the labels were created via consensual reading by 2 thoracic radiologists. Each reader documented their findings in a customized report template, in which the 4 target chest radiograph findings and the reader confidence of the presence of each finding was recorded. The time taken for reporting each chest radiograph was also recorded. Sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC) were calculated for each target finding.

Results: A total of 6 radiologists (2 attending radiologists, 2 thoracic radiology fellows, and 2 residents) participated in the study. The study involved a total of 497 frontal chest radiographs-247 from the MIMIC-CXR data set (demographic data for patients were not available) and 250 chest radiographs from MGH (mean [SD] age, 63 [16] years; 133 men [53.2%])-from adult patients with and without 4 target findings (pneumonia, nodule, pneumothorax, and pleural effusion). The target findings were found in 351 of 497 chest radiographs. The AI was associated with higher sensitivity for all findings compared with the readers (nodule, 0.816 [95% CI, 0.732-0.882] vs 0.567 [95% CI, 0.524-0.611]; pneumonia, 0.887 [95% CI, 0.834-0.928] vs 0.673 [95% CI, 0.632-0.714]; pleural effusion, 0.872 [95% CI, 0.808-0.921] vs 0.889 [95% CI, 0.862-0.917]; pneumothorax, 0.988 [95% CI, 0.932-1.000] vs 0.792 [95% CI, 0.756-0.827]). AI-aided interpretation was associated with significantly improved reader sensitivities for all target findings, without negative impacts on the specificity. Overall, the AUROCs of readers improved for all 4 target findings, with significant improvements in detection of pneumothorax and nodule. The reporting time with AI was 10% lower than without AI (40.8 vs 36.9 seconds; difference, 3.9 seconds; 95% CI, 2.9-5.2 seconds; P < .001).

Conclusions and relevance: These findings suggest that AI-aided interpretation was associated with improved reader performance and efficiency for identifying major thoracic findings on a chest radiograph.

Conflict of interest statement

Conflict of Interest Disclosures: Dr Naccarato reported receiving personal fees from Massachusetts General Hospital for time interpreting chest radiographs during the conduct of the study. Dr Digumarthy reported receiving personal fees from Siemens Healthineers and grants from Lunit, GE, Vuno, and QureAI outside the submitted work; Dr Digumarthy also provides independent image analysis for hospital-contracted clinical research trials programs for Merck, Pfizer, Bristol Myers Squibb, Novartis, Roche, Polaris, Cascadian, Abbvie, Gradalis, Bayer, Zai Laboratories, Shanghai Bioscience, Biengen, Resonance, Riverain, and Analise. Dr Kalra reported receiving grants from Lunit, Inc, for an unrelated study outside the submitted work. No other disclosures were reported.

Figures

Figure 1.. Different Display Modes Available for…
Figure 1.. Different Display Modes Available for the Artificial Intelligence Output
Shown are the color heat map (A), grayscale contour map (B), combined map (C), and single-color map (D).
Figure 2.. Receiver Operating Characteristic Curves of…
Figure 2.. Receiver Operating Characteristic Curves of a Deep-Learning Artificial Intelligence (AI) Algorithm for the Target Findings and Comparison Against the Reader Performance
Graphs show data for nodules (A), pleural effusions (B), pneumonia (C), and pneumothorax (D). Diagonal lines denote lines of regression. A indicates attending radiologist; F, fellow; R, resident.

References

    1. McComb BL, Chung JH, Crabtree TD, et al. ; Expert Panel on Thoracic Imaging . ACR Appropriateness Criteria® routine chest radiography. J Thorac Imaging. 2016;31(2):W13-W15. doi:10.1097/RTI.0000000000000200
    1. Mettler FA Jr, Mahesh M, Bhargavan-Chatfield M, et al. . Patient exposure from radiologic and nuclear medicine procedures in the United States: procedure volume and effective dose for the period 2006-2016. Radiology. 2020;295(2):418-427. doi:10.1148/radiol.2020192256
    1. de Groot PM, Carter BW, Abbott GF, Wu CC. Pitfalls in chest radiographic interpretation: blind spots. Semin Roentgenol. 2015;50(3):197-209. doi:10.1053/j.ro.2015.01.008
    1. Austin JH, Romney BM, Goldsmith LS. Missed bronchogenic carcinoma: radiographic findings in 27 patients with a potentially resectable lesion evident in retrospect. Radiology. 1992;182(1):115-122. doi:10.1148/radiology.182.1.1727272
    1. Johnson J, Kline JA. Intraobserver and interobserver agreement of the interpretation of pediatric chest radiographs. Emerg Radiol. 2010;17(4):285-290. doi:10.1007/s10140-009-0854-2
    1. Moncada DC, Rueda ZV, Macías A, Suárez T, Ortega H, Vélez LA. Reading and interpretation of chest X-ray in adults with community-acquired pneumonia. Braz J Infect Dis. 2011;15(6):540-546. doi:10.1016/S1413-8670(11)70248-3
    1. Albaum MN, Hill LC, Murphy M, et al. ; PORT Investigators . Interobserver reliability of the chest radiograph in community-acquired pneumonia. Chest. 1996;110(2):343-350. doi:10.1378/chest.110.2.343
    1. Melbye H, Dale K. Interobserver variability in the radiographic diagnosis of adult outpatient pneumonia. Acta Radiol. 1992;33(1):79-81.
    1. Campbell SG, Murray DD, Hawass A, Urquhart D, Ackroyd-Stolarz S, Maxwell D. Agreement between emergency physician diagnosis and radiologist reports in patients discharged from an emergency department with community-acquired pneumonia. Emerg Radiol. 2005;11(4):242-246. doi:10.1007/s10140-005-0413-4
    1. Rimmer A. Radiologist shortage leaves patient care at risk, warns royal college. BMJ. 2017;359:j4683. doi:10.1136/bmj.j4683
    1. Nakajima Y, Yamada K, Imamura K, Kobayashi K. Radiologist supply and workload: international comparison—Working Group of Japanese College of Radiology. Radiat Med. 2008;26(8):455-465. doi:10.1007/s11604-008-0259-2
    1. Yarmus L, Feller-Kopman D. Pneumothorax in the critically ill patient. Chest. 2012;141(4):1098-1105. doi:10.1378/chest.11-1691
    1. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582. doi:10.1148/radiol.2017162326
    1. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44-56. doi:10.1038/s41591-018-0300-7
    1. Hwang EJ, Park S, Jin K-N, et al. ; DLAD Development and Evaluation Group . Development and validation of a deep learning-based automated detection algorithm for major thoracic diseases on chest radiographs. JAMA Netw Open. 2019;2(3):e191095. doi:10.1001/jamanetworkopen.2019.1095
    1. Park S, Lee SM, Lee KH, et al. . Deep learning-based detection system for multiclass lesions on chest radiographs: comparison with observer readings. Eur Radiol. 2020;30(3):1359-1368. doi:10.1007/s00330-019-06532-x
    1. Hwang EJ, Hong JH, Lee KH, et al. . Deep learning algorithm for surveillance of pneumothorax after lung biopsy: a multicenter diagnostic cohort study. Eur Radiol. 2020;30(7):3660-3671. doi:10.1007/s00330-020-06771-3
    1. Park S, Lee SM, Kim N, et al. . Application of deep learning-based computer-aided detection system: detecting pneumothorax on chest radiograph after biopsy. Eur Radiol. 2019;29(10):5341-5348. doi:10.1007/s00330-019-06130-x
    1. Nam JG, Park S, Hwang EJ, et al. . Development and validation of deep learning-based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology. 2019;290(1):218-228. doi:10.1148/radiol.2018180237
    1. Nam JG, Kim M, Park J, et al. . Development and validation of a deep learning algorithm detecting 10 common abnormalities on chest radiographs. Eur Respir J. 2021;57(5):57. doi:10.1183/13993003.03061-2020
    1. Thurfjell E, Thurfjell MG, Egge E, Bjurstam N. Sensitivity and specificity of computer-assisted breast cancer detection in mammography screening. Acta Radiol. 1998;39(4):384-388. doi:10.1080/02841859809172450
    1. Fenton JJ, Abraham L, Taplin SH, et al. . Effectiveness of computer-aided detection in community mammography practice. J Natl Cancer Inst. 2011;105(15):1152-1161. doi:10.1093/jnci/djr206
    1. Meyl TP, de Bucourt M, Berghöfer A, et al. . Subspecialization in radiology: effects on the diagnostic spectrum of radiologists and report turnaround time in a Swiss university hospital. Radiol Med. 2019;124(9):860-869. doi:10.1007/s11547-019-01039-3
    1. Eng J, Mysko WK, Weller GER, et al. . Interpretation of emergency department radiographs: a comparison of emergency medicine physicians with radiologists, residents with faculty, and film with digital display. AJR Am J Roentgenol. 2000;175(5):1233-1238. doi:10.2214/ajr.175.5.1751233
    1. Singh SP, Gierada DS, Pinsky P, et al. . Reader variability in identifying pulmonary nodules on chest radiographs from the national lung screening trial. J Thorac Imaging. 2012;27(4):249-254. doi:10.1097/RTI.0b013e318256951e
    1. MIT Laboratory for Computational Physiology . Medical Information Mart for Intensive Care. Accessed April 2021.
    1. Homayounieh F, Digumarthy S, Ebrahimian S, et al. . An artificial intelligence–based chest x-ray model on human nodule detection accuracy from a multicenter study. JAMA Netw Open. 2021;4(12):e2141096. doi:10.1001/jamanetworkopen.2021.41096
    1. Ueda D, Yamamoto A, Shimazaki A, et al. . Artificial intelligence-supported lung cancer detection by multi-institutional readers with multi-vendor chest radiographs: a retrospective clinical validation study. BMC Cancer. 2021;21(1):1120. doi:10.1186/s12885-021-08847-9
    1. Hong W, Hwang EJ, Lee JH, Park J, Goo JM, Park CM. Deep learning for detecting pneumothorax on chest radiographs after needle biopsy: clinical implementation. Radiology. 2022;303(2):433-441. doi:10.1148/radiol.211706
    1. Zhou L, Yin X, Zhang T, et al. . Detection and semiquantitative analysis of cardiomegaly, pneumothorax, and pleural effusion on chest radiographs. Radiol Artif Intell. 2021;3(4):e200172. doi:10.1148/ryai.2021200172
    1. Ebrahimian S, Homayounieh F, Rockenbach MABC, et al. . Artificial intelligence matches subjective severity assessment of pneumonia for prediction of patient outcome and need for mechanical ventilation: a cohort study. Sci Rep. 2021;11(1):858. doi:10.1038/s41598-020-79470-0
    1. Homayounieh F, Digumarthy SR, Febbo JA, et al. . Comparison of baseline, bone-subtracted, and enhanced chest radiographs for detection of pneumothorax. Can Assoc Radiol J. 2021;72(3):519-524. doi:10.1177/0846537120908852
    1. Seah JCY, Tang CHM, Buchlak QD, et al. . Effect of a comprehensive deep-learning model on the accuracy of chest x-ray interpretation by radiologists: a retrospective, multireader multicase study. Lancet Digit Health. 2021;3(8):e496-e506. doi:10.1016/S2589-7500(21)00106-0
    1. Häberle L, Wagner F, Fasching PA, et al. . Characterizing mammographic images by using generic texture features. Breast Cancer Res. 2012;14(2):R59. doi:10.1186/bcr3163
    1. Technologies . qXR: AI for chest x-rays. Accessed February 4, 2022.
    1. Annalise-AI . Comprehensive medical imaging AI solutions. Accessed February 4, 2022.
    1. Singh R, Kalra MK, Nitiwarangkul C, et al. . Deep learning in chest radiography: detection of findings and presence of change. PLoS One. 2018;13(10):e0204155. doi:10.1371/journal.pone.0204155
    1. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE; 2016:770-778. Accessed August 1, 2022.
    1. Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV. AutoAugment: learning augmentation policies from data. Computer Vision Foundation. 2018. Accessed August 1, 2022.
    1. Kim M, Park J, Na S, Park CM, Yoo D. Learning visual context by comparison. arXiv. Posted online July 15, 2020. Accessed August 1, 2022.
    1. Caruana R. Multitask learning. Machine Learning. 1997;28:41-75. doi:10.1023/A:1007379606734

Source: PubMed

3
订阅