Effect of deep learning-based assistive technology use on chest radiograph interpretation by emergency department physicians: a prospective interventional simulation-based study

Ji Hoon Kim, Sang Gil Han, Ara Cho, Hye Jung Shin, Song-Ee Baek, Ji Hoon Kim, Sang Gil Han, Ara Cho, Hye Jung Shin, Song-Ee Baek

Abstract

Background: Interpretation of chest radiographs (CRs) by emergency department (ED) physicians is inferior to that by radiologists. Recent studies have investigated the effect of deep learning-based assistive technology on CR interpretation (DLCR), although its relevance to ED physicians remains unclear. This study aimed to investigate whether DLCR supports CR interpretation and the clinical decision-making of ED physicians.

Methods: We conducted a prospective interventional study using a web-based performance assessment system. Study participants were recruited through the official notice targeting board for certified emergency physicians and residents working at the present ED. Of the eight ED physicians who volunteered to participate in the study, seven ED physicians were included, while one participant declared withdrawal during performance assessment. Seven physicians' CR interpretations and clinical decision-making were assessed based on the clinical data from 388 patients, including detecting the target lesion with DLCR. Participant performance was evaluated by area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, and accuracy analyses; decision-making consistency was measured by kappa statistics. ED physicians with < 24 months of experience were defined as 'inexperienced'.

Results: Among the 388 simulated cases, 259 (66.8%) had CR abnormality. Their median value of abnormality score measured by DLCR was 59.3 (31.77, 76.25) compared to a score of 3.35 (1.57, 8.89) for cases of normal CR. There was a difference in performance between ED physicians working with and without DLCR (AUROC: 0.801, P < 0.001). The diagnostic sensitivity and accuracy of CR were higher for all ED physicians working with DLCR than for those working without it. The overall kappa value for decision-making consistency was 0.902 (95% confidence interval [CI] 0.884-0.920); concurrently, the kappa value for the experienced group was 0.956 (95% CI 0.934-0.979), and that for the inexperienced group was 0.862 (95% CI 0.835-0.889).

Conclusions: This study presents preliminary evidence that ED physicians using DLCR in a clinical setting perform better at CR interpretation than their counterparts who do not use this technology. DLCR use influenced the clinical decision-making of inexperienced physicians more strongly than that of experienced physicians. These findings require prospective validation before DLCR can be recommended for use in routine clinical practice.

Keywords: Chest radiograph; Decision-making; Deep learning-based assistive technology; Emergency department.

Conflict of interest statement

The authors declare that they have no competing interests.

© 2021. The Author(s).

Figures

Fig. 1
Fig. 1
Representative case for performance assessment. (Left) CRs and the patients’ clinical and demographic characteristics were presented to the participating ED physicians in the first step. (Right) In the second step, the same information was presented, although the assessment was made using DLCR. CR, chest radiograph; ED, emergency department; DLCR, deep learning-based assistive technology on CR interpretation
Fig. 2
Fig. 2
Flowchart of changes in clinical decisions after assessing CRs with deep learning-based assistive technology. CR, chest radiograph

References

    1. Chung JH, Cox CW, Mohammed TL, Kirsch J, Brown K, Dyer DS, et al. ACR appropriateness criteria blunt chest trauma. J Am Coll Radiol. 2014;11(4):345–351. doi: 10.1016/j.jacr.2013.12.019.
    1. Heitkamp DE, Albin MM, Chung JH, Crabtree TP, Iannettoni MD, Johnson GB, et al. ACR Appropriateness Criteria® acute respiratory illness in immunocompromised patients. J Thorac Imaging. 2015;30(3):W2–5. doi: 10.1097/RTI.0000000000000153.
    1. Hoffmann U, Akers SR, Brown RK, Cummings KW, Cury RC, Greenberg SB, et al. ACR appropriateness criteria acute nonspecific chest pain-low probability of coronary artery disease. J Am Coll Radiol. 2015;12(12 Pt A):1266–1271. doi: 10.1016/j.jacr.2015.09.004.
    1. Hwang EJ, Park S, Jin KN, Kim JI, Choi SY, Lee JH, et al. Development and validation of a deep learning-based automated detection algorithm for major thoracic diseases on chest radiographs. JAMA Netw Open. 2019;2(3):e191095. doi: 10.1001/jamanetworkopen.2019.1095.
    1. Jokerst C, Chung JH, Ackman JB, Carter B, Colletti PM, Crabtree TD, et al. ACR Appropriateness Criteria(®) Acute Respiratory Illness in Immunocompetent Patients. J Am Coll Radiol. 2018;15(11s):S240–S251. doi: 10.1016/j.jacr.2018.09.012.
    1. Ketai LH, Mohammed TL, Kirsch J, Kanne JP, Chung JH, Donnelly EF, et al. ACR appropriateness criteria® hemoptysis. J Thorac Imaging. 2014;29(3):W19–22. doi: 10.1097/RTI.0000000000000084.
    1. Kirsch J, Brown RKJ, Henry TS, Javidan-Nejad C, Jokerst C, Julsrud PR, et al. ACR Appropriateness Criteria(®) acute chest pain-suspected pulmonary embolism. J Am Coll Radiol. 2017;14(5s):S2–s12. doi: 10.1016/j.jacr.2017.02.027.
    1. McComb BL, Chung JH, Crabtree TD, Heitkamp DE, Iannettoni MD, Jokerst C, et al. ACR Appropriateness Criteria® routine chest radiography. J Thorac Imaging. 2016;31(2):W13–W15. doi: 10.1097/RTI.0000000000000200.
    1. Chung JH, Duszak R, Jr, Hemingway J, Hughes DR, Rosenkrantz AB. Increasing utilization of chest imaging in US emergency departments from 1994 to 2015. J Am Coll Radiol. 2019;16(5):674–682. doi: 10.1016/j.jacr.2018.11.011.
    1. Ilsen B, Vandenbroucke F, Beigelman-Aubry C, Brussaard C, de Mey J. Comparative interpretation of CT and standard radiography of the pleura. J Belg Soc Radiol. 2016;100(1):106. doi: 10.5334/jbr-btr.1229.
    1. Donald JJ, Barnard SA. Common patterns in 558 diagnostic radiology errors. J Med Imaging Radiat Oncol. 2012;56(2):173–178. doi: 10.1111/j.1754-9485.2012.02348.x.
    1. Petinaux B, Bhat R, Boniface K, Aristizabal J. Accuracy of radiographic readings in the emergency department. Am J Emerg Med. 2011;29(1):18–25. doi: 10.1016/j.ajem.2009.07.011.
    1. Gatt ME, Spectre G, Paltiel O, Hiller N, Stalnikowicz R. Chest radiographs in the emergency department: is the radiologist really necessary? Postgrad Med J. 2003;79(930):214–217. doi: 10.1136/pmj.79.930.214.
    1. Eng J, Mysko WK, Weller GE, Renard R, Gitlin JN, Bluemke DA, et al. Interpretation of Emergency Department radiographs: a comparison of emergency medicine physicians with radiologists, residents with faculty, and film with digital display. AJR Am J Roentgenol. 2000;175(5):1233–1238. doi: 10.2214/ajr.175.5.1751233.
    1. Al aseri Z. Accuracy of chest radiograph interpretation by emergency physicians. Emerg Radiol. 2009;16(2):111–114. doi: 10.1007/s10140-008-0763-9.
    1. Kim JH, Kim JY, Kim GH, Kang D, Kim IJ, Seo J, et al. Clinical validation of a deep learning algorithm for detection of pneumonia on chest radiographs in emergency department patients with acute febrile respiratory illness. J Clin Med. 2020;9(6):1981. doi: 10.3390/jcm9061981.
    1. Hwang EJ, Nam JG, Lim WH, Park SJ, Jeong YS, Kang JH, et al. Deep learning for chest radiograph diagnosis in the emergency department. Radiology. 2019;293(3):573–580. doi: 10.1148/radiol.2019191225.
    1. Sellers A, Hillman BJ, Wintermark M. Survey of after-hours coverage of emergency department imaging studies by US academic radiology departments. J Am Coll Radiol. 2014;11(7):725–730. doi: 10.1016/j.jacr.2013.11.015.
    1. Nam JG, Park S, Hwang EJ, Lee JH, Jin KN, Lim KY, et al. Development and validation of deep learning-based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology. 2019;290(1):218–228. doi: 10.1148/radiol.2018180237.
    1. Hwang EJ, Park S, Jin KN, Kim JI, Choi SY, Lee JH, et al. Development and validation of a deep learning-based automatic detection algorithm for active pulmonary tuberculosis on chest radiographs. Clin Infect Dis. 2019;69(5):739–747. doi: 10.1093/cid/ciy967.
    1. Huang G, Liu Z, Pleiss G, Van Der Maaten L, Weinberger K. Convolutional networks with dense connectivity. IEEE Trans Pattern Anal Mach Intell. 2019 doi: 10.1109/tpami.2019.2918284.
    1. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–174. doi: 10.2307/2529310.
    1. Kim JH, Kim MJ, You JS, Song MK, Cho SI. Do emergency physicians improve the appropriateness of emergency transfer in rural areas? J Emerg Med. 2018;54(3):287–294. doi: 10.1016/j.jemermed.2017.08.013.
    1. Lyon M, Sturgis L, Lendermon D, Kuchinski AM, Mueller T, Loeffler P, et al. Rural ED transfers due to lack of radiology services. Am J Emerg Med. 2015;33(11):1630–1634. doi: 10.1016/j.ajem.2015.07.050.
    1. Bergeron C, Fleet R, Tounkara FK, Lavallée-Bourget I, Turgeon-Pelchat C. Lack of CT scanner in a rural emergency department increases inter-facility transfers: a pilot study. BMC Res Notes. 2017;10(1):772. doi: 10.1186/s13104-017-3071-1.
    1. Santosh KC, Antani S. Automated chest X-ray screening: can lung region symmetry help detect pulmonary abnormalities? IEEE Trans Med Imaging. 2018;37(5):1168–1177. doi: 10.1109/TMI.2017.2775636.
    1. Singh R, Kalra MK, Nitiwarangkul C, Patti JA, Homayounieh F, Padole A, et al. Deep learning in chest radiography: Detection of findings and presence of change. PLoS ONE. 2018;13(10):e0204155. doi: 10.1371/journal.pone.0204155.
    1. Santosh KC. AI-driven tools for coronavirus outbreak: need of active learning and cross-population train/test models on multitudinal/multimodal data. J Med Syst. 2020;44(5):93. doi: 10.1007/s10916-020-01562-1.
    1. Santosh KC, Ghosh S. Covid-19 imaging tools: How Big data is Big? J Med Syst. 2021;45(7):71. doi: 10.1007/s10916-021-01747-2.
    1. Park SH, Han K. Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction. Radiology. 2018;286(3):800–809. doi: 10.1148/radiol.2017171920.

Source: PubMed

Подписаться