Does training improve diagnostic accuracy and inter-rater agreement in applying the Berlin radiographic definition of acute respiratory distress syndrome? A multicenter prospective study

Jin-Min Peng, Chuan-Yun Qian, Xiang-You Yu, Ming-Yan Zhao, Shu-Sheng Li, Xiao-Chun Ma, Yan Kang, Fa-Chun Zhou, Zhen-Yang He, Tie-He Qin, Yong-Jie Yin, Li Jiang, Zhen-Jie Hu, Ren-Hua Sun, Jian-Dong Lin, Tong Li, Da-Wei Wu, You-Zhong An, Yu-Hang Ai, Li-Hua Zhou, Xiang-Yuan Cao, Xi-Jing Zhang, Rong-Qing Sun, Er-Zhen Chen, Bin Du, China Critical Care Clinical Trial Group (CCCCTG), Jin-Min Peng, Chuan-Yun Qian, Xiang-You Yu, Ming-Yan Zhao, Shu-Sheng Li, Xiao-Chun Ma, Yan Kang, Fa-Chun Zhou, Zhen-Yang He, Tie-He Qin, Yong-Jie Yin, Li Jiang, Zhen-Jie Hu, Ren-Hua Sun, Jian-Dong Lin, Tong Li, Da-Wei Wu, You-Zhong An, Yu-Hang Ai, Li-Hua Zhou, Xiang-Yuan Cao, Xi-Jing Zhang, Rong-Qing Sun, Er-Zhen Chen, Bin Du, China Critical Care Clinical Trial Group (CCCCTG)

Abstract

Background: Poor inter-rater reliability in chest radiograph interpretation has been reported in the context of acute respiratory distress syndrome (ARDS), although not for the Berlin definition of ARDS. We sought to examine the effect of training material on the accuracy and consistency of intensivists' chest radiograph interpretations for ARDS diagnosis.

Methods: We conducted a rater agreement study in which 286 intensivists (residents 41.3%, junior attending physicians 35.3%, and senior attending physician 23.4%) independently reviewed the same 12 chest radiographs developed by the ARDS Definition Task Force ("the panel") before and after training. Radiographic diagnoses by the panel were classified into the consistent (n = 4), equivocal (n = 4), and inconsistent (n = 4) categories and were used as a reference. The 1.5-hour training course attended by all 286 intensivists included introduction of the diagnostic rationale, and a subsequent in-depth discussion to reach consensus for all 12 radiographs.

Results: Overall diagnostic accuracy, which was defined as the percentage of chest radiographs that were interpreted correctly, improved but remained poor after training (42.0 ± 14.8% before training vs. 55.3 ± 23.4% after training, p < 0.001). Diagnostic sensitivity and specificity improved after training for all diagnostic categories (p < 0.001), with the exception of specificity for the equivocal category (p = 0.883). Diagnostic accuracy was higher for the consistent category than for the inconsistent and equivocal categories (p < 0.001). Comparisons of pre-training and post-training results revealed that inter-rater agreement was poor and did not improve after training, as assessed by overall agreement (0.450 ± 0.406 vs. 0.461 ± 0.575, p = 0.792), Fleiss's kappa (0.133 ± 0.575 vs. 0.178 ± 0.710, p = 0.405), and intraclass correlation coefficient (ICC; 0.219 vs. 0.276, p = 0.470).

Conclusions: The radiographic diagnostic accuracy and inter-rater agreement were poor when the Berlin radiographic definition was used, and were not significantly improved by the training set of chest radiographs developed by the ARDS Definition Task Force.

Trial registration: The study was registered at ClinicalTrials.gov (registration number NCT01704066 ) on 6 October 2012.

Keywords: Acute respiratory distress syndrome; Chest radiograph; Diagnostic accuracy; Inter-rater variability.

Figures

Fig. 1
Fig. 1
Diagnostic accuracies for 12 chest radiographs for the 286 participating intensivists before and after training. Consistent, chest radiographs consistent with ARDS, as judged by the panel; equivocal, chest radiographs equivocal for ARDS, as judged by the panel; inconsistent, chest radiographs inconsistent with ARDS, as judged by the panel
Fig. 2
Fig. 2
Distribution of 286 intensivists by numbers of correctly diagnosed chest radiographs before and after training

References

    1. Bernard GR, Artigas A, Brigham KL, Carlet J, Falke K, Hudson L, et al. The American-European Consensus Conference on ARDS. Definitions, mechanisms, relevant outcomes, and clinical trial coordination. Am J Respir Crit Care Med. 1994;149:818–24. doi: 10.1164/ajrccm.149.3.7509706.
    1. Definition Task Force ARDS, Ranieri VM, Rubenfeld GD, Thompson BT, Ferguson ND, Caldwell E, et al. Acute respiratory distress syndrome: the Berlin definition. JAMA. 2012;307:2526–33.
    1. Ferguson ND, Fan E, Camporota L, Antonelli M, Anzueto A, Beale R, et al. The Berlin definition of ARDS: an expanded rationale, justification, and supplementary material. Intensive Care Med. 2012;38:1573–82. doi: 10.1007/s00134-012-2682-1.
    1. Rubenfeld GD, Caldwell E, Granton J, Hudson LD, Matthay MA. Interobserver variability in applying a radiographic definition for ARDS. Chest. 1999;116:1347–53. doi: 10.1378/chest.116.5.1347.
    1. Meade MO, Cook RJ, Guyatt GH, Groll R, Kachura JR, Bedard M, et al. Interobserver variation in interpreting chest radiographs for the diagnosis of acute respiratory distress syndrome. Am J Respir Crit Care Med. 2000;161:85–90. doi: 10.1164/ajrccm.161.1.9809003.
    1. Bellani G, Laffey JG, Pham T, Fan E, Brochard L, Esteban A, et al. Epidemiology, patterns of care, and mortality for patients with acute respiratory distress syndrome in intensive care units in 50 countries. JAMA. 2016;315:788–800. doi: 10.1001/jama.2016.0291.
    1. Mandrekar JN. Simple statistical measures for diagnostic accuracy assessment. J Thorac Oncol. 2010;5:763–4. doi: 10.1097/JTO.0b013e3181dab122.
    1. Youden WJ. Index for rating diagnostic tests. Cancer. 1950;3:32–5. doi: 10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>;2-3.
    1. Fleiss JL. Measuring nominal scale agreement among many raters. Psychol Bull. 1971;76:378–82. doi: 10.1037/h0031619.
    1. Gisev N, Bell JS, Chen TF. Interrater agreement and interrater reliability: key concepts, approaches, and applications. Res Social Adm Pharm. 2013;9:330–8. doi: 10.1016/j.sapharm.2012.04.004.
    1. Fisher RA. On the probable error of a coefficient of correlation deduced from a small sample. Metron. 1921;1:3–32.
    1. The CoBaTrICE Collaboration International standards for programmes of training in intensive care medicine in Europe. Intensive Care Med. 2011;37:385–93. doi: 10.1007/s00134-010-2096-x.
    1. Hu X, Xi X, Ma P, Qiu H, Yu K, Tang Y, et al. Consensus development of core competencies in intensive and critical care medicine training in China. Crit Care. 2016;20:330.
    1. Raoof S, Feigin D, Sung A, Raoof S, Irugulpati L, Rosenow EC. Interpretation of plain chest roentgenogram. Chest. 2012;141:545–58. doi: 10.1378/chest.10-1302.
    1. Troy PJ, Salerno EL, Venkatesh P. An evaluation of a short chest radiograph learning intervention to evaluate internal medicine residents’ ability to identify basic pathologic abnormalities and normal anatomy. Conn Med. 2006;70:421–5.
    1. Herasevich V, Yilmaz M, Khan H, Hubmayr RD, Gajic O. Validation of an electronic surveillance system for acute lung injury. Intensive Care Med. 2009;35:1018–23. doi: 10.1007/s00134-009-1460-1.
    1. Brower RG, Lanken PN, MacIntyre N, Matthay MA, Morris A, Ancukiewicz M, et al. Higher versus lower positive end-expiratory pressures in patients with the acute respiratory distress syndrome. N Engl J Med. 2004;351:327–36. doi: 10.1056/NEJMoa032193.
    1. National Heart, Lung, and Blood Institute Acute Respiratory Distress Syndrome (ARDS) Clinical Trials Network. Wiedemann HP, Wheeler AP, Bernard GR, Thompson BT, Hayden D, et al. Comparison of two fluid-management strategies in acute lung injury. N Engl J Med. 2006;354:2564–75. doi: 10.1056/NEJMoa062200.
    1. Papazian L, Forel JM, Gacouin A, Penot-Ragon C, Perrin G, Loundou A, et al. Neuromuscular blockers in early acute respiratory distress syndrome. N Engl J Med. 2010;363:1107–16. doi: 10.1056/NEJMoa1005372.
    1. Ferguson ND, Cook DJ, Guyatt GH, Mehta S, Hand L, Austin P, et al. High-frequency oscillation in early acute respiratory distress syndrome. N Engl J Med. 2013;368:795–805. doi: 10.1056/NEJMoa1215554.

Source: PubMed

3
Předplatit