Inter-Reader Reliability of Early FDG-PET/CT Response Assessment Using the Deauville Scale after 2 Cycles of Intensive Chemotherapy (OEPA) in Hodgkin's Lymphoma
Regine Kluge, Lidia Chavdarova, Martha Hoffmann, Carsten Kobe, Bogdan Malkowski, Françoise Montravers, Lars Kurch, Thomas Georgi, Markus Dietlein, W Hamish Wallace, Jonas Karlen, Ana Fernández-Teijeiro, Michaela Cepelova, Lorrain Wilson, Eva Bergstraesser, Osama Sabri, Christine Mauz-Körholz, Dieter Körholz, Dirk Hasenclever, Regine Kluge, Lidia Chavdarova, Martha Hoffmann, Carsten Kobe, Bogdan Malkowski, Françoise Montravers, Lars Kurch, Thomas Georgi, Markus Dietlein, W Hamish Wallace, Jonas Karlen, Ana Fernández-Teijeiro, Michaela Cepelova, Lorrain Wilson, Eva Bergstraesser, Osama Sabri, Christine Mauz-Körholz, Dieter Körholz, Dirk Hasenclever
Abstract
Purpose: The five point Deauville (D) scale is widely used to assess interim PET metabolic response to chemotherapy in Hodgkin lymphoma (HL) patients. An International Validation Study reported good concordance among reviewers in ABVD treated advanced stage HL patients for the binary discrimination between score D1,2,3 and score D4,5. Inter-reader reliability of the whole scale is not well characterised.
Methods: Five international expert readers scored 100 interim PET/CT scans from paediatric HL patients. Scans were acquired in 51 European hospitals after two courses of OEPA chemotherapy (according to the EuroNet-PHL-C1 study). Images were interpreted in direct comparison with staging PET/CTs.
Results: The probability that two random readers concord on the five point D score of a random case is only 42% (global kappa = 0.24). Aggregating to a three point scale D1,2 vs. D3 vs. D4,5 improves concordance to 60% (kappa = 0.34). Concordance if one of two readers assigns a given score is 70% for score D1,2 only 36% for score D3 and 64% for D4,5. Concordance for the binary decisions D1,2 vs. D3,4,5 is 67% and 86% for D1,2,3 vs D4,5 (kappa = 0.36 resp. 0.56). If one reader assigns D1,2,3 concordance probability is 92%, but only 64% if D4,5 is called. Discrepancies occur mainly in mediastinum, neck and skeleton.
Conclusion: Inter-reader reliability of the five point D-scale is poor in this interobserver analysis of paediatric patients who underwent OEPA. Inter-reader variability is maximal in cases assigned to D2 or D3. The binary distinction D1,2,3 versus D4,5 is the most reliable criterion for clinical decision making.
Conflict of interest statement
Competing Interests: The authors have declared that no competing interests exist.
Figures
References
- Press OW, LeBlanc M, Rimsza LM, Schoder H, Friedberg JW, Evens AM, et al. A phase II trial of response-adapted therapy of stages III-IVHodgkin lymphoma using early interim FDG-PET imaging; US intergroup S0812. Hematol Oncol. 2013;31 (Suppl 1), abstract 124.
- Dührsen U1, Hüttmann A, Jöckel KH, Müller S. Positron emission tomography guided therapy of aggressive non-Hodgkin lymphomas—the PETAL trial. Leuk Lymphoma. 2009;50:1757–60. 10.3109/10428190903308031
- Meignan M, Gallamini A, Haioun C, Polliack A. Report on the Second International Workshop on interim positron emission tomography in lymphoma held in Menton, France, 8–9 April 2010. Leuk Lymphoma. 2010;51:2171–80. 10.3109/10428194.2010.529208
- Meignan M, Barrington S, Itti E, Gallamini A, Haioun C, Polliack A. Report on the 4th International Workshop on Positron Emission Tomography in Lymphoma held in Menton, France, 3–5 October 2012. Leuk Lymphoma. 2014;55:31–7. 10.3109/10428194.2013.802784
- Kluge R, Körholz D. Role of FDG-PET in Staging and Therapy of Children with Hodgkin Lymphoma. Klin Padiatr. 2011;223:315–9. 10.1055/s-0031-1287834
- Juweid ME, Stroobants S, Hoekstra OS, Mottaghy FM, Dietlein M, Guermazi A, et al. Use of positron emission tomography for response assessment of lymphoma: consensus of the Imaging Subcommittee of International Harmonization Project in Lymphoma. J Clin Oncol. 2007;25:571–8.
- Gallamini A, Hutchings M, Rigacci L, Specht L, Merli F, Hansen M, et al. Early interim 2-[18F]fluoro-2-desoxy-D-glucose positron emission tomography is prognostically superior to international prognostic score in advanced-stage Hodgkin's lymphoma: a report from a joint Italian-Danish study. J Clin Oncol. 2007;25:3746–52.
- Meignan M, Gallamini A, Haioun C. Report on the First International Workshop on Interim-PET-Scan in Lymphoma. Leuk Lymphoma. 2009;50:1257–60. 10.1080/10428190903040048
- Meignan M, Barrington S, Itti E, Gallamini A, Haioun C, Polliack A. Report on the 4th International Workshop on Positron Emission Tomography in Lymphoma held in Menton, France, 3–5 October 2012. Leuk Lymphoma. 2014;55:31–7. 10.3109/10428194.2013.802784
- Stauss J, Franzius C, Pfluger T, Juergens KU, Biassoni L, Begent J, et al. Guidelines for 18F-FDG PET and PET-CT imaging in paediatric oncology. Eur J Nucl Med Mol Imaging. 2008;35:1581–8. 10.1007/s00259-008-0826-x
- Kurch L, Mauz-Körholz C, Bertling S, Wallinder M, Kaminska M, Marwede D, et al. The EuroNet paediatric Hodgkin network—modern imaging data management for real time central review in multicentre trials. Klin Padiatr. 2013;225:357–61. 10.1055/s-0033-1354416
- Uebersax JS. A design-independent method for measuring the reliability of psychiatric diagnosis. J Psych Res. 1982;17:335–342.
- Eye AV, Eye MV. On the Marginal Dependency of Cohen’s κ. European Psychologist. 2008; 13:305–315.
- Vach W. The dependence of Cohen's kappa on the prevalence does not matter. J Clin Epidemiol. 2005;58:655–661.
- Ware C. Information Visualization: Perception for Design (Interactive Technologies). Elsevier; Third Edition 2012.
- Hasenclever D, Kurch L, Mauz-Körholz C, Elsner A, Georgi T, Wallace H, et al. qPET—a quantitative extension of the Deauville scale to assess response in interim FDG-PET scans in lymphoma. Eur J Nucl Med Mol Imaging. 2014;41:1301–8. 10.1007/s00259-014-2715-9
- Boellaard R, Delgado-Bolton R, Oyen WJ, Giammarile F, Tatsch K, Eschner W, et al. FDG PET/CT: EANM procedure guidelines for tumour imaging: version 2.0. Eur J Nucl Med Mol Imaging. 2015;42:328–54. 10.1007/s00259-014-2961-x
- Barrington SF, Mackewn JE, Schleyer P, Marsden PK, Mikhaeel NG, Qian W, et al. Establishment of a UK-wide network to facilitate the acquisition of quality assured FDG-PET data for clinical trials in lymphoma. Ann Oncol. 2011;22:739–45. 10.1093/annonc/mdq428
- Oki Y, Chuang H, Chasen B, Jessop A, Pan T, Fanale M, et al. The prognostic value of interim positron emission tomography scan in patients with classical Hodgkin lymphoma. Br J Haematol. 2014;165:112–6. 10.1111/bjh.12715
- Biggi A, Gallamini A, Chauvie S, Hutchings M, Kostakoglu L, Gregianin M, et al. International validation study for interim PET in ABVD-treated, advanced-stage Hodgkin lymphoma: interpretation criteria and concordance rate among reviewers. J Nucl Med. 2013;54:683–90. 10.2967/jnumed.112.110890
- Barrington SF, Qian W, Somer EJ, Franceschetto A, Bagni B, Brun E, et al. Concordance between four European centres of PET reporting criteria designed for use in multicentre trials in Hodgkin lymphoma. Eur J Nucl Med Mol Imaging. 2010;37:1824–33. 10.1007/s00259-010-1490-5
- Furth C, Amthauer H, Hautzel H, Steffen IG, Ruf J, Schiefer J, et al. Evaluation of interim PET response criteria in paediatric Hodgkin's lymphoma—results for dedicated assessment criteria in a blinded dual-centre read. Ann Oncol. 2011;22:1198–203. 10.1093/annonc/mdq557
- Horning SJ, Juweid ME, Schöder H, Wiseman G, McMillan A, Swinnen LJ, et al. Interim positron emission tomography scans in diffuse large B-cell lymphoma: an independent expert nuclear medicine evaluation of the Eastern Cooperative Oncology Group E3404 study. Blood. 2010;115:775–7. 10.1182/blood-2009-08-234351
- Furth C, Erdrich AS, Steffen IG, Ruf J, Stiebler M, Kahraman D, et al. Interim PET response criteria in paediatric non-Hodgkin's lymphoma. Results from a retrospective multicenter reading. Nuklearmedizin. 2013;52:148–56. 10.3413/Nukmed-0546-12-12
Source: PubMed