Application of a Bayesian graded response model to characterize areas of disagreement between clinician and patient grading of symptomatic adverse events

Thomas M Atkinson, Bryce B Reeve, Amylou C Dueck, Antonia V Bennett, Tito R Mendoza, Lauren J Rogak, Ethan Basch, Yuelin Li, Thomas M Atkinson, Bryce B Reeve, Amylou C Dueck, Antonia V Bennett, Tito R Mendoza, Lauren J Rogak, Ethan Basch, Yuelin Li

Abstract

Background: Traditional concordance metrics have shortcomings based on dataset characteristics (e.g., multiple attributes rated, missing data); therefore it is necessary to explore supplemental approaches to quantifying agreement between independent assessments. The purpose of this methodological paper is to apply an Item Response Theory (IRT) -based framework to an existing dataset that included unidimensional clinician and multiple attribute patient ratings of symptomatic adverse events (AEs), and explore the utility of this method in patient-reported outcome (PRO) and health-related quality of life (HRQOL) research.

Methods: Data were derived from a National Cancer Institute-sponsored study examining the validity of a measurement system (PRO-CTCAE) for patient self-reporting of AEs in cancer patients receiving treatment (N = 940). AEs included 13 multiple attribute patient-reported symptoms that had corresponding unidimensional clinician AE grades. A Bayesian IRT Model was fitted to calculate the latent grading thresholds between raters. The posterior mean values of the model-fitted item responses were calculated to represent model-based AE grades obtained from patients and clinicians.

Results: Model-based AE grades showed a general pattern of clinician underestimation relative to patient-graded AEs. However, the magnitude of clinician underestimation was associated with AE severity, such that clinicians' underestimation was more pronounced for moderate/very severe model-estimated AEs, and less so with mild AEs.

Conclusions: The Bayesian IRT approach reconciles multiple symptom attributes and elaborates on the patterns of clinician-patient non-concordance beyond that provided by traditional metrics. This IRT-based technique may be used as a supplemental tool to detect and characterize nuanced differences in patient-, clinician-, and proxy-based ratings of HRQOL and patient-centered outcomes.

Trial registration: ClinicalTrials.gov NCT01031641 . Registered 1 December 2009.

Keywords: Clinician-patient agreement; Item response theory; Neoplasms; Patient-reported outcomes.

Conflict of interest statement

Ethics approval and consent to participate

The study was approved by the institutional review boards at the National Cancer Institute and all participating sites. All study participants provided written informed consent.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Graded Response Model tracelines depicting two orthopedic surgeons’ responses in classifying hip fracture severity. Note: Posterior mean severity locations of three patients are superimposed. Plot recreated using published data [6]
Fig. 2
Fig. 2
Graded Response Model Estimates for Patients/Clinicians, and Difference between Patient and Clinicians for Two-Attribute Symptoms. Note: X-axis represents underlying distribution of AE in the population (θ parameter in the GRM); Y-axis represents the model estimated AE ratings. In the case of fatigue, θ represents severity and interference with daily activities
Fig. 3
Fig. 3
Graded Response Model Estimates for Patients/Clinicians, and Difference between Patient and Clinicians for Three-Attribute Symptoms. Note: X-axis represents underlying distribution of AE in the population (θ parameter in the GRM); Y-axis represents the model estimated AE ratings. In the case of pain, θ represents frequency, severity, and interference with daily activities

References

    1. Uebersax JS. Validity inferences from Interobserver agreement. Psychol Bull. 1988;104(3):405–416. doi: 10.1037/0033-2909.104.3.405.
    1. Uebersax JS. Modeling approaches for the analysis of observer agreement. Investig Radiol. 1992;27(9):738–743. doi: 10.1097/00004424-199209000-00017.
    1. Uebersax JS. Diversity of decision-making models and the measurement of interrater agreement. Psychol Bull. 1987;101(1):140–146. doi: 10.1037/0033-2909.101.1.140.
    1. Shrout PE, Fleiss JL. Intraclass correlations: Uses in assessing rater reliability. Psychol Bull. 1979;86:420–428. doi: 10.1037/0033-2909.86.2.420.
    1. Atkinson TM, Li Y, Coffey CW, Sit L, Shaw M, Lavene D, Bennett AV, Fruscione M, Rogak L, Hay J, Gonen M, Schrag D, Basch E. Reliability of adverse symptom event reporting by clinicians. Qual Life Res. 2012;21(7):1159–1164. doi: 10.1007/s11136-011-0031-4.
    1. Baldwin P, Bernstein J, Wainer H. Hip psychometrics. Stat Med. 2009;28(17):2277–2292. doi: 10.1002/sim.3616.
    1. Linacre JM. Constructing measurement with a many-facet Rasch model. 1991.
    1. Atkinson TM, Rogak LJ, Heon N, Ryan SJ, Shaw M, Stark LP, Bennett AV, Basch E, Li Y. Exploring differences in adverse symptom event grading thresholds between clinicians and patients in the clinical trial setting. J Cancer Res Clin Oncol. 2017;143(4):735–743. doi: 10.1007/s00432-016-2335-9.
    1. Basch E. The missing voice of patients in drug-safety reporting. N Engl J Med. 2010;362(10):865–869. doi: 10.1056/NEJMp0911494.
    1. Xiao C, Polomano R, Bruner DW. Comparison between patient-reported and clinician-observed symptoms in oncology. Cancer Nurs. 2013;36(6):E1–E16. doi: 10.1097/NCC.0b013e318269040f.
    1. Atkinson TM, Ryan SJ, Bennett AV, Stover AM, Saracino RM, Rogak LJ, Jewell ST, Matsoukas K, Li Y, Basch E. The association between clinician-based common terminology criteria for adverse events (CTCAE) and patient-reported outcomes (PRO): A systematic review. Support Care Cancer. 2016;24(8):3669–3676. doi: 10.1007/s00520-016-3297-9.
    1. Basch E, Iasonos A, McDonough T, Barz A, Culkin A, Kris MG, Scher HI, Schrag D. Patient versus clinician symptom reporting using the National Cancer Institute common terminology criteria for adverse events: Results of a questionnaire-based study. Lancet Oncol. 2006;7:903–909. doi: 10.1016/S1470-2045(06)70910-X.
    1. Basch E, Jia X, Heller G, Barz A, Sit L, Fruscione M, Appawu M, Iasonos A, Atkinson T, Goldfarb S, Culkin A, Kris MG, Schrag D. Adverse symptom event reporting by patients vs clinicians: Relationships with clinical outcomes. J Natl Cancer Inst. 2009;101(23):1624–1632. doi: 10.1093/jnci/djp386.
    1. Bennett BK, Park SB, Lin CS, Friedlander ML, Kiernan MC, Goldstein D. Impact of oxaliplatin-induced neuropathy: A patient perspective. Support Care Cancer. 2012;20(11):2959–2967. doi: 10.1007/s00520-012-1428-5.
    1. Greimel ER, Bjelic-Radisic V, Pfisterer J, Hilpert F, Daghofer F, Pujade-Lauraine E, du Bois A. Toxicity and quality of life outcomes in ovarian cancer patients participating in randomized controlled trials. Support Care Cancer. 2011;19(9):1421–1427. doi: 10.1007/s00520-010-0969-8.
    1. Neben-Wittich MA, Atherton PJ, Schwartz DJ, Sloan JA, Griffin PC, Deming RL, Anders JC, Loprinzi CL, Burger KN, Martenson JA, Miller RC. Comparison of provider-assessed and patient-reported outcome measures of acute skin toxicity during a phase III trial of mometasone cream versus placebo during breast radiotherapy: The north central Cancer treatment group (N06C4) Int J Radiat Oncol Biol Phys. 2011;81(2):397–402. doi: 10.1016/j.ijrobp.2010.05.065.
    1. Dueck AC, Mendoza TR, Mitchell SA, Reeve BB, Castro KM, Rogak LJ, Atkinson TM, Bennett AV, Denicoff AM, O'Mara AM, Li Y, Clauser SB, Bryant DM, Bearden JD, 3rd, Gillis TA, Harness JK, Siegel RD, Paul DB, Cleeland CS, Schrag D, Sloan JA, Abernethy AP, Bruner DW, Minasian LM, Basch E, National Cancer Institute, P. R. O. C. S. G Validity and reliability of the US National Cancer Institute's patient-reported outcomes version of the common terminology criteria for adverse events (PRO-CTCAE) JAMA Oncol. 2015;1(8):1051–1059. doi: 10.1001/jamaoncol.2015.2639.
    1. National Cancer Institute, National Institutes of Health, U.S. Department of Health and Human Services. Common Terminology Criteria for Adverse Events (CTCAE) Version 4.0. Published May 28, 2009; Revised Version 4.03 June 14, 2010. Available from: [Accessed February 28, 2018].
    1. Basch E., Reeve B. B., Mitchell S. A., Clauser S. B., Minasian L. M., Dueck A. C., Mendoza T. R., Hay J., Atkinson T. M., Abernethy A. P., Bruner D. W., Cleeland C. S., Sloan J. A., Chilukuri R., Baumgartner P., Denicoff A., St. Germain D., O'Mara A. M., Chen A., Kelaghan J., Bennett A. V., Sit L., Rogak L., Barz A., Paul D. B., Schrag D. Development of the National Cancer Institute's Patient-Reported Outcomes Version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) JNCI Journal of the National Cancer Institute. 2014;106(9):dju244–dju244. doi: 10.1093/jnci/dju244.
    1. Hay JL, Atkinson TM, Reeve BB, Mitchell SA, Mendoza TR, Willis G, Minasian LM, Clauser SB, Denicoff A, O'Mara A, Chen A, Bennett AV, Paul DB, Gagne J, Rogak L, Sit L, Viswanath V, Schrag D, Basch E, Group, N. P.-C. S Cognitive interviewing of the US National Cancer Institute's patient-reported outcomes version of the common terminology criteria for adverse events (PRO-CTCAE) Qual Life Res. 2014;23(1):257–269. doi: 10.1007/s11136-013-0470-1.
    1. Mendoza TR, Dueck AC, Bennett AV, Mitchell SA, Reeve BB, Atkinson TM, Li Y, Castro KM, Denicoff A, Rogak LJ, Piekarz RL, Cleeland CS, Sloan JA, Schrag D, Basch E. Evaluation of different recall periods for the US National Cancer Institute's PRO-CTCAE. Clin Trials. 2017;14(3):255–263. doi: 10.1177/1740774517698645.
    1. Bennett, A. V., Dueck, A. C., Mitchell, S. A., Mendoza, T. R., Reeve, B. B., Atkinson, T. M., Castro, K. M., Denicoff, A., Rogak, L. J., Harness, J. K., Bearden, J. D., Bryant, D., Siegel, R. D., Schrag, D., Basch, E., & National Cancer Institute, P. R. O. C. S. G. (2016). Mode equivalence and acceptability of tablet computer-, interactive voice response system-, and paper-based administration of the U.S. National Cancer Institute's Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE). Health Qual Life Outcomes, 14, 24.
    1. Atkinson, T. M., Hay, J. L., Dueck, A. C., Mitchell, S. A., Mendoza, T. R., Rogak, L. J., Minasian, L. M., & Basch, E. (2017). What do "none," "mild," "moderate," "severe" and "very severe" mean to patients with cancer? Content validity of PRO-CTCAE response scales. J Pain Symptom Manage.
    1. Mendoza TR, Dueck AC, Mitchell SA, Reeve BB, Li Y, Atkinson TM, Bennett AV, Clauser SB, Basch E. The effect of skip patterns on the validity and reliability of selected items from the patient-reported outcomes version of the common terminology criteria for adverse events (PRO-CTCAE). Paper presented at the. San Diego, CA: Joint Statistical Meetings; 2012. p. 2012.
    1. Lord FM, Novick MR. Statistical theories of mental test scores. Reading, MA: Addison Wesley; 1968.
    1. Bjorner JB, Chang CH, Thissen D, Reeve BB. Developing tailored instruments: Item banking and computerized adaptive assessment. Qual Life Res. 2007;16(Suppl 1):95–108. doi: 10.1007/s11136-007-9168-6.
    1. Simpson D, Rue H, Riebler A, Martins TG, Sørbye SH. Penalising model component complexity: A principled, practical approach to constructing priors. Stat Sci. 2017;32(1):1–28. doi: 10.1214/16-STS576.
    1. Development Core Team R. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2017.
    1. Plummer, M. (2017). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling: Retrieved from: .
    1. Curtis SM. BUGS code for item response theory. J Stat Softw. 2010;36(1):1–34.
    1. Drotar D, editor. Measuring health-related quality of life in children and adolescents: Implications for research and practice. Mahwah, NJ: Lawrence Erlbaum Associates, Inc.; 1998.
    1. Preen DB, Holman CD, Lawrence DM, Baynham NJ, Semmens JB. Hospital chart review provided more accurate comorbidity information than data from a general practitioner survey or an administrative database. J Clin Epidemiol. 2004;57(12):1295–1304. doi: 10.1016/j.jclinepi.2004.03.016.
    1. Patrick DL, Burke LB, Gwaltney CJ, Leidy NK, Martin ML, Molsen E, Ring L. Content validity-establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 2-assessing respondent understanding. Value Health. 2011;14(8):978–988. doi: 10.1016/j.jval.2011.06.013.

Source: PubMed

3
Abonner