Responsiveness of PROMIS and Patient Health Questionnaire (PHQ) Depression Scales in three clinical trials

Kurt Kroenke, Timothy E Stump, Chen X Chen, Jacob Kean, Teresa M Damush, Matthew J Bair, Erin E Krebs, Patrick O Monahan, Kurt Kroenke, Timothy E Stump, Chen X Chen, Jacob Kean, Teresa M Damush, Matthew J Bair, Erin E Krebs, Patrick O Monahan

Abstract

Background: The PROMIS depression scales are reliable and valid measures that have extensive normative data in general population samples. However, less is known about how responsive they are to detect change in clinical settings and how their responsiveness compares to legacy measures. The purpose of this study was to assess and compare the responsiveness of the PROMIS and Patient Health Questionnaire (PHQ) depression scales in three separate samples.

Methods: We used data from three clinical trials (two in patients with chronic pain and one in stroke survivors) totaling 651 participants. At both baseline and follow-up, participants completed four PROMIS depression fixed-length scales as well as legacy measures: Patient Health Questionnaire 9-item and 2-item scales (PHQ-9 and PHQ-2) and the SF-36 Mental Health scale. We measured global ratings of depression change, both prospectively and retrospectively, as anchors to classify patients as improved, unchanged, or worsened. Responsiveness was assessed with standardized response means, statistical tests comparing change groups, and area-under-curve analysis.

Results: The PROMIS depression and legacy scales had generally comparable responsiveness. Moreover, the four PROMIS depression scales of varying lengths were similarly responsive. In general, measures performed better in detecting depression improvement than depression worsening. For all measures, responsiveness varied based on the study sample and on whether depression improved or worsened.

Conclusions: Both PROMIS and PHQ depression scales are brief public domain measures that are responsive (i.e., sensitive to change) and thus appropriate as outcome measures in research as well as for monitoring treatment in clinical practice. Trial registration ClinicalTrials.gov ID: NCT01236521, NCT01583985, NCT01507688.

Keywords: Depression; PHQ-9; PROMIS; Psychometrics; Responsiveness; Sensitivity to change.

Conflict of interest statement

The authors report no competing interests.

Figures

Fig. 1
Fig. 1
Comparative standardized response means (SRMs) between depression measures across trials
Fig. 2
Fig. 2
Comparative standardized response means (SRMs) between PROMIS depression short forms of varying lengths

References

    1. O'Connor E, Rossum RC, Henninger M, Groom HC, Burda BU, Henderson JT, Bigler KD, Whitlock EP: Screening for Depression in Adults: An Updated Systematic Evidence Review for the U.S. Preventive Services Task Force. Evidence Synthesis No. 128. Rockville, MD; 2016.
    1. Cella D, Riley W, Stone A, Rothrock N, Reeve B, Yount S, Amtmann D, Bode R, Buysse D, Choi S, et al. The Patient-Reported Outcomes Measurement Information System (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. J Clin Epidemiol. 2010;63:1179–1194.
    1. Revicki D, Hays RD, Cella D, Sloan J. Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. J Clin Epidemiol. 2008;61:102–109.
    1. Choi SW, Reise SP, Pilkonis PA, Hays RD, Cella D. Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms. Qual Life Res. 2010;19:125–136.
    1. Pilkonis PA, Yu L, Dodds NE, Johnston KL, Maihoefer CC, Lawrence SM. Validation of the depression item bank from the Patient-Reported Outcomes Measurement Information System (PROMIS) in a three-month observational study. J Psychiatr Res. 2014;56:112–119.
    1. Vilagut G, Forero CG, Adroher ND, Olariu E, Cella D, Alonso J. investigators IN: Testing the PROMIS(R) Depression measures for monitoring depression in a clinical sample outside the US. J Psychiatr Res. 2015;68:140–150.
    1. Jakob T, Nagl M, Gramm L, Heyduck K, Farin E, Glattacker M. Psychometric Properties of a German Translation of the PROMIS(R) Depression Item Bank. Eval Health Prof. 2017;40:106–120.
    1. Katzan IL, Fan Y, Griffith SD, Crane PK, Thompson NR, Cella D. Scale linking to enable patient-reported outcome performance measures assessed with different patient-reported outcome measures. Value Health. 2017;20:1143–1149.
    1. Schalet BD, Pilkonis PA, Yu L, Dodds N, Johnston KL, Yount S, Riley W, Cella D. Clinical validity of PROMIS Depression, Anxiety, and Anger across diverse clinical samples. J Clin Epidemiol. 2016;73:119–127.
    1. Kroenke K, Stump TE, Chen CX, Kean J, Bair MJ, Damush TM, Krebs EE, Monahan PO. Minimally important differences and severity thresholds are estimated for the PROMIS depression scales from three randomized clinical trials. J Affect Disord. 2020;266:100–108.
    1. Kroenke K, Spitzer RL, Williams JB, Lowe B. The Patient Health Questionnaire Somatic, Anxiety, and Depressive Symptom Scales: a systematic review. Gen Hosp Psychiatry. 2010;32:345–359.
    1. Mitchell AJ, Yadegarfar M, Gill J, Stubbs B. Case finding and screening clinical utility of the Patient Health Questionnaire (PHQ-9 and PHQ-2) for depression in primary care: a diagnostic meta-analysis of 40 studies. BJPsych Open. 2016;2:127–138.
    1. Lowe B, Kroenke K, Grafe K. Detecting and monitoring depression with a two-item questionnaire (PHQ-2) J Psychosom Res. 2005;58:163–171.
    1. Staples LG, Dear BF, Gandy M, Fogliati V, Fogliati R, Karin E, Nielssen O, Titov N. Psychometric properties and clinical utility of brief measures of depression, anxiety, and general distress: the PHQ-2, GAD-2, and K-6. Gen Hosp Psychiatry. 2019;56:13–18.
    1. Berwick DM, Murphy JM, Goldman PA, Ware JE, Jr, Barsky AJ, Weinstein MC. Performance of a five-item mental health screening test. Med Care. 1991;29:169–176.
    1. Johns SA, Kroenke K, Krebs EE, Theobald DE, Wu J, Tu W. Longitudinal comparison of three depression measures in adult cancer patients. J Pain Symptom Manag. 2013;45:71–82.
    1. Yost KJ, Eton DT, Garcia SF, Cella D. Minimally important differences were estimated for six Patient-Reported Outcomes Measurement Information System-Cancer scales in advanced-stage cancer patients. J Clin Epidemiol. 2011;64:507–516.
    1. Schmitt J, Di Fabio RP. The validity of prospective and retrospective global change criterion measures. Arch Phys Med Rehabil. 2005;86:2270–2276.
    1. Fletcher KE, French CT, Irwin RS, Corapi KM, Norman GR. A prospective global measure, the Punum Ladder, provides more valid assessments of quality of life than a retrospective transition measure. J Clin Epidemiol. 2010;63:1123–1131.
    1. Chen CX, Kroenke K, Stump T, Kean J, Krebs EE, Bair MJ, Damush T, Monahan PO. Comparative responsiveness of the PROMIS pain interference short forms with legacy pain measures: results from three randomized clinical trials. J Pain. 2019;20:664–675.
    1. Kazis LE, Anderson JJ, Meenan RF. Effect sizes for interpreting changes in health status. Med Care. 1989;27:S178–S189.
    1. Askew RL, Cook KF, Revicki DA, Cella D, Amtmann D. Evidence from diverse clinical populations supported clinical validity of PROMIS pain interference and pain behavior. J Clin Epidemiol. 2016;73:103–111.
    1. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–845.
    1. Kroenke K, Baye F, Lourens SG. Comparative validity and responsiveness of PHQ-ADS and other composite anxiety-depression measures. J Affect Disord. 2019;246:437–443.
    1. Krebs EE, Bair MJ, Wu J, Damush TM, Tu W, Kroenke K. Comparative responsiveness of pain outcome measures among primary care patients with musculoskeletal pain. Med Care. 2010;48:1007–1014.
    1. Kroenke K, Theobald D, Wu J, Tu W, Krebs EE. Comparative responsiveness of pain measures in cancer patients. J Pain. 2012;13:764–772.
    1. Kean J, Monahan PO, Kroenke K, Wu J, Yu Z, Stump TE, Krebs EE. Comparative Responsiveness of the PROMIS Pain Interference Short Forms, Brief Pain Inventory, PEG, and SF-36 Bodily Pain Subscale. Med Care. 2016;54:414–421.
    1. Deyo RA, Katrina R, Buckley DI, Michaels L, Kobus A, Eckstrom E, Forro V, Morris C. Performance of a Patient Reported Outcomes Measurement Information System (PROMIS) Short Form in older adults with chronic musculoskeletal pain. Pain Med. 2015;17:314–324.
    1. Mokkink LB, Terwee CB, Knol DL, Stratford PW, Alonso J, Patrick DL, Bouter LM, de Vet HC. The COSMIN checklist for evaluating the methodological quality of studies on measurement properties: a clarification of its content. BMC Med Res Methodol. 2010;10:22.
    1. Angst F: The new COSMIN guidelines confront traditional concepts of responsiveness. BMC Med Res Methodol 2011, 11:152; author reply 152.
    1. Mokkink LB, Terwee CB, Knol DL, de Vet HC. The new COSMIN guidelines regarding responsiveness. Author's response. BMC Med Res Methodol. 2011;11:152.
    1. Husted JA, Cook RJ, Farewell VT, Gladman DD. Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol. 2000;53:459–468.
    1. Middel B, van Sonderen E. Statistical significant change versus relevant or important change in (quasi) experimental design: some conceptual and methodological problems in estimating magnitude of intervention-related change in health services research. Int J Integr Care. 2002;2:e15.
    1. Norman GR, Wyrwich KW, Patrick DL. The mathematical relationship among different forms of responsiveness coefficients. Qual Life Res. 2007;16:815–822.
    1. Jensen RE, Moinpour CM, Potosky AL, Lobo T, Hahn EA, Hays RD, Cella D, Smith AW, Wu XC, Keegan TH, et al. Responsiveness of 8 Patient-Reported Outcomes Measurement Information System (PROMIS) measures in a large, community-based cancer study cohort. Cancer. 2017;123:327–335.
    1. Cella D, Lai JS, Jensen SE, Christodoulou C, Junghaenel DU, Reeve BB, Stone AA. PROMIS fatigue item bank had clinical validity across diverse chronic conditions. J Clin Epidemiol. 2016;73:128–134.
    1. Cook KF, Jensen SE, Schalet BD, Beaumont JL, Amtmann D, Czajkowski S, Dewalt DA, Fries JF, Pilkonis PA, Reeve BB, et al. PROMIS measures of pain, fatigue, negative affect, physical function, and social function demonstrated clinical validity across a range of chronic conditions. J Clin Epidemiol. 2016;73:89–102.
    1. Schalet BD, Hays RD, Jensen SE, Beaumont JL, Fries JF, Cella D. Validity of PROMIS physical function measured in diverse clinical samples. J Clin Epidemiol. 2016;73:112–118.
    1. Hung M, Saltzman CL, Greene T, Voss MW, Bounsanga J, Gu Y, Anderson MB, Peters CL, Gililland J, Pelt CE. Evaluating instrument responsiveness in joint function: The HOOS JR, the KOOS JR, and the PROMIS PF CAT. J Orthop Res. 2018;36:1178–1184.
    1. Hinds PS, Wang J, Cheng YI, Stern E, Waldron M, Gross H, DeWalt DA, Jacobs SS. PROMIS pediatric measures validated in a longitudinal study design in pediatric oncology. Pediatr Blood Cancer. 2019;66:e27606.
    1. Kroenke K, Baye F, Lourens SG. Comparative responsiveness and minimally important difference of common anxiety measures. Med Care. 2019;57:890–897.
    1. Katz P, Pedro S, Alemao E, Yazdany J, Dall'Era M, Trupin L, Rush S, Michaud K. Estimates of responsiveness, minimally important differences, and patient acceptable symptom state in five patient-reported outcomes measurement information system short forms in systemic lupus erythematosus. ACR Open Rheumatol. 2020;2:53–60.
    1. Bushey MA, Kroenke K, Baye F, Lourens S. Assessing depression improvement with the remission evaluation and mood inventory tool (REMIT) Gen Hosp Psychiatry. 2019;60:44–49.
    1. Norman GR, Stratford P, Regehr G. Methodological problems in the retrospective computation of responsiveness to change: the lesson of Cronbach. J Clin Epidemiol. 1997;50:869–879.
    1. Prinsen CAC, Mokkink LB, Bouter LM, Alonso J, Patrick DL, de Vet HCW, Terwee CB. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res. 2018;27:1147–1157.
    1. Deyo RA, Centor RM. Assessing the responsiveness of functional scales to clinical change: an analogy to diagnostic test performance. J Chronic Dis. 1986;39:897–906.
    1. Turner D, Schunemann HJ, Griffith LE, Beaton DE, Griffiths AM, Critch JN, Guyatt GH. Using the entire cohort in the receiver operating characteristic analysis maximizes precision of the minimal important difference. J Clin Epidemiol. 2009;62:374–379.
    1. Deyo RA, Diehr P, Patrick DL. Reproducibility and responsiveness of health status measures. Statistics and strategies for evaluation. Control Clin Trials. 1991;12:142S–158S.
    1. Wu AW, Kharrazi H, Boulware L, Snyder CF. Measure once, cut twice-adding patient-reported outcome measures to the electronic health record for comparative effectiveness research. J Clin Epidemiol. 2013;8:S12–S20.

Source: PubMed

3
Abonnieren