Comparison of the SF6D, the EQ5D, and the oswestry disability index in patients with chronic low back pain and degenerative disc disease

Lars G Johnsen, Christian Hellum, Oystein P Nygaard, Kjersti Storheim, Jens I Brox, Ivar Rossvoll, Gunnar Leivseth, Margreth Grotle, Lars G Johnsen, Christian Hellum, Oystein P Nygaard, Kjersti Storheim, Jens I Brox, Ivar Rossvoll, Gunnar Leivseth, Margreth Grotle

Abstract

Background: The need for cost effectiveness analyses in randomized controlled trials that compare treatment options is increasing. The selection of the optimal utility measure is important, and a central question is whether the two most commonly used indexes - the EuroQuol 5D (EQ5D) and the Short Form 6D (SF6D) - can be used interchangeably. The aim of the present study was to compare change scores of the EQ5D and SF6D utility indexes in terms of some important measurement properties. The psychometric properties of the two utility indexes were compared to a disease-specific instrument, the Oswestry Disability Index (ODI), in the setting of a randomized controlled trial for degenerative disc disease.

Methods: In a randomized controlled multicentre trial, 172 patients who had experienced low back pain for an average of 6 years were randomized to either treatment with an intensive back rehabilitation program or surgery to insert disc prostheses. Patients filled out the ODI, EQ5D, and SF-36 at baseline and two-year follow up. The utility indexes was compared with respect to measurement error, structural validity, criterion validity, responsiveness, and interpretability according to the COSMIN taxonomy.

Results: At follow up, 113 patients had change score values for all three instruments. The SF6D had better similarity with the disease-specific instrument (ODI) regarding sensitivity, specificity, and responsiveness. Measurement error was lower for the SF6D (0.056) compared to the EQ5D (0.155). The minimal important change score value was 0.031 for SF6D and 0.173 for EQ5D. The minimal detectable change score value at a 95% confidence level were 0.157 for SF6D and 0.429 for EQ5D, and the difference in mean change score values (SD) between them was 0.23 (0.29) and so exceeded the clinical significant change score value for both instruments. Analysis of psychometric properties indicated that the indexes are unidimensional when considered separately, but that they do not exactly measure the same underlying construct.

Conclusions: This study indicates that the difference in important measurement properties between EQ5D and SF6D is too large to consider them interchangeable. Since the similarity with the "gold standard" (the disease-specific instrument) was quite different, this could indicate that the choice of index should be determined by the diagnosis.

Figures

Figure 1
Figure 1
Bland-Altman plot.
Figure 2
Figure 2
Person-item threshold distribution for EQ5D.
Figure 3
Figure 3
Person-item threshold distribution for SF6D.
Figure 4
Figure 4
ROC curve.

References

    1. Brazier J. Measuring and valuing health benefits for economic evaluation. New York: Oxford University Press; 2007.
    1. Nord E. Health state values from multiattribute utility instruments need correction. Ann Med. 2001;33(5):371–374. doi: 10.3109/07853890109002091.
    1. Brazier J, Roberts J, Deverill M. The estimation of a preference-based measure of health from the SF-36. J Health Econ. 2002;21(2):271–292. doi: 10.1016/S0167-6296(01)00130-8.
    1. Dolan P. Modeling valuations for EuroQol health states. Med Care. 1997;35(11):1095–1108. doi: 10.1097/00005650-199711000-00002.
    1. Drummond MF. Methods for the economic evaluation of health care programmes, 3rd edn. Oxford. New York: Oxford University Press; 2005.
    1. Barton GR, Sach TH, Avery AJ, Jenkinson C, Doherty M, Whynes DK, Muir KR. A comparison of the performance of the EQ-5D and SF-6D for individuals aged >or= 45 years. Health Econ. 2008;17(7):815–832. doi: 10.1002/hec.1298.
    1. Brazier J, Roberts J, Tsuchiya A, Busschbach J. A comparison of the EQ-5D and SF-6D across seven patient groups. Health Econ. 2004;13(9):873–884. doi: 10.1002/hec.866.
    1. Grieve R, Grishchenko M, Cairns J. SF-6D versus EQ-5D: reasons for differences in utility scores and impact on reported cost-utility. Eur J Health Econ. 2009;10(1):15–23. doi: 10.1007/s10198-008-0097-2.
    1. Sach TH, Barton GR, Jenkinson C, Doherty M, Avery AJ, Muir KR. Comparing cost-utility estimates: does the choice of EQ-5D or SF-6D matter? Med Care. 2009;47(8):889–894. doi: 10.1097/MLR.0b013e3181a39428.
    1. Soegaard R. Interchangeability of the EQ-5D and the SF-6D in Long-Lasting Low Back Pain Source: Value in Health 12, no. 4 (2009): 606–612 Additional Info: Blackwell Publishing; 20090601 Standard No: ISSN: 1098–3015. Value Health. 2009;12(4):606–612. doi: 10.1111/j.1524-4733.2008.00466.x.
    1. Sogaard R, Christensen FB, Videbaek TS, Bunger C, Christiansen T. Interchangeability of the EQ-5D and the SF-6D in long-lasting low back pain. Value Health. 2009;12(4):606–612. doi: 10.1111/j.1524-4733.2008.00466.x.
    1. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307–310.
    1. Bryan S, Longworth L. Measuring health-related utility: why the disparity between EQ-5D and SF-6D? Eur J Health Econ. 2005;6(3):253–260. doi: 10.1007/s10198-005-0299-9.
    1. Longworth L, Bryan S. An empirical comparison of EQ-5D and SF-6D in liver transplant patients. Health Econ. 2003;12(12):1061–1067. doi: 10.1002/hec.787.
    1. Hellum C, Johnsen LG, Storheim K, Nygaard OP, Brox JI, Rossvoll I, Ro M, Sandvik L, Grundnes O. Surgery with disc prosthesis versus rehabilitation in patients with low back pain and degenerative disc: two year follow-up of randomised study. BMJ. 2011;342:d2786. doi: 10.1136/bmj.d2786.
    1. Fairbank JC, Couper J, Davies JB, O’Brien JP. The Oswestry low back pain disability questionnaire. Physiotherapy. 1980;66(8):271–273.
    1. Fairbank JC, Pynsent PB. The Oswestry Disability Index. Spine. 2000;25(22):2940–2952. doi: 10.1097/00007632-200011150-00017. discussion 2952.
    1. Grotle M, Brox JI, Vollestad NK. Cross-cultural adaptation of the Norwegian versions of the Roland-Morris Disability Questionnaire and the Oswestry Disability Index. J Rehabil Med. 2003;35(5):241–247. doi: 10.1080/16501970306094.
    1. Ware JE Jr, Sherbourne CD. The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Med Care. 1992;30(6):473–483. doi: 10.1097/00005650-199206000-00002.
    1. Dolan P, Gudex C, Kind P, Williams A. The time trade-off method: results from a general population study. Health Econ. 1996;5(2):141–154. doi: 10.1002/(SICI)1099-1050(199603)5:2<141::AID-HEC189>;2-N.
    1. The EuroQol Group. EuroQol--a new facility for the measurement of health-related quality of life. Health Policy. 1990;16(3):199–208.
    1. Ostelo RW, de Vet HC. Clinically important outcomes in low back pain. Best Pract Res Clin Rheumatol. 2005;19(4):593–607. doi: 10.1016/j.berh.2005.03.003.
    1. Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, De Vet HC. COSMIN checklist manual. 2012.
    1. Mokkink LB, Terwee CB, Knol DL, Stratford PW, Alonso J, Patrick DL, Bouter LM, de Vet HC. The COSMIN checklist for evaluating the methodological quality of studies on measurement properties: a clarification of its content. BMC Med Res Methodol. 2010;10:22. doi: 10.1186/1471-2288-10-22.
    1. van der Roer N, Ostelo RW, Bekkering GE, van Tulder MW, de Vet HC. Minimal clinically important change for pain intensity, functional status, and general health status in patients with nonspecific low back pain. Spine. 2006;31(5):578–582. doi: 10.1097/01.brs.0000201293.57439.47.
    1. Wyrwich KW, Nienaber NA, Tierney WM, Wolinsky FD. Linking clinical relevance and statistical significance in evaluating intra-individual changes in health-related quality of life. Med Care. 1999;37(5):469–478. doi: 10.1097/00005650-199905000-00006.
    1. Wyrwich KW, Tierney WM, Wolinsky FD. Further evidence supporting an SEM-based criterion for identifying meaningful intra-individual changes in health-related quality of life. J Clin Epidemiol. 1999;52(9):861–873. doi: 10.1016/S0895-4356(99)00071-2.
    1. Crosby RD, Kolotkin RL, Williams GR. Defining clinically meaningful change in health-related quality of life. J Clin Epidemiol. 2003;56(5):395–407. doi: 10.1016/S0895-4356(03)00044-1.
    1. Bland JM, Altman DG. Measurement error. BMJ. 1996;313(7059):744.
    1. de Vet HC, Terwee CB, Knol DL, Bouter LM. When to use agreement versus reliability measures. J Clin Epidemiol. 2006;59(10):1033–1039. doi: 10.1016/j.jclinepi.2005.10.015.
    1. Beaton DE. Understanding the relevance of measured change through studies of responsiveness. Spine. 2000;25(24):3192–3199. doi: 10.1097/00007632-200012150-00015.
    1. Hagg O, Fritzell P, Nordwall A. The clinical importance of changes in outcome scores after treatment for chronic low back pain. Eur Spine J. 2003;12(1):12–20.
    1. Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, Bouter LM, de Vet HC. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. 2010;63(7):737–745. doi: 10.1016/j.jclinepi.2010.02.006.
    1. Smith EV Jr. Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. J Appl Meas. 2002;3(2):205–231.
    1. Chou Y-T, Wang W-C. Checking Dimensionality in Item Response Models With Principal Component Analysis on Standardized Residuals. Educ Psychol Meas. 2010;70(5):717–731. doi: 10.1177/0013164410379322.
    1. Fairbank JC, Pynsent PB. 22. The Oswestry Disability Index. Spine (Phila Pa 1976) 2000;25:2940–2952. doi: 10.1097/00007632-200011150-00017. discussion 2952.
    1. Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem. 1993;39(4):561–577.
    1. Deyo RA, Centor RM. Assessing the responsiveness of functional scales to clinical change: an analogy to diagnostic test performance. J Chronic Dis. 1986;39(11):897–906. doi: 10.1016/0021-9681(86)90038-X.
    1. Copay AG, Glassman SD, Subach BR, Berven S, Schuler TC, Carreon LY. Minimum clinically important difference in lumbar spine surgery patients: a choice of methods using the Oswestry Disability Index, Medical Outcomes Study questionnaire Short Form 36, and pain scales. Spine J. 2008;8(6):968–974. doi: 10.1016/j.spinee.2007.11.006.
    1. Tennant A, Conaghan PG. The Rasch measurement model in rheumatology: what is it and why use it? When should it be applied, and what should one look for in a Rasch paper? Arthritis Rheum. 2007;57(8):1358–1362. doi: 10.1002/art.23108.
    1. Pallant JF, Tennant A. An introduction to the Rasch measurement model: an example using the Hospital Anxiety and Depression Scale (HADS) Br J Clin Psychol. 2007;46(Pt 1):1–18.
    1. Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res. 1999;8(2):135–160. doi: 10.1191/096228099673819272.
    1. van Stel HF, Buskens E. Comparison of the SF-6D and the EQ-5D in patients with coronary heart disease. Health Qual Life Outcomes. 2006;4:20. doi: 10.1186/1477-7525-4-20.
    1. Walters SJ, Brazier JE. Comparison of the minimally important difference for two health state utility measures: EQ-5D and SF-6D. Qual Life Res. 2005;14(6):1523–1532. doi: 10.1007/s11136-004-7713-0.
    1. Brazier J, Deverill M. A checklist for judging preference-based measures of health related quality of life: learning from psychometrics. Health Econ. 1999;8(1):41–51. doi: 10.1002/(SICI)1099-1050(199902)8:1<41::AID-HEC395>;2-#.
    1. de Vet HC, Ostelo RW, Terwee CB, van der Roer N, Knol DL, Beckerman H, Boers M, Bouter LM. Minimally important change determined by a visual method integrating an anchor-based and a distribution-based approach. Qual Life Res. 2007;16(1):131–142. doi: 10.1007/s11136-006-9109-9.
    1. Guyatt GH, Norman GR, Juniper EF, Griffith LE. A critical look at transition ratings. J Clin Epidemiol. 2002;55(9):900–908. doi: 10.1016/S0895-4356(02)00435-3.
    1. Terwee CB, Roorda LD, Dekker J, Bierma-Zeinstra SM, Peat G, Jordan KP, Croft P, de Vet HC. Mind the MIC: large variation among populations and methods. J Clin Epidemiol. 2010;63(5):524–534. doi: 10.1016/j.jclinepi.2009.08.010.
    1. Turner D, Schunemann HJ, Griffith LE, Beaton DE, Griffiths AM, Critch JN, Guyatt GH. The minimal detectable change cannot reliably replace the minimal important difference. J Clin Epidemiol. 2010;63(1):28–36. doi: 10.1016/j.jclinepi.2009.01.024.
    1. Cella D, Hahn EA, Dineen K. Meaningful change in cancer-specific quality of life scores: differences between improvement and worsening. Qual Life Res. 2002;11(3):207–221. doi: 10.1023/A:1015276414526.
    1. Guyatt GH, Jaeschke RJ. Reassessing quality-of-life instruments in the evaluation of new drugs. Pharmaco Economics. 1997;12(6):621–626. doi: 10.2165/00019053-199712060-00002.

Source: PubMed

3
Tilaa