Measures of function in low back pain/disorders: Low Back Pain Rating Scale (LBPRS), Oswestry Disability Index (ODI), Progressive Isoinertial Lifting Evaluation (PILE), Quebec Back Pain Disability Scale (QBPDS), and Roland-Morris Disability Questionnaire (RDQ)

Authors


INTRODUCTION

Treatment of patients with chronic low back pain and its evolving disability primarily tries to improve the patients' levels of activities and participation. Mostly, self-reported questionnaires have been used for clinical as well as research purposes to assess daily functioning (1, 2), of which the most commonly used will be discussed below. However, this information may not necessarily reflect the real capacity of a patient's performance. A recent review showed that the correlation of self-reported disability and physical activity level was at best moderate for patients with chronic low back pain (3). In order to improve objectivity, measures of body function, e.g., spinal mobility and lumbar extensor muscle strength, have been used, although the correlation with the level of disability is very weak (4, 5). Furthermore, there are major concerns about reliability and validity (6–8).

Besides the self-reported disability measures, many have urged to use more objective and direct measures of low back pain–specific functional capacity (5, 9, 10). Capacity is defined as the highest probable level of functioning that a person may reach in an activity domain at any given moment in a standardized environment. Although there is still no consensus for the definition of functional capacity evaluation (FCE), in the past decades, several FCE measures have been developed, of which the Isernhagen Work Systems Functional Capacity Evaluation (IWS-FCE) is among the most frequently used (11, 12). However, recently published psychometric data have shown that some of the tasks included in the IWS-FCE are not reliable (13, 14). Unfortunately, the entire sequence of tasks in, for example, the IWS-FCE, is time consuming and expensive, as is the training of the test observer. Therefore, we have decided not to include these measures in this review.

Nevertheless, in order to keep up with, for example, the Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials recommendations to evaluate several core outcome domains, including physical functioning (9), we wanted to include easy to use performance tasks. Several tasks have been described (8, 15–17), but most of them are not low back pain specific, and some, such as the Back Performance Scale, show insufficient factor structure, as in this measure the quality of the performance is also scored (1, 18). Therefore, we decided only to include a performance task that assesses lifting, an activity that specifically might be hampered by low back pain.

For the selection of the self-reported disability questionnaires and lifting performance tasks, we only selected questionnaires/tests that are low back pain specific and of which all psychometric, including responsiveness, properties have been studied in relevant low back pain populations and published in peer-reviewed journals.

Other criteria for selection were: being available in at least English and for performance task measures, easy to administer, inexpensive, and not time consuming when used in clinical practice.

LOW BACK PAIN RATING SCALE (LBPRS)

Description

Purpose.

Developed by Manniche et al in 1985, the LBPRS is constructed to measure the 3 clinical illness components of low back pain: pain (back and leg), disability, and physical impairment (19). The scale has been widely used in randomized clinical trials to monitor outcome following therapeutic interventions for low back pain (20–28), including older patients (29).

Content.

The scale covers 3 domains: back and leg pain (60 points), disability (30 points), and physical impairment (40 points). The first domain includes six 11-point scales, concerning current pain, worst pain in the last 2 weeks, and average pain in the last 2 weeks for both the leg and lower back. The second domain consists of a disability index with 15 questions that range from quality of sleep, social and occupational participation, daily activities, and emotional status. The last domain includes 4 measures of physical impairment: endurance of back muscles, back mobility, overall mobility, and the use of analgesics (19).

Number of items.

The pain domain consists of 6 items, the disability domain consists of 15 items, and the physical impairment domain consists of 4 items, yielding a total of 21 self-reported items and 4 performance-based measures.

Response options/scale.

The pain domain comprises six 11-point scales, where 0 = “no pain” and 10 = “the worst imaginable pain.” The second domain consists of 15 questions and each question is scored from 0–2, where 0 = “not a problem,” 1 = “can be a problem,” and 2 = “is a problem.” The last domain includes 4 measures of physical impairment, each being scored from 0–10 points (19).

Recall period for items.

The pain domain includes questions concerning current status and pain intensity in the past 2 weeks. The other 2 domains concern the patient's current status.

Endorsements.

The scale has been recommended for functional pain evaluation by researchers in the field (30, 31).

Examples of use.

The LBPRS has been widely used in randomized clinical trials, in particular those assessing the efficacy of surgical procedures.

Andersen T, Christensen FB, Egund N, Ernst C, Fruensgaard S, Ostergaard J, et al. The effect of electrical stimulation on lumbar spinal fusion in older patients: a randomized, controlled, multi-center trial. Part 2: fusion rates. Spine (Phila Pa 1976) 2009;34:2248–53 (20).

Andersen T, Christensen FB, Ernst C, Fruensgaard S, Ostergaard J, Andersen JL, et al. The effect of electrical stimulation on lumbar spinal fusion in older patients: a randomized, controlled, multi-center trial. Part 1: functional outcome. Spine (Phila Pa 1976) 2009;34:2241–7 (21).

Andersen T, Christensen FB, Hansen ES, Bunger C. Pain 5 years after instrumented and non-instrumented posterolateral lumbar spinal fusion. Eur Spine J 2003;12:393–9 (22).

Filiz M, Cakmak A, Ozcan E. The effectiveness of exercise programmes after lumbar disc surgery: a randomized controlled study. Clin Rehabil 2005;19:4–11 (23).

Radziszewski KR. Comparative retrospective analysis of pain afflictions in patients with lumbar discopathy receiving conservative or operative therapies. Pol Merkur Lekarski 2006;21:335–40. In Polish (24).

Radziszewski KR. The functional status in patients with discopathy of the lumbar spine receiving only conservative therapy or operative therapy. Wiad Lek 2008;61:23–9. In Polish (25).

Soegaard R, Christensen FB, Christiansen T, Bunger C. Costs and effects in lumbar spinal fusion: a follow-up study in 136 consecutive patients with chronic low back pain. Eur Spine J 2007;16:657–68 (26).

Laursen SO, Fugl IR. Outcome of treatment of chronic low back pain in inpatients: effect of individual physiotherapy including intensive dynamic training in inpatients with chronic low back trouble, evaluated by means of low back pain rating scale. Dan Med Bull 1995;42:290–3 (27).

Andersen T, Christensen FB, Niedermann B, Helmig P, Hoy K, Hansen ES, et al. Impact of instrumentation in lumbar spinal fusion in elderly patients: 71 patients followed for 2-7 years. Acta Orthop 2009;80:445–50 (29).

Christensen FB, Stender Hansen E, Laursen M, Thomsen K, Bunger CE. Long-term functional outcome of pedicle screw instrumentation as a support for posterolateral spinal fusion: randomized clinical study with a 5-year follow-up. Spine (Phila Pa 1976) 2002;27:1269–77 (32).

Christensen FB. Lumbar spinal fusion: outcome in relation to surgical methods, choice of implant and postoperative rehabilitation. Acta Orthop Scand Suppl 2004;75:2–43 (33).

Christensen FB, Hansen ES, Eiskjaer SP, Hoy K, Helmig P, Neumann P, et al. Circumferential lumbar spinal fusion with brantigan cage versus posterolateral fusion with titanium cotrel-dubousset instrumentation: a prospective, randomized clinical study of 146 patients. Spine (Phila Pa 1976) 2002;27:2674–83 (34).

Christensen FB, Laurberg I, Bunger CE. Importance of the back-cafe concept to rehabilitation after lumbar spinal fusion: a randomized clinical study with a 2-year follow-up. Spine (Phila Pa 1976) 2003;28:2561–9 (35).

Videbaek TS, Christensen FB, Soegaard R, Hansen ES, Hoy K, Helmig P, et al. Circumferential fusion improves outcome in comparison with instrumented posterolateral fusion: long-term results of a randomized clinical trial. Spine (Phila Pa 1976) 2006;31:2875–80 (36).

Practical Application

How to obtain.

No cost is involved in obtaining the LBPRS. A copy can be downloaded from online outcome measure databases (https://www.cebp.nl/?NODE=77&SUBNODE=1135).

Method of administration.

The scale may be completed by either the patient or the interviewer. A modified version of the questionnaire, omitting back muscle endurance, spinal mobility, and total mobility items, has been developed for mail or phone interviews (19).

Scoring.

Each of the 3 domains is scored separately and the total score represents a sum of all 3 domains. The score of the first domain ranges from 0–60 points, the score of the second domain ranges from 0–30 points, and the last domain ranges from 0–40 points (19). Together, the 3 domains form a rank scale, where an asymptomatic person scores 0 and a person with extreme disability scores 130 points. However, it is recommended not to use the total sum score, as subscores provide valuable information and are not subject to weighting bias.

Score interpretation.

The 3 domains form a rank scale where an asymptomatic person scores 0 and a person with extreme disability scores 130 points. The sum score is influenced by a weighting bias since 3 answer options exist for physical impairment and disability index items and 11 options are used to indicate pain (31).

Respondent burden.

In general, the scale is easy to understand and complete. Some items, for instance, items 14 and 15 (item 14: “If it was a present interest do you think that there are certain jobs which you would not be able to manage because of your back trouble?” and item 15: “Do you think that the low back pain will influence your future?”), may be harder to interpret.

Administrative burden.

Approximately 15 minutes are required to complete the LBPRS. No training is necessary.

Translations/adaptations.

The scale is available in Danish and English (19), Turkish (23), German (37), and Polish (24, 25). The scale has been validated in Danish (19) and culturally adapted into German (37).

Psychometric Information

Method of development.

The items of the LBPRS were generated to account for the 3 clinical components of low back pain: pain, disability, and physical impairment. Included items concern the etiology of back pain as well as its impact on a patient's psychological social and work status. The scale was developed by a group of researchers and was primarily devised for use in clinical trials; however, it may also be used in clinical settings.

Acceptability.

No data on readability or floor or ceiling effects of the scale are available in the literature.

Reliability.

The scale presents high interrater reliability (97.7%) (19). No confidence intervals are provided. No information on the minimum detectable change (MDC) or SEM of the scale is available.

Validity.

Content and face validity.

The 4 back pain, leg pain, disability, and physical impairment components of the scale are marginally correlated and yet conditionally independent, suggesting that the LBPRS is a unidimensional scale (latent variable accounts for 65.9% of the total variation of the scale components).

Construct validity.

Construct validity was assessed using the conditional Gaussian distribution, where conditional independence among variables was tested using likelihood ratio tests. Results confirmed conditional independence of the LBPRS and doctor's assessment, given the patient's assessment (P < 0.00005), and conditional independence of the LBPRS and patient's assessment, given the doctor's assessment (P < 0.00005). This suggests that the LBPRS correlates strongly with both the doctor's global assessment and patient's global assessment (19). The German version of the scale presents a high correlation (0.91, P < 0.000) with the Roland-Morris Disability Questionnaire (37).

Ability to detect change.

Standardized response means for the disability and pain components of the scale are 0.8 (95% confidence interval [95% CI] 0.4–1.3) and 1.3 (95% CI 1.0–1.6), respectively, for patients with low back pain only, and 0.8 (95% CI 0.3–1.2) and 1.3 (95% CI 0.7–1.9), respectively, for patients with low back and leg pain.

Minimum clinically important change (MCIC) was determined by an optimal cut point analysis using both the raw and percent change scores. For the raw scores, the MCIC for the disability and pain components of the scale in all patients was 17 and 10 points, respectively (38).

Critical Appraisal of Overall Value to the Rheumatology Community

Strengths.

The 21 self-reported items of the scale are simple and demonstrate a well-balanced distribution of items across the International Classification of Functioning, Disability and Health components (31). It contains items concerning pain, activity limitation, including work activities and activities of daily life, and physical impairment. The pain domain of the scale is responsive and preferable to the numerical rating scale, as it provides more information on pain dimension at 2 different timeframes (38). The scale has been widely used in clinical research, in particular clinical trials involving postsurgical patients.

Caveats and cautions.

The disability domain presents lower responsiveness when compared to the Roland-Morris Disability Questionnaire and the Oswestry Disability Index. The LBPRS lacks information on important psychometric properties, such as the MDC and SEM. No information is available on the responsiveness of the performance-based (i.e., physical impairment) component of the scale.

Item-weighting bias has been suggested due to the discrepancy in score ranges across the 3 domains of the scale, and care should be taken when interpreting the total score.

Clinical usability.

Information on its MDC and SEM is lacking, but the scale is quick and simple to use and understand and assesses important aspects of the disease (i.e., pain, disability, and physical impairment).

Research usability.

Its use in research has been endorsed by experts in the field (30, 31). The scale is simple and has been widely used in clinical research in a variety of ways, including face-to-face interviews, mailed followups, and phone interviews.

OSWESTRY DISABILITY INDEX (ODI).

Description

Purpose.

The ODI has been developed to assess pain-related disability in people with acute, subacute, or chronic low back pain. Since it was first published in 1980 (version 1.0) (39), several different versions have been developed, including ODI version 2.0, ODI AAOS (modified by the American Academy of Orthopedic Surgeons), and the ODI Chiropractic Version (40). Version 2.0 is recommended for general use (40, 41). The rest of this article refers to ODI version 1.0 or ODI version 2.0.

Content.

The ODI covers 1 item on pain and 9 items on activities of daily living (personal care, lifting, walking, sitting, standing, sleeping, sex life, social life, and traveling).

Number of items.

10 items.

Response options/scale.

Each item is measured on a 6-point ordinal scale, ranging from the best scenario to the worst scenario. For example, for walking (item 4) the response options range from “pain does not prevent me walking any distance” to “I am in bed most of the time and have to crawl to the toilet.”

Recall period for items.

Version 1.0 is not specific on a timeframe. Version 2.0 relates to “today.”

Endorsements.

The ODI has been recommended as a back pain–specific measure of disability by researchers in this field (42).

Examples of use.

Brox JI, Sorensen R, Friis A, Nygaard O, Indahl A, Keller A, et al. Randomized clinical trial of lumbar instrumented fusion and cognitive intervention and exercises in patients with chronic low back pain and disc degeneration. Spine (Phila Pa 1976) 2003;28:1913–21 (43).

Carette S, Leclaire R, Marcoux S, Morin F, Blaise GA, St-Pierre A, et al. Epidural corticosteroid injections for sciatica due to herniated nucleus pulposus. N Engl J Med 1997;336:1634–40 (44).

Fritzell P, Hagg O, Wessberg P, Nordwall A. 2001 Volvo Award Winner in Clinical Studies. Lumbar fusion versus nonsurgical treatment for chronic low back pain: a multicenter randomized controlled trial from the Swedish Lumbar Spine Study Group. Spine (Phila Pa 1976) 2001;26:2521–32 (45).

Malmivaara A, Hakkinen U, Aro T, Heinrichs ML, Koskenniemi L, Kuosma E, et al. The treatment of acute low back pain: bed rest, exercises or ordinary activity. N Engl J Med 1995;332:351–5 (46).

Weinstein JN, Tosteson TD, Lurie JD, Tosteson AN, Hanscom B, Skinner JS, et al. Surgical vs nonoperative treatment for lumbar disk herniation: the Spine Patient Outcomes Research Trial (SPORT). A randomized trial. JAMA 2006;296:2441–50 (47).

Practical Application

How to obtain.

No permission or cost is required to use the ODI. Copies of the ODI can be found in published sources (40, 41).

Method of administration.

The ODI is normally completed by patients using paper and pen. Administration by computer (through MODEMS) or telephone is also possible at PO Box 2354, Des Plaines, IL 60017-2354 (40).

Scoring.

For each item, the scoring increases incrementally by 1 with each response option, from 0 (first response option) to 5 (last response option). Missing values are omitted. A percentage is worked out to get the total score.

Score interpretation.

The total ODI score ranges from 0 (no disability) to 100 (maximum disability). The original developers of the ODI intended for scores from 0–20 to indicate “minimal disability,” 20–40 to indicate “moderate disability,” 40–60 to indicate “severe disability,” 60–80 to indicate “housebound,” and 80–100 to indicate “bedbound” (39).

Respondent burden.

The ODI is simple to read and can be completed by the respondent in <5 minutes.

Administrative burden.

Scoring takes <1 minute. No training is necessary.

Translations/adaptations.

The ODI was originally developed in English, but it has been culturally adapted and is available in a range of languages (40, 48), such as German, Mandarin, and Spanish.

The ODI Chiropractic Version was developed for patients with less disability, although this version is not recommended by some authors (40).

Psychometric Information

Method of development.

The ODI was developed by clinicians at the Robert Jones and Agnes Hunt Orthopaedic Hospital, Oswestry, Shropshire, UK. It is unclear how the items were generated (39).

Acceptability.

The ODI is simple to read. The floor or ceiling effects are unclear. Item 8, sex life, has the option of “if applicable” and is at times omitted. An alternative version replaces item 8 by work/housework.

Reliability.

The ODI has high internal consistency (Cronbach's α = 0.71–0.87) (40) and test–retest reliability (intraclass correlation coefficient 0.84, 95% confidence interval 0.73–0.91) (49). The standard error of measure has been reported to be between 4 and 6 (49, 50). Assessed in a group of patients with back pain presented to physiotherapy, the minimal detectable change is 15–19 (49).

Validity.

Content and face validity.

The ODI has adequate content validity, as it covers activities of daily living that are commonly experienced by patients with back pain. However, it lacks generic activities such as work, leisure, recreation, or sporting activities.

Internal construct validity.

The ODI has high internal consistency, with Cronbach's alpha between 0.71 and 0.87 (40, 41).

External construct/convergent validity.

It correlates with other measures of disability, such as the Roland-Morris Disability Questionnaire (RDQ), and shows moderate correlation with pain scales and the Short Form 36 (40, 41).

Ability to detect change.

There is evidence that the ODI is responsive in detecting change (area under the receiver operating characteristic curve >0.76) (49, 51, 52). Based on a literature review and discussion, an international panel has suggested 10 points or a 30% score improvement as the cutoff point for minimal important change (53).

Critical Appraisal of Overall Value to the Rheumatology Community

Strengths.

The ODI measures pain-related disability, which is an important element affected in people with back pain and a core outcome in this population (42). It is simple to use and score, and has minimal respondent and administrator burden. The ODI has become one of the most commonly used measures of disability in back pain, along with the RDQ. Compared with the RDQ, the ODI is more sensitive in patients with more persistent severe disability, whereas the RDQ is more sensitive to change in patients with mild to moderate disability (2, 40).

Caveats and cautions.

The ODI has been administered by telephone; however, the multiple response options mean that face-to-face or computer administration would be the preferred method of administration.

Clinical usability.

The ODI has established psychometric properties and is easy to use, and therefore is suitable to be used clinically. It can be used both to assess and monitor outcome.

Research usability.

The ODI has established psychometric properties and is easy to use, and therefore is suitable to be used in research as a measure of outcome. The ODI is also frequently used as a comparator when evaluating other measures.

PROGRESSIVE ISOINERTIAL LIFTING EVALUATION (PILE), LUMBAR TEST

Description

Purpose.

To quantify frequent lifting capacity based on 3 primary limiting factors of patient capability, i.e., psychological, cardiovascular, and anthropometric, while employing isoinertial lifting characteristics. The PILE provides reasonable limits for subject frequent lifting in industry, as well as the limiting factor in lifting (psychophysical or cardiovascular).

By comparing the measured capacity to normative values in industrial workers, the test is able to predict a subject's capacity to tolerate strenuous lifting throughout a day, but it is not sufficient to disqualify applicants or to predict low back pain (LBP) incidents.

It is also used as an outcome measure to evaluate the effect of treatment in patients with chronic LBP (CLBP). The original test was published in 1998, with an erratum notice in 1990 regarding the scoring (54, 55).

One modified version has been described in which a starting weight of 4 kg and an incremental weight of 2 kg irrespective of sex are used (56). However, as no other study used this modification, this test will not be discussed.

Content.

The participant is asked to lift a box with handles (36 × 26 × 18 cm, 1.35 kg) with an additional weight 4 times in 20 seconds from the floor to a 75-cm high table and back, starting with a total weight of 3.6 kg for women and 5.85 kg for men. After every completed cycle, the weight is increased by 2.25 kg for women and 4.5 kg for men. The test is stopped when the participant is unable to complete 4 lifting cycles within 20 seconds, decides to quit due to fatigue or excessive discomfort (psychophysical end point), when the heart rate (HR) exceeds 85% of the maximal HR (220 − age; cardiovascular end point), when the maximum weight that can be safely lifted has been reached (55–60% of body weight), or when the assessor does not think it is safe to continue the test (safety end point).

Endorsement.

The measure has been recommended for more objective functional evaluation in addition to self-report measures (54, 55).

Examples of use.

Rainville J, Sobel J, Hartigan C, Monlux G, Bean J. Decreasing disability in chronic back pain through aggressive spine rehabilitation. J Rehabil Res Dev 1997;34:383–93 (57).

Smeets RJ, Vlaeyen JW, Hidding A, Kester AD, Van Der Heijden GJ, Knottnerus JA. Chronic low back pain: physical training, graded activity with problem solving training, or both? The one-year post-treatment results of a randomized controlled trial. Pain 2008;134:263–76 (58).

Smeets RJ, Vlaeyen JW, Hidding A, Kester AD, van der Heijden GJ, van Geel AC, et al. Active rehabilitation for chronic low back pain: cognitive-behavioral, physical, or both? First direct post-treatment results from a randomized controlled trial. BMC Musculoskelet Disord 2006;7:5 (59).

Weiner DK, Rudy TE, Glick RM, Boston JR, Lieber SJ, Morrow LA, et al. Efficacy of percutaneous electrical nerve stimulation for the treatment of chronic low back pain in older adults. J Am Geriatr Soc 2003;51:599–608 (60).

Practical Application

How to obtain.

The procedure is described in the original publication and erratum notice (54). An extensive protocol in Dutch is available without costs from R. J. E. M. Smeets, MD, PhD (e-mail: r.smeets@adelante-zorggroep.nl).

Method of administration.

The assessor increases weight in a standardized manner every 20 seconds and records the maximum weight lifted, the total number of completed lifting cycles, and the HR after each lifting cycle. The assessor also judges whether continuation of the test is safe.

Equipment needed includes a stopwatch, a table of 75 cm height, a box with handles, a set of 2.25 kg and 4.5 kg weights, an HR monitoring system, and paper and pencil.

Scores.

Results are expressed as 1) maximum weight lifted, 2) endurance time to discontinuation of test, 3) final and target HRs, 4) total work (sum of forces multiplied by distance), and 5) work consumption (work/time). In order to correct for over- and underweight and facilitate intersubject comparisons, the “adjusted weight” (AW; derived from a Young Men's Christian Association height/weight chart) normalizing factor can be used (maximum weight lifted to AW).

For outcome measurement, the weight lifted (expressed as a percentage of AW or ideal weight) (57) or the maximum weight lifted in the last fully completed lifting cycle are most commonly used (61–65). In order to analyze results of men and women together, adjustment for the difference in starting weight and incremental weights between men and women is necessary. Therefore, Smeets et al suggested using the number of completed lifting cycles as the main outcome (66–68).

Score interpretation.

A normative database based on 61 male and 31 female mixed blue- and white-collar industrial workers (US) is available (54).

Respondent burden.

5–15 minutes; back pain can temporarily increase due to lifting. It is a safe procedure, and none of the studies reported severe side effects.

Administrative burden.

5–15 minutes; instruction of the patient using a written protocol, attaching an HR monitoring system, preparation of box and starting weight, increasing weight every 20 seconds, recording of maximum weight lifted in the last completed cycle or the total number of completed lifting cycles, and HR after each completed lifting cycling on paper.

Psychometric Information

Method of development.

A test was developed to measure dynamic lifting capacity without using anatomic stabilization or control of speed/acceleration variables and mimic daily life lifting.

Acceptability.

Seven percent (69) to 11% (59) of the patients with LBP were not able to complete 1 lifting cycle before treatment, which might indicate a floor effect.

Reliability.

CLBP subjects.

Interrater reliability for 21 patients with a mean difference of −0.11 kg maximum weight lifted and limits of agreement (LOA) of −2.33 to 2.11 kg was acceptable. The same study using data of 24 patients studied intrarater reliability and found repeatability (2 − SD of mean change) of 4.0 kg (11% of range) in men and 3.6 (18.5% range) in women (62).

Testing with a 2-day interval in 31 patients showed an intraclass correlation coefficient (ICC) of 0.69 for women and 0.91 for men and a smallest detectable change of 6.2 kg maximum weight lifted for women (±3 cycles) and 7.1 kg for men, with a mean test score of 11.8 kg (±4.35 cycles) and 20.8 kg (±4.2 cycles), respectively (65). It should be noted that the patients were instructed to discontinue the test when experiencing an increase of pain or discomfort.

A study using a 5–9-day interval in 50 patients with CLBP found an ICC of 0.92 (95% confidence interval [95% CI] 0.87–0.96) using the number of completed lifting cycles as the outcome (68). The LOA was 2 cycles, which is 48% of the mean score of 4.27 cycles.

Healthy subjects.

Test–retest for maximum weight lifted in 10 healthy industrial workers showed a correlation coefficient of 0.87 (54). Another study of 22 female nurses reported an ICC varying from 0.69–0.71 and LOA expressed as the logarithm of time elapsed at termination of 0.75–1.28 and 0.78–1.33 for a 3- and 14-day interval, respectively (70).

The intrarater and interrater reliability in 11 and 12 healthy subjects, respectively, was moderate to good, with a repeatability of 9.37 kg (25% of range) in men and 1.66 kg (8.6% of range) in women, and an interrater repeatability of 5.61 kg (15% of range) in men and 2.37 (12.2% of range) in women (62).

Validity.

Content and face validity.

Measures at the World Health Organization level of activity and by using 3 different end points provide information about a potential limiting factor. Improvement in scores after treatment was similar for different CLPB groups (postdiscectomy versus nonsurgical) (71).

PILE testing showed a sensitivity of 0.85 and a specificity of 0.65 to discriminate between 27 patients with back or neck pain and 26 healthy persons (61).

Construct validity in CLBP subjects.

Pearson's correlation with the Work-Well Systems Functional Capacity Evaluation was good (0.75) (72).

In 1 study, 90 subjects with CLBP were randomly assigned to perform at 60% or 100% of their effort. The ability of the tester to differentiate between the amount of effort the subject was lifting on completion of the test had reasonable specificity (84.1%), but unacceptable sensitivity (65.2%) (73).

Correlations with isokinetic lumbar lifting strength before and after CLBP treatment were low and negative for women (−0.08 to −0.32), and positive, although higher prior to treatment, for men (0.38 to 0.63) (55).

Ljungquist et al assessed the influence of pain behavior during testing, pain intensity, duration of pain, more than 1 pain site, sick leave, physical activity during leisure time, and exertion during the test by using linear regression (63). Age, being a woman, and pain during the previous 4 weeks were significantly negatively associated with the PILE results, whereas neck pain and pain in more than 1 side were significantly positively associated.

The influence of psychosocial factors on the performance was confirmed in several studies, although each study included and controlled for many different factors. Geisser et al, while controlling for demographic, physiologic (body mass index [BMI], pain, metabolic equivalents, max HR, perceived effort), and other psychological variables, showed that activity avoidance is significantly associated with the percentage of maximum predicted weight lifted (74). In another study of this group, depression significantly contributed to PILE performance while controlling for age, sex, site of pain, and pain intensity. Furthermore, the physiologic effect during testing (measured by HR) mediated this relationship between depression and performance on the PILE (75). Smeets et al used a linear model, including age, sex, pain, radiating leg pain, duration of symptoms, maximum oxygen consumption (VO2max), fear of movement/injury, catastrophizing, and internal control, and showed that besides sex, depression and fear of movement significantly, although not very highly, influence the completed lifting cycles (67).

Construct validity in healthy subjects.

In a study of 74 healthy women, a linear regression model, including age, height, BMI, strength and endurance of muscle, cardiovascular endurance, trunk mobility, and coordination ability, explained only 40% of the variance with significant associations for flexion mobility, balance, VO2max, and body height (76). This study confirms that not only physiologic factors are of influence.

Von Garnier et al showed that treatment improving lifting capacity seems to be mediated by reduction of fear-avoidance beliefs about work in nurses with an LBP episode in the last 2 years but not experiencing acute LBP leading to sick leave (77).

Ability to detect change.

Mayer et al showed doubling of lifting capacity after work hardening program for CLBP, but provided no effect sizes (55).

Ljungquist et al used a combination of statistical methods to assess whether the PILE is sensitive to pick up clinically important changes in 3 other outcome measures (general health, disturbing pain, and self-efficacy) (64). They conclude that the PILE lumbar test is not responsive to clinically important change. Unfortunately, no raw data such as effect sizes, etc., are provided.

In a study of 223 CLBP patients (mean score 4.2 cycles) with general perceived effect as the external criteria and a threshold of ≥0.70 for the area under receiving operating characteristic curve (AUC) as the criterion for responsiveness, the PILE appeared to be not responsive (AUC 0.59, 95% CI 0.49–0.69) (66). The same study showed that the minimum clinically important change (MCIC) varied from 1.5 (optimal cutoff point based on AUC, sensitivity 0.71, specificity 0.44) to 3.4 (minimum detectable change).

Critical Appraisal of Overall Value to the Rheumatology Community

Strengths.

Safe, inexpensive, and easy to administer even by inexperienced persons (e.g., nurse), psychophysical lifting end point, and unconstrained lifting (no anatomical stabilization or control of speed/acceleration variables) truly reflecting self-selected “real-world” lifting techniques.

Caveats and cautions.

Cannot be applied in patients taking rate-limiting cardiac medication. Inability to discriminate the “weak link” anywhere along the biomechanical lifting chain.

Approximately 7–11% of patients with CLBP will not be able to complete a lifting cycle before treatment, leaving much room for improvement.

Clinical usability.

The results on reliability, especially the LOA, the lack of responsiveness, and a rather high MCIC, are a major concern. Even though most of these studies used the alternative outcome (amount of completed lifting cycles), we recommend not using the PILE as an outcome measure in the treatment of patients with CLBP.

Research usability.

There is sufficient evidence on the construct and content validity as well as moderate predictive validity to use the PILE for research, especially for increasing our insight in the complicated interaction between physical and psychosocial factors on frequent lifting, which is often impaired in patients with disabling CLBP. It is an easy to learn and administer test, cheap, and not highly time consuming, which needs only a limited amount of equipment. Despite a potential temporary increase of pain, the test appears to be safe for patients with CLBP.

QUEBEC BACK PAIN DISABILITY SCALE (QBPDS)

Description

Purpose.

Measures the level of functional disability (78). This questionnaire was originally developed to monitor and compare patient progress (78). This questionnaire, developed for ambulatory patients with various disability levels and developed for researchers and clinicians, has since been used in various populations (acute low back pain [LBP] [79], chronic disabling pain [75], sacroiliac joint dysfunction [80], lumbar spinal stenosis [81], undergoing disc surgery [82], and posterior surgical decompression [83]) and settings.

The studies describing the development and the measurement properties of the QBPDS were published by Kopec et al in 1996 and 1995, respectively (78, 84). According to the results of these initial development studies, the authors suggested a few changes regarding the scale's format and the wording of some of the items to reach the final version of the questionnaire (84).

Content.

Items represent elementary daily activities that patients with back pain might perceive difficult to perform. Items can be classified into 6 domains of activity affected by back pain: bed/rest (items 1–3), sitting/standing (items 4–6), ambulation (items 7–9), movement (items 10–12), bending/stooping (items 13–16), and handling of large/heavy objects (items 17–20) (78).

Number of items.

20 items.

Response options/scale.

For each item, a 6-point Likert scale (0–5) to indicate the level of difficulty is used, where 0 = “not difficult at all,” 1 = “minimally difficult,” 2 = “somewhat difficult,” 3 = “fairly difficult,” 4 = “very difficult,” and 5 = “unable to do.” Kopec et al suggested using this scale's format rather than the numerical 11-point scale (0–10) used in the development studies (84).

Recall period for items.

Patients are asked to answer the QBPDS according to the difficulty they have to perform the activities the current day (“today”).

Endorsements.

The QBPDS is included in the few back-specific questionnaires recommended in literature (9, 10).

Examples of use.

Alschuler KN, Theisen-Goodvich ME, Haig AJ, Geisser ME. A comparison of the relationship between depression, perceived disability, and physical performance in persons with chronic pain. Eur J Pain 2008;12:757–64 (75).

Cusi M, Saunders J, Hungerford B, Wisbey-Roth T, Lucas P, Wilson S. The use of prolotherapy in the sacroiliac joint. Br J Sports Med 2010;44:100–4 (80).

Almeida DB, Prandini MN, Awamura Y, Vitola ML, Simiao MP, Milano JB, et al. Outcome following lumbar disc surgery: the role of fibrosis. Acta Neurochir (Wien) 2008;150:1167–76 (82).

Verbunt JA, Sieben JM, Seelen HA, Vlaeyen JW, Bousema EJ, van der Heijden GJ, et al. Decline in physical activity, disability and pain-related fear in sub-acute low back pain. Eur J Pain 2005;9:417–25 (85).

Sanchez K, Papelard A, Nguyen C, Jousse M, Rannou F, Revel M, et al. Patient-preference disability assessment for disabling chronic low back pain: a cross-sectional survey. Spine (Phila Pa 1976) 2009;34:1052–9 (86).

Reneman MF, Jorritsma W, Schellekens JM, Goeken LN. Concurrent validity of questionnaire and performance-based disability measurements in patients with chronic nonspecific low back pain. J Occup Rehabil 2002;12:119–29 (87).

Van den Hout JH, Vlaeyen JW, Heuts PH, Zijlema JH, Wijnen JA. Functional disability in nonspecific low back pain: the role of pain-related fear and problem-solving skills. Int J Behav Med 2001;8:134–48 (88).

Wilhelm F, Fayolle-Minon I, Phaner V, Le-Quang B, Rimaud D, Bethoux F, et al. Sensitivity to change of the Quebec Back Pain Disability Scale and the Dallas Pain Questionnaire. Ann Phys Rehabil Med 2010;53:15–23 (89).

Practical Application

How to obtain.

A web site (http://www.tac.vic.gov.au/upload/Quebec-Back-Pain.pdf) provides free access to the questionnaire in English. A copy of the questionnaire is also available in the publication by Fritz and Irrgang (79).

Method of administration.

The QBPDS is normally completed by patients using paper and pen. It can also be administered by mail (90) or telephone (78).

Scoring.

Items are not weighted and the total score is calculated by adding up the scores of each items. There are no specific instructions in case of item omission. Sometimes scores are given for each domain (89).

Score interpretation.

Scores range from 0 (no disability) to 100 (maximal disability).

Respondent burden.

Self-administration takes ∼5 minutes (84, 91). No specific difficulty has been reported regarding item reading or understanding.

Administrative burden.

Time to administer and to score the questionnaire is short; training necessity is not reported.

Translations/adaptations.

The questionnaire has been translated into French (French-Quebec) (78). The use of the French-Quebec version in patients living in France did not cause major problems (91). The QBPDS has been culturally adapted to Dutch (90), Iranian (92), Brazilian Portuguese (93), Turkish (94, 95), and Arab (Maroc) (96).

Psychometric Information

Method of development.

Several steps (involving clinicians and patients) have been conducted to develop the QBPDS. Forty-eight items designed to assess limitations in elementary activities by using a numerical 11-point scale (ranging from 0 [not difficult at all] to 10 [extremely difficult]) to measure the level of difficulty were administered to 242 ambulatory patients from various settings who sought care for back pain. Patients were asked additional questions concerning item relevance and clarity. Test–retest, responsiveness, and homogeneity of the item analyses were performed; a statistical method based on item-response theory was applied to evaluate the discriminating ability of each item. Final item selection was guided by the analysis as well as by practical considerations. A major concern was to ensure that all types of physical activities relevant to back pain were represented. Developers also wanted the questionnaire to be highly reliable and discriminative over a wide range of disability levels, while at the same time being practical and acceptable to both patients and clinicians (78). Finally, 20 items representing 6 empirically derived categories of activity were selected (84).

Acceptability.

The QBPDS appears acceptable to both patients and clinicians (78, 91). Kopec et al reported low item omission (range 0.7–1.8%) (84). A higher rate of incomplete questionnaires (10.8%) was reported for questionnaires administered by mail (90).

According to some patients, a few items lack precision and the choice between response options 0 and 1 and between 4 and 5 is not always easy, and the item “throw a ball” surprised some patients (91). No ceiling or floor effects were reported (49).

Reliability.

Internal consistency.

The development study revealed a high internal consistency (Cronbach's α = 0.96) using the original numerical 11-point scale (84). Similar high internal consistency is confirmed for the 6-point Likert scale (0–5) in other languages (Cronbach's α = >0.90) (90, 92, 93, 95).

Reproducibility.

Reproducibility is good: the development studies (numerical 11-point scale) revealed high Pearson's correlation coefficients for all items (78) and an intraclass correlation coefficient (ICC) based on 2 self-administrations (spaced by 1–14 days, median 3.8 days) of 0.92 (84).

Davidson and Keating (49) studied test–retest reliability (6-week interval) with the 6-point Likert scale in 47 patients who were seeking treatment for LBP and who reported no change during the 6 weeks. They reported an ICC, SEM, and minimum detectable change (MDC) of 0.84 (95% confidence interval [95% CI] 0.73–0.91), 8 (95% CI 6–10), and 19 (95% CI 14–24), respectively. A similar study in patients with chronic LBP reported a slightly lower SEM (5.7) and smallest detectable change (15.8) (97).

Recently, Hicks and Manal reported an ICC, SEM, and MDC of 0.94, 4.73, and 11.04, respectively, in community-dwelling patients ages 62 years or older with current LBP (mean test–retest interval of 11 days) (98).

Studies in other languages also revealed good reliability, with an ICC generally ≥0.9 (90, 93, 95).

The test–retest reliability (4-week interval) appeared lower in a group of patients with acute LBP (0.55) (79).

Validity.

Content and face validity.

Content and face validity was good (99), as the questionnaire contains various domains of activity that were selected by patients and health care providers, and has good measurement properties (78). Due to the poor response rate, developers did not include questions on sexual activities, although it may be important (78).

However, although patients were involved in the development questionnaire, disability of the activities assessed by the QBPDS does not necessarily seem to be the priority (86).

Construct validity.

The scale is able to discriminate between groups of patients that are expected to differ in the disability level (84) or self-rated health (98).

Internal construct validity.

Kopec et al (78) reported a relatively high degree of interitem correlation (ranging from 0.24–0.87) as well as a very high item-total correlation (range 0.59–0.86). Later, the literature reported interitem correlations lower than 0.80 (suggesting absence of redundancy) (91) and item-total correlation ranging from 0.44–0.83 (90, 91).

External construct/convergent validity.

The QBPDS correlated strongly with other self-reported functional limitation measures such as the Roland-Morris Disability Questionnaire (RDQ), the Oswestry Disability Index (ODI), and the physical function subscale of the Short Form 36 (r = 0.77, 0.80, and 0.72, respectively) (84). Correlations with pain were weak to moderate (r = 0.54) (84).

Recent studies confirmed moderate to strong associations with other disability questionnaires, with correlation coefficients ranging from 0.6–0.91 (87, 90–93, 95, 98), and moderate to weak correlations with pain (90, 92, 93, 95), direct measure of physical function (75, 81, 87, 95), and psychosocial variables (91, 98).

Ability to detect change.

The few studies dealing with this measurement property (49, 79, 84, 97, 100) appeared heterogeneous in populations, the external criterion used, and statistical methods to estimate minimal important change, resulting in a wide range of values.

Responsiveness.

Responsiveness was good and similar to the ODI and RDQ (49); in the original studies the items proved highly sensitive to change (78) and the scale appeared able to detect relatively small changes in the level of disability over time (84). Although the difference in change scores between patients who said they had improved and those who said they had deteriorated was significant, the Norman-Streiner coefficient of sensitivity was low (0.26).

Davidson and Keating reported a standardized response mean (SRM) of 0.49 (49); however, Wilhelm et al reported a high sensitivity to change for the QBPDS (SRM 0.80, effect size 0.62) for the total score as for the score of the 6 specific domains (89).

A recent study (97) focusing on the ability of the QBPDS to detect change in patients with chronic LBP referred for a multidisciplinary treatment performed a receiver operating characteristic analysis, and revealed an area under the curve (AUC) of 0.850 points (versus 0.740 and 0.870 in the studies by Davidson and Keating and Fritz and Irrgang, respectively) (49, 79). Based on score change (expressed as the percentage) from baseline, the AUC was 0.856 (97).

Interpretability.

The literature reported minimal important change (MIC) values ranging from 8.5–32.9 points in patients with back pain (53). This is mainly due to the heterogeneity of the studies. Recently, a group of experts with a particular focus on primary care proposed considering the MIC for the QBPDS as a decrease of 20 points or 30% relative to the baseline score without taking into account the statistical method used (53). However, they specified that different MICs may be more appropriate for different populations and contexts.

A recent study focusing on the ability of the QBPDS to detect change in patients with chronic LBP referred for a multidisciplinary treatment revealed an optimal cutoff value of 5 points based on a receiver operating characteristic analysis (versus 15 points in the study by Fritz and Irrgang [79]), and 18.1% when the score change was expressed as the percentage from baseline (97). This study confirmed that the baseline score has an impact on the magnitude of the optimal cutoff score (97).

Critical Appraisal of Overall Value to the Rheumatology Community

Strengths.

This questionnaire measures the level of functional disability in daily life, which is essential in patients with LBP. Furthermore, it is short, easy to use, and acceptable (for patients and clinicians); has good clinimetric properties (reliability, face and construct validity, ability to detect changes); and is available in several validated translated versions. Therefore, it belongs to the few back-specific disability questionnaires recommended in literature (2).

Caveats and cautions.

Because the authors of the original version suggested a few changes (regarding the scale's format and the wording of some of the items) following the 2 development studies (78, 84), one cannot be sure that all clinimetric properties reported in those studies are identical for the newly proposed version.

Despite the rather good clinimetric properties, the use of the QBPDS still remains much less frequent than the RDQ or ODI.

Clinical usability.

The administrative and respondent burden of the QBPDS is extremely low, and thus easy for clinical use. The absence of a consensus regarding interpretability values resulting from the limited studies with a high level of heterogeneity makes the interpretation of individual score change difficult.

Research usability.

The good clinimetric properties of the QBPDS support using it in research.

ROLAND-MORRIS DISABILITY QUESTIONNAIRE (RDQ)

Description

Purpose.

The RDQ was designed in 1983 (101) for use in primary care research to assess physical disability due to low back pain (LBP). It has extensively been used in clinical practice in different settings (primary care, injured workers, and multidisciplinary rehabilitation center) to monitor progress in patients with acute, subacute, and chronic LBP and sciatica (1, 102). The original description of the RDQ included a pain rating scale that is not recommended and consequently used anymore (41).

Some modifications have been proposed: 1) changing the phrase “because of my back pain” into “because of my back or leg problems” to make it suitable for patients with sciatica (103), 2) reducing the 24 items to 18 due to analysis of redundancy (104), 3) removing 5 items to improve responsiveness and adding 4 additional items, resulting in a 23-item RDQ (103), and 4) changing the timeframe from the last 24 hours into “how many days of the previous month” the patient has been affected (105).

As these modifications resulted in only modest improvements, or have been insufficiently validated, the use of the original version has been recommended (41, 42, 102). In this review, only data regarding the 24-item original RDQ are shown.

Content.

The items represent the execution of daily physical activities and functions that may be affected by LBP, such as housework, sleeping, mobility, dressing, getting help, appetite, irritability, and pain severity. Although it is called a “disability” scale, it contains elements of impairment, disability, and handicap according to the International Classification of Functioning, Disability and Health (78, 106, 107).

Number of items.

24 items, no subscales.

Response options/scale.

In the original version, a patient has to tick a box against the statements that apply to him and leave them blank otherwise. Modified versions use a “yes” and “no” response option for each item (108, 109).

Recall period for items.

Relates to the last 24 hours.

Endorsements.

Deyo et al (42) recommended using the RDQ (or the Oswestry Disability Index [ODI]) in a standard set of outcome measures for back pain. However, since then, in several reviews about functional status measures in back pain, no specific recommendations for a specific measurement tool were made (1, 102).

Examples of use.

Smeets RJ, Vlaeyen JW, Hidding A, Kester AD, Van Der Heijden GJ, Knottnerus JA. Chronic low back pain: physical training, graded activity with problem solving training, or both? The one-year post-treatment results of a randomized controlled trial. Pain 2008;134:263–76 (58).

Artus M, van der Windt DA, Jordan KP, Hay EM. Low back pain symptoms show a similar pattern of improvement following a wide range of primary care treatments: a systematic review of randomized clinical trials. Rheumatology (Oxford) 2010;49:2346–56 (110).

Wilkens P, Scheel IB, Grundnes O, Hellum C, Storheim K. Effect of glucosamine on pain-related disability in patients with chronic low back pain and degenerative lumbar osteoarthritis: a randomized controlled trial. JAMA 2010;304:45–52 (111).

Lamb SE, Hansen Z, Lall R, Castelnuovo E, Withers EJ, Nichols V, et al. Group cognitive behavioural treatment for low-back pain in primary care: a randomised controlled trial and cost-effectiveness analysis. Lancet 2010;375:916–23 (112).

Hay EM, Mullis R, Lewis M, Vohora K, Main CJ, Watson P, et al. Comparison of physical treatments versus a brief pain-management programme for back pain in primary care: a randomised clinical trial in physiotherapy practice. Lancet 2005;365:2024–30 (113).

Mannion AF, Muntener M, Taimela S, Dvorak J. A randomized clinical trial of three active therapies for chronic low back pain. Spine (Phila Pa 1976) 1999;24:2435–48 (114).

Practical Application

How to obtain.

A free download of the questionnaire in different languages is available from www.rmdq.org/Download.htm. Queries may be sent to mroland@man.ac.uk. A copy is available in the original publication (41).

Method of administration.

A self-completed questionnaire on paper and an electronic version. Both versions seem equivalent and can be used interchangeably (115). The RDQ can also be administered by telephone (41, 109).

Scoring.

Items are not weighted. The total score is calculated by adding up the “yes” answers or the items checked by the patient. Scoring does not include an abstinence option (as a result, the denominator remains 24 even if the statement is not applicable to the patient), which may be problematic.

Score interpretation.

Scores range from 0 (no disability) to 24 (maximal disability). Female patients more frequently select item 5, “using a handrail to get upstairs,” and item 7, “holding on to something to get out of chair,” and patients ages ≥65 years more often select item 5 (116, 117). In the original study describing the natural history/evolution of LBP, median RDQ scores were 11, 8, and 4 on presentation, 7 days later, and 1 month later, respectively. Stratford et al (118) reported that 68% of patients with mechanical LBP had initial scores ranging from 7–17.

Respondent burden.

Completion takes ∼5 minutes (119). The RDQ is short and readily understood by patients (52).

Administrative burden.

Scoring takes <1 minute. Training is not necessary.

Translations/adaptations.

Translations are available in Arabic (Egyptian), Bulgarian, Chinese, Croatian, Czech, Danish (120), Dutch (121), English (Canadian, US, Australian), Flemish, French (122), German (123), Greek (124), Hungarian, Icelandic, Iranian (92), Italian (125), Japanese (126), Korean, Norwegian (127), Polish, Portuguese (128), Brazilian Portuguese (129), Moroccan (130), Romanian, Russian, Spanish (131), Argentinean (132), Columbian, Mexican, Puerto Rican, Venezuelan, Swedish (133), Thai, Tunisian (134), and Turkish (117), as well as for India (Hindi, Kannada, Marathi, Tamil, Telugu, Urdu). Several of these versions have not been validated.

Psychometric Information

Method of development.

The RDQ (101) includes 23 items selected from the Sickness Impact Profile (a 136-item health status measure) (135); 15 relate to the physical category, 3 to sleep and rest, 2 to psychosocial, 2 to home management, and 1 to eating. An additional item (“my back is painful almost all the time”) is related to the frequency of back pain. The authors chose these items because they describe activities usually affected by LBP; the phrase “because of my back pain” was added to all items to make it specific to LBP and to exclude disability due to another cause.

Acceptability.

The RDQ appears acceptable to both patients and clinicians. It is more discriminative in patients who have relatively little disability rather than a high level of disability (41). Proportions of items omitted by the patients are scarcely reported. Kovacs et al (131) and Scharovsky et al (132) reported no missing values compared to 19% and 18%, respectively, in the ODI. The Brazilian RDQ proved to be easy to understand and in 94% of the patients, no item was missing (129). In workers with back injury claims, 14.6% did not answer ≥1 items (109).

Neither floor nor ceiling effects were seen at baseline among workers with recent work-related back injuries (109, 136). In a study of patients with mild to moderate low back pain, 22% scored ≤2 at baseline, including 4.9% who scored 0 (137).

Reliability.

Good internal consistency is reported, with Cronbach's alpha ranging from 0.84–0.96 (41, 106, 109, 127, 130, 138, 139). Reliability for short time intervals (1–14 days) (101, 132) is higher compared to intervals longer than 6 weeks (49, 140). Pearson's correlation coefficients for test–retest in patients with acute/subacute LBP are 0.91 for the same day (101), 0.88 for 1 week (133), and 0.83 for 3 weeks (141). In patients with chronic LBP, a correlation coefficient of 0.72 (interval 2 days to 6 months) was found (140).

The intraclass correlation coefficients (ICCs) for test–retest in patients with acute/subacute LBP are 0.93 for 1–14 days (106), 0.91 for 2 weeks (142), and 0.86 for 3–6 weeks (118). In a mixed group of patients with acute/subacute and chronic LBP referred for physiotherapy, the ICC ranged from 0.42–0.53 (interval of 6 weeks) (49). Almost all studies with a time interval of >2 weeks have lower ICCs than the studies with a shorter interval (142).

In a mixed group of patients with acute/subacute and chronic LBP, SEMs of 3.7 and 4.1, respectively, are reported (49). The SEM depends on the statistical method used, time interval, and definition of unchanged patient, and for patients with chronic LBP it ranged from 1–2.1, 1.3–2.5, and 1.7–2.2, respectively (143).

Minimum detectable change (MDC) also depends on time interval (range 3.7–6.9), definition of “unchanged” patients (range 4.8–6), type of SEM measurement (range 2.7–5.8), treatment type (range 5.4–5.6), and baseline scores (range 5.5–6) (143). Other reports of MDC are generally in line with these ranges (118, 127, 144, 145).

Limits of agreement (LOA) tend to increase as time between tests increases for patients with chronic LBP. Demoulin et al (143) reported almost double values (range −5.8 to 7.8) for a time interval of 12 or more weeks compared to 1–2 weeks (range −3.5 to 3.9). For short intervals (2 weeks), LOA varied from −4.6 to 6.2 (142).

Validity.

Content and face validity.

Only a limited range of problems in physical daily activities related to back pain is assessed. Evaluation of the different activities specified by patients with LBP produced a list of 325 activities (pooled in 56 similar activity groups) compared to the 24 items in the RDQ (146).

The RDQ contains a small number of psychosocial items that are not related to functional limitation per se, e.g., appetite, irritability.

Construct validity.

RDQ scores correlate moderately to strongly with other self-reported disability measures: the Quebec Back Pain Disability Scale (r = 0.60) (84, 87), the ODI (r = 0.50) (52, 87, 147), the Back Pain Functional Scale (r = 0.79) (148), the Aberdeen Back Pain Scale (r = 0.68) (149), the Isernhagen Works Systems Functional capacity (r = −0.20) (87), and the EuroQol (r = −0.50) (149, 150).

RDQ scores show weak to modest correlations with pain intensity (range 0.26–0.57) (149–151), physical impairment tests such as the straight-leg raising test, and flexion range of motion (range 0.27–0.44) (152).

The RDQ largely satisfies the Rasch model for unidimensionality (108). However, there are insufficient items of higher difficulty to sufficiently evaluate persons with mild disability. Some misfitting items have been found and many of the items are of moderate difficulty with few easy or difficult items. This means that it is easier to detect change for individuals who start with scores in the middle of the range than those who start with high or low scores.

Ability to detect change.

The magnitude of responsiveness is dependent on the type of external criteria used (153). Furthermore, the time interval between tests, interpretation of the general perceived effect scale, and baseline scores have a considerable impact on the responsiveness indicators of the RDQ (143).

Several authors comparing the RDQ and ODI have concluded that the RDQ is more sensitive to change (2, 102), especially for minor levels of functional limitation. However, the RDQ may be relatively insensitive to deterioration in the patients' condition.

Responsiveness statistics, such as areas under curves (AUCs), ranged from 0.68–0.93 (49, 51, 52, 102, 137, 154). Demoulin et al (143) reported different AUC scores for different definitions of unchanged patients (range 0.83–0.90) and baseline scores (range 0.89–0.91). Effect sizes ranged from 0.50–1.60 (49, 103, 106, 118, 146).

Standardized response means (SRMs) ranged in patients with subacute/chronic LBP from 0.55–0.90 for a time interval of 6 weeks (49, 149) to 0.72 (6 months) and 0.83 (1 year) (149). SRMs (3-week period) were 1.34 for patients with acute LBP, 0.80 for patients with subacute LBP, and 0.48 for patients with chronic LBP (155). For another population with chronic LBP (interval of 28 weeks), the SRM ranged from 1.33–2.64 (153).

Cutoff points for relevant improvements strongly depend on baseline severity and methods used for estimation of minimum clinically important change (MCIC) (143, 145, 153). Systematic reviews concluded that as an approximate guide, changes of 2–3 points on the RDQ between groups should be considered the MCIC (41, 145). Kovacs et al (145) reported for patients with subacute and chronic LBP a MCIC ranging from 2.5–6.8 points in patients with baseline scores below 10 points, and from 5.5–13.8 in patients with baseline scores ≥15 points.

Based on an expert consensus, a 30% change from baseline was proposed as a clinically meaningful improvement, which normally means an absolute change of 5 points (53).

Critical Appraisal of Overall Value to the Rheumatology Community

Strengths.

The RDQ is the most comprehensively validated measure in low back pain. It is short, simple to complete, and readily understood by patients and clinicians. Psychometric properties are acceptable to good and the RDQ is available in many language versions. It can be used in patients with acute, subacute, and chronic LBP.

Caveats and cautions.

There is some evidence that the RDQ does not provide a sufficient spread of items representing activities on a continuum from easy to hard (116). The poor fit of some items to the factor “disability” needs further attention (108, 116). Garrat (108) stated that the RDQ could be improved through the removal of items with poor fit statistics and the addition of items toward the extremes of the scale hierarchy. None of the versions have sufficient items of higher difficulty to assess persons with low levels of disability, making it inadequate for assessing function in patients with little disability (116).

Clinical usability.

The administrative and respondent burden is very low. RDQ scores and changes scores must be interpreted with caution due to poor-fitting items and the fact that the RDQ does not appear to have interval-level properties. It is inadequate for use in patients with little disability.

Research usability.

The psychometric quality is sufficient for using the RDQ in research. Score distributions must be examined before statistical analysis and Rasch-transformed scores can be used to adjust for the imperfections in the scale hierarchy (108).

Table  . Summary Table for Measures of Function in Low Back Pain/Disorders*
ScalePurpose/contentMethod of administrationRespondent burdenAdministrative burdenScore interpretationReliability evidenceValidity evidenceAbility to detect changeStrengthsCautions
  • *

    PILE = Progressive Isoinertial Lifting Evaluation; HR = heart rate; ICC = intraclass correlation coefficient; LOA = limits of agreement; N/A = not applicable; ODI = Oswestry Disability Index; LBPRS = Low Back Pain Rating Scale; RDQ = Roland-Morris Disability Questionnaire; MCID = minimum clinically important difference; ICF = International Classification of Functioning, Disability and Health; MDC = minimum detectable change; QBPDS = Quebec Back Pain Disability Scale.

PILECapacity to tolerate strenuous lifting throughout a day and to evaluate lifting capacityObserver-led task5–15 min, temporary increase of pain5–15 min, increase of lifting weight, register time, HR, and quality of liftingWeight lifted adjusted for sex/weight or number of completed lifting cyclesGood ICCs; however, LOA large (48% of baseline score)Good and proof for construct validityPoor to moderateSafe, inexpensive, easy to administer psychophysical lifting end point, and unconstrained lifting reflects “real-world” liftingN/A in patients taking HR-limiting medication. Unable to discriminate the “weak link” of the biomechanical lifting chain 10% of patients are not able to perform task
ODIMeasuring pain-related disability in people with acute, subacute, or chronic low back painSelf-completed questionnaire by patient on paper and/or phone<5 min<1 minTotal score ranges from 0 (no disability) to 100 (maximum disability)Good ICCsAdequate content and construct validity; however, lacks generic activities such as work, leisure, recreational, or sporting activitiesCutoff point for minimum important change is 10 points or a 30% score improvementSimple to use and score, and has minimal respondent and administrative burdenFace-to-face or computer administration would be the preferred method over telephone interview
LBPRSMeasuring 3 clinical illness components of low back pain: pain (back and leg), disability, and physical impairmentSelf-completed questionnaire by patient on paper or by interview∼15 min∼15 min
  • Score ranges: 0–60 for pain, 0–40 points for disability, 0–40 points for impairments

  • Recommended not to use the total sum score

  • Higher scores are indicative of more problems

High interrater reliability (97.7%)Correlates highly with RDQMCID for the disability scale is 17 and for pain scale is 10 pointsSimple and contains a well-balanced distribution of items across the ICF components pain, activity limitation, and physical impairmentResponsiveness is lower compared to RDQ and ODI Lacks information on MDC and SEM
RDQMeasuring daily physical activities and functions that may be affected by low back painSelf-completed questionnaire by patient on paper and electronic version<5 min<1 minScores range from 0 (no disability) to 24 (maximal disability)Internal consistency and ICC are good MDC and SEM are known, but are influenced by several factors (time intervals, methods used, etc.)
  • Acceptable; contains a small number of items that are not related to functional limitations

  • Correlates well with other disability measures

MCID ranges from 2–5 points. A 30% change from baseline was proposed as a clinically meaningful improvement (normally equivalent to an absolute change of 5 points)Short, simple to complete, and readily understood by patients and clinicians. Psychometric properties are acceptable to good and the RDQ is available in many language versions. It can be used in acute, subacute, and chronic low back pain patientsLess suitable for patients with low levels of disability. Can be improved through the removal of items with poor fit statistics and the addition of items toward the extremes of the scale hierarchy
QBPDSMeasuring elementary daily activities that patients with back pain might perceive difficult to perform. Items can be classified into 6 domains of activity affected by back painSelf-completed questionnaire by patient on paper, mail, and/or phone<5 min<1 minScore ranges from 0 (no disability) to 100 (maximal disability)Internal consistency and ICC are goodGood, contains various domains of activity that were selected by patients and health care providers; correlates well with other disability measuresMCID ranges from 8.5–32.9 mainly due to the heterogeneity of the study populations. A 30% change from baseline was proposed as a clinically meaningful improvementShort, easy to use, and acceptable. Measures functional disability in daily life that is essential in patients with low back painDue to changes regarding the scale's format and the wording of some of the items, one cannot be sure that all clinimetric properties reported in studies are identical for the newly proposed version

AUTHOR CONTRIBUTIONS

All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published.

Ancillary