Background
Osteoarthritis (OA), the most common rheumatic disease, primarily affects the articular cartilage and subchondral bone of a synovial joint and results in joint failure. The most typical radiographic features are the formation of osteophytes at the joint margins, joint space narrowing, subchondral sclerosis, subchondral cyst formation and chondrocalcinosis (Scott 1993). It has been estimated that about 40% to 80% of people with radiographic changes will have symptomatic disease. The Framingham Osteoarthritis Study found that 10% of people aged 63 years and over had symptomatic knee OA in the presence of radiographic changes (Felson 1987). A comparable population-based survey conducted amongst residents of Beijing found that Chinese women had an even higher prevalence of knee OA compared to the mostly Caucasian women in the Framingham study (Zhang 2001).
People with symptomatic OA of the knee complain of deep, aching pain. In early disease, pain is intermittent and mostly associated with joint use. For many people, symptomatic disease progresses and the pain becomes more chronic and may also be present at rest and during the night. The joint feels 'stiff', resulting in typical pain and difficulty when initiating movement after a period of rest. With advanced disease, patients may experience crepitus or deep 'creaking' sounds on movement and the range of joint motion often becomes limited. People with progressive symptomatic knee OA experience increasing difficulty with daily functional activities. In fact, knee OA is responsible for more disability in walking, stair climbing and housekeeping in non-institutionalised people aged 50 years and over than any other disease (Davis 1991; Guccione 1994). Ultimately, chronic OA involving the lower limb joints leads to reduced physical fitness with resultant increased risk of cardiovascular co-morbidity (Minor 1988; Philbin 1995).
Altered biomechanics, resulting in increased joint loading rate or localised stress in the articular cartilage, has an important role in both the initiation and progression of knee OA (Cooper 1995; Felson 1995; Kujala 1995; McAlindon 1999; Rangger 1995; Slemenda 1997; Zhang 1996). Currently, there is no known cure for OA. However, disease-related factors, such as impaired muscle function and reduced fitness, are potentially amenable to exercise (Buchner 1992; Fiatarone 1993). International guidelines advocate various non-pharmacological treatments, including exercise, as the first line of management for people with OA (Roddy 2005; Zhang 2008).
Objectives
To determine whether land-based therapeutic exercise is beneficial for people with knee OA in terms of reduced joint pain and improved physical function.
Methods
Criteria for considering studies for this review
Types of studies
Randomized or quasi-randomized controlled trials, published in the English language, comparing some form of land-based therapeutic exercise with a non-exercise group.
Types of participants
Adults, male or female, with either an established diagnosis of knee OA according to accepted criteria (Altman 1991) or self-reporting knee OA on the basis of chronic joint pain (without radiographic confirmation).
Types of interventions
Any land-based non-perioperative therapeutic exercise regimes aiming to relieve the symptoms of OA, regardless of content, duration, frequency or intensity. The comparator (control) group could be active (any non-exercise intervention) or placebo (no treatment or waiting list) group.
Types of outcome measures
In accordance with international consensus regarding the core set of outcome measures for phase III clinical trials in OA (Bellamy 1997), the randomized clinical trial needed to include assessment of at least one of the following:
- self-reported pain;
- self-reported physical function.
For this systematic review, data from the outcomes assessment conducted immediately post-treatment (or the most immediate assessment post-treatment) have been used.
Search methods for identification of studies
Five electronic databases were searched: MEDLINE (January 1966 to December 2007), EMBASE, CINAHL (January 1982 to December 2007), PEDro (Physiotherapy Evidence Database), and the Cochrane Central Register of Controlled Trials (CENTRAL) (The Cochrane Library).
The five search strategies are outlined in Appendix 1; Appendix 2; Appendix 3; Appendix 4; Appendix 5.
Data collection and analysis
Two review authors (SM, MF) independently screened retrieved clinical studies for inclusion, extracted data from all included studies, and conducted the methodological quality assessment. If agreement was not achieved at any stage, a third review author (MB) adjudicated.
Risk of bias
The quality of the included studies was evaluated according to three criteria. Two of these criteria are recommended by Jadad et al (Jadad 1996).
1. Blinding of intervention provider, recipient, and outcomes assessment.
As it is arguably not possible to truly blind the intervention provider or recipient to the treatment allocation in RCTs evaluating exercise programs versus non-exercise programs, the studies were evaluated according to whether the outcomes assessment was blinded.
2. Handling of withdrawals and dropouts.
Studies were assessed according to whether the presented results were analysed as per the intention to treat (all randomized participants according to their study treatment allocation), or efficacy analysis (where only participants completing an outcomes assessment or only participants adhering to the study treatment allocation were included).
The two criteria were supplemented by an evaluation of the reported methods for allocation concealment (Schulz 1995).
3. Adequate allocation concealment, or unclear or inadequate allocation concealment.
The studies were then assigned an overall assessment:
- low risk of bias (all three criteria met);
- moderate risk of bias (one or two criteria met);
- high risk of bias (none of the criteria met).
Data analysis
As the studies used a variety of continuous data scales to evaluate outcomes, a unitless measure of treatment effect size was needed to allow the results of the various randomized controlled trials (RCTs) to be combined. Standardized mean differences (SMD) and a random-effects model were used to calculate treatment effect sizes from the entered mean change scores and related baseline standard deviations. Change scores were used as many of the studies were small and demonstrated baseline differences in scores for the outcomes between the allocation groups. The treatment effect size, therefore, was a unitless measure providing an indication of the size of the change in terms of its baseline variability.
Sensitivity analyses were conducted according to treatment content (strengthening, aerobic, walking, other), treatment delivery mode (individual, class-based, home programs), and the number of directly supervised contact occasions. Sensitivity analyses were also conducted on various aspects of study methodology (blinding of outcomes assessment, statistical analysis method, allocation concealment, overall assessment of bias risk, and sample size).
Authors were contacted if the data could not be extrapolated in the desired form from the published manuscript.
Grading the strength of the evidence
A modified system for grading the strength of scientific evidence for a therapeutic intervention, described in Evidence-based Rheumatology (Tugwell 2004), was used to grade the evidence included in this systematic review. As blinding of participants to their allocation to an exercise intervention is not feasible in RCTs with a non-exercise comparator, only blinding of outcomes assessors (and not participants) was considered the essential blinding criterion for platinum and gold ranking.
Platinum level evidence
The platinum ranking is given to evidence from a systematic review that includes at least two individual controlled trials, each satisfying the following criteria for the major outcome(s) as reported:
- sample sizes of at least 50 per group; If they did not find a statistically significant difference, they were adequately powered for a 20% relative difference in the relevant outcome;
- blinding of assessors for outcomes;
- handling of withdrawals with > 80% follow up (imputations based on methods such as 'last observation carried forward' (LOCF) acceptable);
- concealment of treatment allocation.
Gold level
The gold ranking is given to evidence if at least one randomized clinical trial meets all of the following criteria for the major outcome(s) as reported:
- sample sizes of at least 50 per group. If they do not find a statistically significant difference, they are adequately powered for a 20% relative difference in the relevant outcome;
- blinding of assessors for outcomes;
- handling of withdrawals with > 80% follow up (imputations based on methods such as 'last observation carried forward' (LOCF) acceptable);
- concealment of treatment allocation.
Silver level
The silver ranking is given to evidence if a randomized trial does not meet the above criteria. Silver ranking would also include evidence from at least one study of non-randomized cohorts who did and did not receive the therapy, or evidence from at least one high quality case-control study. A randomized trial with a 'head-to-head' comparison of agents is considered silver level ranking unless a reference is provided to a comparison of one of the agents to placebo showing at least a 20% relative difference.
Bronze level
The bronze ranking is given to evidence if there is at least one high quality case series without controls (including simple before and after studies in which the patient acts as their own control), or if it is derived from expert opinion based on clinical experience without reference to any of the foregoing (for example, argument from physiology, bench research, or first principles).
Clinical relevance tables
Clinical relevance tables were compiled under 'Additional tables' to improve the readability of the review. For outcomes pooled on different scales, the standardized mean difference (SMD) was multiplied by the baseline standard deviation in the control group to obtain the weighted absolute change. Relative per cent change from baseline was calculated as the absolute benefit divided by the baseline mean of the control group. Number needed to treat (NNT) was calculated using the Wells calculator software available at the Cochrane Musculoskeletal Group (CMSG) editorial office. The minimal clinically important difference (MCID) for each outcome was determined for input into the calculator.
Results
Description of studies
See: Characteristics of included studies; Characteristics of excluded studies.
Of the 85 retrieved RCTs identified from the literature search, 32 studies met the inclusion criteria (Baker 2001; Bautch 1997; Bennell 2005; Deyle 2000; Ettinger 1997a/Etttinger 1997b; Foley 2003; Fransen 2001; Fransen 2007; Gur 2002; Hay 2006;Hopman-Rock 2000; Huang 2003; Huang 2005a; Hughes 2004; Keefe 2004; Kovar 1992; Maurer 1999; Messier 2004; Mikesky 2006; Minor 1989; O'Reilly 1999; Peloquin 1999; Petrella 2000; Quilty 2003; Rogind 1998; Schilke 1996; Song 2003; Talbot 2003; Thomas 2002; Thorstensson 2005; Topp 2002; van Baar 1998). Of the 32 studies, one study had two clearly different exercise intervention groups and was treated as two trials, with the sample size of the control group being equally divided between the two exercise intervention groups: aerobic walking (Ettinger 1997a) and resistance training (Etttinger 1997b). Four of the included studies recruited people with either a diagnosis of hip or knee OA (Foley 2003; Fransen 2007; Hopman-Rock 2000; van Baar 1998). These four studies provided data specific for participants with knee OA. Two of the included studies were three-armed trials (Foley 2003; Fransen 2007) with a hydrotherapy allocation. Only the land-based and control arms were included in the current review. Two studies allocated participants to two (Gur 2002) or three (Huang 2003) different forms of muscle strengthening. As the control groups in both studies were relatively small, the mean effect of the exercise allocations were combined and compared with the control group. One study (Huang 2005a) had two allocations combining exercise with ultrasound or hyaluron. Only the exercise alone allocation was considered in the current review. One study had four treatment allocations (Messier 2004), two of which included a weight reduction program. Only the exercise alone allocation versus the control group (healthy lifestyle education) was considered in the current review. One study (Mikesky 2006) included participants without knee pain. Data was provided by the author on the 37 participants with knee pain and confirmed knee OA. One study (Keefe 2004) had four allocations, two involving a spouse-assisted coping strategy intervention. Only the exercise alone and the control groups were evaluated in the current review.
Fifty-three studies were excluded for reasons given in the 'Characteristics of excluded studies' table (Atamaz 2006; Borjesson 1996; Callaghan 1995; Chamberlain 1982; Cheing 2002; Cheing 2004; Deyle 2005; Dias 2003;Diracoglu 2005; Durmus 2007; Eungpinichpong 1997; Evcik 2002; Eyigor 2004; Forster 2007; Green 1993; Haslam 2001; Hinman 2007; Hoeksma 2004; Huang 2005b; Hurley 1998; Jan 1991;Kreindler 1989; Kuptniratsaikul 2002; Lankhorst 1982; Lim 2002; Lin 2004; Lin 2007; Mangione 1999; McCarthy 2004; Messier 1997; Messier 2000 (1); Messier 2000 (2); Moss 2007; Nicklas 2004; Ozdincler 2005; Penninx 2001; Penninx 2002; Peterson 1993; Quirk 1985; Ravaud 2004; Rejeski 1998; Sen 2004; Stitik 2007; Sullivan 1998; Sylvester 1989; Toda 2001; Tuzun 2004; van Baar 2001; Van Gool 2005; Wang 2006; Williamson 2007; Wyatt 2001; Yip 2007).
There was a large variability between the 32 included studies in study participants recruited, exercise interventions assessed, and important aspects of study methodology. Most studies recruited between 50 to 150 participants. However, 11 (36%) studies recruited less than 25 participants in one or both allocation groups (Baker 2001; Bautch 1997; Foley 2003; Gur 2002; Keefe 2004; Mikesky 2006; Minor 1989; Rogind 1998; Schilke 1996; Song 2003; Talbot 2003) while one study recruited more than 750 participants (Thomas 2002).
Sample recruitment varied widely with studies recruiting: exclusively community volunteers (Bennell 2005; Ettinger 1997a/Ettinger 1997b; Fransen 2007; Hughes 2004; O'Reilly 1999; Peloquin 1999; Quilty 2003); specialist rheumatology or orthopaedic clinic patients (Foley 2003; Schilke 1996; Song 2003; Thorstensson 2005); a mix of community volunteers and specialist clinic patients (Bautch 1997; Keefe 2004; Minor 1989; Petrella 2000); general physician referrals (van Baar 1998;Hay 2006;Thomas 2002 ); or from physiotherapy waiting lists (Deyle 2000; Fransen 2001).
Approximately 50% of the sample in one study reported a symptom duration of less a year (van Baar 1998) whilst other studies reported mean symptom durations of more than 10 years (Maurer 1999; Minor 1989). Most studies stated that the American College of Rheumatology diagnosis criteria were used for study inclusion. However, 'knee pain in the past week' (O'Reilly 1999) and patello-femoral knee pain (Quilty 2003) were sufficient in two studies; two other studies required at least Kellgren and Lawrence Grade III radiographic disease for study participation (Rogind 1998;Thorstensson 2005). Studies ranged from those excluding people taking non-steroidal anti-inflammatory drugs (NSAIDs) (Bautch 1997) to others only including people currently taking NSAIDs at least twice a week (Kovar 1992). Another study included prescribing a daily NSAID for all participants for the course of the study (Petrella 2000). One study targeted only overweight or obese participants (body mass index (BMI) ≥ 28 kg.m
A wide range of therapeutic exercise programs were assessed. At the most basic level, there were exercise programs: delivered individually to the patient (Bennell 2005; Deyle 2000; Gur 2002; Huang 2003; Huang 2005a; Jan 1991; Maurer 1999; Quilty 2003; Schilke 1996; van Baar 1998), delivered in a class-based format (Bautch 1997; Ettinger 1997a/Ettinger 1997b; Foley 2003; Fransen 2007; Hughes 2004; Keefe 2004; Kovar 1992; Minor 1989; Peloquin 1999; Rogind 1998; Song 2003; Thorstensson 2005), and exercise programs to mostly undertaken by the patient at home (Baker 2001;Hay 2006;O'Reilly 1999; Petrella 2000; Thomas 2002; Topp 2002). Treatment content varied from simple quadriceps muscle strengthening using only straight leg raises (Jan 1991) and mostly aerobic walking programs (Ettinger 1997a; Kovar 1992; Messier 2004; Minor 1989; Talbot 2003) through to very complex, comprehensive programs including manual therapy, upper limb and/or truncal muscle strengthening and balance coordination (Bennell 2005; Deyle 2000; Peloquin 1999; Rogind 1998; van Baar 1998) in addition to the more usual lower limb muscle strengthening. Two studies evaluated Tai Chi classes (Fransen 2007; Song 2003).
Apart from delivery mode and content, treatment 'dosage' varied widely between studies. Monitored treatment sessions, either in individual or class-based format, ranged from 30 minutes (Bautch 1997; Hay 2006; Jan 1991; Maurer 1999; van Baar 1998) to 90 minutes (Kovar 1992) per session. The total number of monitored exercise sessions provided ranged from none (Petrella 2000; Talbot 2003) to 36 or more (Bautch 1997; Ettinger 1997a/Etttinger 1997b; Keefe 2004; Messier 2004; Minor 1989; Peloquin 1999). The total treatment duration ranged from one month (Deyle 2000) to six months (Messier 2004; Mikesky 2006; O'Reilly 1999) and two years (Thomas 2002). Treatment intensity ranged from 'maximum effort' muscle strengthening (Gur 2002; Schilke 1996) to low intensity aerobic walking (Bautch 1997;Messier 2004; Talbot 2003).
Seventeen studies used the Western Ontario and McMaster Universities Arthritis Index (WOMAC) (Baker 2001; Bennell 2005; Deyle 2000; Foley 2003; Fransen 2001; Fransen 2007; Hay 2006; Hughes 2004; Maurer 1999; Messier 2004; Mikesky 2006; O'Reilly 1999; Petrella 2000; Quilty 2003; Song 2003; Thomas 2002; Topp 2002) to score knee pain or self-reported physical function. A variety of scales were used by the other 15 studies. Nine studies used VAS scales to measure pain (Bautch 1997; Bennell 2005; Gur 2002; Hopman-Rock 2000; Huang 2003; Huang 2005a; Quilty 2003; Rogind 1998; Tak 2005). Only one study included a separate patient's global assessment of treatment effectiveness (van Baar 1998).
Risk of bias in included studies
Only two of the 32 included studies claimed blinding of both participants and outcomes assessment (Bennell 2005; Petrella 2000). Bennell 2005 used sham ultrasound (US) with non-active gel as the placebo treatment; therefore, while the participants allocated to US were mostly (67%) unaware they were on a placebo allocation, participants allocated to the exercise allocation were not blinded to their intervention status. Petrella 2000 provided non-resistive 'sham exercise' and, therefore, the claim of double-blinding relied heavily on the assumption of limited insight of the study participants. With the unavoidable difficulty of masking either the study participant or the therapist to group allocation, it would seem essential to provide blinded outcomes assessment. Just over half (56%) of the 32 studies clearly stated that the outcomes assessor was blinded to group allocation (Bennell 2005; Deyle 2000; Ettinger 1997a/Etttinger 1997b; Foley 2003; Fransen 2007; Hay 2006; Hopman-Rock 2000; Huang 2005a; Maurer 1999; Messier 2004; Mikesky 2006; Peloquin 1999; Petrella 2000; Quilty 2003; Rogind 1998; Song 2003; Thomas 2002; van Baar 1998)
According to the above criteria (methodological quality assessment) a total of nine (28%) studies could be considered as achieving a 'low risk of bias' from the published report (Bennell 2005; Ettinger 1997a; Etttinger 1997b; Foley 2003; Fransen 2007; Hay 2006;Messier 2004; Quilty 2003; Thomas 2002; van Baar 1998). A further 14 (44%) were categorized as at 'moderate risk of bias' while the remaining nine (28%) had a 'high risk of bias' (see Comparison 09).
Effects of interventions
The pain outcome measure for one study (Petrella 2000) was not included as all participants were required to take daily NSAIDs, which we considered would unfairly attenuate any pain relieving benefit attributable to the exercise program. Two studies did not provide self-reported physical function as an outcome measure (Keefe 2004; Talbot 2003).
At the time of the original review, several attempts were made to contact seven authors for additional data. Four authors responded, with two able to provide the requested results for location of OA in the knee (Hopman-Rock 2000; van Baar 1998), one able to provide WOMAC scores disaggregated for pain and physical function (Deyle 2000), and one able to provide change scores for each allocation group (Thomas 2002). No contact could be established with the other three authors. Therefore, for one study a misprint assumption was made on one 'impossible' standard error of the mean score (Bautch 1997). For another study, two baseline standard deviations needed to be extrapolated from a similar sized study using the same self-report questionnaires (Maurer 1999). For the third study, the post-treatment results for the control group were used as the baseline for the active treatment groups (two-group analysis (Ettinger 1997a/Etttinger 1997b). For the updated review, two studies recruiting both participants with OA of the hip and/or OA of the knee (Foley 2003; Fransen 2007) provided data disaggregated according to the most symptomatic joint (hip or knee).
Comparison 01
All studies
The 30 included studies provided data on almost 3800 participants.
Combining all included studies providing the relevant data demonstrated a statistically significant benefit with a standardized mean difference (SMD) in a random-effects model of 0.40 (95% confidence interval (CI) 0.30 to 0.50) for self-reported pain; and SMD 0.37 (95% CI 0.25 to 0.49) for self-reported physical function. Both these effect sizes would be considered small (Cohen 1977). Between study heterogeneity was marked: I
Comparison 02
Treatment content
The 32 included studies were categorized according to the main treatment focus: simple quadriceps strengthening, lower limb muscle strengthening (Theraband, cuff weights, Cybex), strengthening and aerobic component (stationary bicycle or walking), walking program only, or 'other' (not specifically focused on lower limb muscle strengthening or increasing aerobic capacity).
The simple quadriceps programs achieved only borderline significance for both pain and physical function and the 'other' programs resulted in an insignificant treatment effect for physical function. However, for both pain and physical function no significant difference in effect size could be demonstrated between the groups of exercise programs when testing the Chi
Comparison 03
Treatment delivery mode
All included studies were categorized according to three treatment delivery modes: individual treatments, class-based programs, or 'home' programs. However, many 'home' programs incorporated home visits by a trained nurse or community physiotherapist. Also, most individual treatments or class-based programs included provision of a home exercise program. Only one study included allocation to either individual treatments or a class-based program (Fransen 2001). Results for both these allocations were presented in the original manuscript for all participants (including those originally allocated to a waiting list control) and were presented as such for this comparison.
All three forms of treatment delivery achieved significant treatment benefits in terms of pain and physical function.
For pain, the difference in mean effect size between studies assessing individual treatments (SMD 0.55, 95% CI 0.29 to 0.81), exercise classes (SMD 0.37, 95% CI 0.24 to 0.51), or home programs (SMD 0.28, 95% CI 0.16 to 0.39) did not reach statistical significance. Similarly, for physical function the difference in mean effect size between studies assessing individual treatments (SMD 0.52, 95% CI 0.19 to 0.86), class-based sessions (SMD 0.36, 95% CI 0.19 to 0.50) or home programs (SMD 0.28, 95% CI 0.17 to 0.38) did not reach statistical significance.
Comparison 04
Number of contact occasions
All included studies were dichotomised according to the number of directly supervised sessions provided (in clinics or as home visits) to: less than 12 occasions, and 12 or more occasions.
Both categories achieved significant treatment benefits in terms of pain and physical function. However, direct supervision was clearly influential on treatment effect size. Studies evaluating programs providing less than 12 direct supervision occasions only demonstrated small mean effects for pain (SMD 0.28, 95% CI 0.16 to 0.40) and for physical function (SMD 0.23, 95% CI 0.09 to 0.37). Studies evaluating programs providing at least 12 direct supervision occasions demonstrated moderate mean effect sizes for pain (SMD 0.46, 95% CI 0.32 to 0.60) and physical function (SMD 0.45, 95% CI 0.29 to 0.62). The difference in mean treatment effect size between the two categories of studies was significant for pain (chidist (Chi
Comparison 05
Blinding of outcomes assessment
All included studies were categorized according to whether the blinding of outcomes assessment was: reported, or uncertain (not reported) or not part of the study design (unblinded).
Both categories achieved significant treatment benefits. However, the reported provision of blinded outcomes assessment was clearly influential on the magnitude of treatment effect. Studies stating the use of blinded outcomes assessment demonstrated small treatment effect sizes for pain (SMD 0.33, 95% CI 0.22 to 0.43) and for physical function (SMD 0.28, 95% CI 0.17 to 0.39). Studies not reporting blinded outcomes assessment or reporting unblinded outcomes assessment demonstrated moderate treatment effect sizes for pain (SMD 0.53, 95% CI 0.33 to 0.73) and for physical function (SMD 0.55, 95% CI 0.28 to 0.83). The difference in mean treatment effect size between these two study categories was significant for pain (chidist (5.64,1), P = 0.02) and for physical function (chidist (7.21,1), P = 0.01).
Comparison 06
Statistical analysis method
All included studies were categorized according to the method chosen to deal with study participants without follow-up data or dropouts: intention to treat, or efficacy analysis.
Both study categories achieved significant treatment benefits. However, the analysis method was highly influential on the magnitude of treatment effects. Studies using the more rigorous intention-to-treat analysis (all randomised participants) demonstrated small effect sizes for pain (SMD 0.36, 95% CI 0.21 to 0.51) and for physical function (SMD 0.30, 95% CI 0.16 to 0.45). Studies using efficacy analysis (only participants with follow-up data or only treatment completers) resulted in a moderate effect sizes for pain (SMD 0.45, 95% CI 0.32 to 0.58) and for physical function (SMD 0.43, 95% CI 0.23 to 0.64). The difference in mean treatment effect size between these two study categories did not reach statistical significance at the 0.05 level.
Comparison 07
Allocation concealment
All included studies were categorized according to the adequacy of allocation concealment into studies: reporting randomisation procedures providing adequate allocation concealment, and those not reporting sufficient details of the randomisation procedure to be certain that allocation was concealed.
Both study categories achieved significant treatment benefits. However, studies categorized as providing adquate allocation concealment reported small mean treatment effects for pain (SMD 0.33, 95% CI 0.23 to 0.44) and for physical function (SMD 0.28, 95% CI 0.18 to 0.38). Studies not reporting sufficient detail for certain, adequate allocation concealment achieved moderate mean treatment effect sizes for pain (SMD 0.49, 95% CI 0.32 to 0.66) and for physical physical function (SMD 0.48, 95% CI 0.23 to 0.73). The difference in mean treatment effect size between these two study categories was significant for pain (chidist (3.94,1), P = 0.047) and for physical function (chidist (4.71,1), P = 0.03).
Comparison 08
Estimate of bias risk
Studies were categorized according to the estimated risk of bias: low risk, moderate risk, and high risk (see the methodological quality assessment).
All three study categories achieved significant mean treatment benefits in terms of pain and physical function. However, studies at low risk of bias demonstrated small mean treatment effect sizes for pain (SMD 0.28, 95% CI 0.15 to 0.42) and for physical function (SMD 0.25, 95% CI 0.13 to 0.38). Studies at moderate or high risk of bias demonstrated moderate mean treatment effect sizes for pain and small to moderate mean treatment effect sizes for physical function. The difference in mean treatment effect size between the three study categories was significant for pain (chidist (7.98, 2), P = 0.02) and for physical function (chidist (8.98, 2), P = 0.01).
Comparison 09
Sample size
Studies were categorised according to the sample sizes of the allocated groups, to: large studies with > 50 participants in each group, medium-sized studies with < 50 but more than 25 participants in each group, and small studies with less than 25 participants in any of the allocated groups.
All three study categories achieved significant mean treatment benefits for both pain and physical function. However, the large studies achieved small mean treatment benefits for pain (SMD 0.29, 95% CI 0.16 to 0.42) and for physical function (SMD 0.24, 95% CI 0.16 to 0.33), while both the medium and small-sized studies achieved larger mean treatment effects. The difference in mean treatment effect sizes between the three study categories was significant for both pain (chidist (8.79, 2), P = 0.01) and for physical function (chidist (9.01, 2), P = 0.01).
Comparisons 01-09
Both the mean effect sizes and 95% CIs tended to be slightly smaller with a fixed-effect model compared with the random-effects model used in this meta-analysis. However, this difference was never clinically meaningful or statistically significant.
For all sensitivity analyses (Comparisons 02 to 09), between study heterogeneity within each evaluated category remained marked in most instances.
Comparison 10
To compile the clinical relevance tables for the pain and function, studies were categorsied according to the pain outcome measure: WOMAC pain, VAS pain; and also WOMAC function. All categories showed significant mean treatment benefits. Studies measuring pain using the WOMAC scale showed a significant treatment benefit (SMD 0.35; 95% CI 0.48 to 0.21), with a number needed to treat to benefit (NNTB) of 7 ( Table 1), while studies using the VAS scale also showed a similar benefit for pain (SMD 0.43 95% CI 0.65 to 0.21; NNTB of 6 Table 2). Studies measuring physical function using the WOMAC scale also showed a significant treatment benefit (SMD 0.28 95% CI 0.39 to 0.17; NNT of 8; Table 3).
Discussion
This systematic review was restricted to studies evaluating land-based therapeutic exercise for people with symptomatic knee OA in terms of self-reported knee pain and disease-specific physical function. Overall, meta-analysis demonstrated that the evaluated exercise programs resulted in a mean treatment benefit for both knee pain (SMD 0.40, 95% CI 0.30 to 0.50) and physical function (SMD 0.37, 95% CI 0.25 to 0.49). These mean treatment benefits, extrapolated from 32 randomised controlled clinical trials involving almost 3800 participants, would be considered small. They are, however, comparable to reported estimates for current simple analgesics and non-steroidal anti-inflammatory drugs taken for knee pain. If the meta-analysis results are restricted to those nine studies, with a total of 2024 participants, evaluated as having a low risk of bias, land-based therapeutic exercise demonstrated smaller but still signficant benefits in terms of knee pain (SMD 0.28, 95% CI 0.15 to 0.42) and physical function (SMD 0.25, 95% CI 0.13 to 0.38).
It should be noted that this review evaluated the effectiveness only in terms of self-reported pain and physical function in accordance with the recent international consensus recommendations (Bellamy 1997). However, regular exercise has been demonstrated to offer many other overall physical and mental health benefits apart from those related to OA-induced disease impairments.
Due to marked heterogeneity within the evaluated exercise programs, sensitivity analyses were conducted according to the stated main focus of the evaluated exercise program, the mode of treatment delivery, and the number of directly superivised treatment occasions. While these subgroups analyses should be viewed as being exploratory as they are non-randomised comparisons, some interesting findings were demonstrated. Increasing both lower limb muscle strength and general aerobic capacity are recommended in most international guidelines for the management of knee OA. Only one study attempted to directly compare two different forms of exercise, aerobic walking and muscle strengthening, but lack of study power for this particular research question led to inconclusive results (Ettinger 1997a/Etttinger 1997b). Interestingly, meta-analysis could also not demonstrate significant differences in the magnitude of treatment effect between programs that provided either lower limb muscle strengthening or monitored walking programs compared with those that included both components (Comparison 02). However, programs targeting simple quadriceps strengthening only, or programs without a primary focus on either lower limb muscle strengthening or increasing aerobic capacity (other), appeared to be less beneficial to people with knee OA particularly in terms of improving physical function.
We examined the influence of the exercise program delivery mode (Comparison 03). While studies assessing home programs demonstrated effect sizes for pain and physical function that were consistently lower than with either of the more closely supervised forms of treatment delivery (individual treatments or class-based programs), the differences between the various forms of treatment delivery were not statistically significant. This non-significant finding is likely to reflect the regular home or clinic visits by trained health professions that were incorporated into several of these home programs (Baker 2001; Hay 2006; O'Reilly 1999; Thomas 2002). This hypothesis is supported by the finding that the magnitude of the treatment effect for both pain and physical function was significantly influenced by the number of directly supervised occasions provided either as home visits, monitored classes, or individual clinic-based treatments (Comparison 04). Clearly most people with knee OA need some form of ongoing monitoring or supervision to enable an exercise program to provide optimal clinical benefits.
Exercise 'dosage' is a factor of frequency, intensity, and program duration and varies considerably between the studies included in this review. Uncertainties in actual dosage arise due to the dependence of exercise intensity not only upon exercise presciption but also upon individual exertion. The influence of program duration upon dosage is difficult to quantify, with simple addition not providing a sufficiently physiological, plausible model. None of the included studies attempted to evaluate the influence of exercise dosage. Furthermore, there were insufficient studies with comparable exercise program content to provide a meaningful subgroup analysis of the influence of exercise dosage on treatment effectiveness. Specific recommendations cannot, therefore, be made about optimal dosage (frequency, intensity, duration).
To achieve a 'platinum' or 'gold' standard for the level of scientific evidence, apart from adequate allocation concealment and limited loss to follow up, blinding of both participants and outcomes assessment is usually required. This approach provides the best protection that trial results will be free of selection, performance, attrition and detection bias. Blinding of study participants is arguably impossible to achieve in studies evaluating exercise programs. Using 'sham' exercise as the control intervention is fraught with ethical concerns (substantial wasted time for control participants attending an ineffective program) and is likely to be fairly transparent to the majority of people with OA. Therefore, a slight modification to the usual criteria has been used in this systematic review, that is, only blinding of outcomes assessment was required. It is of concern, therefore, that only 18 (56%) studies reported using blinded outcomes assessment (Comparison 05); only 14 (43%) studies used an intention-to-treat analysis (Comparison 06); and only 14 (43%) studies reported adequate allocation concealment. Not unexpectedly, the nine studies evaluated as having a low risk of bias (by fulfilling all three methodological quality criteria) demonstrated significantly lower mean effect sizes for pain and physical function compared with the other studies having moderate or high risk of bias (Comparison 08).
There are some important caveats to this review. The first concerns the responsiveness of self-reported pain and physical function. Many of the studies included in this systematic review included mostly participants with early or mild symptomatic disease. Although people with early disease frequently demonstrate reduced muscle strength and aerobic capacity compared with their age and gender peers without symptomatic OA, these physiological impairments are often not yet large enough to translate into reportable difficulties on simple questionnaires. This lack of reportable difficulties would considerably reduce the potential range of improvement that was possible (ceiling effect) on self-report questionnaires in people with early or mild disease. One of the potential benefits of exercise in people with early disease,in terms of increasing their physiological reserve capacity, will not be captured by these questionnaires. Objective measures of physical performance not only strengthen the methodological quality of a study where masking to allocation is unattainable for the participant but also potentially provide data better able to discriminate between people with early disease, where disease-related impairments have not yet developed into self-reported functional limitations or disability. Secondly, regular exercise provides general health benefits beyond reducing joint symptoms. This review is, therefore, likely to be underestimating the overall beneficial effect of exercise amongst people with knee OA.
Most people with knee OA have a pattern of chronic, fluctuating symptoms. Long-term adherence to exercise, or increased leisure-time physical activity, is required to maintain the benefits of exercise. Long-term adherence, however, usually requires the stimulus of regular supervision or monitoring (Woodard 2001). Unfortunately most individuals or healthcare systems do not have sufficient resources to allow ongoing unrestricted access to individually provided treatments for chronic musculoskeletal conditions. This review could not establish a significant difference in mean benefits, in terms of knee pain or physical function, between studies assessing individual treatments, class-based programs or (usually closely individually monitored) home programs. It could, however, be argued that the class-based format potentially provides a cost-effective alternative that could be more regularly accessed by older people when introduced to community centres or gymnasiums; and that the social contact with peers, particularly those experiencing similar disease-related symptoms, is highly likely to encourage treatment adherence.
Authors' conclusions
Implications for practice There is platinum level evidence that land-based therapeutic exercise has a benefit in terms of reduced knee pain and disability for people with knee OA. Health professionals and people with OA can be reassured that any type of exercise program that is done regularly and is closely monitored by health professionals can improve pain and physical function related to knee OA in the short term. This allows a great deal of choice, ranging from individual physiotherapy-led sessions and exercise classes to home-based programs. Exercise programs that involved more than 12 directly supervised sessions were associated with greater improvements in knee pain and physical function. The results of this meta-analysis are limited to evaluating immediate symptomatic benefits. There is still no clear evidence for the effect of regular therapeutic exercise on disease progression in people with knee OA. |
Implications for research The treatment effect for many of the studies was only modest. Multi-faceted interventions that incorporate the exercise strategies into patient care may provide greater benefit and should be tested. Initiate research to assess the effectiveness of therapeutic exercise for people with OA of the knee. |
Acknowledgements
Dr Mary Bell, University of Toronto, for adjudicating on study inclusion.
Ms Louise Falzon, Mt Sinai Medical Centre, New York, for designing the literature search strategy
Dr Renea Johnson, Coordinator, Australian Editorial Base, Cochrane Musculoskeletal Group, for overall guidance and expert advice.
Data and analyses
- Top of page
- Background
- Objectives
- Methods
- Results
- Discussion
- Authors' conclusions
- Acknowledgements
- Data and analyses
- Appendices
- What's new
- History
- Contributions of authors
- Declarations of interest
- Sources of support
- Differences between protocol and review
- Notes
- Index terms
| |||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
Appendices
Appendix 1. MEDLINE search strategy
- exp osteoarthritis/
- osteoarthr$.tw.
- (degenerative adj2 arthritis).tw.
- arthrosis.tw.
- or/1-4
- Knee/
- exp Knee Joint/
- knee$.tw.
- or/6-8
- exp EXERCISE/
- exp exertion/
- exp Physical Fitness/
- exp Exercise Test/
- exp Exercise Tolerance/
- exp Sports/
- exp PLIABILITY/
- exp Physical Endurance/
- exertion$.tw.
- exercis$.tw.
- sport$.tw.
- ((physical or motion) adj5 (fitness or therap$)).tw.
- (physical$ adj2 endur$).tw.
- ((strength$ or isometric$ or isotonic$ or isokinetic$ or aerobic$ or endurance or weight$) adj5 (exercis$ or train$)).tw.
- exp physical therapy modalities/
- physiotherap$.tw.
- manipulat$.tw.
- kinesiotherap$.tw.
- exp Rehabilitation/
- rehab$.tw.
- (skate$ or skating).tw.
- run$.tw.
- jog$.tw.
- treadmill$.tw.
- swim$.tw.
- bicycl$.tw.
- (cycle$ or cycling).tw.
- walk$.tw.
- (row or rows or rowing).tw.
- muscle strength$.tw.
- or/10-39
- randomized controlled trial.pt.
- controlled clinical trial.pt.
- randomized.ab.
- placebo.ab.
- drug therapy.fs.
- randomly.ab.
- trial.ab.
- groups.ab.
- 41 or 42 or 43 or 44 or 45 or 46 or 47 or 48
- humans.sh.
- 49 and 50
- and/5,9,40,51
Appendix 2. Embase (Ovid) search strategy
- exp osteoarthritis/
- osteoarthr$.tw.
- (degenerative adj2 arthritis).tw.
- arthrosis.tw.
- or/1-4
- Knee/
- knee$.tw.
- 6 or 7
- exp EXERCISE/
- fitness/
- exercise test/
- exercise tolerance/
- exp Sport/
- pliability/
- exp "physical activity, capacity and performance"/
- exertion$.tw.
- exercis$.tw.
- sport$.tw.
- ((physical or motion) adj5 (fitness or therap$)).tw.
- (physical$ adj2 endur$).tw.
- ((strength$ or isometric$ or isotonic$ or isokinetic$ or aerobic$ or endurance or weight$) adj5 (exercis$ or train$)).tw.
- exp physiotherapy/
- physiotherap$.tw.
- manipulat$.tw.
- kinesiotherap$.tw.
- exp REHABILITATION/
- rehab$.tw.
- (skate$ or skating).tw.
- run$.tw.
- jog$.tw.
- treadmill$.tw.
- swim$.tw.
- bicycl$.tw.
- (cycle$ or cycling).tw.
- walk$.tw.
- (row or rows or rowing).tw.
- muscle strength$.tw.
- or/9-37
- and/5,8,38
- random$.ti,ab.
- factorial$.ti,ab.
- (crossover$ or cross over$ or cross-over$).ti,ab.
- placebo$.ti,ab.
- (doubl$ adj blind$).ti,ab.
- (singl$ adj blind$).ti,ab.
- assign$.ti,ab.
- allocat$.ti,ab.
- volunteer$.ti,ab.
- crossover procedure.sh.
- double blind procedure.sh.
- randomized controlled trial.sh.
- single blind procedure.sh.
- or/40-52
- exp animal/ or nonhuman/ or exp animal experiment/
- exp human/
- 54 and 55
- 54 not 56
- 53 not 57
- 39 and 58
Appendix 3. The Cochrane Library (Wiley Interscience) search strategy
- MeSH descriptor Osteoarthritis explode all trees
- osteoarthr*:ti,ab
- (degenerative next arthritis):ti,ab
- arthrosis:ti,ab
- (#1 OR #2 OR #3 OR #4)
- MeSH descriptor Knee explode all trees
- MeSH descriptor Knee Joint explode all trees
- knee*:ti,ab
- (#6 OR #7 OR #8)
- MeSH descriptor Exercise explode all trees
- MeSH descriptor Exertion explode all trees
- MeSH descriptor Physical Fitness explode all trees
- MeSH descriptor Exercise Test explode all trees
- MeSH descriptor Exercise Tolerance explode all trees
- MeSH descriptor Sports explode all trees
- MeSH descriptor Pliability explode all trees
- MeSH descriptor Physical Endurance explode all trees
- exertion*:ti,ab
- exercis*:ti,ab
- sport*:ti,ab
- ((physical or motion) near/5 (fitness or therap*)):ti,ab
- (physical* near/2 endur*):ti,ab
- ((strength* or isometric* or isotonic* or isokinetic* or aerobic* or endurance or weight*) near/5 (exercis* or train*)):ti,ab
- MeSH descriptor Physical Therapy Modalities explode all trees
- (physical next therap*):ti,ab
- physiotherap*:ti,ab
- manipulat*:ti,ab
- kinesiotherap*:ti,ab
- MeSH descriptor Rehabilitation explode all trees
- rehab*:ti,ab
- (skate* or skating):ti,ab
- run*:ti,ab
- jog*:ti,ab
- treadmill*:ti,ab
- swim*:ti,ab
- bicycl*:ti,ab
- (cycle* or cycling):ti,ab
- walk*:ti,ab
- (row or rows or rowing):ti,ab
- muscle next strength:ti,ab
- (#10 OR #11 OR #12 OR #13 OR #14 OR #15 OR #16 OR #17 OR #18 OR #19 OR #20 OR #21 OR #22 OR #23 OR #24 OR #25 OR #26 OR #27 OR #28 OR #29 OR #30 OR #31 OR #32 OR #33 OR #34 OR #35 OR #36 OR #37 OR #38 OR #39 OR #40)
- (#5 AND #9 AND #41)
Appendix 4. CINAHL (EBSCOhost) search strategy
- S56 S55 and S42
- S55 S54 or S53 or S52 or S51 or S50 or S49 or S48 or S47 or S46 or S45 or S44 or S43 S54 TI Allocat* random* or AB Allocat* random*
- S53 (MH "Quantitative Studies")
- S52 (MH "Placebos")
- S51 TI Placebo* or AB Placebo*
- S50 TI Random* allocat* or AB Random* allocat*
- S49 (MH "Random Assignment")
- S48 TI Randomi?ed control* trial* or AB Randomi?ed control* trial*
- S47 TI singl* mask* or TI doubl* mask* or TI treb* mask* or TI tripl* mask* or AB singl* mask* or AB doubl* mask* or AB treb* mask* or AB tripl* mask*
- S46 TI singl* blind* or TI doubl* blind* or TI treb* blind* or TI tripl* blind* or AB singl* blind* or AB doubl* blind* or AB treb* blind* or AB tripl* blind*
- S45 TI "clinic* trial*" or AB "clinic* trial*"
- S44 PT Clinical Trial
- S43 (MH "Clinical Trials+")
- S42 S41 and S40 and S5
- S41 S39 or S38 or S37 or S36 or S35 or S34 or S33 or S32 or S31 or S30 or S29 or S28 or S27 or S26 or S25 or S24 or S23 or S22 or S21 or S20 or S19 or S18 or S17 or S16 or S15 or S14 or S13 or S12 or S11 or S10 or S9 or S8 or S7 or S6
- S40 S8 or S7 or S6
- S39 (ti "muscle strength*") or (ab "muscle strength*")
- S38 (ti row or rows or rowing) or (ab row or rows or rowing)
- S37 (ti walk*) or (ab walk*)
- S36 (ti cycle* or cycling) or (ab cycle* or cycling)
- S35 (ti bicycl*) or (ab bicycl*)
- S34 (ti swim*) or (ab swim*)
- S33 (ti swim*) or (ab swim*)
- S32 (ti treadmill*) or (ab treadmill*)
- S31 (ti jog*) or (ab jog*)
- S30 (ti run*) or (ab run*)
- S29 (ti skate* or skating) or (ab skate* or skating)
- S28 (ti rehab*) or (ab rehab*)
- S27 (MH "Rehabilitation+")
- S26 (ti kinesiotherap*) or (ab kinesiotherap*)
- S25 (ti manipulat*) or (ab manipulat*)
- S24 (ti physiotherap*) or (ab physiotherap*)
- S23 (MH "Physical Therapy+")
- S22 TI ( strength* or isometric* or isotonic* or isokinetic*or aerobic* or endurance or weight* ) or AB ( strength* or isometric* or isotonic* or isokinetic*or aerobic* or endurance or weight* )
- S21 TI physical* n2 endur* or AB physical* n2 endur*
- S20 TI physical N5 fitness or TI physical N5 therap* or AB physical N5 fitness or AB physical N5 therap* or TI motion n5 therap* or AB motion n5 therap*
- S19 (ti sport*) or (ab sport*)
- S18 (ti exercis*) or (ab exercis*)
- S17 (ti exertion*) or (ab exertion*)
- S16 (MH "Physical Endurance+")
- S15 (MH "Pliability
- S14 (MH "Sports+")
- S13 (MH "Exercise Tolerance+")
- S12 (MH "Exercise Test+")
- S11 (MH "Physical Fitness")
- S10 (MH "Exertion+")
- S9 (MH "Exercise+")
- S8 (ti knee*) or (ab knee*)
- S7 (MH "Knee Joint
- S6 (MH "Knee")
- S5 S4 or S3 or S2 or S1
- S4 (ti arthrosis) or (ab arthrosis)
- S3 (ti degenerative N2 arthritis) or (ab degenerative N2 arthritis)
- S2 (ti osteoarthr*) or (ab osteoarthr*)
- S1 (MH "Osteoarthritis+")
Appendix 5. PEDro search strategy
- Advanced search
- Therapy: Fitness training OR Strength training
- Body Part: Lower leg or knee
What's new
Last assessed as up-to-date: 12 August 2008.
| |||||||||
History
Review first published: Issue 3, 2003
| |||||||||||||||||||||||||||
Contributions of authors
M Fransen and S McConnell conducted the updated review.
M Fransen is the guarantor of the review.
Declarations of interest
None
Sources of support
Internal sources
- National Health and Medical Research Council, Australia.
External sources
- No sources of support supplied
Differences between protocol and review
The methods in the review have been updated since the original protocol, in accordance with the current recommended methods of the Cochrane Musculoskeletal Group.
Notes
The original protocol was for a review, 'Exercise for osteoarthritis of the hip or knee'. Since the original review, the editors decided to subdivide the review into two reviews of separate conditions
Index terms
Medical Subject Headings (MeSH)
*Exercise Therapy; Arthralgia [rehabilitation]; Osteoarthritis, Knee [*rehabilitation]; Randomized Controlled Trials as Topic
MeSH check words
Humans
* Indicates the major publication for the study
