Background
The loss of muscle strength in old age is a prevalent and disabling condition. Muscle strength declines with age such that, on average, the strength of people in their 80's is about 40 per cent less than that of people in their 20's (Doherty 1993). Muscle weakness, particularly of the lower limbs, is associated with reduced walking speed (Buchner 1996), increased risk of disability (Guralnik 1995) and falls in older people (Tinetti 1986).
Contrary to long held beliefs, the muscles of older people (i.e. people aged 60 years and older) continue to be adaptable, even into the extremes of old age (Frontera 1988). Trials have revealed that older people can experience large improvements in their muscle strength, particularly if their muscles are significantly overloaded during training (Brown 1990; Charette 1991; Fiatarone 1994). This training approach is referred to as progressive resistance training (PRT), since the participants exercise their muscles against some type of resistance which is progressively increased as strength improves.
Despite evidence of benefit from PRT in terms of improving muscle strength, there is still uncertainty about how these effects translate into changes in substantive outcomes such a reduction in physical disability (Chandler 1998). Most studies have been under-powered to determine the effects of PRT on these outcomes or have included PRT as part of a complex intervention. In addition, there is uncertainty about the effects of PRT when more pragmatic, home or hospital-based programs are used, and the safety and effectiveness of this intervention in older adults who have health problems and/or functional limitations. Finally, there is uncertainty about the relative benefits of PRT compared to other exercise programs, or the effectiveness of varying doses of PRT (i.e. programs of varying intensity and duration).
Objectives
The main objective of this review was to determine whether PRT, as a single exercise intervention, reduces physical disability in older people. The specific null hypothesis tested was: in randomised controlled trials, there is no evidence that participation in PRT produces improvement in physical disability in older people. In this review, measures of physical disability included measures of daily activities and measures of physical domains of health-related quality of life.
Secondary outcomes explored included the effects of PRT on:
- functional limitations (i.e. gait speed, time up and go test, chair rise)
- impairment (i.e. strength, balance, aerobic capacity)
- pain and vitality
- falls
- adverse events
- hospitalization, health service use and death.
Methods
Criteria for considering studies for this review
Types of studies
Any randomised or quasi-randomised (i.e. allocation of participants to treatment groups which are not strictly random, such as date of birth, alternation etc.) clinical trials meeting the specifications below were included. All non-randomised controlled trials (e.g. controlled before and after studies) were excluded.
Types of participants
Older people, resident in institutions or at home in the community. Trials were included if the mean age of participants was 60 or over, but excluded if participants aged less than 50 were enrolled. The participants could include frail or disabled older people, people with identified diseases or health problems, or fit and healthy people.
Types of interventions
Any trial that had one group of participants who received PRT as a primary intervention was considered for inclusion. Progressive resistance strength training was defined as a strength training programme in which the participants exercised their muscles against an external force that was set at specific intensity for each participant, and this resistance was adjusted throughout the training programme. The type of resistance used included elastic bands or tubing (i.e. therabands), cuff weights, free weights, isokinetic machines or other weight machines. This type of training could take place in individual or group exercise programmes, and in a home-based or gymnasium/clinic setting. Studies that utilised only isometric exercises were excluded. Studies that included balance, aerobic or other training as part of the exercise intervention (and not simply part of the warm-up or cool-down) were also excluded.
We found the following comparisons between groups in the trials:
- PRT versus no exercise (greatest difference between groups was expected)
- PRT versus regular care (including regular therapy or exercise)
- PRT versus another type of exercise (smaller difference between groups expected)
- low intensity or frequency of PRT versus high intensity or frequency of PRT (greatest effect expected in the higher intensity groups)
Types of outcome measures
The primary outcome of this review was physical disability. This was assessed as a continuous variable. The outcomes were categorized based on the Nagi model of health states (Nagi 1991). In this model, disability is considered to be a limitation in performance of socially defined roles and tasks that can relate to self-care, work, family etc. In this review, the primary assessment of physical disability included the evaluation of self-reported measures of activities of daily living (ADL, i.e. the Barthel Index) and the physical domains of health-related quality of life (HRQOL, i.e. the physical function domain of the SF-36). Data from these measures were pooled for the main analysis of physical disability. However, because these two types of measures (ADL and physical domains of HRQOL) evaluate different health concepts, they were also evaluated in separate analyses. The Nagi model also includes firstly, the domain of 'functional limitations' which are limitations in performance at the level of the whole person and includes activities such as walking, climbing or reaching, and secondly, 'impairments' that are defined as anatomical or physiological abnormalities.
Since the protocol of this review was written the International Classification of Functioning, Disability and Handicap (ICF) has been released (WHO 2001). Under this system, disability is an umbrella term for impairments, activity limitations and participation restrictions. Using the ICF, the outcome measures evaluated in this review fall under the domains of impairments, limitations in simple activities (similar to 'functional limitations' in Nagi's system) and limitations in complex activities (similar to some aspects of disability in Nagi's model).
The following secondary outcomes were assessed as continuous variables:
Measures of impairment (outcome comparisons 1 and 2)
- muscle strength (e.g. 1 repetition maximum test, isokinetic and isometric dynamometry)
- aerobic capacity (e.g. 6 minute walk test, VO2 max)
Measures of functional limitation (simple physical activities)
- balance (e.g. Berg Balance Scale, Functional Reach Test)
- gait speed, timed walk
- timed 'up-and-go' test
- chair rise (sit to stand)
The dichotomous secondary outcomes assessed were adverse events, admission to hospital and death. The effect of PRT on falls was also evaluated, although these outcomes are considered in a separate Cochrane review. Pain and vitality measures were evaluated as continuous outcomes, and were used to provide additional information about the potential adverse effects or benefits of PRT.
In the original protocol for this review, measures of fear of falling and participation in social activities were also included as outcomes. However, when the size and complexity of this review became apparent, the authors decided to limit this review to assessments of physical disability as this was the prespecified primary aim of the review. Therefore, these outcomes are not included in the current review. In addition, the protocol also stated that assessments of disability using the Barthel Index and Functional Independence Measure (FIM) would be dichotomised. However, as no trials included the FIM as an outcome and only three trials used the Barthel, the decision was made to report these data as continuous outcomes only.
Search methods for identification of studies
We searched the Cochrane Bone, Joint and Muscle Trauma Group specialized register (to August 2002), the Cochrane Central Register of Controlled Trials (The Cochrane Library Issue 2, 2002), MEDLINE (1966 to February 1, 2002), EMBASE (1980 to February 1, 2002), CINAHL (1982 to February 1, 2002), Sports Discus (1948 to February 2002), PEDro - The Physiotherapy Evidence Database (http://ptwww.cchs.usyd.edu.au/pedro/; accessed February 1,2002) and Digital Dissertations (accessed February 1,2002). We contacted authors and searched reference lists of identified studies, and reviews ( Anonymous 2001; Buchner 1993; Chandler 1996; Fiatarone 1993; Keysor 2001; King 1998; King 2001; Mazzeo 1998; Singh 2002), and also handsearched the following conference proceedings:
- 16th International Association of Gerontology World Congress; 1997; Adelaide (Australia).·
- 17th International Association of Gerontology World Congress; 2001; Vancouver (Canada).
- Proceedings of the 13th World Congress of Physical Therapy; 1995; Washington (DC).·
- Proceedings of the 14th World Congress of Physical Therapy; 1999; Japan·
- New Zealand Association of Gerontology Conferences - 1996 Dunedin, 1999 Wellington and 2002 Auckland (New Zealand).
In MEDLINE (OVID Web) the subject specific search strategy (see Appendix 1) was combined with the optimal search strategy (Clarke 2001) and modified for use in other databases.
Data collection and analysis
Selection of studies
One of the reviewers (NL) performed the search strategy on the databases and downloaded the information to a file. Two reviewers (NL, CS) reviewed the titles, descriptors or abstracts identified from all literature searches to identify potentially relevant trials for full review. A copy of the full text of all trials that appeared to be potentially suitable for the review was obtained. The two principal reviewers independently used previously defined inclusion criteria to select the trials. The reviewers attempted to reach a consensus if they disagreed about the inclusion of a trial. A third reviewer (CA) was asked to participate in the final decision if disagreement persisted.
Data extraction and management
Two reviewers independently extracted the data and recorded information on a standardised paper form. They considered all primary and secondary outcomes. If the data were not reported in a form that enabled quantitative pooling, the authors were contacted for additional information. If the authors could not be contacted or if the information was no longer available, the trial was not included in the pooling for that specific outcome.
Assessment of risk of bias in included studies
In this review, risk of bias is implicitly assessed in terms of methodological quality.
The methodological quality of each trial was independently assessed by two reviewers (NL, CS) using a scoring system that was based on the Cochrane Bone, Joint and Muscle Trauma Group evaluation tool. The reviewers were blinded to the authors, institution, journal that the trial was published in and the results of the trial. A third reviewer (CA) was consulted if a consensus about the trial quality could not be reached between the two reviewers. The following assessments of internal and external validity were assessed:
A. Was the assigned treatment adequately concealed prior to allocation?
2 = method did not allow disclosure of assignment.
1 = small but possible chance of disclosure of assignment or unclear.
0 = quasi-randomised or open list/tables.
B. Were the outcomes of patients/participants who withdrew described and included in the analysis (intention to treat)?
2 = withdrawals well described and accounted for in analysis.
1 = withdrawals described and analysis not possible.
0 = no mention, inadequate mention, or obvious differences and no adjustment.
C. Were the outcome assessors blind to treatment status?
2 = effective action taken to blind assessors.
1 = small or moderate chance of un blinding of assessors.
0 = not mentioned, or not possible.
D. Were the treatment and control group comparable at entry? Specifically, were the groups comparable with respect to age, medical co-morbidities (one or more of history of coronary artery disease, stroke, hypertension, diabetes, chronic lung disease), pre-entry physical dependency (independent vs dependent in self-care ADL) and mental status (clinical evidence of cognitive impairment, yes or no)?
2 = good comparability of groups, or confounding adjusted for in analysis.
1 = confounding small; mentioned but not adjusted for.
0 = large potential for confounding, or not discussed.
E. Were care programmes, other than the trial options, identical?
2 = care programmes clearly identical.
1 = clear but trivial differences.
0 = not mentioned or clear and important differences in care programmes.
F. Were the inclusion and exclusion criteria clearly defined?
2 = clearly defined.
1 = inadequately defined.
0 = not defined.
G. Were the interventions clearly defined?
2 = clearly defined interventions are applied with a standardised protocol.
1 = clearly defined interventions are applied but the application protocol is not standardised.
0 = intervention and/or application protocol are poorly or not defined.
H. Were the outcome measures used clearly defined?
For our primary outcome, physical disability in terms of self-report measures of physical function, we considered the outcome clearly defined if a validated and standardised scale was used and the method of data collection was clearly described.
Our secondary outcome measures included gait speed, muscle strength (e.g. one repetition maximum test, isokinetic and isometric dynamometry), balance (e.g. Berg Balance Scale, Functional Reach Test), aerobic capacity, and chair rise. These secondary outcomes were considered well defined if validated and standardised measures were used, and the method of data collection and scoring of any scales was clearly described.
2 = clearly defined measures and the method of data collection and scoring are clearly described
1 = inadequately defined measures
0 = not defined.
I. Was the surveillance active and of clinically appropriate duration (i.e. at least 3 months)?
2 = active and appropriate duration (three months follow-up or greater).
1 = active but inadequate duration (less than three months follow-up).
0 = not active or surveillance period not defined.
Data synthesis
Where it was thought appropriate, the results from the studies were combined. Data synthesis was carried out using MetaView in Review Manager version 4.0.4. For continuous outcomes, weighted mean differences (WMD) and 95% confidence intervals (CI) were calculated when similar measurement units were used. To pool outcomes using different units, standardised units (i.e. standardised mean differences, SMD) were created as appropriate. We employed Hedges adjusted g for the standardised mean difference. This is very similar to Cohen's d, but includes an adjustment for small sample bias. We planned to calculate relative risks and 95% CI for dichotomous outcomes, where possible.
For each outcome, a test of statistical heterogeneity was carried out. If minimal statistical heterogeneity (P<0.1) existed, fixed effects meta-analysis was performed. If substantial statistical heterogeneity did exist, the reviewers looked for possible explanations. Specifically, we considered differences in age and baseline disability of the study participants, the methodological quality of the trials and the intensity and duration of the interventions. If the statistical heterogeneity could be explained, the reviewers considered the possibility of presenting the results as sub-group analyses. If the statistical heterogeneity could not be explained, the reviewers considered not combining the studies at all, using a random effects model with cautious interpretation or using both fixed and random effects models to assist in explaining the uncertainty around an analysis with heterogeneous studies. When adequate data was available, funnel plots were created to assess whether small study bias influenced the results.
Sensitivity analyses were conducted to assess the effect of differences in methodological quality (including blinding of outcome assessors and intention to treat analysis), intensity of the intervention, duration of training, patient characteristics and the intervention the control group received.
Results
Description of studies
See: Characteristics of included studies; Characteristics of excluded studies.
Please see the Characteristics of Included Studies table.
Sixty-six trials with 3783 participants at entry were included in this review. Four studies were published only as abstracts and/or theses (Collier 1997; Fiatarone 1997; Moreland 2001; Newnham 1995). There was variation across the trials in the characteristics of the participants, the design of the PRT programs, the interventions provided for the comparison group and the outcomes assessed. More detailed information is provided in the Characteristics of Included Studies table, however a brief summary is provided here.
Participants
The participants in 38 trials were healthy older people. In the remaining 28 studies, the participants had a health problem, functional limitation and/or were residing in a hospital or residential care. Fifteen of these 28 trials included older people with a specific medical condition, including osteoarthritis (Baker 2001; Ettinger 1997; Maurer 1999; Schilke 1996), peripheral arterial disease (Hiatt 1994; McGuigan 2001), recent stroke (Moreland 2001), congestive heart failure (Pu 2001; Tyni-Lenne 2001), chronic airflow limitation (Simpson 1992), clinical depression (Singh 1997), low bone-mineral density (Parkhouse 2000), obesity (Ballor 1996), chronic renal insufficiency (Castaneda 2001) and coronary artery bypass graft surgery three or more months before exercise training (Maiorana 1997). In 11 other studies, the trials recruited participants who did not have a specific health problem, but were considered frail and/or to have a functional limitation (Chandler 1998; Fiatarone 1994; Fiatarone 1997; Hennessey 2001; Jette 1999; Latham 2002; McMurdo 1995; Mihalko 1996; Newnham 1995; Skelton 1996; Westhoff 2000), including four of these studies in which the participants resided in a rest-home or nursing home (Fiatarone 1994; McMurdo 1995; Mihalko 1996; Newnham 1995). In additional two studies included participants who were in hospital at the time the exercise program was carried out (Donald 2000; Latham 2001). In the other trials, most or all of the participants lived in the community.
Most studies included both men and women, although six trials included only men (Hagerman 2000; Haykowsky 2000; Hepple 1997; Maiorana 1997; Newnham 1995; Sartorio 2001) and 13 trials included only women (Charette 1991; Damush 1999; Flynn 1999; Jones 1994; Kerr 2001; Nelson 1994; Nichols 1993; Parkhouse 2000; Pu 2001; Rhodes 2000; Sipila 1996; Skelton 1995; Taaffe 1996). In 32 studies the mean or median age of the participants was 60-69, in 23 studies the mean/median age was 70-79 and in 10 studies it was 80 years or greater.
PRT programmes
Most training programs took place in gym or clinic settings with all sessions fully supervised. Seven studies were entirely home-based (Baker 2001; Chandler 1998; Fiatarone 1997; Jette 1996; Jette 1999; Latham 2002; McMurdo 1995), while seven additional studies carried out some of the training at home and some in gym/clinic settings (Ettinger 1997; Jones 1994; Skelton 1995; Skelton 1996; Topp 1993; Topp 1996; Westhoff 2000).
The resistance training programs in most trials (i.e. 47 trials) involved high intensity training. Most of these trials used specialized exercise machines for training. Twelve trials used low- to moderate-intensity training, with most using elastic tubing or bands. All of the high-intensity training was carried out at least in part in gym or clinic based settings, with the exception of two published trials (Baker 2001; Latham 2002) and a trial published as an abstract (Fiatarone 1997).
The frequency of training was consistent across studies, with the exercise program carried out two to three times a week in almost all trials. Two exceptions to this were the two trials conducted in hospital which carried out the exercises on a daily basis (Latham 2001; Donald 2000). In contrast, there was large variation in the duration of the exercise programs and the number of exercises performed in each program. Although most of the programs (i.e. 38 trials) were eight to 12 weeks long, the duration ranged from two to 78 weeks. In 27 trials the exercise program was longer than 12 weeks. The number of exercises performed also varied, from one to more than 14.
Data about adherence to the PRT program is reported in the Included Studies table. These data are difficult to interpret because different definitions for adherence or compliance were used across the trials. In most trials, adherence referred to the percentage of exercise sessions attended compared to the total number of prescribed sessions. Many studies only included participants that completed the entire trial (i.e. excluded drop-outs), while some trials reported these data with drop-outs included.
Comparison interventions
Sixty-two trials compared PRT to a control group. In eleven trials PRT was compared to an aerobic training program (Ballor 1996; Buchner 1997; Earles 2001; Ettinger 1997; Hepple 1997; Hiatt 1994; Jubrias 2001; Kerr 2001; Pollock 1991; Sipila 1996; Wood 2001). Eight of these studies also included a control (non-exercise) group. One study compared PRT to balance training, combined PRT and balance training or a control group (Judge 1994). Another study compared PRT to functional training and to a control group (McMurdo 1995). Five studies compared PRT programs of different intensities (Hortobagyi 2001, Hunter 2001; Taaffe 1996; Tsutsumi 1997; Vincent 2002). One trial compared PRT performed at different frequencies (i.e. once, twice or three times per week; Taaffe 1999).
Outcomes
A variety of outcomes were assessed in these studies. Six studies did not report final means and standard deviations for some or all of their outcome measures but instead reported mean baseline scores and mean change in score from baseline, and additional data could not be obtained from the investigators (Buchner 1997; Chandler 1998; Fiatarone 1994; Hiatt 1994; Jette 1996; Topp 1996). The final mean score was calculated for these studies by adding the change in score to the baseline score, and the standard deviation of the baseline score was used for the final score.
Excluded studies
The excluded studies and their reasons for exclusion are listed in the Characteristics of Excluded Studies table. The main reasons for exclusion were that the studies used a combination of exercise interventions (i.e. not resistance training alone), the strength training program did not use a progressive resistance approach, the participants were not elderly (i.e. did not have a mean age of at least 60 years and/or included some participants below 50 years of age), the study design caused serious threats to its internal validity or the study was not a randomised controlled trial.
Risk of bias in included studies
Methodological quality scores of each item for all included studies are given in Table 1. A summary of the findings of key indicators of internal validity are listed below.
Blinded outcome assessment
Thirteen studies stated that they used a blinded assessor for all outcome measures (Buchner 1997; Chandler 1998; Ettinger 1997; Jette 1996; Jette 1999; Jones 1994; Judge 1994; Latham 2002; Maurer 1999; McMurdo 1995; Moreland 2001; Newnham 1995; Westhoff 2000). Five additional studies used a blinded outcome assessor for some, but not all outcome assessments (Baker 2001; Castaneda 2001; Fiatarone 1994; Pu 2001; Singh 1997).
Blinding of participants
Blinding of participants is difficult in studies of exercise interventions. However, the use of attention control groups (i.e. control group receives matching attention) can help to minimise bias. Eighteen studies used some type of attention program for the control group (Baker 2001; Castaneda 2001; Damush 1999; Ettinger 1997; Fiatarone 1994; Fiatarone 1997; Judge 1994; Latham 2002; Maurer 1999; McCartney 1995; McMurdo 1995; Mihalko 1996; Moreland 2001; Newnham 1995; Pu 2001; Singh 1997; Topp 1993; Topp 1996). In three of these studies the control group received 'sham' exercise programs (Baker 2001; Castaneda 2001; Pu 2001).
Intention-to-treat analysis
Nine studies stated that they used intention to treat analysis (Baker 2001; Buchner 1997; Ettinger 1997; Fiatarone 1994; Judge 1994; Latham 2002; Moreland 2001; Nelson 1994; Pu 2001).
Concealed randomisation
Twenty-one studies provided some information about the method of randomisation that was used which suggested that randomisation was probably concealed and/or randomisation lists were appropriately generated (Baker 2001; Buchner 1997; Chandler 1998; Donald 2000; Earles 2001; Fiatarone 1994; Haykowsky 2000; Judge 1994; Kerr 2001; Latham 2001; Latham 2002; Maurer 1999; McMurdo 1995; Moreland 2001; Nichols 1993; Pollock 1991; Sartorio 2001; Simpson 1992; Singh 1997; Skelton 1995; Vincent 2002).
Study size
Most of the studies were small, with less than 40 participants in total, but seven studies had 100 or more patients in total in the PRT and control groups (Chandler 1998; Ettinger 1997; Jette 1996; Jette 1999; Latham 2002; McCartney 1995).
Loss to follow-up
Some trials had high drop-out rates, with several studies reporting more than 20 per cent of their participants lost to follow-up (Donald 2000; Topp 1996). In some studies there was clear evidence of bias associated with the excluded patients, with people who failed to adhere to the exercise program (Topp 1996) or those who had adverse responses deliberately excluded (Hagerman 2000). When the number of drop-outs from the PRT and control groups were compared, there were 61 per cent more drop-outs in the PRT group (219 drop-outs) compared to the control group (148 drop-outs).
Effects of interventions
See Graphs
PRT versus control
Measures of impairment
Strength
Many different muscle groups were tested and a number of methods were used to evaluate muscle strength in these trials. To minimise clinical heterogeneity, data were pooled from one muscle group. The leg extensor group of muscles was selected since this group was the most frequently evaluated. The effect size was calculated using standardised mean difference (SMD) to allow the pooling of data that used different units of measurement. Forty-one studies involving 1955 participants reported the effect of resistance training on a lower-limb extensor muscle group and provided data that would allow pooling. A moderate-to-large beneficial effect was found with the SMD 0.68 (95% CI 0.52 to 0.84) using a random effects model (fixed effects estimate 0.48; 95% CI 0.39 to 0.57).
Significant statistical heterogeneity was apparent in these data (P<0.0001). Since a large number of studies assessed this outcome, it was possible to explore this heterogeneity by stratifying the data. Differences in treatment effects due to the quality of the trials were investigated. We also explored subgroups of trials that were based on the design of the treatment programs and the characteristics of the participants.
To explore the effect of data quality on treatment effects, data were stratified by four design features that are associated with improved internal validity. These are the use of blinded assessors; attention control groups; concealed allocation and intention to treat analysis (ITT). Random effects models were used for all analyses. The effect estimates were lower in studies that used blinded assessors (10 trials, 1010 participants, SMD 0.29, 95% CI 0.12 to 0.47) compared to studies that did not use blinded assessors (31 trials, 945 participants, SMD 0.83, 95% CI 0.64 to 1.01). This was also true for studies that reported the use of concealed allocation: nine studies, 570 participants, SMD 0.38, 95% CI 0.38 to 0.70; non-concealed allocation: 32 studies, 1385 participants, SMD 0.78, 95% CI 0.60 to 0.96 ) and intention to treat analysis (ITT: 7 trials, 656 participants, SMD 0.33, 95% CI 0.05 to 0.61; no ITT: 34 trials, 1299 participants, SMD 0.76, 95% CI 0.59 to 0.93). The use of attention control groups did not appear to reduce the effect estimates (attention control: 12 studies, 830 participants, SMD 0.63, 95% CI 0.31 to 0.94; no attention control: 29 studies, 1125 participants, SMD 0.70, 95% CI 0.52 to 0.87).
Subgroup analyses were conducted to explore the effect of PRT when the design of the exercise program and the characteristics of the participants differed. The effect of differences in the exercise program were explored by examining effect estimates in studies that used different intensity and duration. High intensity strength training was compared with low to moderate intensity training. This analysis suggests that while both training approaches are probably effective in improving strength, higher intensity training has a larger effect on strength than low to moderate intensity training (high intensity: 32 trials, SMD 0.81, 95% CI 0.60 to 1.01; low-moderate intensity: nine trials, SMD 0.34, 95% CI 0.18 to 0.51). Longer duration programs (i.e. greater than 12 weeks) were also compared with shorter duration programs (less than 12 weeks). The duration of the trial appeared to have minimal effect on the strength outcome (<12 weeks training: 25 trials, SMD 0.62, 95% CI 0.42 to 0.82; >12 weeks of training: 16 trials, SMD 0.77, 95% CI 0.50 to 1.05). Considerable statistical heterogeneity still exists across these data, which suggests that the design of exercise program does not explain all of the variation in effect estimates across trials.
Treatment effects in older people with and without a chronic disease (or functional limitation) were also assessed. Again, resistance training appeared to be effective in improving strength in both groups of older people, but there was statistical heterogeneity in the effects. Studies that included participants who had specific health problems and/or functional limitations were compared with studies that included only healthy older people. There was no apparent difference in the effect of treatment in older people who were healthy compared with trials that recruited people with specific health problems (healthy older people: 26 studies, n = 939, SMD 0.69, 95% CI 0.51 to 0.86 versus people with specific health problems: 15 studies, 1016 participants, SMD 0.66, 95% CI 0.38 to 0.93). On the other hand, studies that included people who had a physical disability or functional limitation appeared to be less effective than those that included people who did not have functional limitations (people with no functional limitations: 32 studies, 1084 participants, SMD 0.76, 95% CI 0.59 to 0.94 versus people with functional limitations: nine studies, 871 participants, SMD 0.36, 95% CI 0.11 to 0.60). However, this result could be confounded by the intensity of the PRT programs, as almost all programs that included people with functional limitations were carried out at a low to moderate intensity.
Aerobic capacity
The main measure of aerobic capacity was pooled from 16 studies (n = 777) using a fixed effect model (Ades 1996; Buchner 1997; Chandler 1998; Ettinger 1997; Hagerman 2000; Hiatt 1994; Maiorana 1997; McGuigan 2001; Moreland 2001; Pollock 1991; Pu 2001; Rall 1996; Simpson 1992; Singh 1997; Tsutsumi 1997; Tyni-Lenne 2001). These data suggest that PRT has a non-significant effect on aerobic capacity (SMD 0.13, 95% CI -0.02 to 0.27). Different measures of aerobic capacity were used, so further analyses were performed using WMD to pool data from two specific measures of aerobic capacity: VO2 max, measured in ml/kg/min and the Six-Minute Walk Test, measured in metres. A consistent non-significant effect was found when data from measures of VO2 max alone (Ades 1996; Buchner 1997; Ettinger 1997; Hagerman 2000; Hiatt 1994; Maiorana 1997; Pollock 1991; Pu 2001; Rall 1996; Tsutsumi 1997; Tyni-Lenne 2001) were combined (11 trials, n = 496, WMD 0.47 ml/kg/min, 95% CI -0.03 to 0.97). However, when data from the Six-minute Walk Test were combined, (six trials, n = 212; Chandler 1998; McGuigan 2001; Pu 2001; Simpson 1992; Singh 1997; Tyni-Lenne 2001) a significant positive effect was found (WMD 53.69 metres, 95% CI 27.03 to 80.36).
Measures of functional limitations (simple physical activities)
Balance/ postural control
Results from all balance performance measures were pooled using SMD and a fixed effect model. A SMD of 0.11 (-0.03 to 0.25) was found in 12 studies with 789 participants (Buchner 1997; Chandler 1998; Jette 1999; Judge 1994; Latham 2001; Latham 2002; Newnham 1995, Schlicht 1999; Skelton 1995; Skelton 1996; Topp 1993; Westhoff 2000) suggesting a small, nonsignificant benefit (higher score indicates better balance). Because a variety of balance outcome measures were used in these trials, we explored further the effect of PRT when two different types of balance measures were pooled separately. The first type were simple measures of the time that a person is able to maintain postural control under different conditions (e.g. single leg stance, tandem standing) and the second type were measures of postural control during more complex activities (e.g. the Functional Reach Test and the Berg Balance test). The simple timed measures (five studies, 187 participants; Buchner 1997; Judge 1994; Schlicht 1999; Topp 1993; Westhoff 2000 ) found a SMD 0.16 95% CI -0.13 to 0.45, while the measures of more complex balance activities (seven studies, 602 participants; Chandler 1998; Jette 1999; Latham 2001; Latham 2002; Newnham 1995; Skelton 1995; Skelton 1996 ) found an overall SMD of 0.10, 95% CI -0.06 to 0.26. Both of these are consistent with the overall finding of a non-significant effect on balance.
Gait speed
Two different measures of walking speed were used: gait speed (measured in metres per second) and timed walk (i.e. time to walk a set distance, measured in seconds). A higher gait speed score indicates faster mobility, while a higher timed walk score indicates slower mobility. Because of this difference, these data were analysed separately using fixed effect models and WMD. Gait speed was measured in 14 studies (Brandon 2000; Buchner 1997; Chandler 1998; Fiatarone 1994; Judge 1994; Latham 2002; Newnham 1995; Schlicht 1999; Singh 1997; Sipila 1996; Skelton 1995; Topp 1993; Topp 1996; Tyni-Lenne 2001) that included 798 participants and produced a WMD of 0.07m/s (95% CI 0.04 to 0.09), indicating that PRT has a modest but significant beneficial effect on gait speed. The limited number of trials and participants using the timed walk (seconds) as an outcome measure (Donald 2000; Latham 2001; Skelton 1996; Westhoff 2000) limited the analyses for this method of describing gait speed, and no evidence of an effect was found (81 participants, WMD 0.77 seconds, 95% CI -0.65 to 2.20).
Timed up-and-go
Timed up-and-go (i.e. time to stand, walk three metres, turn, and return to sitting, measured in seconds) was analysed using a fixed effect model and WMD. The timed up-and-go was analysed in six trials and a total of 494 participants (Jette 1999; Latham 2001; Latham 2002; Newnham 1995; Skelton 1996; Westhoff 2000). The WMD was -1.23 seconds (95% CI -2.80 to 0.35, lower score indicates improvement), which is consistent with no benefit from PRT on this mobility task.
Timed chair rise
The time to stand up from a sitting position was measured in four studies (n = 185; Brandon 2000; Judge 1994; Singh 1997; Skelton 1996). Because different numbers of sit-to-stands were counted, SMD were used to pool these results. A SMD of -0.67 (95% CI -1.31 to -0.02) was found, indicating a significant, moderate to large effect on this task (i.e. less time was required to stand up).
Measures of physical disability / HRQOL (complex physical activities)
The main disability measures from trials that had appropriate data were pooled using SMD. It was necessary to conduct two separate analyses of the main disability measures because 10 studies (n = 722; Baker 2001; Buchner 1997; Chandler 1998; Damush 1999; Donald 2000; Hiatt 1994; Latham 2002; Moreland 2001; Singh 1997; Tsutsumi 1997) used measures in which a higher score indicates less disability/better function, while six studies (n = 559; Baker 2001; Ettinger 1997; Jette 1999; Schilke 1996; Singh 1997; Westhoff 2000) used scores in which a higher score indicates more disability / poorer function. SMD and fixed effect models were used for both comparisons. In both analyses, there was no evidence that PRT has an effect on physical disability (higher measure indicates less disability: SMD 0.01, 95% CI -0.14 to 0.16; higher measure indicates more disability: SMD -0.17 95% CI -0.53 to 0.19). When HRQOL and ADL measures were examined separately, there was also no evidence of benefit. For example, when the physical function domain of SF-36 was pooled from seven studies (n = 493; Baker 2001; Chandler 1998; Damush 1999; Hiatt 1994; Latham 2002; Singh 1997; Tsutsumi 1997) using a fixed effect model, a small difference of less than one point on this 100-point scale was found (WMD 0.96, 95% CI -3.35 to 5.26). A number of studies had disability measures (i.e. measures of activity, function or HRQOL) that could not be pooled. The available data from these measures is reported in Additional Table 2 .
Falls
Five studies collected data about the effect of resistance training on falls, but the outcomes reported did not allow pooling of the data. The available data is reported in Additional Table 3. Three of these studies (Buchner 1997; Fiatarone 1994; Judge 1994) were part of the FICSIT trial, a prospective preplanned meta-analysis to determine the effectiveness of exercise to prevent falls in older people (Province 1995). The data were extracted from the main FICSIT paper, because papers published about the individual exercise programs did not provide useful data about the effect of resistance training alone on falls. One additional trial investigated the effect of resistance training on falls in older people while they were in hospital (Donald 2000). A recent trial also assessed the effect of PRT on frail older people following discharge from hospital (Latham 2002).
With the exception of Latham 2002, all of these trials were small (i.e. less than 80 participants in the resistance training and control groups). Only the study by Donald 2000 found a significant reduction in falls, but there were few fall events in this trial.
Adverse Events
Adverse events reported
Thirty-five studies provided no comment at all about adverse events associated with the training program. Out of the 31 studies that did provide some comment about adverse events, 14 reported no adverse events and 17 reported some adverse reaction to the exercise program. An additional nine studies did not report adverse events as such, but it is possible that an event occurred since these studies reported drop-outs from the exercise group secondary to increasing pain or specific injuries (Chandler 1998; Charette 1991; Fiatarone 1997; Hagerman 2000; Hortobagyi 2001; Jette 1996; Kerr 2001; Maurer 1999; Topp 1993). Given that there were considerably more drop-outs from the PRT group than from the control group (see methodological quality section above), it is possible that the number of cases of adverse events reported here are an underestimate.
Only seven of 66 studies provided an a priori definition of an adverse event in the study methods or objectives (Earles 2001; Ettinger 1997; Judge 1994; Latham 2002; Moreland 2001; Pollock 1991; Singh 1997). Six of these seven studies detected adverse events (Earles 2001; Ettinger 1997; Judge 1994; Latham 2002; Moreland 2001; Pollock 1991). However, there was little consistency in the definition that was used, with some studies only reporting serious events that the investigators thought to be possibly related to the exercise program (i.e. Ettinger 1997) while other studies reported all adverse events that occurred in each group. Most adverse events were musculoskeletal problems; there was no report of cardiac events or death associated with PRT. Further details about all adverse events reported in these trials are in Additional Table 4.
Pain
The bodily pain (BP) domain of the SF-36 health status measure was assessed in six studies involving 440 participants (Baker 2001; Buchner 1997; Damush 1999; Latham 2002; Singh 1997; Tsutsumi 1997). For this measure, a higher score indicates better health (i.e. less pain). When the BP domain was pooled, there was no evidence that PRT had an effect on bodily pain (WMD -0.14, 95% CI -4.45 to 4.18). In contrast, three studies with 311 participants (Baker 2001; Ettinger 1997; Schilke 1996) included pain measures where a higher score indicates more pain, and found evidence to support a modest reduction in pain following PRT (SMD -0.33, 95% CI -0.55 to -0.11). These three studies all included participants with osteoarthritis and used pain measures designed specifically for this population, which could have increased their sensitivity to change.
Vitality
The vitality (VT) domain of the SF-36 health status measure was assessed in five studies involving 389 participants (Baker 2001; Damush 1999; Latham 2002; Singh 1997; Tsutsumi 1997). For this measure, a higher score indicates better health (i.e. more vitality). When the VT domain was pooled there was no evidence of an effect of PRT (WMD 1.42 95% CI -2.22 to 5.07).
Health service use, hospitalization and death
Three studies provided data about hospitalization rates, length of stay and/or outpatient visits. Donald 2000 reported that people who received PRT in addition to regular in-hospital physiotherapy had a length of stay of 27 days compared to 32 days for the control group. Latham 2002 found that 42/120 people in the PRT group were admitted to hospital over six months compared to 35/123 in the control group. The third trial by Singh 1997 reported that, over a 10 week period, people in the PRT group had mean 2.1 (SD 0.4) visits to a health professional and mean 0.24 (SD 0.2) hospital days compared to controls mean 2.0 (SD 0.5) visits and mean 0.53 (0.4) hospital days. An additional study, Buchner 1997, provided data about health service use, but only reported data that was pooled to include participants in aerobic training, combined aerobic training and PRT and PRT alone. This study found no change in hospital admissions between those in the exercise and control groups, but an increased number of outpatient visits by those in the control group. Finally, two studies stated that there was no difference in health care visits (Fiatarone 1997) or hospitalization (Pu 2001) but no specific data were provided.
Six studies provided data about participant deaths that allowed pooling (Donald 2000; Ettinger 1997; Fiatarone 1994; Latham 2002; Moreland 2001; Newnham 1995). There was no evidence of difference in the number of deaths in treatment and control groups (treatment group: 10 deaths versus control group: 17 deaths; RR = 0.60, 95% CI 0.29 to 1.23).
Comparison of PRT doseage
Strength
Six trials investigated the effects of different doses of PRT. Four studies (n = 85) examined the effect of low versus high intensity PRT on lower limb strength (Hortobagyi 2001; Hunter 2001; Taaffe 1996; Tsutsumi 1997). These data suggest that high intensity training results in greater lower limb strength, as a moderate effect was seen (SMD = 0.51, 95% CI 0.07 to 0.94; higher score favours high-intensity group). One trial (n = 24) compared high intensity PRT with variable intensity PRT with inconclusive results (SMD = 0.61, 95% CI -0.21 to 1.44; Hunter 2001). Finally, one trial (n = 46) compared PRT performed at difference frequencies - once, twice or three times per week (Taaffe 1999). When the effect on strength of PRT once per week was compared to three times per week, no significant difference was found (SMD 0.40, 95% CI -0.44 to 1.25).
Aerobic capacity
One study compared the effect of high versus low intensity PRT on aerobic capacity (Tsutsumi 1997). This study (n = 27) found greater benefit from high intensity compared to low intensity training (WMD 5.20 ml/kg/min, 95% CI 1.30 to 9.10; higher score favours high-intensity group). Another trial compared high intensity training with variable intensity training, and found that the point estimate favoured variable intensity training (WMD = 1.30, 95% CI -0.12 to 2.72). However, there were few participants in these trials.
Physical disability, pain and vitality
One study (Tsutsumi 1997; n = 27) compared high versus low intensity PRT, and evaluated pain, vitality and physical function using the domains of the SF-36. No significant difference was found for any of these outcomes, however there were few participants included.
PRT versus aerobic training
Strength
Seven studies (n = 420) evaluated the effect of PRT compared to aerobic training on lower extremity strength (Ballor 1996; Buchner 1997; Earles 2001; Ettinger 1997; Pollock 1991; Sipila 1996; Wood 2001). A fixed effect model and SMD were used for this analysis. This found that PRT had a non-significant benefit compared to aerobic training on strength with SMD 0.11(95% CI -0.08 to 0.30; higher score favours PRT group).
Aerobic capacity
The effect of aerobic training compared to PRT on aerobic capacity was evaluated in six studies involving 374 participants (Ballor 1996; Buchner 1997; Ettinger 1997; Hepple 1997; Hiatt 1994; Pollock 1991). This was measured using VO2 max in ml/kg/min. Aerobic training had a non-significant benefit compared to PRT for this outcome (WMD -0.47 ml/kg/min, 95% CI -1.00 to 0.05; lower score favours aerobic group).
Physical disability
Four studies evaluated the effect of PRT compared to aerobic training on physical disability. Three studies (Buchner 1997; Earles 2001; Hiatt 1994) used outcomes in which a higher score indicates less disability (n = 102), and found no significant difference, with a SMD of -0.26 (95% CI -0.65 to 0.14; lower score favours the aerobic training group). One large study (Ettinger 1997) also found no significant difference between the groups. This trial used a measure in which a higher score indicates less disability (n = 237), and found SMD 0.05 (95% CI -0.21 to 0.30; higher score favours aerobic group).
Pain
Ettinger 1997 evaluated pain in people who undertook PRT compared to aerobic training (n = 237) using a scale in which a lower score indicates less pain. The trial found no significant difference between groups (SMD 0.12; 95% CI -0.14 to 0.37; higher score favours aerobic group).
PRT versus balance or functional training
One study (Judge 1994) included an evaluation of PRT alone compared to balance retraining alone (n = 55). Both exercise programs were performed in a research center three times per week for three months. Balance training included training on a computerized balance platform and non-platform training (i.e. balancing on different surfaces, with varying bases of support, with different perturbations).This study found that strength improved in the PRT group, but not in the balance training group. Chair rise time and gait speed did not improve in any group, with gait speed actually declining in the balance training group. However, balance improved in the balance trained group compared with the PRT group.
A second study (McMurdo 1995) included an evaluation of PRT alone compared to 'mobility training' (n = 41). Participants in both groups carried out home exercise programmes, and were visited every three to four weeks by a physiotherapist for six months. There was no difference between the two groups for any measures, including measures of strength, physical performance, function or quality of life.
Discussion
Overview of main results
This review identified, graded and synthesised the available literature regarding the effect of a specific exercise intervention, PRT, on a particular population, older people. To increase the generalisability of these data, the trials included participants with a range of health problems, and the dose and delivery of the PRT programs varied. This made it possible to assess overall effects of the intervention on older people, while still providing adequate data to explore the effects on subgroups (i.e. in different groups of older people or with different doses of PRT). Overall, this review suggests that PRT has a large positive effect on strength, the most proximal impairment measure, and a small to modest positive effect on some other measures of impairment and functional limitations. However, the current data suggest that PRT does not have an effect on physical disability (complex activities). There is some evidence, however, that suggests that PRT might reduce pain in older people with osteoarthritis. Adverse events were poorly reported in most studies, which limits the ability of this review to assess the risks associated with this intervention. In addition, the sparse data did not allow an adequate assessment of the effect of PRT on fall risk.
Methodological quality
The 66 studies in this review were generally of poor methodological quality, as most of the studies did not use design features that are known to increase internal validity, such as intention to treat analysis, blinded outcome assessors, attention control groups, or concealed randomization. For example, only 13 of the 66 studies used blinded outcome assessors for all outcomes, and nine studies appeared to use intention to treat analysis. Therefore, caution is required when drawing conclusions from these data. When data were stratified by indicators of study quality, results from the high quality trials continued to support the positive effect of resistance training on strength. However, these data also indicate that low quality trials that comprise the majority of the studies in the review probably overestimate the effect of resistance training.
PRT versus control
PRT appears to have a large positive effect on strength in older people. However, there was a large amount of statistical heterogeneity associated with this estimate. This variation was reduced, but not eliminated, by investigating differences in outcome in different groups of participants, types of intervention and in trials that used different quality indicators. In exploratory subgroup analyses, it appeared that training intensity has the greatest effect on strength (i.e. high intensity training has a greater effect on strength than lower intensity training), while the duration of the training appears to have a reduced effect. The health status of the participants does not have a clear effect on their response to the intervention, as there was no difference in the effect on strength between people with specific health problems and those who were generally healthy. It did appear that people with pre-existing functional limitations had smaller gains in strength. However, these subgroup analyses must be interpreted with caution as the number of participants is reduced which decreases the precision of these estimates. In addition, it is possible that study quality is a confounder for some of these observed differences, as several of the largest and highest quality trials included people with function limitations and/or lower intensity training programs, and study quality appears to reduce the effect estimates.
PRT also appears to have a positive effect on aerobic capacity and most measures of functional limitations, including gait speed and the time to stand up from a chair. Some of these effects, such as the improvements in gait speed and the distance walked in the Six-Minute Walk Test, could be considered clinically as well as statistically significant. Despite these improvements in functional limitations, no effect of PRT was found on measures of disability. There were fewer data in these comparisons than for impairment measures, which would decrease the precision of these estimates. However, there was no evidence of a trend towards improved physical ability in the overall pooled estimates or when the activity and health-related quality of life measures were analysed separately.
It was not possible to pool fall data because falls were reported differently in the five studies that measured this outcome. These data might suggest a trend towards PRT reducing falls, since four of the five studies found that participants in the PRT group had fewer falls than those in the control group. However, the effect of PRT alone on falls is still not clear.
Adverse events were poorly monitored and reported in most of these trials. This makes it difficult to assess the risk of injury or other adverse events associated with resistance training. The finding that several studies reported drop-outs from the exercise program due to pain or injury, yet failed to report any adverse events, suggests that adverse events might have been under-reported in some trials. This hypothesis is further supported by the finding that the studies with a clear definition of adverse events in their study methods were more likely to detect these events than those with no definition. The large number of drop-outs from the PRT group compared to controls also raises the possibility that people are experiencing adverse effects from PRT that are not identified in these trials. However, it is reassuring that participant's pain and vitality were not affected by PRT, and in fact PRT appeared to decrease pain in people with osteoarthritis. Furthermore, there was no evidence of increased risk of hospitalization, and several studies reported decreased use of health care services in the PRT group. Finally, there were no reports of serious adverse events (i.e. myocardial infarction or death) associated with PRT.
Comparison of PRT dosage
There are currently few randomised data available to guide the dose and prescription of PRT. Six studies investigated different aspects of this issue, but all were small studies and most were of poor quality. The available data suggests that high intensity training has a greater effect on strength than lower intensity training. However, some caution is needed before applying this finding to all older people. All of the participants in these trials were healthy older people who participated in highly supervised, gym-based programs. Therefore, it is not clear if high intensity PRT is more beneficial than low intensity training in less fit or healthy older people and/or in home or hospital based programs.
PRT vs other training
Overall, no significant differences were found between the different types of training. When aerobic training is compared to PRT, is appears that aerobic training tended to produce larger gains in measures of aerobic capacity while PRT tended to produced larger gains in strength. This finding is to be expected, given that these outcomes are more specific to the particular type of training. There are fewer data available to determine the comparative effect of these types of training on physical disability, but the available data suggest that the two training programs have a similar effect on this outcome. There are too few data to draw conclusions about other forms of training such as balance or mobility training compared to PRT.
Authors' conclusions
Implications for practice PRT increases muscle strength and has a positive effect on some functional limitations in older people. Therefore, it would appear to be an appropriate intervention for many older people to improve performance of some simple physical tasks. In particular, this intervention might be appropriate for older people with osteoarthritis as their pain appeared to decrease in response to PRT. However, based on current data, there is no evidence that PRT alone has an effect on physical disability. Other interventions, either other forms of exercise training to improve overall physical capacity and/or other strategies to increase self-efficacy, motivation or reduce barriers to participation might be required to impact at this level. Some caution is warranted with this intervention since adverse effects have been poorly monitored, but appear to occur when they are looked for in a trial. When used in clinical practice, clinicians should monitor for adverse effects, particularly when older people who might be at higher risk of injury (i.e. frail or recently ill older people) are undertaking PRT. |
Implications for research We recommend that future trials investigating the effect of PRT in older people should:
Future trials should include participants and interventions that are similar to those in health care settings (i.e. frail or recently ill older people), so that, if proven to be effective, resistance training can be incorporated into routine health care services. Well-designed trials are also required to determine the most appropriate dose of PRT to use with different participants and in different settings. This review will be updated every two years to incorporate new trials and new data from existing studies. |
Acknowledgements
The reviewers would like to thank the Cochrane Bone, Joint and Muscle Trauma Group, particularly Lesley Gillespie and Leeann Morton, for their assistance throughout the review process. In particular, thanks to Lesley for searching the Cochrane registers and assistance with developing the search strategies. Thank you to Leeann for her advice and guidance about the procedures and content of this review. We would also like to thank the review group's editors and the external reviewers, Prof John Campbell and Dr Keith Hill, for their helpful comments on earlier drafts of this review.
Data and analyses
- Top of page
- Background
- Objectives
- Methods
- Results
- Discussion
- Authors' conclusions
- Acknowledgements
- Data and analyses
- Appendices
- What's new
- History
- Contributions of authors
- Declarations of interest
- Index terms
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||
| |||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||
| |||||||||||||||
Appendices
Appendix 1. Search strategy for MEDLINE (OVID Web)
1 strength training.tw.
2 resist$ training.tw.
3 or/1-2
4 Exercise/
5 Exercise Therapy/
6 exercise$.tw.
7 or/4-6
8 resist$.tw.
9 and/7-8
10 or/3,9
11 limit 10 to "all aged <65 and over>"
12 (elderly or senior$).tw.
13 and/10,12
14 or/11,13
What's new
Last assessed as up-to-date: 2 December 2002.
| |||||||||
History
Protocol first published: Issue 4, 2000
Review first published: Issue 2, 2003
Contributions of authors
All reviewers contributed to the development of the protocol, the analysis and interpretation of the data and the write-up of the review. Nancy Latham took the lead in conducting the analyses and writing the protocol and review. In addition, Nancy and Caroline Stretton conducted the searches, identified the trials, conducted the quality assessments and extracted the data. Derrick Bennet provided methodological and statistical guidance for the review. Craig Anderson served as the adjudicator when a consensus about data issues could not be reached between the two reviewers, and provided guidance about the methods and interpretation of the review.
Declarations of interest
None known.
Index terms
Medical Subject Headings (MeSH)
*Exercise; Muscle Weakness [*rehabilitation]; Randomized Controlled Trials as Topic
MeSH check words
Aged; Humans
* Indicates the major publication for the study
