Criteria for considering studies for this review
Types of studies
Randomised (parallel groups or cross-over, including cluster-randomised and quasi-randomised trials) or controlled clinical trials (CCTs) will be considered for inclusion.
Types of participants
Persons 18 years and older with a physician confirmed (i.e. radiological or clinical or both) diagnosis of hand OA will be considered for inclusion. If participants are both older and younger than 18 years, we will include the study if there are separate results for those older than 18 years. We also include studies where more than 90 percent of participants are older than 18 years. Studies including diverse populations will be accepted if the data can be extracted for the hand OA group separately.
Types of interventions
Interventions assessing the benefits and harms of exercises versus other interventions for pain and function in persons with hand OA will be considered for inclusion. Hand exercise therapy may comprise some or a combination of: strength, mobility, endurance or joint stability training. Studies of post-operative exercises will be excluded. However, studies in which other treatments were used (e.g. analgesics) will be included if the treatment was similar across intervention and control groups.
Specific comparison to be made
It is likely that the review will include studies testing different interventions. The following main comparisons will be considered for inclusion:
Exercise versus no intervention (e.g. usual care, wait control);
Exercises versus placebo;
Exercises versus other interventions; and
Comparison of different exercise programs.
We anticipate that there are no studies with real placebo exercise interventions. Sham or attention control interventions, provided to minimize the difference of placebo effect between the two groups, may be the comparator intervention. 'Other interventions' may include pharmacological interventions, hand surgery, patient education, use of assistive technology, functional or vocational activity training, local application of heat or cold packs, use of hot paraffin wax, use of ultrasound, laser or transcutaneous electrical nerve stimulation (TENS), acupuncture or acupressure, dietary supplements, creams, orthoses, gloves, or any combination of two or more of these interventions.
Types of outcome measures
Published recommendations for a core set of outcome measures exist for phase III clinical trials in knee, hip and hand OA and include physical function, pain, patient global assessment of disease impact, and for studies of one year or longer, joint imaging (Bellamy 1997; Maheu 2006).
The main outcomes for benefit are pain and physical function in addition to radiographic joint structure changes, quality of life and stiffness. Adverse events and sustained joint inflammation or pain associated with hand exercise therapy might be increased. If possible, the number of patients experiencing intervention related adverse events will be reported, as well as the total number of patients withdrawn from the studies.
The seven main outcomes for the 'Summary of findings' table will include:
When more than one measure of pain is reported in a study, we will choose the highest in the hierarchy of outcome measures:
Pain overall (e.g. VAS pain)
Pain on hand usage
AUSCAN pain sub-scale
Other algofunctional scale validated for use in hand OA
Patient’s global assessment
Physician’s global assessment
When more than one measure of physical function is reported in a study, we will choose the highest in the hierarchy of outcome measures:
AUSCAN physical function sub-scale
Other algofunctional scale validated for use in hand OA
Hand function measured by performance based tests (e.g. Grip strength, pinch strength)
Global disability score
3. Radiographic joint structure changes;
4. Quality of life;
6. Number of patients experiencing any adverse event; and
7. Number of patients withdrawn because of adverse events.
Minor outcomes will include:
Timing of outcome assessment
The main time point of interest will be the first assessment after completion of the exercise programme. When data for longer term follow-ups are available, such data will also be extracted and categorized into short (0 to 6 months), medium (7 to 12 months) and long term follow-up (> 12 months).
Search methods for identification of studies
A search for studies published up until the date of search (i.e. from inception to February 2013) will be carried out in six electronic databases: the Cochrane Central Register of Controlled Trials (CENTRAL) published in The Cochrane Library), MEDLINE, EMBASE, CINAHL, PEDro, and OTseeker. The search strategies are provided in Appendix 1 (MEDLINE), Appendix 2 (EMBASE), Appendix 3 (CINAHL), Appendix 4 (AMED), Appendix 5 (PEDro), Appendix 6 (OTseeker), and Appendix 7 (CENTRAL).
Searching other resources
The reference lists of all included full-text articles will be screened to identify additional studies. Searches for unpublished complete studies and ongoing studies will be performed using the WHO’s International Clinical Trials Registry Platform (http://www.who.int/ictrp/en/) and other randomised controlled trials registers including:
Unpublished or grey literature will be reviewed using the database OpenSIGLE (System for Information on Grey Literature in Europe). Further, conference proceedings from Osteoarthritis Research Society International (OARSI), EULAR and the American College of Rheumatology (ACR) from 2008 until February 2013 will also be searched. We will not apply any language restrictions.
Data collection and analysis
Selection of studies
The first author along with one of co-authors (IK, GS) will independently screen retrieved clinical studies for inclusion, extract data from all included studies and conduct the risk of bias assessment. If agreement is not achieved at any stage, a third review author (KBH) will adjudicate.
The process of selecting studies will include the following steps:
merging search results using reference management software, and removing duplicate records of the same report;
examining titles and abstracts to remove obviously irrelevant reports;
retrieving the full text of the potentially relevant reports;
linking together multiple reports of the same study and identifying more than one study reported in the same article;
examining full-text reports against the eligibility criteria;
corresponding with investigators, where appropriate, to clarify study eligibility or other missing information; and
making final decisions on study inclusion
Data extraction and management
Data from all reports will be extracted directly into a single data collection form including:
Methods: study design, total study duration, recruitment method, random sequence generation, allocation sequence concealment, blinding, other concerns about bias;
Participants: total number of participants, setting, diagnostic criteria, joint involvement, age, sex, country, co-morbidity, inclusion and exclusion criteria;
Interventions: total number and type of intervention groups (i.e. placebo, control, or comparative intervention), specific intervention, intervention details sufficient for replication, number of therapists providing study treatments, delivery mode (i.e. individual, group-based, home program), frequency (i.e. hours per week), treatment content (i.e. strength, mobility, endurance, stability), duration (i.e. total weeks of treatment), the number of directly supervised contact occasions and intensity (i.e. low, high), and number of follow-up contacts (i.e. face to face or by telephone, e-mail or text-messages);
Results: number of participants allocated to each intervention group, sample size, missing data or participants, summary data; and
Miscellaneous: funding source, key conclusions, miscellaneous comments from study author, miscellaneous comments from review authors.
Assessment of risk of bias in included studies
The risk of bias in the included studies will be assessed using the procedures recommended by The Cochrane Collaboration and described in chapter 8 of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011). The following methodological domains will be assessed:
allocation sequence concealment;
blinding of participants, personnel and outcome assessors;
incomplete outcome data;
selective outcome reporting; and
other potential threats to validity (e.g. substantial imbalance of participant characteristics at baseline that is strongly related to outcome measures, blocked randomisation in unblinded trials, contamination).
Each of these criteria will be explicitly judged as 'low risk', 'high risk’ or 'unclear risk' of bias.
Individual components rather than summary scores are recommended (Khan 2001) in non-pharmacological trials where blinding of participants proves challenging, meaning that blinding in most such trials would be assessed as high risk of bias. For such studies it is particularly important to assess whether assessors of outcome data were blinded.
Measures of treatment effect
The risk ratio (RR) and 95% confidence interval (CI) will be calculated for dichotomous outcomes. Shorter ordinal scales will be transformed into dichotomous data by combining adjacent categories together (especially if an established defensible cut-point is available).
The mean difference (MD) and 95% CI or standardised mean difference (SMD) and 95% CI will be calculated for continuous outcomes. The MD will be used when all studies have assessed the same outcome using the same scale or instrument. The SMD will be used when studies have assessed the same outcome using different scales or instruments.
The SMD expresses the size of the intervention effect in each study relative to the variability observed in that study. SMDs will be calculated by dividing the mean difference by the standard deviation of outcome among participants. SMDs greater than zero will indicate a beneficial effect of hand exercise therapy. The SMD will be interpreted as described by Cohen; i.e. a SMD of 0.2 is considered to indicate a small beneficial effect, 0.5 a medium effect, and 0.8 a large effect of hand exercise therapy (Cohen 1988). Longer ordinal scales (i.e. more than 10-point numeric rating scale or VAS) will be analysed in meta-analyses as continuous data.
Unit of analysis issues
Effect estimates and standard errors from cluster-randomised trials may be meta-analysed using the generic inverse-variance method in RevMan. If a cluster-randomised trial is analysed incorrectly (e.g. using the individual as unit of analysis instead of taking account of the clustering), we can perform approximately correct analyses when the following information can be extracted (Higgins 2011):
The effective sample size of a single intervention group in a cluster-randomised trial is its original sample size divided by the ‘design effect’. The design effect is where M is the average cluster size and ICC is the intracluster correlation coefficient. A common design effect is usually assumed across intervention groups.
1 + (M – 1) ICC
For dichotomous data both the number of participants and the number experiencing the event should be divided by the same design effect. For continuous data only the sample size needs to be reduced; means and standard deviations will remain unchanged. This approach may be unsuitable for small trials as the resulting data must be rounded to whole numbers for entry into RevMan.
Cross-over studies will be included when outcome data from the first period is available or can be provided by the study authors on request.
Studies with multiple treatment groups
When studies have multiple treatment groups that are relevant for the present review, we will combine groups to create a single pair-wise comparison. This will be done by combining all relevant experimental intervention groups of the study into a single group, and by combining all relevant control intervention groups into a single control group.
Dealing with missing data
When possible, we will contact the original investigators to request missing studies, outcomes, or summary data. Further, we will clarify which assumptions were used to cope with missing data (e.g. whether the missing data are assumed to be 'missing at random' or 'not missing at random'). We will also examine which methods the investigators used for dealing with missing data. These methods could include ignoring the missing data, imputing the missing data with replacement values as if they were observed values, imputing the missing data and accounting for uncertainty, or using statistical models that allow for missing data and that make assumptions about the relationship with the observed data. Sensitivity analyses will be performed to assess how sensitive results are to reasonable changes in the assumptions that are made, and the potential impact of missing data on the findings will be discussed.
Assessment of heterogeneity
Heterogeneity will be assessed using the Chi2 test. Chi2 tests the hypothesis that all studies measure the same effects. If P < 0.10, heterogeneity will be considered statistically significant. We will also assess the magnitude of heterogeneity with the I2 statistic. If large and unexplained heterogeneity is identified, we will explore possible reasons for this heterogeneity using subgroup analyses. We will use the following thresholds for interpretation of the I2(Higgins 2011):
0% to 40%: might not be important;
30% to 60%: may represent moderate heterogeneity;
50% to 90%: may represent substantial heterogeneity; and
75% to 100%: considerable heterogeneity.
The importance of the observed value of I2 depends on (i) magnitude and direction of effects and (ii) strength of evidence for heterogeneity (e.g. P value from the Chi2 test, or a confidence interval for I2).
Assessment of reporting biases
To reduce the possibility of publication bias we will ensure that multiple sources are searched for studies that meet the eligibility criteria including ‘grey’ literature. We will examine funnel plots to inform us about possible publication bias if there are a sufficient number of studies (i.e. ≥ 10).
For clinically homogeneous studies, we will pool outcomes in a meta-analysis. We assume that the results from the included studies reflect a distribution of effect sizes rather than a fixed effect size, and will therefore employ a random-effects model. Separate statistical analyses will be performed for RCTs and CCTs.
Subgroup analysis and investigation of heterogeneity
If possible, we will do subgroup analysis according to joint involvement, defined as involvement of the base of the thumb joint (i.e. yes/no), and whether the participants have erosive OA or not. We will also perform subgroup analyses based on treatment content (i.e. strength, mobility, endurance, stability), intensity (i.e. 80 to 100% of one-repetition maximum (1RM) will be considered high intensity, 50% to 70% as medium intensity, and 10% to 40% as low intensity)), and delivery mode (i.e. individual, group-based, home program). Dose (i.e. frequency and duration) and the number of directly supervised contact occasions will be analysed using meta-regression. Subgroup analyses for gender and age will also be conducted when possible.
If possible, sensitivity analysis will be conducted to assess the effect of decisions made during the review process in relation to the inclusion or exclusion of particular studies from a meta-analysis, to methodological quality on outcomes like whether the assessors in the study are blinded or not, adequacy of the randomisation process, and according to the intention-to-treat principle.
Summary of findings table
We will use the GRADE approach, as described in the Cochrane Handbook chapter 12 (Schünemann 2011), to assess the quality of the body of evidence. A summary of findings (SoF) table will be produced using the GRADE-pro software. The table will provide key information concerning the quality of evidence, and the magnitude of intervention effects on main outcomes.
SoF tables will be created for the following comparisons:
SoF Table 1: Exercise versus no intervention (e.g. usual care, wait control);
SoF Table 2: Exercises versus placebo;
SoF Table 3: Exercises versus other interventions; and
SoF Table 4: Comparison of different exercise programs.
The outcomes (in order of importance) reported in the SOF tables will include:
Radiographic joint structure changes;
Quality of life;
Number of patients experiencing any adverse event; and
Number of patients who withdrew because of adverse event.
In addition to the absolute and relative magnitude of effect provided in the 'Summary of findings' table, the number needed to treat (NNT) will be calculated from the control group event rate (unless the population event rate is known) and the risk ratio using the Visual RxNNT calculator (Cates 2013). For continuous outcomes, the NNT will be calculated using the Wells calculator software available from the CMSG editorial office. The minimal clinically important difference (MCID) for each outcome will be determined for input into the calculator.