Psychological interventions for rheumatoid arthritis: A meta-analysis of randomized controlled trials




To carry out a systematic review of the literature examining the efficacy of psychological interventions (e.g., relaxation, biofeedback, cognitive–behavioral therapy) in the treatment of rheumatoid arthritis (RA).


Studies that met the following criteria were included: random assignment, wait-list or usual care control condition; publication in peer-reviewed journals; treatment that included some psychological component beyond simply providing education information; and separate data provided for patients with RA if subjects with conditions other than RA were included. Two investigators independently extracted data on study design, sample size and characteristics, type of intervention, type of control, direction and nature of the outcome(s).


Twenty-five trials met the inclusion criteria. Methodologic quality was assessed, and effect sizes were calculated for 6 outcomes. Significant pooled effect sizes were found postintervention for pain (0.22), functional disability (0.27), psychological status (0.15), coping (0.46), and self efficacy (0.35). At followup (averaging 8.5 months), significant pooled effect sizes were observed for tender joints (0.33), psychological status (0.30), and coping (0.52). No clear or consistent patterns emerged when effect sizes for different types of treatment and control conditions were compared, or when higher quality trials were compared to lower quality ones. Findings do, however, suggest that these psychological interventions may be more effective for patients who have had the illness for shorter duration.


Despite some methodologic flaws in the literature, psychological interventions may be important adjunctive therapies in the medical management of RA.


Rheumatoid arthritis (RA) is a chronic condition that affects approximately 2.1 million Americans (1). The causes of RA remain poorly understood and there is at present no known cure for the disease. Although pharmacologic approaches can in some instances significantly reduce symptoms of the disease such as pain and disability, for many people with RA, disability, pain, psychological distress, fatigue, and poor quality of life persist in spite of such treatments. Although some recent medications (e.g., the cyclooxygenase-2 inhibitor celecoxib) appear to produce fewer adverse effects (2), for many patients nonsteroidal antiinflammatory drugs (NSAIDs) and disease modifying antirheumatic drugs can still give rise to a variety of side effects, particularly with prolonged use (3–6). As noted by Gotzsche (7), such adverse effects can in some instances become so problematic that patients either become noncompliant, or opt for drugs with fewer side effects, even if these medications are less effective.

Not surprisingly, given the persistent and chronic nature of the disease and the potential side effects associated with many pharmacologic approaches, persons with RA tend to be particularly high users of complementary and alternative medicine (CAM) therapies (8). For example, a recent survey (9) found that 46% of RA patients being seen in rheumatology clinics across the country reported using some form of CAM. Mind–body therapies, defined as psychological, social, or spiritual approaches to health, are among the most common CAM treatments used by patients with arthritis (10). Unlike many areas of CAM that have not been extensively researched, numerous clinical studies have been carried out examining the effects of mind–body (e.g., multimodal, psychosocial–behavioral) interventions in the treatment of arthritis. These protocols typically include some combination of relaxation strategies (e.g., progressive muscle relaxation), biofeedback therapy for reducing muscle tension, cognitive–behavioral strategies (e.g., cognitive restructuring, pain coping), and patient education.

There have been several narrative (i.e., nonsystematic) reviews and 2 meta-analyses of this literature. In a 1991 review, McCracken (11) concluded that despite some methodologic problems and inconsistencies in the studies, further investigations of cognitive–behavioral treatments for RA patients were warranted, given the positive findings on pain reduction. Parker et al (12) reached a similar conclusion in their 1993 review. However, Keefe and Van Horn (13) noted that although cognitive–behavioral treatments often result in reductions of pain and disability in the short term, more work is needed to identify effective relapse prevention strategies because the majority of studies have failed to demonstrate maintenance of treatment gains at followup. In the most recent narrative review, Bradley and Alberts (14) concluded that cognitive–behavioral interventions represented “well-established” pain management strategies for RA. (In this review, the criteria for “well-established,” as set forth by the American Psychological Association, included a demonstration of efficacy through at least 2 controlled outcome studies from different investigators.)

In 1987, Mullen et al (15) published a meta-analysis of 15 trials. They concluded that “psycho-educational” interventions can contribute to improved patient outcomes in both RA and osteoarthritis, although as they noted, the pooled effect sizes were relatively small (0.20 for pain, 0.27 for depression, 0.09 for disability). The methodologic quality of the trials was not assessed in this review. A 1996 meta-analysis (16) compared effect sizes found in randomized trials of patient education programs with those found in trials of NSAIDs. Compared with the NSAID studies, effect sizes were again fairly small for the patient education interventions (0.16 for pain; 0.18 for functional disability; 0.34 for tender joint count). However, the authors suggest that because most RA patients in these trials were already taking NSAIDs, the relatively small effect sizes probably represent the additional benefit over and above medication and may therefore be clinically relevant (16). There have been no randomized trials that directly compare such interventions to pharmacologic therapy.

Because the primary focus of this more recent meta-analysis (16) was patient education, the authors chose to include only studies that were informational in nature and to exclude studies that were psychological or behavioral without a significant informational component. As a result, with the exception of the previously cited 1987 meta-analysis, to date there has been no systematic review of the literature specifically examining psychological/behavioral interventions in RA. Therefore, we carried out a meta-analytic review of studies that compared “psychosocial” (e.g., cognitive–behavioral, psychoeducational) interventions to nonintervention controls (e.g., wait list, usual care, or attention placebo) in patients with RA.


A comprehensive literature search was carried out to identify randomized controlled trials of psychosocial interventions for RA. The MEDLINE, PsychLit, EMBASE, CAMPAIN, Science Citation Index, and Cochrane Library databases (including the Cochrane Registry of randomized trials) were searched using inclusion dates from their inception through June 2001. We employed more than 30 search terms related to psychological therapies (e.g., meditation, relaxation, cognitive–behavioral, psychoeducational, counseling, biofeedback, mind–body) and combined these with 11 search terms related to arthritis. We also hand searched our own files and the reference sections of identified trials and review articles for additional studies. Criteria for inclusion were as follows: 1) Random assignment of subjects; 2) If the trial included patients with conditions other than RA (e.g., osteoarthritis), data for RA patients had to be reported separately; 3) Appropriate control group (i.e., usual medical care, wait-list, attention placebo); 4) English language publication in a peer-reviewed journal; 5) Active treatment that included some psychological/psychosocial component beyond simply providing information (e.g., patient education) about the disease. There was one non-English trial (17) that met our inclusion criteria. However, this study was not included in our review because of difficulties getting the manuscript translated and based on evidence that inclusion of non-English trials does not significantly alter results of meta-analyses (18).

The methodologic quality of the trials was assessed using criteria outlined by Jadad et al (19). However, as was done by von Tulder et al (20) in their review of psychological treatments for low back pain, we included several additional factors that could be considered more appropriate for trials of behavioral interventions. For example, the Jadad scale gives strong weight to whether there is blinding of patients to the treatment condition and to whether or not a trial is placebo controlled. However, both of these conditions frequently cannot be met in psychosocial interventions. Therefore, we developed a set of additional quality criteria (Figure 1) that are particularly germane to the type of interventions being studied and that would, we believed, provide us with sufficient variability in quality ratings across trials.

Figure 1.

Methodologic quality criteria.

Data extractions were carried out by 2 independent assessors (JA and WB). Differences were settled by consensus. In our review of the literature, we sought to identify outcomes that were assessed across a majority of the trials and that would provide the basis for a meta-analytic review. Six clinical outcomes were identified: pain, functional disability, psychological status (usually depression), coping, self efficacy (or helplessness), and tender joints (21). When studies provided sufficient data, effect sizes were calculated for each of these outcomes using Cohen's d (22), weighted for sample size. We examined effects on each of these outcomes postintervention (i.e., the first time point following the treatment) and at followup (the last time point assessed). The Hedges correction (23) was applied to all effect sizes and data were pooled using the fixed-effects model. Finally, we employed the Cochran Q-test for homogeneity, which tests whether there is variability within the set of effect sizes.


Our initial search yielded 64 controlled clinical trials that examined psychological or behavioral interventions for RA. Forty studies were excluded, leaving 25 trials that met our inclusion criteria (Table 1) (24–48). Principal reasons for exclusion were lack of randomization, having an inadequate control condition (or comparison to another active treatment), the intervention being solely educational (i.e., not psychological) in nature, or being unpublished (although we were able to identify only 1 unpublished randomized controlled trial that met our other inclusion criteria). In some instances (49, 50), there was insufficient detail to determine the precise nature of the intervention and whether or not it was predominantly informational (as opposed to psychological). If it appeared the trial was primarily informational, it was not included in our analysis.

Table 1. Summary of study characteristics and findings
First author (year)Trial designSample sizeTreatmentControl conditionPrimary outcomesResultsQuality scoreJadad scoreEffect sizes post (followup)
Applebaum (1988) (24)Randomized trial 2 arms1810 trials progressive muscle relaxation 10 trials thermal biofeedback (w/autogenic phrases) Cognitive pain management administered in 10 sessions over 6 weeksSymptom monitoring wait list controlPostintervention and 18 months: Pain, sleep, depression, anxiety, disability, grip strength, timed walkingReduction in peak pain rating Positive changes in perceptions of pain Changes in communication (daily activities questionnaire) and on joint range of motion No effects at follow up52Pain 0.50 (−0.38) Disability 0.68 (−0.15)
Bradley (1987) (25)Randomized trial 3 arms53Cognitive-behavioral therapy consisting of 5 individual thermal biofeedback sessions and 10 small group meetingsSocial support—15 sessions with family or friends—active listening by group leaders—patients encouraged to develop coping strategies but not taught any skills Usual medical carePost intervention and 6 months: Anxiety, depression, pain, locus of control, helplessness, rheumatoid factor, sedimentation rate, grip strength, tender jointsPositive changes in pain and disease activity as measured by composite “Rheumatoid Activity Index” post-intervention. No effects at follow-up51Pain 0.44 (−0.02) Tender joints 0.88 (−0.23) Depression 0.27 (0.47) Self-efficacy 0.20 (0.21)
DeVellis (1988) (26)Randomized trial 2 arms101Individualized problem solving intervention based on psychosocial interview (1 hour long)Non-intervention controlPost-intervention Function, general well being, self-esteem, problem resolution, treatment complianceTreatment group showed improvement in compliance but not in psychosocial or physical function62
Flor (1983) (27)Randomized trial 3 arms24EMG biofeedback—12 sessions over 2 weeks (no instructions for relaxation)—also presented with cognitive model of disorder—home practice of relaxation after 3rd sessionPseudo therapy—(mock biofeedback) told machine would relax and massage muscles Usual medical carePost intervention/4–5 month follow up Pain, medication use, physician consultationsPositive treatment effects post-intervention and at follow up in both pain and cognitions about pain53Pain 1.01 Coping 1.01 (1.41)
Germond (1992) (28)Randomized trial matched 2 group248 week (16 two hour meetings) Stress inoculation/pain management training (progressive muscle relaxation, coping, neurolinguistic programming)Control group met for 4 sessions (1–2 hours each)Post intervention and at 8 weeks Articular index, coping, locus of control, stress, mood, disability, pain, sedimentation rate, lymphocyte proliferation rateNo significant treatment effects52
Kaplan (1981) (29)Randomized trial matched 2 group3412 weeks of nondirective/client centered group counseling (1–2 hours)Usual carePost intervention Human service scale (satisfaction of human needs) Self concept, depression Joint counts and tenderness, proximal interphalangeal joint ring sizes, duration of morning stiffness and impression of rheumatologistPositive effects on understanding and knowledge of disease Some self-concepts (family self and self satisfaction) improved72Depression 0.73
Kelley (1997) (30)Randomized trial 2 groups Matched for sex and disease status72Talk for 15 minutes in private on 4 consecutive days discussing topic into tape recorder (most stressful or significant traumatic event)Controls described neutral pictures2 weeks post intervention, average of 3 month follow up: Stress, mood, disability (health status) Joint count, grip strength, walking timeAt 2 weeks groups did not differ but at 3 months treatment group showed improvements in affective disturbance and physical functioning No effect on pain or joint condition84Pain 0.00 (−00) Tender joints (0.03) Depression −0.24 (0.66) Disability 0.08 (0.47)
Kraimaat (1995) (31)Randomized trial 3 arms77Cognitive behavior therapy (coping with pain, stress, relaxation, active coping, rational thinking—10 weekly session/2 hours eachOccupational therapy (usual care in the Netherlands), Wait list controlPost treatment and at 6 months: Sedimentation rate, ESR, C-reactive protein, walking time and joint score, Health status Pain and copingPositive effect in only 1 of 6 pain coping strategies; no other significant treatment effects42Pain 0.05 (0.23) Depression −0.19 (0.22) Disability 0.46 (−0.16)
Lavigne (1992) (32)Randomized trial 2 arms86 bi-weekly individual sessions over 3 months with parent and child PMR, EMG and thermal biofeedback with imageryDelayed treatment group (Wait list control)Pain diaries Pediatric pain behavior questionnaire Physician ratings, physical therapist ratings, child behavior checklistSignificant changes only in maternal pain ratings for treatment group52Pain 0.58
Leibing (1999) (33)Randomized trial 2 arms5512 weekly 90 minute sessions of cognitive behavioral treatment involving information, relaxation/imagery, pain management, active copingUsual medical carePosttreatment (3 months) and 9 month followup to be reported later C-reactive protein, sedimentation rate, grip strength, joint count, functional capacity, medication use, pain, anxiety, depression, helplessness, coping 62Pain 0.30 Depression 0.48 Disability 0.41 Tender joints −0.35 Coping 0.45 Self efficacy 0.69
Lindroth (1997) (34)Randomized trial 2 arms96“Problem based” education that included relaxation training—8 weeks, professional led, 2.5 hours per sessionWait list controlPost intervention and 12 month followup: Pain and disabilityEffects posttreatment but not at 12 month followup in pain Fewer problems concerning self-confidence and relations to friends reported at followup52Pain 40 (.00) Depression −0.19 (0.22) Disability 0.08 (0.47) Self efficacy 0.00 (0.00)
Lundgren (1999) (35)Randomized trial 2 arm685 weeks supervised muscle relaxation training and 5 weeks “imaginary” techniques for pain reduction—30 minutes twice weekly—used taped instructions at classesNo treatment controlPost intervention and 6 and 12 month follow up Pain, disability, muscle function, sedimentation rateImprovements in self care, recreation and pastime activities At 6 months, treatment group showed improved mobility and arm function and muscle endurance and balance (No effects at 12 months)62
Maisiak et al. (1996) (47)Randomized trial 3 arms219Telephone counseling (using the model of “reality therapy” emphasizing control of stress, self-care, and improving communication with health care practitioners)—11 scheduled contacts over 9 monthsSymptom monitoring Usual medical carePhysical and affective function, and pain status (as measured by the AIMS2)Positive treatment effects observed post-intervention in affective and physical function but not pain72Pain 0.00 Depression 0.34 Disability 0.39
Maisiak et al (1996) (48)Randomized trial parallel group 2 arm58“Person-centered” telephone counseling focusing on affective states and emotional coping (15–30 minute sessions every 4–6 weeks for 6 months)—Usual medical carePhysical and affective function, and pain status (as measured by the AIMS2)No significant effects72Pain 0.23 Depression −0.03 Disability −0.13
O'Leary (1988) (36)Randomized trial 2 arms33Cognitive behavioral—5 weeks, 2 hour meeting once per week Included relaxation with imagery, attention refocusing, dissociation, relabeling, and self-encouragement as well as communication skills trainingControls received Arthritis Helpbook (as did treatment group) but no other instructionPain, disability, joint pain count, self efficacy, depression, stress, loneliness, sleep, sedimentation rate, T-lymphocyte subsets Perceived Stress Scale UCLA Loneliness Sleep behavior ESR T-lymphocyte subsetsSignificant improvements in average and highest pain, joint impairment, self efficacy, no changes in psychological measures or sleep or immunologic measures At 4 month followup: Only changes in self efficacy and coping remained significant72Pain 0.69 Depression 0.22 Disability 0.20 Tender joints 0.70 Coping 0.54 Self efficacy 0.82
Parker (1988) (37)Randomized trial 3 arms83Cognitive behavioral pain management (coping strategies, relaxation, family dynamics—1 week inpatient program followed by an average of 6.6 support group sessions focusing on maintenance of treatment gainsAttention control (education) Usual carePost intervention and 6/12 months Pain, disability, depression, coping, stress, helplessness, grip strength, sedimentation rate, walking speed, and estimates of morning stiffness, joint countsSignificant changes in pain coping evidenced postintervention and at both followups73
Parker (1995) (38)Randomized trial 3 arms Stratified according to functional class, clinic site, and stress level14110 weekly individual 1.5 hour sessions of stress management training 15 month maintenance program—subjects seen at least once every 3 monthsAttention control (patient education) Usual care control groupPost intervention 3 month, 15 month followup Pain, joint counts, stress, helplessness, depression, anxiety, self-efficacy, copingPostintervention significant improvements in pain, stress, helplessness, self-efficacy, and coping with pain At 15 month followup improvements in helplessness, self-efficacy, and coping82
Radojevic (1992) (39)Randomized trial 4 arms59Once a week behavioral therapy (coping, relaxation, imagery) for 90 minutes over 4 weeks followed by 2 weeks of self practice (skill consolidation)—one arm with family support, one arm withoutAttention control (education with family support (4 video educational presentations) Usual care no treatment controlPre and post-intervention and 2 month follow up: Pain, physical functioning, psychological status, depression, joint condition, copingSignificant effects on severity of swelling and number of swollen joints (addition of family support not significant) No significant effects at followup31Pain 0.12 (0.36) Depression −0.25 (0.04) Disability 0.06 (0.06) Self efficacy 0.00 (0.00) Tender joints 0.19 (0.51) Coping 0.15 (0.15)
Scholten (1999) (40)Randomized trial 2 arms (followed by 5 year longitudinal study)68Training during 9 sessions scheduled over 2 weeks Psychological group counseling including coping strategies, relaxation, self efficacy enhancementWait list control2 weeks, 6 weeks, 52 weeks Disability, coping, depressionResults unclear based on inappropriate statistical analysis (no between group analysis) Coping and disability were improved at 5 year follow up (within group analysis)42
Sharpe (2001) (41)Randomized trial 2 arms53Cognitive-behavioral therapy (relaxation, pain coping, education) consisting of eight individual sessions over eight weeksUsual careAssessed postintervention and 6 month followup on: Anxiety, depression, pain, coping, disability, tender joints, ESR, C-reactive proteinTreatment group showed significant improvements in depression post test (ES = .73) and at followup, C-reactive protein postintervention and joint involvement83
Shearn (1985) (42)Randomized trial 3 arms105Stress management (identify sources of stress and learn relaxation) and support group—10 weekly 90 minute sessionsSupport group (enhance self-responsibility, exchange information, build relationships, decrease isolation) No intervention controlPost-intervention, 4 and 8 months followup Pain, sedimentation rate, grip strength, walking speed, number of tender joints, disability, life satisfaction, and depressionPositive treatment effects observed postintervention in joint tenderness No effects at followup53Pain .14 Depression −.07 Disability −.37 Tender joints .27
Smyth (1999) (43)Randomized trial 2 arms49Writing about “the most stressful experience they'd ever undergone” for 20 minutes on 3 consecutive daysControl condition: asked to describe their plans for the day—time management to reduce stress (both groups informed that researchers were interested in their experience of stress)2 weeks, 2 months, 4 months: Physician interviews to assess disease activity, symptom severity, distribution of pain, tenderness, and swelling, daily function and psychosocial well being4 months postintervention, treatment group showed significant reductions in disease activity (considered clinically significant showing one category of improvement)—improvement not evidenced at earlier time points94Tender joints −0.24 (1.00)
Strauss (1986) (44)Randomized trial 3 arms57Conventional psychotherapy for 6 months (mutual support through sharing of experiences and emotions (N = 20) Group assertion/relaxation training for 3 months (N = 17)Non-intervention controlPostintervention and 1 year followup: Functional status, psychosocial adaptation, physician rating of disease activityNo significant treatment effects observed for either intervention at any time point42
Taal (1993) (45)Randomized trial 2 arms755 weekly 2 hour group sessions Professional led educational groups with relaxation and guided imagery, coping with depression, pain managementNon-intervention controlPostintervention, 4 and 14 months: Joint tenderness, sedimentation rate, hemoglobin, thrombocytes, depression, pain, anxiety, self-efficacyPost-intervention positive effects observed in functional disability, self-efficacy, and practice of exercise At 14 months only changes in self-efficacy and practice of exercise remained significant52Disability 0.61 (0.00) Tender joints 0.00 (0.45) Depression 0.00 (0.00) Self efficacy 0.57 (0.55)
Van Deusen (1987) (46)Randomized trial 2 arms46ROM Dance Program (combines Tai Chi movements with relaxation exercises, biofeedback and discussion of coping with stress 8 ninety-minute weekly classesWait list controlPostintervention and 4 months: Range of motion (goniometry) Range of motion on selected joints (goniometry) and shoulder flexion and rotation significantly improved in treated groupSignificant effects in treated subjects at posttest in shoulder flexion, rotation, ankle flexion and lower extremity flexion; at 4 month followup total upper extremity showed improvement in treatment group52Disability 0.51 (0.47)

Study characteristics

Study characteristics are summarized in Table 1. Across the 25 trials, sample sizes ranged from a low of 8 to a high of 141 with a mean of 55. Average length of time that patients had the disease was 10.6 years (SD 3.73). Thirteen studies (52%) (24, 25, 28, 31, 33, 36–42, 45) were either described by the researchers or could be characterized as multimodal, cognitive–behavioral interventions. These protocols typically involved some combination of relaxation, imagery, stress management, or the teaching of cognitive coping skills. Five studies (24, 25, 27, 32, 46) also included biofeedback as 1 of the treatment components. Five studies employed more traditional psychotherapeutic interventions, both group based (29, 34) and individual (26, 47, 48), and the intervention in 2 studies (30, 43) involved subjects' writing or speaking about difficult emotional or stressful experiences. Length of the interventions varied from 3 days to 9 months with a mean of 9.8 weeks. One study (38) utilized a “refresher” course following the actual intervention. Nineteen of the 25 studies collected followup data with followup time periods ranging from 2 to 18 months (mean of 8.6).

Methodologic quality of trials

Across studies, the mean score on our 10-point quality rating scale was 5.84, range 3–9. We considered trials scoring ≥7 to be high quality, those scoring 5–6 of average quality, and those scoring ≤4 to be low quality. The mean score on the Jadad scale was 2.24. Looking at the specific items that make up the Jadad quality scale (19) and the additional criteria we added (see Figure 1), the following was observed. All trials were randomized (a criteria for inclusion in the review). Five of the 25 studies (30, 37, 41–43) described a proper method of randomization (e.g., table of random numbers, computer generated). Patients were deemed to be blind to the treatment condition in 5 trials (27, 30, 37, 38, 43), although as noted earlier, such blinding is typically either not possible or not ethical in many behavioral interventions. Following von Tulder et al (20), in studies that utilized an attention control as the placebo condition, we considered patients to be blind if the researchers performed a treatment credibility analysis and there were no significant differences between the intervention or control group in terms of the participants' assessment of credibility. Most trials (92%) described reasons for dropouts or withdrawals. We also assigned additional quality points to 15 trials that either statistically compared dropouts to study completers or that had fewer than 10% who dropped out or were lost to followup.

Because practitioners frequently cannot be blinded to the treatment condition in behavioral trials, as they can in drug trials, we also looked at whether the assessor or evaluator was blinded. In 11 trials (44%) (25, 26, 29, 32, 35, 36, 38, 41, 43, 47, 48), the assessor (i.e., data collector) was described as blinded to group assignment. Only 1 trial (43) described a proper method of allocation concealment. Five studies (20%) (26, 31, 34, 42, 44) did not adequately describe the control condition (e.g., simply referred to a control group but did not indicate the extent, if any, of contact with researchers or clinical staff). Finally, the majority of trials (92%) reported comparing baseline characteristics of treatment and control groups and 72% controlled for baseline differences in the statistical analysis (for which an additional quality point was given).

Study outcomes

Table 1 summarizes the study characteristics, including sample size, type of intervention and control, outcomes assessed, principal findings, and quality ratings for the 25 trials. Figure 2 shows the pooled effect sizes both postintervention and at followup for our 6 selected outcome measures. When there was more than one followup time point, we used the final time point for all calculations.

Figure 2.

Effect sizes.


Nineteen studies looked at pain as an outcome. In the majority of cases, the measure that was used to calculate effect sizes was a standard visual analog scale (51). The average effect size for the 13 trials that provided sufficient data to calculate effect size was 0.22 (P = 0.003; 95% confidence interval [95% CI] 0.07–0.37). At followup, the effect size was not significant (effect size = 0.06; 95% CI −0.17–0.29 for 6 studies).

Functional disability

Across studies, several different self-report measures were used to assess functional disability. These included the Health Assessment Questionnaire (52) and the Arthritis Impact Measurement Scales (53). Eighteen studies assessed disability. Based on data from the 12 trials that provided sufficient data to calculate effect sizes, the pooled postintervention effect size was 0.27 (P = .00001, 95% CI 0.12–0.42). At followup, the effect size of 0.12 (95% CI −0.09, −0.33), based on 7 trials, was no longer significant.

Tender joints

The average effect size postintervention for tender joints (based on the 7 trials that provided sufficient data to calculate effect sizes for this outcome) was 0.15 (95% CI −0.09, −0.39 and was not significant. However, at followup, the effect size (based on data from 5 trials) was significant (effect size = 0.30; P = 0.005; 95% CI 0.04–0.56).

Psychological status

Across studies, the most frequently employed measure of psychological status was depression, e.g., the Center for Epidemiological Studies-Depression Scale (54), the Beck Depression Inventory (55). Based on results from 12 of the 19 trials that included depression or psychological status, e.g., affective health status as measured by the Arthritis Impact Measurement Scales 2 (47), as an outcome, we found a significant average effect size of 0.15 (P = 0.03, 95% CI −0.01, −0.31), postintervention. At followup, the average effect size (from 5 trials) was 0.33 (P = 0.01; 95% CI −0.07, −0.59).


Twelve studies included some measure of psychological or cognitive-emotional coping (i.e., with pain or related symptoms of the disease). Across the 6 outcomes we selected for analysis, the average effect size for coping was the largest. Based on 4 trials at postintervention, there was an average effect size of 0.46 (P = 0.007; 95% CI 0.09–0.83). At followup, data from 3 trials showed an average effect size of 0.52 (P = 0.04; 95% CI −0.07–1.11).

Self efficacy

We examined outcomes that looked at patients' sense of control or self efficacy, which has been defined as the subjective assessment that one has the internal–external resources to cope with a given or hypothetical situation. To increase our sample size, we combined studies that used either the Arthritis Self-Efficacy Scale (56) or Arthritis Helplessness Index (57). Among the 8 trials that assessed either of these outcomes, there was a significant effect size (based on 5 studies) postintervention (0.35; P = 0.017; 95% CI 0.11–0.59). At followup, the average effect size of 0.20 (95% CI −0.08, −0.48) based on only 3 trials was no longer significant.

Homogeneity of effect sizes

For each pooled effect size we calculated, the Q-test was not significant, indicating that the results across trials were homogeneous.

Sensitivity analyses

Quality of trials

We examined whether the methodologic quality of the trials (as assessed by our 10-point rating scale) was related to the magnitude of the observed effect sizes. To do this, we calculated pooled effect sizes and significance levels for higher quality studies—i.e., those that had received a quality score ≥7 (n = 9). We were not able to calculate effect sizes for tender joints, coping, and self efficacy in this analysis, due to the small number of studies that examined these outcomes. For pain (0.11) and disability (0.20), pooled effect sizes were somewhat smaller in trials judged to be of higher methodologic quality when compared with effect sizes for these 2 outcomes in trials scoring lower in quality (0.32 and 0.33, respectively). The reverse pattern was observed with the outcome “psychological status,” with higher quality trials evidencing a higher pooled effect size (0.23) compared with lower quality trials (0.03). Overall, correlations between trial quality and effect sizes for each of the 6 outcomes we examined were not statistically significant.

Publication bias

Several methods have been developed to examine whether the results of meta-analyses may be affected by publication bias (i.e., the selective nonpublication of negative or null findings). We employed the “fail-safe n” (FSN) as one such assessment of publication bias (58). FSN estimates the number of findings of zero effect that would have to be present to render the pooled effect size nonsignificant. We calculated the FSN for each outcome that had a statistically significant averaged effect size postintervention. The results were as follows: pain FSN = 22, disability FSN = 49, depression FSN = 0, coping FSN = 10, and self efficacy FSN = 8. Rosenthal (59) has suggested that a reasonable tolerance level for the fail-safe value would be 5K + 10 (with “K” being the number of studies). Based on this estimate, publication bias cannot be ruled out in the present meta-analysis. Along with calculating the FSN, we also examined funnel plot graphs for the 3 outcomes that had the largest number of effect sizes. The funnel plots for disability and psychological status appeared normal (i.e., did not suggest the presence of any publication bias); the plot for pain suggested an absence of studies with smaller sample sizes yielding larger effect sizes. Therefore, it is not clear from these analyses whether or the extent to which publication bias may have been present.

We also examined whether our results may have been affected by any reporting biases, that is, the tendency to only report results (or provide sufficient data to calculate effect sizes) for outcomes that are statistically significant. Such selective reporting of outcomes can potentially bias effect size estimates in meta-analyses (58). As shown in Figure 3, we calculated adjusted effect sizes based upon an assumed value of zero in cases where effect sizes for nonsignificant results could not be calculated, and an assumed effect size based on a P value equal to 0.05 for positive effects that could not be calculated. Overall, these adjusted effect sizes were somewhat smaller. However, only for the variable of psychological status (postintervention) did these adjustments alter the statistical significance of the findings.

Figure 3.

Adjusted effect sizes. Effect sizes were adjusted by inputting 0.00 when authors reported null findings but effect sizes could not be calculated, and by inputting an effect size based on P = 0.05 when authors reported positive findings but effect sizes could not be calculated.

Effect of intervention and control group

As illustrated in Figure 4, we compared effect sizes in studies that used a wait list or treatment as usual control condition with those that employed an attention, education, or placebo control. For pain and disability, effects were quite comparable. For psychological status, the effect size in studies utilizing a wait list/usual care control (0.29) was larger than in those studies employing attention/placebo control (0.08). However, for tender joints, the reverse was observed with the effect size in studies using an attention/placebo control (0.31) being considerably larger than in those studies using a wait list/usual care (−0.01).

Figure 4.

Effect sizes by control group type.

We also examined effect sizes for studies that were either described as or deemed to be cognitive–behavioral and compared these with studies that tested any other intervention. The non-cognitive–behavioral category was too varied to be broken down further. As shown in Figure 5, across these 2 intervention categories, only minor differences were observed for the outcomes of pain, disability, and psychological status.

Figure 5.

Effect sizes by intervention type.

Subgroup analysis

Illness duration

We explored whether effect sizes differed as a function of the length of time subjects had RA. Among the 18 studies that reported illness duration, we analyzed effect sizes for studies in which the illness duration was greater or less than 11.5 years (the median split). Pooled effect sizes were somewhat lower in those studies where patients had, on average, had the disease for a longer period of time. Results were as follows with effect sizes for greater illness duration listed first: pain, 0.19 and 0.46; disability, 0.33 and 0.45; psychological status, 0.08 and 0.34; and coping, 0.43 and 0.49. These analyses should be interpreted with some caution, however, because of the smaller number of trials constituting each of the pooled effect sizes.


The results of our systematic review of psychological interventions (e.g., relaxation, biofeedback, cognitive–behavioral therapy, stress management) suggest that such approaches may be effective adjuncts to the conventional medical management of RA. Based on data from 25 randomized controlled trials that compared these interventions to usual medical care, wait-list, or attention controls, we found small but statistically significant average effect sizes for pain, functional disability, depression, coping, and self efficacy posttreatment. At followup (which averaged 8.5 months), effect sizes for coping, psychological status, and tender joints were significant, whereas those for pain, disability, and self efficacy became nonsignificant.

In part owing to our rather stringent inclusion criteria, the majority of trials (84%) were deemed to be of average to high quality (scoring 5 or higher on our 10-point quality rating scale), although quality was quite variable. Our subgroup analysis of the 8 trials judged to be of highest methodologic quality found no clear relationship between study quality and outcomes (among higher-quality studies, effect sizes were higher for 2 outcomes and lower for 1 outcome). That being said, the failure of the majority of trials to describe the method of randomization, use appropriate allocation concealment procedures, and/or ensure blinding of trial investigators/data evaluators, represent significant methodologic limitations that may have biased the results.

It is also possible that our findings may have been due to some publication bias (selective reporting or publication of positive results). However, the 2 tests we employed (FSN and funnel plot) produced conflicting results, making it difficult to determine whether or to what extent publication bias may have influenced the findings.

As in the 2 previous meta-analyses of patient education programs (15, 16), the effect sizes in our review, although statistically significant for most outcomes, were relatively small. In the face of these findings, the question arises, what, if any, is the clinical significance of these effects? As noted earlier, Superio-Cabuslay et al (16) have suggested that the relatively small effect sizes they observed were clinically relevant because, in most instances, patients in these interventions were already receiving standard medical care and, therefore, any changes in outcome (e.g., pain, function, number of tender joints, psychological status) over and above what patients were receiving as a result of their usual care would be clinically important.

We have highlighted a number of additional methodologic issues, as well as theoretical questions, that researchers may want to consider addressing in future studies.

Inconsistent results

Although we found statistically significant pooled effect sizes for most of the outcomes we analyzed, these effects were relatively small. In addition, at the individual study level, the majority of trials failed to show significant treatment effects on the 6 outcomes we examined, and a number of cases actually showed a negative treatment effect (i.e., controls showed greater improvement). One explanation for these findings is lack of statistical power. For example, sample sizes were fairly small across trials, with 11 of the 25 studies having fewer than 40 total participants. Along with ensuring adequate statistical power, questions that should be addressed in future trials include: 1) are certain of these interventions more effective than others (see below); 2) do certain patient characteristics (e.g., degree of disability, comorbid conditions such as depression, personality factors) make one more or less responsive or receptive to these kinds of interventions; 3) to what extent does efficacy vary as a function of either disease severity or the particular permutations of RA. Most studies that we reviewed did not examine treatment effectiveness as a function of any specific patient characteristic. However, our preliminary subgroup analysis examining illness duration suggests that such psychological interventions may be more effective in those patients who have had the disease for a shorter period of time.

Most effective interventions

It is unclear from our review whether certain of these interventions are more or less effective than others (our analysis comparing cognitive–behavioral interventions to other treatment types was inconclusive). Because many of these treatments were multimodal (e.g., integrating a variety of different psychosocial approaches), it is also difficult to determine whether or not particular component parts or certain combinations of modalities may have been responsible for the observed treatment effects. Therefore, future research should directly compare these different interventions against one another. In addition, it would be interesting to design trials that directly compared such psychological treatments to pharmacologic approaches.

Long-term effects

An additional question not fully answered by our meta-analytic review is the extent to which the effects of these psychological interventions produce long-term effects. For some outcomes (pain and disability), treatment effects appeared to diminish over time, becoming statistically nonsignificant at followup. However, in the case of psychological status (depression) and tender joints, effects became somewhat more pronounced over time. Future trials should explore the potential value of building booster/relapse prevention strategies into the trial designs, as was done by Parker et al in their study (38).

Methodologic issues

Factors that may have potentially biased the results of our meta-analysis and should be addressed in future trials include: 1) the frequent use of multiple statistical comparisons and the failure in almost all instances to control for these; 2) significant dropout rates in some studies without carrying out intent-to-treat analyses; and 3) the failure to report or provide adequate data on nonsignificant outcomes. In addition, only 1 trial reported using allocation concealment methods. Because there is some evidence that the absence of such methods may serve to inflate effect size estimates, future trials examining psychological interventions for RA should utilize such methods.

Mechanisms of action

If, as our review suggests, an array of psychological-behavioral therapies may lessen pain and disability and potentially impact disease activity in RA, the question arises: what are the mechanisms through which these interventions produce their favorable clinical outcomes? For example, research by Lorig and colleagues suggests that the benefits of arthritis patient education programs cannot be explained by changes in health behaviors, but may instead result from effects on arthritis patients' self efficacy (the belief one has the ability to control or cope with the symptoms of the disease) (60–63). Although studies in the present review did not specifically examine whether improvements in health status were mediated by changes in self efficacy or coping, average effect sizes were significant for both of these outcomes, with the effects on coping being the largest and most consistently positive of all outcomes we examined.

Impact on disease activity

In the studies we reviewed, a number of different measures were employed to assess disease activity. Although the results of these trials suggest that psychological interventions do not significantly influence biologic markers for the disease (such as erythrocyte sedimentation rate or serum rheumatoid factor), they do suggest that these interventions can influence certain objective clinical indices of disease activity, specifically tender joints. As with improvements in health status, it is therefore important for future researchers to more carefully examine whether observed changes in tender joints may be mediated by certain psychological factors such as attenuation of stress reactivity (64) and to explore the potential neurophysiological pathways (such as changes in immune function) through which these interventions may ultimately influence disease processes.

Particularly given the failure of many patients with RA to respond to conventional pharmacologic approaches, and the frequent adverse effects that are associated with such treatments, we believe the results of our meta-analysis point to the potential value of incorporating an array of psychological/behavioral therapies as adjunctive treatment in the medical management of RA. Additional research is needed to clarify which of these psychological interventions (or combinations of interventions) are most effective and for which specific types of patients, and to examine whether or not such treatments can potentially reduce the use of and reliance upon pharmacologic approaches.