In 1997, all RA patients registered in 4 outpatient rheumatology clinics and assumed to fulfill the inclusion criteria (Table 1) after their records were screened for demographic and disease-related characteristics were invited by mail to participate in the trial. Patients who were willing to participate, who still met these criteria after a screening by two trained investigators (ZdJ and AJ), and who gave written informed consent were subsequently randomized. The medical ethics committee of each participating center approved the study protocol.
Table 1. Eligibility criteria for inclusion in the study*
|Age 20–70 years|
|RA according to ACR 1987 revised criteria (22)|
|ACR functional classes I–III (23)|
|Stable DMARD regimen in past 3 months|
|Able to cycle|
|Willing to exercise biweekly on fixed schedule|
|Living within a predefined adherence region of training and/or assessment center|
|No prosthesis of a weight-bearing joint|
|No cardiopulmonary disease excluding intensive exercise|
|No comorbidity causing a short life expectancy|
|No serious psychiatric disease|
|Able to complete a questionnaire|
A permutated-blocked randomization (blocks of 4), with stratification for center, age (<50 years and >50 years), and sex, made up by a random digit generator was used to allocate the patients either to a high-intensity exercise program (the Rheumatoid Arthritis Patients In Training [RAPIT] program) or to UC and to prevent unbalanced distribution. An administrative assistant, not aware of the block size, allocated the interventions.
The patients randomized to the RAPIT group participated in a supervised biweekly group exercise program of 1.25 hours each session. Overall, each session had 3 parts: “bicycle training” (20 minutes), “exercise circuit” (20 minutes), and “sport or game” (20 minutes). Each session was preceded by a “warm-up” and followed by a “cool-down.”
Bicycle load was based on 2 indicators: 1) heart rate during bicycling and 2) rating of perceived exertion (range 0–10). During training, the heart rate was kept at ∼70–90% of the predicted maximal heart rate, and the rating of perceived exertion was kept at 4–5.
The “exercise circuit” consisted of 8–10 different exercises intended to improve muscle strength, muscle endurance, joint mobility, and activities of daily living. The proportion of exercise duration/rest duration changed from 90 seconds/60 seconds in the first weeks of the program to 90 seconds/30 seconds after 6 months. Within the exercise circuit, each exercise was repeated 8–15 times.
The “sport or game” section of the program consisted of impact-delivering sporting activities such as badminton, volleyball, indoor soccer, and basketball. The impact loading was also applied during the “warming-up” (stepping aside and jumping) and “exercise circuit” sections of the program.
If necessary, the program was adapted to individual disabilities to reach the same aims. Patients assigned to the UC group were treated by a physical therapist only if this was regarded as necessary by their attending physician.
The attending physicians of patients in each group were informed about the treatment allocation. The physicians had free choice with respect to their medical prescriptions and other treatment strategies, including any form of additional individual physical therapy with the exception of high-intensity weight-bearing exercises. Attendance at any group or individual physical therapy sessions apart from the RAPIT program was recorded in both groups.
At baseline, sociodemographic characteristics were registered along with disease duration, presence of rheumatoid factor, number of disease-modifying antirheumatic drugs (DMARDs) taken since diagnosis except for the current DMARD (“past number of DMARDs”), and radiographic damage of the hands and feet (Larsen/Scott method) (24). The Larsen score of the hands and feet ranges from 0 (no joint space narrowing, no erosions) to 200 (maximal possible damage). The primary end point of effectiveness was functional ability; secondary end points were physical capacity and emotional status. The primary end point of safety was radiographic damage of the large joints, and the secondary end point was disease activity.
Functional ability was assessed with the McMaster Toronto Arthritis (MACTAR) Patient Preference Disability Questionnaire (25) and the Health Assessment Questionnaire (HAQ) (26, 27). The MACTAR Questionnaire is a semistructured interview consisting of transitional questions and status questions. Transitional questions concern perceived changes in disease activity and ability to perform previously impaired activities. These activities were elicited from and ranked by the patient at baseline and again at 1 year. The evaluation took place 6 months and 12 months after each elicitation. The change of weighted score from each baseline could vary from −38 (maximum deterioration) to +38 (maximum improvement) (28). The HAQ total score ranged from 0 (no functional limitations) to 3 (serious functional limitations). The change score could thus vary from −3 (maximal improvement) to +3 (maximal deterioration).
Physical capacity was determined by aerobic fitness and muscle strength. Aerobic fitness was measured by means of a standardized ergometer test and is given in watts (29). Muscle strength of the knee extensors was measured with an isokinetic dynamometer at an angle velocity of 60°/second and is given in newtons (3).
Emotional status was assessed with the Hospital Anxiety and Depression Scale (HADS). The total HADS score ranges from 0 to 42, and higher scores indicate higher levels of anxiety and/or depression (30).
Radiographic damage of the shoulders, elbows, hips, knees, ankles, and subtalar joints was scored independently by two experienced readers (HMK and ZdJ) using the Larsen method (31) without information about the time sequence, patient's identity, and group allocation. The Larsen score of the large joints (LLJ score) ranges from 0 (no joint space narrowing, no erosions) to 60 (maximal possible damage) and is presented as a mean of the scores by the two readers.
Disease activity was assessed with the original Disease Activity Score with 4 variables (DAS4) (32). The DAS4 is a compiled index based on the number of swollen joints, tender joint score (Ritchie Articular Index [RAI]), erythrocyte sedimentation rate (ESR), and patient's global assessment of general health measured on a visual analog scale. The DAS4 ranges from 0 (no disease activity) to 10 (severe disease activity).
The use of medication in the week preceding the visit (“current use of medication”) was registered at baseline and every 3 months along with information on whether participants had a paying job and how many hours per week they spent at this job. Outcome assessments were done at baseline and at 6, 12, 18, and 24 months. Disease activity and physical capacity were assessed every 3 months; radiographic damage of the large joints was assessed only at baseline and at 12 and 24 months.
All clinical outcome assessments were done by 4 research physical therapists who were trained thoroughly before the trial and after 1 year. A manual of procedures and assessment techniques was available in each center. A reproducibility study in 19 patients yielded intraclass correlation coefficients (ICCs) for aerobic fitness, muscle strength, swollen joint count, and RAI of 0.97, 0.98, 0.83, and 0.92, respectively. The ICC based on all readings by the readers of the radiographs of the large joints was 0.95; the mean ± SD difference in change scores after 2 years between the two readers was 0.030 ± 1.188.
Clinical outcome assessors were blinded to the treatment allocation and measures were taken throughout the trial to preserve blinding. The patients were instructed repeatedly not to discuss their treatment allocations with the assessor and were given tips on how to avoid unblinding. The rooms in which the assessments took place were located as far as possible from the training location. At the end of the last visit, the assessors were able to guess the treatment allocation correctly in 75% of participants.
There are no data on clinically relevant changes in the MACTAR Questionnaire score. The target sample size was based on the ability to detect a difference of 0.20 in the change in the HAQ score, which is assumed to be clinically relevant (33). Based on 0.9 power to detect a significant difference (2-sided P = 0.05) and assuming an SD of 0.5, we determined that 119 patients would be required for each study group. To compensate for an expected dropout rate of ∼20%, we planned to enroll at least 150 patients in each study group. As a threshold for relevant progression in radiographic damage of the large joints and a surrogate for clinically relevant increase in damage, we used the smallest detectable difference (SDD) of the change score calculated according to the method of Lassere et al (34). The analyses are based on intent to treat as initially assigned. All available data were used.
Measures with a Gaussian distribution are expressed as the mean ± SD, and measures with a non-Gaussian distribution are expressed as the median and interquartile range (IQR; expressed as the net result of 75th percentile − 25th percentile). Differences between the groups at baseline were analyzed by Student's unpaired t-test, Mann-Whitney U test, or chi-square test where appropriate. At each time point, changes from baseline were compared by analysis of variance (ANOVA) and are presented as the mean difference in change between the groups (95% confidence interval [95% CI]). All effect analyses were performed after correction for the baseline differences. To compare the effectiveness and safety over the total period of 2 years, repeated measures were analyzed with mixed-effects ANOVA models, with patient number as a random factor and treatment, time, and treatment × time interaction as fixed effects.