Dr. Simpson owns stock in sanofi-aventis. Ms Dorrier has received consulting fees (less than $10,000) from Aventis.
Reduction of the efficacy of methotrexate by the use of folic acid: Post hoc analysis from two randomized controlled studies
Article first published online: 30 SEP 2005
Copyright © 2005 by the American College of Rheumatology
Arthritis & Rheumatism
Volume 52, Issue 10, pages 3030–3038, October 2005
How to Cite
Khanna, D., Park, G. S., Paulus, H. E., Simpson, K. M., Elashoff, D., Cohen, S. B., Emery, P., Dorrier, C. and Furst, D. E. (2005), Reduction of the efficacy of methotrexate by the use of folic acid: Post hoc analysis from two randomized controlled studies. Arthritis & Rheumatism, 52: 3030–3038. doi: 10.1002/art.21295
- Issue published online: 30 SEP 2005
- Article first published online: 30 SEP 2005
- Manuscript Accepted: 23 JUN 2005
- Manuscript Received: 3 DEC 2004
- Aventis Pharmaceuticals, a member of the sanofi-aventis group
- Arthritis Foundation
- Scleroderma Foundation
To examine the effect of folic acid on the efficacy of methotrexate (MTX) treatment in rheumatoid arthritis (RA) at 12 months in 2 phase III randomized controlled trials (RCTs) of leflunomide in which MTX was used as a comparator.
Analyses were restricted to patients randomized to receive MTX who had rheumatoid factor data. The US study recruited 482 patients with active RA; 179 received at least 1 dose of MTX, and all were mandated to receive 1 mg of oral folic acid once or twice daily. The multinational European study recruited 999 patients with active RA; 489 received at least 1 dose of MTX, and oral folic acid was not required, although 50 received folate after developing an adverse event. Because of similar entry criteria for both studies, the data for patients with available primary outcome data at week 52 were pooled (n = 668), and the patients were grouped by folic acid use (n = 225) and nonuse (n = 443). To account for the significant between-study differences in the MTX groups, baseline covariates were adjusted using propensity scores so that folic acid users could be matched with nonusers. This allowed for a comparison of differences in American College of Rheumatology (ACR) 20% improvement criteria at week 52.
At study entry, non–folic acid users had a significantly lower mean body weight, shorter mean RA duration, and higher mean disease activity (measured by joint counts, patient's and physician's global assessments, and acute-phase reactant levels). The mean MTX dosage at week 52 was similar in the 2 RCTs. Using propensity score matching techniques, the proportion of patients achieving an ACR 20% response at week 52 averaged 17% higher in the non–folic acid group than in the folic acid group (range 15–21%). Similarly, the proportion of patients achieving ACR 50% and ACR 70% responses averaged 14% (range 12–16%) and 12% (range 9–14%) higher, respectively, in the non–folic acid group. Adverse events were reported in 93% of US study patients and 94% of the multinational study patients. Elevated liver transaminase levels (above the upper limit of normal) were reported in 29% of the US study patients (majority receiving folic acid) and 62% of the multinational study patients (majority not receiving folic acid).
After using propensity scores to adjust for differences in the baseline characteristics of folic acid users and non–folic acid users, 9–21% fewer MTX-treated RA patients taking folic acid had ACR 20%, 50%, or 70% improvement at 52 weeks compared with those who did not receive folic acid in the 2 phase III RA clinical trials. As a post hoc analysis, the results of this data analysis should be considered “hypothesis generating” and an impetus for future studies regarding the effects of folic acid on the efficacy of MTX in RA.
Rheumatoid arthritis (RA) is a chronic systemic inflammatory disorder of unknown etiology that primarily involves the joints. It may be remitting, but if uncontrolled, it may lead to deformity and destruction of joints as a result of cartilage and bone erosion. This symmetric disease often results in significant functional disability.
Methotrexate (MTX) has been advocated by the American College of Rheumatology (ACR) as the standard first-line disease-modifying antirheumatic drug (DMARD) treatment for severe RA (1). The use of MTX can be associated with side effects, and the concomitant use of daily folic acid reduces these side effects, especially the incidence of gastrointestinal intolerance and liver enzyme elevations (2–5). Controversy exists whether the use of concomitant folic acid also reduces the efficacy of the MTX (2, 4, 5). Daily folic acid treatment is routinely used in the US to help prevent the side effects of MTX, whereas in Europe, it is prescribed only to ameliorate an adverse effect (6, 7). For example, in 2 published phase III randomized controlled trials (RCTs) of leflunomide in which MTX was used as the comparator, one a US study (8) and the other a multinational European study (6), daily folic acid was mandated in the US study protocol (98% received daily oral folic acid), but was used only on an “as needed” basis in the multinational study (10% of MTX patients received folic acid).
PATIENTS AND METHODS
Subjects who participated in 2 randomized, double-blind, parallel, controlled phase III trials of leflunomide versus MTX in RA comprised the patient population. Both studies have been published in detail elsewhere (6, 8).
The US study (8) compared daily leflunomide (20 mg/day after a loading dose of 100 mg/day for 3 days) with placebo and weekly MTX (7.5–15 mg/week) at the primary end point of 52 weeks (8). The study recruited 482 patients with active RA, 182 of whom received at least 1 dose of MTX. The 179 patients who had rheumatoid factor (RF) data available form the group of interest for the present analysis. All patients were mandated to receive 1 mg of oral folic acid once or twice daily.
The multinational European trial (6) compared daily leflunomide (20 mg/day after a loading dose of 100 mg/day for 3 days) with weekly MTX (7.5–15 mg/week) at the primary end point of 52 weeks; there was no placebo group. The study recruited 999 patients with active RA, 498 of whom received at least 1 dose of MTX. The 489 patients who had RF data available form the group of interest for the present analysis. In this study, oral folic acid was given, generally at a dosage of 1–2 mg/day, only if patients developed drug-related side effects.
In both studies, all analyses were performed using the last observation carried forward (LOCF) method for missing data. Since the entry criteria for both studies were similar, the MTX patient data were pooled for the present analysis.
In the MTX arm of the US study, 52% of patients achieved an ACR 20% improvement response, as compared with 65% of patients in the multinational trial (6,7). Based on the comparability of MTX use and the disparity in folic acid use between the 2 studies, it has been speculated that folic acid use may reduce the efficacy of MTX in RA (6,9).
However, there are significant differences in the baseline characteristics of the patients in the MTX arms of the 2 studies and in the pooled data between those receiving folic acid and those not receiving folic acid (e.g., a longer mean disease duration, lower mean joint count, lower mean levels of acute-phase reactants in the US study), making the comparison between the ACR 20% responses at week 52 (primary end point of the RCTs) difficult to interpret (7). To account for the differences in the baseline covariates, we used propensity score adjustment (10, 11). Traditionally, the propensity score is the probability of being assigned a treatment, given the observed baseline characteristics (10, 11). The propensity score matching method is then used for generalized matching based on the baseline characteristics of the patients. In the present analysis, we used this matching method to examine the effect of folic acid, a surrogate for study locale, on the ACR response at the end of the study, after adjusting for the observed baseline differences.
Methods other than propensity scores may be used to adjust for baseline variations between 2 groups, such as linear regression including the baseline variables and matching on baseline factors. However, linear regression with multiple baseline values requires important model assumptions, in particular, that the independent and dependent variables are normally distributed. Also, matching multiple variables in specific patients results in very few exact matches. We therefore selected the propensity score approach for our analysis.
Patients from both studies who had primary outcome data at week 52 were grouped by folic acid use for the present study; 225 of the patients had received folic acid and 443 patients had not. Of the 680 patients in the 2 studies who had received at least 1 dose of MTX, the data analysis was limited to the 668 patients who had RF data. Propensity scores were calculated using the probit model to produce a normal distribution of propensity scores (Stata software, release 8.2; Stata, College Station, TX). Complete data for calculating propensity scores were available for 544 patients at the end of 52 weeks; radiographic data were available for only 180 patients in the folic acid group and 432 in the non–folic acid group. Small amounts of other data (e.g., Health Assessment Questionnaire [HAQ] disability index scores, swollen joint counts) were missing.
In the probit model analysis, a variant of logistic regression, the use of folic acid was the dependent or outcome variable and the following were used as independent variables: age at baseline visit, disease duration, body mass index (BMI), number of DMARDs used in the past, number of swollen joints, number of tender joints, C-reactive protein level, pain assessed using a 0–10-cm visual analog scale (VAS), patient's global assessment, physician's global assessment, baseline total Sharp score for radiographic damage (12), functional disability (using the modified HAQ for the US study and the HAQ for the European study), rheumatoid factor status, and use of nonsteroidal antiinflammatory drugs (NSAIDs) and corticosteroids.
Next, the fit of the model was assessed to see how well the probit model fit the data for the propensity scores. The goodness-of-fit test was examined for calibration of the results. For discrimination, the area under the receiver operating characteristic (ROC) curve was checked for the ability to maximize the differences between the true results and false results. Values of 0.7 and above are considered to have acceptable discrimination.
The third step in the propensity score matching process was to match subjects based on the computed scores. Since propensity scores range from 0 to 1 in a continuum, there are an infinite number of possible scores. Because of this infinite number of possible scores and the limited number of subjects, it is impossible to get exact matches for every patient treated. To match as closely as possible, one can use various matching techniques, including stratification matching, radius matching, nearest neighbor matching, and kernel matching (13). Since no method was chosen a priori, all matching results are reported in this analysis. Stratification matching involves partitioning ranges of the propensity score into blocks, such that treated and untreated subjects have the same average propensity score in the same block. Radius matching is based on measurements with calipers, which define a specific range (e.g., ±0.1) in which to match the propensity scores of treated and untreated subjects. The nearest neighbor matching method selects the closest match for the propensity scores of the treated and untreated groups, which relaxes the predefined caliper region of the radius matching technique. Kernel matching takes each of the propensity scores of the treated group and matches them with all of the weighted average propensity scores of the untreated group. These weights are inversely proportional to the distance between the propensity scores of treated and untreated subjects, such that closer matches have more weight than those that are farther away (13).
The propensity score matching methods proposed by Becker and Ichino (13) include the option to estimate the average treatment effect. Becker and Ichino called this the “average effect of treatment on the treated,” but we refer to this as the treatment effect for ease of understanding. The treatment effect is the average expected difference in outcome for folic acid users compared with nonusers, and represents the effect of a subject being treated versus not being treated. If not being treated results in a value of 0, then the treatment effect is the absolute change from 0, according to the model. For example, if a treatment effect is shown to be −0.20, then this means that, on average, a person taking folate would be 20% less likely to achieve an ACR response than a comparable person not taking folate. This is obtained by using the propensity scores to match a folic acid user to a nonuser or a weighted combination of nonusers, comparing their outcomes and averaging this value over all treated subjects.
Based on an ACR Annual Meeting abstract presented by the authors from the 2 leflunomide RCTs (14), the ACR response to MTX at 52 weeks was significantly different in patients with and without RF. To examine the effect of having seropositive RF, we also conducted a stratified analysis using propensity scores. We removed RF from the propensity model and, instead, compared propensity scores between RF-positive and RF-negative groups.
The tables in which treatment effects are reported show the 95% empirical confidence intervals (95% CIs), which were bias-corrected, derived from bootstrapping the sample with replacement (1,000 repetitions). The bootstrap method involves randomly sampling with replacement from the analyzed population to get a same-sized population, although some entries may be in the sample several times, while others are never included. A treatment effect is estimated for each bootstrap sample. After this is repeated 1,000 times to ensure randomness, the average treatment effect is estimated, along with its 95% CI.
An intent-to-treat (prophylactic versus therapeutic use of folic acid) analysis of adverse events by laboratory and symptom characteristics was performed, comparing the US and multinational study groups. Potential statistical differences in toxicity, comparing patients using folic acid prophylactically and those not using this supplement initially, were analyzed using the chi-square test and Fisher's exact test. For all of the analyses, P values less than 0.05 were considered statistically significant.
For the purpose of this analysis, MTX-treated patients from the US and multinational studies were classified solely according to their use of folic acid. Thus, all of the patients from the multinational study who were taking folic acid (n = 50) were combined with those from the US study (n = 175) to yield a total of 225 patients in the folic acid group. All of the patients in the multinational study who were not taking folic acid (n = 439) were combined with those from the US study (n = 4) to yield a total of 443 patients in the non–folic acid group.
Table 1 compares the baseline characteristics of the MTX-treated patients who entered the two 52-week studies as well as those who received and did not receive folic acid. Patients who did not receive folic acid had a significantly lower mean body weight, shorter mean duration of RA, and higher mean disease activity (as measured by joint counts, patient's and physician's global assessments, and levels of acute-phase reactants), as well as higher scores on the HAQ disability index and the Disease Activity Score in 28 joints (DAS28), and a greater incidence of RF positivity compared with the patients who received folic acid (P < 0.05) (Table 1).
|Characteristic||Study||Use of folic acid|
|US (n = 179)||Multinational (n = 489)||P||Yes (n = 225)||No (n = 443)||P|
|Age, mean ± SD years||53.2 ± 11.8||57.7 ± 10.8||<0.0001||54.6 ± 11.81||57.5 ± 10.9||0.002|
|Height, mean ± SD cm||166.5 ± 10.4||166.0 ± 9.1||0.57||166.5 ± 10.5||166.0 ± 8.9||0.54|
|Weight, mean ± SD kg||81.2 ± 19.2||71.5 ± 13.3||<0.0001||78.9 ± 18.7||71.7 ± 13.4||<0.0001|
|Body mass index, mean ± SD||29.4 ± 6.7||26.0 ± 4.6||<0.0001||28.6 ± 6.5||26.1 ± 4.6||<0.0001|
|Disease duration, mean ± SD months||78.2 ± 97.7||44.8 ± 41.9||<0.0001||72.5 ± 89.2||44.3 ± 42.6||<0.0001|
|No. of previous DMARDs, mean ± SD||0.9 ± 1.0||1.1 ± 1.2||0.0022||1.0 ± 1.0||1.1 ± 1.2||0.05|
|MTX dosage at last visit, mean ± SD mg/week||11.9 ± 3.7||11.9 ± 2.9||0.93||11.8 ± 3.5||12.0 ± 2.9||0.46|
|Corticosteroid use, %||54||65||0.0012||59||64||0.21|
|NSAID use, %||73||89||<0.0001||76||89||<0.0001|
|Tender joint count, mean ± SD||15.8 ± 6.9||17.7 ± 6.7||<0.0001||15.6 ± 6.6||18.0 ± 6.8||<0.0001|
|Swollen joint count, mean ± SD||13.0 ± 5.7||16.5 ± 5.9||<0.0001||13.5 ± 5.8||16.6 ± 5.9||<0.0001|
|Pain score on 10-cm VAS, mean ± SD||58.0 ± 22.3||58.7 ± 20.1||0.67||58.1 ± 21.5||58.8 ± 20.2||0.67|
|Patient's global assessment, mean ± SD||53.9 ± 23.0||65.5 ± 16.6||<0.0001||56.4 ± 22.1||65.5 ± 16.8||<0.0001|
|Physician's global assessment, mean ± SD||58.9 ± 16.8||64.4 ± 15.1||<0.0001||60.1 ± 16.5||64.3 ± 15.4||0.001|
|ESR, mean ± SD mm/hour||33.8 ± 25.3||51.6 ± 22.8||<0.0001||38.6 ± 26.4||51.2 ± 22.7||<0.0001|
|CRP, mean ± SD mg/liter||1.9 ± 1.9||4.1 ± 3.7||<0.0001||2.5 ± 2.8||4.0 ± 3.7||<0.0001|
|RF positive, %||60||77||<0.0001||64||77||0.0003|
|HAQ DI score, mean ± SD†||0.78 ± 0.50||1.06 ± 0.60||<0.0001||0.83 ± 0.50||1.07 ± 0.61||<0.0001|
|DAS28 score, mean ± SD||6.15 ± 1.09||7.04 ± 0.80||<0.0001||6.32 ± 1.06||7.05 ± 0.81||<0.0001|
|Erosion score, mean ± SD‡||8.1 ± 18.4||9.8 ± 18.1||0.34||9.0 ± 17.9||9.5 ± 18.3||0.77|
|Total Sharp score, mean ± SD‡||22.8 ± 39.0||24.9 ± 34.2||0.55||24.8 ± 37.4||24.2 ± 34.5||0.85|
|Sharp score >0, %‡||78||87||0.0087||82||86||0.19|
|Alcohol use, %||NA§||63||–||67||62||0.44|
|Cigarette smoking, %|
For the model used to estimate the propensity scores, the goodness-of-fit test showed a P value of 0.114, indicating that the model fit the data well. In addition, we noted that the area under the ROC curve was 0.78, closely approaching 0.8, the level of excellent discrimination. These findings indicate that the model performed well by standards of calibration and discrimination. We proceeded to use this model to assess the effect of folic acid use on ACR improvement responses.
If there was no adjustment for baseline differences, a statistically significantly higher (P < 0.0001) percentage of MTX-treated patients who were not taking folic acid achieved an ACR 20% response (291 of 443 [65.7%]) compared with the MTX-treated patients who were taking folic acid (109 of 225 [48.4%]) (Table 2).
|Outcome||% of patients taking folic acid (n = 177)||% of patients not taking folic acid (n = 367)||Treatment effect||Standard error||95% CI for treatment effect|
|With propensity score adjustment and modeling†|
|Pr(ACR 20% improvement)||57.3||61.5||−0.16‡||0.05||−0.26, −0.07|
|Pr(ACR 50% improvement)||35.9||39.4||−0.13‡||0.04||−0.22, −0.04|
|Pr(ACR 70% improvement)||12.8||16.4||−0.09‡||0.03||−0.15, −0.03|
|Without propensity score adjustment or modeling|
|ACR 20% improvement||48.4||65.7||−0.17§||0.05||−0.26, −0.08|
|ACR 50% improvement||27.1||43.6||−0.17§||0.04||−0.25, −0.09|
|ACR 70% improvement||8.0||17.8||−0.10§||0.03||−0.16, −0.04|
The mean ± SD propensity score for the 544 patients with complete model data (see Patients and Methods) was 0.33 ± 0.22, with a range of scores from 0.05 to 0.99. The distribution of propensity scores in the 327 ACR 20% responders (0.31 ± 0.21), was statistically significantly different from that in the 217 nonresponders (0.35 ± 0.24) (P = 0.04). This indicates that even after adjusting for the many baseline differences between MTX-treated patients who took folic acid and those who did not, the ACR 20% response rates in the 2 groups were statistically significantly different.
The treatment response outcome was calculated with the LOCF method at week 52 (primary end point of the original trials) using the ACR improvement criteria. The estimated average treatment effect of folic acid on the ACR 20% response to MTX ranged from −0.15 to −0.21 (depending on the matching method used, i.e., stratification, radius, nearest neighbor, kernel), which means that folic acid use reduced the probability of an ACR 20% response by 15–21% (Figure 1).
Table 2 provides the ACR responses and treatment effects, adjusted and unadjusted for baseline differences using propensity scores radius method. For the 544 patients with data for propensity scoring, the treatment effects were statistically significant with or without adjustment at the ACR 20%, 50%, and 70% response levels. For example, using the radius matching method, the adjusted treatment effects were −0.16, −0.13, and −0.09, respectively, meaning that folic acid use reduced the probability of ACR 20%, 50%, and 70% responses by an average of 16%, 13%, and 9%, respectively (P < 0.05). Taking these same patients, without adjusting for baseline differences, their average folic acid treatment effects were −0.17, −0.17, and −0.10, respectively (P < 0.05). With this adjustment, the differences between the 2 folic acid use groups for patients achieving ACR responses are slightly smaller than without adjustment. However, the overlapping 95% CIs with the adjusted and unadjusted treatment effects at each ACR improvement level show that the propensity score adjustment results support what would have been found had we not adjusted for the baseline covariates.
The average treatment effects of folic acid on achieving an ACR 20% response with MTX were further stratified by the presence (n = 399) and the absence (n = 145) of RF. Of 399 RF-positive patients, 281 were not taking folic acid and 118 were taking folic acid. In the 145 RF-negative patients, 86 were not taking folic acid and 59 were taking folic acid. The estimated mean folic acid treatment effect on achieving an ACR 20% response to MTX for the RF-negative group ranged from −0.28 (for radius matching) to −0.44 (nearest neighbor matching), which means that the probability of an ACR 20% response with MTX was 28–44% lower among RF-negative RA patients who were taking folic acid. Similarly, the estimated treatment effect for ACR 20% improvement response (associated with folic acid use) in the RF-positive group ranged from −0.07 (kernel matching) to −0.14 (nearest neighbor matching). This suggests that the use of folic acid with MTX was associated with a much lower ACR 20% response rate in the RF-negative group than in the RF-positive group. However, these folic acid treatment effects were not statistically significant because of the smaller sample sizes available for analysis after stratification by RF status.
The data were reanalyzed using the nonresponder imputation method, in which those who were study noncompleters for any cause at week 52 were defined as treatment failures (ACR improvement nonresponders). In this analysis, the proportions of subjects achieving ACR responses were somewhat smaller, but the effects of folic acid treatment on the responses to MTX were similar for the LOCF and nonresponder imputation analyses. These data are presented in Table 3.
|Outcome||% of patients taking folic acid (n = 225)||% of patients not taking folic acid (n = 443)||Treatment effect||Standard error||95% CI for treatment effect|
|Last observation carried forward analysis|
|ACR 20% improvement||48.4||65.7||−0.17||0.04||−0.25, −0.09|
|ACR 50% improvement||27.1||43.6||−0.17||0.04||−0.25, −0.09|
|ACR 70% improvement||8.0||17.8||−0.10||0.02||−0.14, −0.06|
|Nonresponder imputation analysis|
|ACR 20% improvement||37.8||57.6||−0.20||0.04||−0.28, −0.09|
|ACR 50% improvement||22.2||40.0||−0.18||0.04||−0.26, −0.10|
|ACR 70% improvement||6.2||17.2||−0.11||0.02||−0.16, −0.06|
Ninety-three percent of patients in the US study and 94% in the multinational study reported an adverse event. As expected, the patients receiving prophylactic therapy with folic acid (US study) had a lower incidence of liver function abnormalities (Table 4). Elevated (above 1 times the upper limit of normal) levels of alanine aminotransferase and aspartate aminotransferase were seen in 29.1% and 20.1%, respectively, of the patients in the US study (prophylactic folic acid) and in 62.87% and 47.21%, respectively, of the patients in the multinational study (folic acid given only after an adverse event was noted).
|Adverse event||Study*||Use of folic acid†|
|US study patients taking folic acid||Multinational study patients||P, US versus multinational folic acid groups|
|US||Multinational||P||Taking folic acid||Not taking folic acid||P|
|No. of patients||166||461||–||163||49||412||–||–|
|Infection (no serious infections)||36.75||30.37||0.13||35.58||36.73||29.61||0.31||0.88|
|No. of patients||179||489||–||175||50||439||–||–|
|ALT above ULN, %||29.05||61.55||<0.0001||29.14||50.00||62.87||0.08||0.006|
|AST above ULN, %||20.11||46.63||<0.0001||20.00||38.00||47.21||0.20||0.009|
|Alk. phos. >1.2 times ULN, %||5.59||18.00||<0.0001||5.71||30.00||16.63||0.02||<0.0001|
|Creatinine above ULN, %||5.03||7.16||0.33||5.14||12.00||6.61||0.16||0.09|
|WBCs <3.5 × 109/liter, %||6.70||6.75||0.98||6.86||6.00||6.83||1||1|
|Platelets <100,000 × 109/liter, %||0||0.20||1||0||0||0.23||1||–|
Surprisingly, the US study group had a higher incidence of symptomatic side effects (76.1% versus 67.7%), including a higher incidence of diarrhea (21.08% versus 14.53%), headache (22.89% versus 11.28%; P < 0.05), and oral ulcers (10.8% versus 2.3%; P < 0.06) compared with the multinational study group, P < 0.05 (Table 4). The dosage of MTX was similar among patients with and without oral ulcers (mean ± SD 0.17 ± 0.06 mg/kg versus 0.16 ± 0.04 mg/kg; P = 0.9). All multinational study patients who took folic acid had been prescribed the medication for presumed treatment-related side effects. Thus, the incidence of all of the adverse events except dyspepsia and headache was higher in the multinational group receiving folic acid than in the US group receiving folic acid.
There has been great interest as well as controversy regarding the role of folic acid and its impact on the efficacy of MTX therapy in RA. Much has been published regarding the difference in the ACR 20% responder rates in the MTX-treated groups in the 2 phase III RCTs of leflunomide in which MTX was used for comparison (6, 7, 9). In the MTX arm of the US study (98% of patients received daily folic acid), 52% of the patients achieved an ACR 20% response, as compared with 65% in the multinational study (10% of patients received folic acid and only after experiencing an adverse event). The mean ± SD dosage of MTX at the end of 52 weeks was not statistically significantly different between the studies: 11.9 ± 3.7 mg/week in the US study versus 11.9 ± 2.9 mg/week in the multinational study (P = 0.9). The potential effect of folic acid on the efficacy of MTX has been both supported (6, 9) and refuted (3,7) in analyses of these studies.
Two randomized placebo-controlled trials comparing folic acid with placebo in RA patients being treated with MTX have been conducted (2, 4). The study by van Ede et al (2) used daily oral folic acid (1 mg) versus placebo in RA patients, with the primary objective of evaluating toxicity. All patients were started on 7.5 mg/week of MTX, and the dosage was increased by 2.5 mg every 6 weeks until a decrease of at least 1.08 in the DAS28 was achieved. In the efficacy evaluation at the end of the 48-week trial, the mean ± SD dosage of MTX in the placebo arm (n = 137) was 14.5 ± 5.8 mg/week and the mean ± SD change in DAS28 achieved was 1.54 ± 1.03. In contrast, the folic acid group (n = 133) was taking a statistically significantly higher dosage of MTX at 18.0 ± 5.5 mg/week and had a DAS28 change of 1.73 ± 1.06. Given the higher dosage of MTX in the folic acid group at the end of the study, the authors concluded “that folates decrease the efficacy of MTX to some extent and that as a result, higher dosages of MTX are needed for the same clinical response.” The study by Morgan et al (4) used weekly oral folic acid (5 mg) versus placebo in RA patients in whom MTX therapy was initiated in a 12-month double-blind RCT. No statistical difference in the efficacy outcomes, defined as an ordinal response of marked, moderate, or no improvement based on a composite score of joint swelling, tenderness, and pain indices, was shown. The mean ± SD MTX dosage was numerically, but not statistically, higher at the end of the study in the folic acid group (n = 23) compared with the placebo group (n = 19) (9.4 ± 2.4 versus 8.5 ± 1.5 mg/week).
The findings of our post hoc analysis using propensity scoring to adjust for baseline differences in a much larger group of patients suggest that the use of folic acid with MTX reduces the proportion of patients who achieve an ACR 20% improvement response by 15–21% at week 52, depending on the propensity matching method used. The results were consistent when comparing ACR 50% and ACR 70% improvement responses. Moreover, there was a statistical trend toward a much lower ACR 20% response rate in the RF-negative group taking folic acid (28–44% lower response rate) compared with the RF-positive group taking folic acid (7–14% lower response rate). This may suggest a preferential response to MTX in patients with active RA based on their RF status; however, we do not know the reason for this difference. Attribution of adverse events according to use or nonuse of folic acid is distorted in this analysis because the multinational study patients were not given folic acid until they experienced an adverse event. Although we believe it is acceptable to pool the US and multinational folic acid groups for the efficacy analysis, the most appropriate analysis of adverse events is by treatment strategy, that is, all patients treated prophylactically with folic acid to prevent adverse events versus no patients treated prophylactically (but therapeutic folic acid permitted after an adverse event occurs, as in the multinational study).
The incidence of adverse events noted on laboratory studies was higher for patients who did not receive prophylactic folic acid. There was a statistically higher incidence of clinically significant (defined as greater than the upper limit of normal) elevations in the levels of aspartate aminotransferase, alanine aminotransferase, and alkaline phosphatase. These observations are consistent with the published literature (2, 4). Surprisingly, the incidences of diarrhea and dyspepsia were higher in the group taking folic acid, even though NSAID use was higher in the group not taking folic acid. These observations are in contrast to some of the published literature. For example, a meta-analysis by Ortiz et al (5) found a 79% reduction in mucosal and gastrointestinal side effects with folic acid use compared with placebo use. Van Ede, however, did not find any statistical difference in the incidence of gastrointestinal side effects between the 2 groups, although gastrointestinal adverse events were numerically higher in the placebo arm. We expected fewer adverse events in the US versus the multinational patients taking folic acid because folic acid was given only after an adverse event occurred in the multinational study. Therefore, the comparison of the US versus the multinational study measures how prophylactic folic acid reduces the occurrence of adverse events.
Diarrhea, dyspepsia, and headache were significantly higher in the US study (Table 4), in which 98% of the patients (175 of 179) were taking folic acid. On average, the use of folic acid did not decrease the overall occurrence of symptomatic adverse events, but it substantially decreased the occurrence of abnormal liver function test results. The difference in the number of adverse events could possibly be related to different cultural tendencies to report side effects, rather than to a true difference. The possibility of differential rates of reporting adverse events seems unlikely, since in the multinational study, 460 of 489 patients (94%) reported at least 1 adverse event, and in the US study, 166 of 179 patients (93%) reported at least 1 adverse event (P = 0.53).
There are limitations to our study. First, the effect of folic acid on the efficacy of MTX was not an a priori hypothesis of the 2 RCTs we analyzed. Moreover, as pointed out by Strand et al (7), comparison of the 2 studies is difficult because of differences in disease duration, higher use of NSAIDs in the multinational study, and lack of a placebo arm in the multinational study. Using propensity scoring, we were able to match patients according to disease duration, use of NSAIDs, and the other baseline measures noted by Strand et al, as well as additional factors, and make comparisons between patients with similar covariates. We could not, however, control for the lack of a placebo arm in the multinational study.
In conclusion, using propensity scores to adjust for baseline differences, we found that patients taking folic acid in the 2 phase III RA clinical trials had lower ACR responses at 52 weeks compared with patients who did not take folic acid and that prophylactic use of folic acid was associated with less frequent abnormal liver function test results. As with any post hoc analysis, the results of this data analysis should be considered “hypothesis generating” and be used as an impetus for future studies regarding the effect of folic acid on the efficacy of MTX in RA.
- 2Effect of folic or folinic acid supplementation on the toxicity and efficacy of methotrexate in rheumatoid arthritis: a forty-eight–week, multicenter, randomized, double-blind, placebo-controlled study. Arthritis Rheum 2001; 44: 1515–24., , , , , , et al.
- 12Leflunomide Rheumatoid Arthritis Investigators Group. Treatment with leflunomide slows radiographic progression of rheumatoid arthritis: results from three randomized controlled trials of leflunomide in patients with active rheumatoid arthritis. Arthritis Rheum 2000; 43: 495–505., , , , , on behalf of the
- 13Estimation of average treatment effects based on propensity scores. Stata J 2002; 2: 358–77., .
- 14Does folic acid (FA) decrease the efficacy as well as the toxicity of methotrexate (MTX) in rheumatoid arthritis (RA)? [abstract]. Arthritis Rheum 2001; 44 Suppl 9: S373., , , , .