To evaluate the effect of infliximab on progression of structural damage over 2 years in patients with ankylosing spondylitis (AS).
To evaluate the effect of infliximab on progression of structural damage over 2 years in patients with ankylosing spondylitis (AS).
In the Ankylosing Spondylitis Study for the Evaluation of Recombinant Infliximab Therapy (ASSERT), a randomized, double-blind, placebo-controlled trial of the efficacy of infliximab compared with placebo, 279 patients with active AS received either placebo through week 24 and then infliximab 5 mg/kg from week 24 through week 96 (n = 78) or infliximab 5 mg/kg from baseline through week 96, administered every 6 weeks after a loading dose (n = 201; these patients were the focus of the radiographic analyses). Radiographic findings in patients from the ASSERT trial were indistinguishable from those in a historical control cohort of patients who had no prior use of anti–tumor necrosis factor agents (from the Outcome in Ankylosing Spondylitis International Study [OASIS] database; n = 192). Radiographic progression of structural damage from baseline to the 2-year followup was scored using the modified Stoke Ankylosing Spondylitis Spine Score (mSASSS). All images were scored in one batch.
Median changes in the mSASSS from baseline to year 2 were 0.0 for both the OASIS and the ASSERT cohorts (P = 0.541). Mean changes in the mSASSS were also similar between the OASIS and ASSERT cohorts (mean ± SD change over 2 years 1.0 ± 3.2 and 0.9 ± 2.6, respectively). In addition, results from sensitivity analyses did not show a statistically significant difference in the mSASSS between the OASIS and ASSERT cohorts.
AS patients who received infliximab from baseline through week 96 did not show a statistically significant difference in inhibition of structural damage progression at year 2, as assessed using the mSASSS scoring system, when compared with radiographic data from the historical control OASIS cohort. Improvements in clinical outcomes and spinal inflammation have been previously demonstrated with the use of infliximab therapy.
Ankylosing spondylitis (AS) is a chronic, progressive inflammatory disease characterized by inflammatory back pain, due to sacroiliitis, spondylitis, and enthesitis, that affects young men and women, commonly starting in the second and third decades of life (1). Traditional therapies for AS, including nonsteroidal antiinflammatory drugs (NSAIDs), disease-modifying antirheumatic drugs, and physical therapy, have limited efficacy (2, 3). In contrast, the anti–tumor necrosis factor (anti-TNF) agents infliximab (3, 4), etanercept (5, 6), and adalimumab (7) have shown strong clinical efficacy in short- and intermediate-term evaluations.
Recently, the effects of infliximab were evaluated in 279 patients with AS in the Ankylosing Spondylitis Study for the Evaluation of Recombinant Infliximab Therapy (ASSERT), a multicenter, randomized, double-blind, placebo-controlled trial of the efficacy of infliximab (Remicade) compared with placebo in patients with AS receiving standard antiinflammatory drug therapy. At week 24 of treatment with 5 mg/kg infliximab, administered once every 6 weeks, infliximab was found to be well tolerated and significantly more effective than placebo in improving the signs and symptoms of disease (4). In a substudy of 266 patients with AS, images of the spine obtained by magnetic resonance imaging (MRI) at baseline and at week 24 were evaluable, and patients who received infliximab showed a significant decrease in spinal inflammation as detected by MRI, compared with patients receiving placebo (8). This reduction was still present at the 2-year followup (9).
Whereas MRI evaluations are useful in assessing active spinal lesions, chronic spinal changes in AS are mainly assessed by conventional radiography (10). The modified Stoke Ankylosing Spondylitis Spine Score (mSASSS) has been identified as the most sensitive scoring method when evaluating chronic spinal changes on conventional radiographs (10, 11). In AS, spinal radiographic progression develops relatively slowly and may be detectable only after a minimum of 2 years in a considerable number of patients. Thus, it is ethically difficult to justify a radiographic study in AS that prospectively incorporates a control arm for 2 years, if patients in the active treatment arm already show major clinical improvement shortly after the TNF blocker has been initiated (12).
As a result, the radiographic data obtained from the ASSERT trial were compared with the radiographic data obtained in a prospective followup study, the Outcome in Ankylosing Spondylitis International Study (OASIS), with the provision that the radiographs were scored in one batch and in a blinded manner. Results of these comparisons of mSASSS findings are presented herein.
Criteria for inclusion and exclusion of patients in the ASSERT study have been described previously (4). In this double-blind, placebo-controlled clinical trial, patients were randomly assigned in a 3:8 ratio to 1 of 2 treatment groups. Patients in 1 group received placebo through week 24 and then infliximab 5 mg/kg from week 24 through week 96 (designated the placebo/infliximab group; n = 78), and patients in the other group received infliximab 5 mg/kg from baseline through week 96 (designated the infliximab 5 mg/kg group; n = 201). Patients in the latter group had their dose increased to 7.5 mg/kg starting with the week 36 infusion based on their disease activity (designated the infliximab 7.5 mg/kg group). Infusions were administered at weeks 0, 2, and 6 as the initial loading dose and then every 6 weeks through week 96. Radiographs of the lateral cervical and lumbar spine were obtained from patients at baseline and at year 2 or at the time of withdrawal from the ASSERT trial if a patient was withdrawn from treatment after the week 54 infusion. The 201 patients who were originally assigned to receive infliximab 5 mg/kg were the focus of these radiographic analyses.
The OASIS database is a prospectively defined and prospectively collected data set that contains information on ∼200 patients with AS, who received the best standard of care available at the time of their visits to multiple international centers. None of the OASIS patients were treated with anti-TNF therapy during the first 4 years of data collection. Information collected included, among other variables, radiographic images of the lateral cervical and lumbar spine, demographic data, and indices of pain and spinal mobility. Radiographic data obtained from OASIS patients at baseline and after 2 years of followup are included in the present analysis. All patients included in the OASIS cohort who had complete radiographs but did not have complete spinal ankylosis (n = 192) were the primary population used for comparison in the present study. In addition, a subset of patients from the OASIS database who would have satisfied the ASSERT eligibility criteria were identified; these 70 patients were termed the OASIS match population.
Lateral radiographic views of the cervical and lumbar spine were scored using the mSASSS scoring system (10). The total mSASSS (13) is the sum (range 0–72) of the numerical scores for the anterior site of the cervical spine from the lower border of C2 to the upper border of T1, and the anterior site of the lumbar spine from the lower border of T12 to the upper border of S1. Each corner of the vertebrae is scored as follows: 0 = normal; 1 = presence of erosions, sclerosis, or squaring; 2 = presence of syndesmophytes; and 3 = presence of bridging syndesmophytes.
Investigators at Bio-Imaging Technologies (Newtown, PA and Leiden, The Netherlands) digitized all radiographs from the OASIS and ASSERT cohorts; blinding was done to conceal patient identity as well as the origin and time point of each radiograph. All radiographs were mixed and transferred electronically, in one batch, to the 2 readers. The serial radiographs were grouped per patient, with blinding for time sequence. Two independent readers (XB and HH) and an adjudicator (AvT) who were unaware of the OASIS radiographic data met to discuss scoring rules, to improve the consistency in scoring of radiographic changes. The 2 readers performed the independent assessment of the radiographs for the entire population (ASSERT and OASIS). The adjudicator only evaluated the data from patients whose “change score” differed between the readers by a predefined threshold of at least 5 units.
The total mSASSS at each time point was calculated. The difference between the scores at 2 time points (for example, the 2-year mSASSS minus the baseline mSASSS) was defined as the change score for that patient. The proportions of patients exhibiting at least 1-point, 2-point, 3-point, and 4-point increases (worsening) in the mSASSS from baseline to year 2 were also determined.
All requirements for inclusion in the study populations were prespecified. Change in total mSASSS from baseline to year 2 was compared between patients in the OASIS cohort and patients in the ASSERT cohort who were randomly assigned to the infliximab arm at baseline, with baseline mSASSS used as a covariate. This end point of change in mSASSS in the ASSERT cohort was analyzed using analysis of covariance, with the van der Waerden test to determine normally distributed scores.
To evaluate the consistency of the overall mSASSS results, the median difference and 95% confidence interval were determined for subgroups that had been prespecified in the analysis plan, including those based on demographics (sex, age), disease characteristics at baseline (HLA–B27 status), and measures of disease activity at baseline (C-reactive protein level, Bath Ankylosing Spondylitis Disease Activity Index [BASDAI] , Bath Ankylosing Spondylitis Functional Index [BASFI] , Bath Ankylosing Spondylitis Metrology Index  [scale of 0–10 for each Bath Index], and patient's global assessment of disease activity on a 0–100-mm visual analog scale). In addition, sensitivity analyses were conducted to evaluate the robustness of the data comparisons between the 2 patient cohorts. In particular, changes from baseline mSASSS to year 2 mSASSS, using the baseline mSASSS as a covariate, were assessed using the OASIS match population for comparison with the ASSERT population.
Using the main scores from the 2 readers for the radiographs from all patients and the reread radiographic data from 10% of the randomly selected patients, an interreader intraclass correlation coefficient (ICC) and a read–reread (intrareader) ICC were estimated for determinations of the mSASSS at baseline and at year 2. Also, changes in the mSASSS from baseline to year 2 were compared between readers.
All statistical tests were 2-sided and were performed at an alpha level of 0.05, with no statistical adjustments for multiple comparisons. All analyses and summaries were conducted using SAS software, version 8.2 (SAS Institute, Cary, NC).
Characteristics of the patients in the OASIS primary population were generally similar to those of the patients in the ASSERT population, with the following exceptions. Higher proportions of patients in ASSERT than in OASIS were taking NSAIDs (92.0% versus 75.5%), had uveitis (35.8% versus 14.6%), and had psoriasis (8.0% versus 4.7%) at baseline. Also, patients in the ASSERT cohort, compared with the OASIS primary population, appeared to have greater disease activity and more limited physical function at baseline (mean BASDAI 6.5 versus 3.5; mean BASFI 5.7 versus 3.5) (Table 1). The mSASSS values at baseline were similar among the groups (Table 2). As expected, characteristics of the patients in the OASIS match population more closely resembled those of the patients in the ASSERT population at baseline (Table 1).
|ASSERT cohort (n = 201)†||OASIS cohort‡|
|Primary (n = 192)||Match (n = 70)|
|Male||157 (78.1)||130 (67.7)||47 (67.1)|
|Mean ± SD||39.6 ± 10.6||44.0 ± 12.5||44.2 ± 12.5|
|Disease duration, years|
|Mean ± SD||10.2 ± 8.7||11.3 ± 8.6||9.9 ± 8.8|
|HLA–B27 positive§||173 (86.5)||152 (84.0)||59 (84.3)|
|Concomitant NSAID use||185 (92.0)||145 (75.5)||56 (80.0)|
|BASDAI (range 0–10)|
|Mean ± SD score||6.5 ± 1.5||3.5 ± 2.1||5.7 ± 1.3|
|BASFI (range 0–10)|
|Mean ± SD score||5.7 ± 1.9||3.5 ± 2.6||4.9 ± 2.3|
|BASMI (range 0–10)|
|Mean ± SD score||4.1 ± 2.1||3.0 ± 2.3||3.4 ± 2.4|
|Uveitis at baseline||72 (35.8)||28 (14.6)||11 (15.7)|
|Psoriasis at baseline||16 (8.0)||9 (4.7)||3 (4.3)|
|ASSERT cohort||OASIS cohort†|
|No. of patients in subset||201||192||70|
|Radiographic mSASSS (range 0–72)|
|Mean ± SD score||17.7 ± 17.9||15.8 ± 18.1||17.5 ± 19.1|
|Mean ± SD score||18.1 ± 17.5||16.6 ± 18.4||18.4 ± 19.0|
|Change in mSASSS‡|
|No. of patients evaluated||156||165||61|
|Mean ± SD change score||0.9 ± 2.6||1.0 ± 3.2||1.2 ± 3.9|
|Median change score||0.0||0.0||0.0|
|Interquartile range||−0.5, 1.2||0.0, 1.3||−0.2, 1.5|
|Range||−6.6, 12.2||−3.1, 25.7||−2.3, 25.7|
|P vs. ASSERT||–||0.541||0.683|
Median changes in the mSASSS from baseline to year 2 were 0.0 for both the OASIS and the ASSERT cohorts (P = 0.541). Mean changes were also similar between the cohorts (Table 2 and Figure 1A). In addition, the results of all subgroup analyses showed no statistically significant differences in the mSASSS between the cohorts (Figure 2).
The results from all sensitivity analyses performed, including those using the OASIS match population for comparison (Table 2 and Figure 1B) as well as the analysis of results without the use of adjudication (n = 15), confirmed the lack of a significant difference in the mSASSS between the OASIS and the ASSERT cohorts. In addition, results of post hoc analyses that controlled for patient's age and disease duration showed that these variables had no effect on the difference in the mSASSS between the OASIS and ASSERT cohorts (P = 0.65).
Approximately one-third of patients in both the ASSERT cohort (34.0%) and the OASIS cohort (35.2%) had at least a 1-point increase (worsening) in the mSASSS from baseline to year 2. The study populations were also similar with regard to the proportions of patients who had at least a 2-point increase (19.9% versus 17.6%), 3-point increase (14.7% versus 10.3%), and 4-point increase (10.9% versus 7.3%) from baseline. In all cases, the observed differences between the infliximab and placebo treatment groups were not significant.
We observed a high correlation of 0.91 between the 2 readings for the week 0 measurement of total mSASSS, a high correlation of 0.92 between the 2 readings for the year 2 measurement of total mSASSS, and a moderately high correlation of 0.62 for the assessment of change in mSASSS from baseline. Data analyzed by each reader independently yielded results that were consistent with the primary analysis results expressed as the average of the 2 readings.
In the present evaluation of radiographic data from the ASSERT study, in which we used the mSASSS scoring system to determine changes in structural damage to the spine, patients with AS who received infliximab from week 0 through week 96 did not show a statistically significant difference in inhibition of structural damage progression at year 2 when compared with the historical control OASIS cohort. The OASIS population was utilized as a historical control group for the ASSERT study because radiographic progression in the spine develops relatively slowly and may be detectable only after a minimum of 2 years in a substantial number of patients, and therefore maintaining a placebo control group for 2 years is not appropriate, particularly in view of the established efficacy of anti-TNF therapy in improving the signs and symptoms of AS (11). Specifically, sustained efficacy has been shown after 2 years of infliximab treatment in 279 patients with active AS, with more than 70% of patients achieving an Assessment of SpondyloArthritis (international Society) criteria for 20% improvement response by week 102 (17).
Of particular relevance, the comparison with a historical control group was made under a special condition, whereby the OASIS radiographs were randomized and merged with the ASSERT radiographs. Subsequently, all radiographs were scored simultaneously without indication of film origin. In addition, the study was sufficiently powered (73%) to detect a difference of 0.8 in the mSASSS between 2 groups with sample sizes of 177 (ASSERT cohort receiving infliximab only) and 176 (OASIS primary cohort) and a standard deviation of 2.9. However, some limitations to these analyses remain, including the fact that the 2 cohorts of patients were not randomized as one unit, and that patients in the OASIS and ASSERT groups differed at baseline in several clinical disease indices. However, results of the many subgroup analyses performed did not show inhibition of radiographic progression in the infliximab-treated patients.
Moreover, the only known predictor of radiographic progression is the presence of structural damage, which was included as a covariate in this analysis. Although it is known that physical function is determined by both disease activity and structural damage, it is unknown what effect a small progression in the mSASSS over a 2-year period could have on long-term function. Furthermore, TNF blockers impart major direct gains on both disease activity and function.
As an additional limitation, the mSASSS method of scoring radiographic images assesses the anterior sites of the lumbar and cervical spine (10). Thus, only part of the structural damage that may occur in a patient with AS is being evaluated, with exclusion of the posterior sites of the vertebrae, the facet joints, and the thoracic spine. Nevertheless, the validated mSASSS is the preferred method and accepted standard for scoring radiographic progression in AS (10, 18). Because this was applied to both the active treatment and control groups, it is unlikely that our results were affected by the lack of assessment of other potential sites of progression.
With regard to reader agreement, we observed high correlations (0.91 and 0.92, respectively) between the 2 readings for both the week 0 and year 2 total mSASSS, and a moderate correlation (0.62) between the 2 readings for the change in total mSASSS from baseline. The relatively lower ICC of change scores is well known and is inherent in the use of the ICC statistic, which is not always the most appropriate statistic if only small change scores in a minority of patients are observed. These findings were consistent with the reader agreement previously observed in studies using MRI-based scoring methods in AS patients (19). Moreover, the analysis of the data by each reader separately revealed the same results as when results were analyzed as the average score from the 2 readers.
Results from our analysis were consistent with those from a previous study of infliximab in AS that showed limited inhibition of radiographic progression, as assessed by the mSASSS scoring system, after 2 years of therapy (1). In that previous study, the mean ± SD change in the mSASSS from baseline to year 2 was 0.4 ± 2.7 among 41 patients who received infliximab 5 mg/kg every 6 weeks, compared with 0.7 ± 2.8 in the 41 patients who received no controlled intervention (P not significant). It should be noted that in both the previous study and the present analysis, patients' radiographs were blinded for time sequence in the process of mSASSS scoring, a reading method that has been associated with documentation of significantly less progression, i.e., less reader bias, than when readers are aware of the chronology of the radiographs (20). However, the origin of the treatment group (and therefore the treatment) was known to the reader.
In a third study that assessed the effect of infliximab on structural changes in AS over 4 years, using the mSASSS scoring method, there was less radiographic progression in the infliximab-treated patients (change in mSASSS within 4 years 1.6) as compared with published data from the OASIS cohort (change in mSASSS within 4 years 4.4) (21). However, in that study, mSASSS values for the control group were derived from published OASIS data and were scored chronologically, and the cohorts were each scored by a different reader. Moreover, both readers were aware of the patients' treatment. Taken together, these factors limit the usefulness of the results of these preliminary data derived in the context of a less rigorous study design without a direct comparator.
In a study utilizing MRI assessments to monitor disease progression, treatment with infliximab was shown to result in improvement in spinal inflammation in patients with AS (22). Similarly, treatment with the anti-TNF agent etanercept was found, by MRI, to lead to improvement in spinal inflammation in patients with AS; however, inhibition of radiographic progression was not detected following treatment with etancercept in these patients (23–25). Moreover, the disease progression observed in both the etanercept treatment group and the OASIS control group in that trial was consistent and numerically similar to the progression that we observed in patients treated with infliximab in this study, validating the comparison between the ASSERT (infliximab only) and OASIS populations (24).
The pathologic characteristics of AS may explain why anti-TNF therapy does not appear to have a major effect on radiographic progression. The hallmark of AS is bone formation rather than bone resorption, leading to the fusion of joints and intervertebral spaces and to the development of bridging syndesmophytes (26). In predominantly destructive diseases like rheumatoid arthritis, the proinflammatory effects of TNF coincide with osteoclast activation and subsequent bone erosion, and there is a clear longitudinal relationship between clinical disease activity and subsequent radiographic damage (27, 28). In contrast, data from the OASIS cohort have demonstrated no relationship between radiographic progression and clinical disease activity parameters in AS (26). Thus, although anti-TNF therapy plays an important role in treating AS by significantly improving the clinical symptoms, effective targets for inhibiting structural remodeling in AS still need to be developed.
Dr. van der Heijde had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study design. Van der Heijde, Landewé, Williamson, Baker, Braun.
Acquisition of data. Van der Heijde, Landewé, Baraliakos, Houben, van Tubergen, Baker, Braun.
Analysis and interpretation of data. Van der Heijde, Landewé, Houben, Williamson, Baker, Goldstein, Braun, Michelle Perate, MS (nonauthor; Centocor, Inc.), Mary Whitman, PhD (nonauthor; Centocor, Inc.).
Manuscript preparation. Van der Heijde, Landewé, Houben, van Tubergen, Williamson, Baker, Goldstein, Braun.
Statistical analysis. Xu.
Centocor and Schering-Plough funded this study. The ASSERT study group supervised the study and assisted with study design. Data were collected by the investigators and entered into a Centocor database. Centocor statisticians and programmers conducted the analyses, and the authors, assisted by a medical writer, prepared this manuscript. All authors reviewed and approved the manuscript content before submission and jointly agreed to submit the final version of the manuscript.