Severity of baseline magnetic resonance imaging–evident sacroiliitis and HLA–B27 status in early inflammatory back pain predict radiographically evident ankylosing spondylitis at eight years

Authors


Abstract

Objective

Magnetic resonance imaging (MRI) is increasingly used to detect sacroiliitis earlier. This study was undertaken to investigate what proportion of patients with MRI-evident sacroiliitis develop ankylosing spondylitis (AS) in the long term and whether there are predictors of outcome.

Methods

Consecutive undiagnosed patients with early inflammatory back pain (IBP) (of <2 years' duration) were assessed clinically and radiologically. Baseline imaging assessments included fat-suppressed MRI sequences of the sacroiliac joints and lumbar spine that were scored for active bone marrow edema representative of acute inflammation, and anteroposterior radiographs of the pelvis and lateral radiographs of the lumbar spine, which were scored using the Stoke Ankylosing Spondylitis Spine Score. Patients were reassessed clinically and radiographically after 8 years. The primary outcome was the modified New York criteria for AS at followup.

Results

Fifty patients were assessed at the beginning of the study, and 40 patients were followed up after a mean of 7.7 years. Of these 40 patients, 58% were HLA–B27 positive, and 98% met the European Spondylarthropathy Study Group criteria. At baseline, 33 (83%) of the 40 patients followed up had MRI-evident sacroiliitis, and 6 (12%) had unequivocal AS according to the modified New York criteria. At followup, despite significant improvements in clinical outcomes, 13 of 39 patients (33.3%) had AS according to the modified New York criteria. The combination of severe sacroiliitis seen on MRI with HLA–B27 positivity was an excellent predictor of future AS (likelihood ratio [LR] 8.0, specificity 92%), while mild or no sacroiliitis, regardless of HLA–B27 status, was a predictor of not having AS (LR 0.4, specificity 38%).

Conclusion

Our findings indicate that in patients with early IBP, a combination of severe sacroiliitis and HLA–B27 positivity has a high specificity for development of AS, compared with mild or no sacroiliitis, regardless of HLA–B27 status, which confers a low likelihood of developing AS. This has implications for the diagnosis of “early” AS and possibly for selection of more aggressive therapies.

The current diagnostic criteria for ankylosing spondylitis (AS), the modified New York criteria (1), rely on the presence of radiographically evident sacroiliitis, which may only appear after years of inflammatory back pain (IBP) (2). However, it is known that much of the pathologic condition in axial spondylarthritis (SpA) is related to perifibrocartilage osteitis, and magnetic resonance imaging (MRI) has been proven to be superior to radiography for detecting sacroiliac (SI) joint osteitis by the depiction of bone marrow edema. Although MRI is being increasingly used to detect sacroiliitis early, it is unclear what proportion of patients with MRI-evident sacroiliitis develops AS in the long term and whether there are predictors of outcome.

Thus far, 2 studies have looked at the utility of baseline MRI of the SI joints as a predictor of subsequent AS (3, 4). However, the followup period in both studies was short, and in one the number of patients with baseline bone marrow edema was small (3). It is therefore still unclear what proportion of patients with IBP and sacroiliitis seen on MRI actually develop AS in the long term. Knowledge of this in patients with early IBP is crucial, since it can take up to a decade for radiographically evident sacroiliitis that is diagnostic for AS to appear (2), and these “early AS” patients have disease activity and pain similar to that in patients with established AS (5).

Furthermore, HLA–B27 shows a striking relationship with AS and is associated with the severity of MRI-evident osteitis in the axial and peripheral skeleton (6). Our aim was therefore to investigate the predictive value of the severity of MRI-evident sacroiliitis and the presence of HLA–B27 in the long-term development of AS.

PATIENTS AND METHODS

Study design and recruitment.

This was a prospective, longitudinal, inception cohort study (7). Consecutive undiagnosed patients were recruited from regional early arthritis and IBP clinics. To be included in the study, patients had to have IBP (according to the Calin criteria [8]) of <2 years' duration. Ethics approval was obtained, and all patients gave informed consent.

MRI assessment and scoring.

All patients had MRI assessments at baseline involving coronal oblique T1 and T2 spectral presaturation with inversion recovery (SPIR) sequences of the SI joints and sagittal T1 and T2 SPIR sequences of the lumbar spine. MRIs were scored for bone marrow edema using the Leeds Scoring System, a semiquantitative global scale. Each quadrant of each SI joint was graded on a scale of 1–3, where 1 = <25% of quadrant affected; 2 = 25–75% of quadrant affected; and 3 = >75% of quadrant affected, and a total lesion count in the spine was performed. This system has been used in previous studies (9) and reported by Outcome Measures in Rheumatology Clinical Trials (OMERACT) (10) as being equal to alternative scoring systems. Patients were categorized as having mild, moderate, or severe sacroiliitis if the highest bone marrow edema grade in any SI joint quadrant was 1, 2, or 3, respectively.

An experienced observer (DM), who was blinded with regard to clinical details, anonymously scored the scans. Intrareader and interreader reliability have previously been reported (7, 9). The reliability for distinguishing different grades of MRI bone marrow edema was good (κ = 0.73).

Radiographic assessment.

All patients had anteroposterior radiographs of the pelvis and lateral radiographs of the lumbar spine at baseline and at followup. Sacroiliac joint radiographs were scored using the modified New York criteria (1), and lumbar spine radiographs were scored using the Stoke Ankylosing Spondylitis Spine Score (SASSS) (11), a valid and reliable radiographic scoring system (12). Scoring of paired radiographs was completed by an investigator (DM) who was blinded with regard to clinical details and chronological order. For intrareader reliability, the quadratic weights were chosen for kappa tables with >2 categories (to assess agreement in the SI joints) because these approximate intraclass correlation coefficients (ICCs), which were used to assess agreement in the spine. At baseline, the weighted kappa values were 0.60 and 0.50 for left SI joint and right SI joint, respectively, and the ICC was 0.79 for the spine. For change in the AS status from baseline to followup, the calculated kappa was 0.68 (substantial agreement).

Clinical assessment.

Clinical evaluation included assessment for European Spondylarthropathy Study Group (ESSG) criteria for SpA (13), visual analog scale (VAS) assessments of night pain, day pain, and global health, Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) (14), Bath Ankylosing Spondylitis Functional Index (BASFI) (15), Ankylosing Spondylitis Quality of Life (ASQoL) scale (16), C-reactive protein (CRP) level, and HLA–B27 status. Clinical and radiographic assessments were repeated at followup.

Statistical analysis.

Cross-table analysis for radiographic outcome was completed using the modified New York criteria for AS (1) as the gold standard. Specificity, sensitivity, positive and negative predictive values, and positive likelihood ratios (LRs) were calculated for different grades of MRI-evident sacroiliitis and for presence and absence of HLA–B27. For clinical and serologic data, differences between groups were assessed using the Mann-Whitney U test, and changes over time within the whole cohort were assessed using Wilcoxon's signed rank test. Analyses were performed with SPSS software, version 15.0.1.1 (SPSS, Chicago, IL).

RESULTS

Baseline clinical characteristics.

Fifty patients from the inception cohort (7) were invited to take part in the study. Of these, 44 (88%) met the ESSG criteria, 44 (88%) reported buttock pain, and 43 (86%) had nocturnal pain. Patient demographics are shown in Table 1.

Table 1. Patient characteristics at baseline and followup*
 Baseline in patients not returning for followup (n = 10)Baseline in patients returning for followup (n = 40)Followup (n = 40)P
  • *

    Patients were followed up after a mean of 7.7 years. ESSG = European Spondylarthropathy Study Group; IQR = interquartile range; VAS = visual analog scale; BASFI = Bath Ankylosing Spondylitis Functional Index; BASDAI = Bath Ankylosing Spondylitis Disease Activity Index; ASQoL = Ankylosing Spondylitis Quality of Life; CRP = C-reactive protein; NSAIDs = nonsteroidal antiinflammatory drugs.

  • P values indicate the difference between baseline (in patients returning for followup) and followup, by Wilcoxon's test.

Age, mean (range) years28 (16–38)31 (18–48)39 (25–56)
Male sex, no. (%)6 (60)25 (63)25 (63)
Disease duration, median (range) weeks20 (6–72)24 (2–104)400 (308–521)
HLA–B27 positive, no. (%)6 (60)23 (58)23 (58)
Met ESSG criteria, no. (%)8 (80)36 (90)39 (98)
Back pain during day, median (IQR) mm (0–100 VAS)55 (22–77)49 (34–68)16 (5–35)0.002
Back pain during night, median (IQR) mm (0–100 VAS)61 (7–86)50 (28–75)12 (1–37)0.002
BASFI, median (IQR) (0–10 scale)3.26 (0.75–4.25)4.33 (2.43–6.35)1.56 (0.49–2.48)<0.001
BASDAI, median (IQR) (0–10 scale)4.05 (2.33–6.25)5.8 (3.85–7.53)2.90 (1.14–4.41)0.001
ASQoL score, median (IQR) (0–18 scale)6 (4–10)9 (6–13)6 (1–9)0.003
CRP, median (IQR) mg/liter8.5 (5–13)9 (<5–21)5 (<5–<5)<0.001
% ever taken NSAIDs9090100

Baseline imaging results.

At presentation, 6 patients (all of whom were HLA–B27 positive) had unequivocal radiographically evident sacroiliitis of grade 3 or greater and had AS according to the modified New York criteria. Two other patients also met the modified New York criteria for AS but had less definite bilateral grade 2 sacroiliitis. Seven other patients had unilateral grade 2 sacroiliitis, and 35 had normal radiographs of the SI joints (grade 0 or 1). The majority (84%; n = 42) had evidence of sacroiliitis on MRI. Of the 6 patients with unequivocal AS on baseline radiography, 5 had either moderate or severe sacroiliitis.

Characteristics at followup.

Forty (80%) of the patients (25 men and 15 women) agreed to followup at a mean of 7.7 years. The mean age of the patients at followup was 39 years, 23 (58%) were HLA–B27 positive, and 39 (98%) met the ESSG criteria. One patient declined repeat radiography. Of the 40 patients who were followed up, 33 (83%) had MRI-evident sacroiliitis at baseline (10 had grade 3, 10 had grade 2, 13 had grade 1, and 7 had grade 0). All patients had been treated with nonsteroidal antiinflammatory drugs (NSAIDs), 17 (43%) had been treated with disease-modifying antirheumatic drugs (DMARDs), and 5 (12.5%) had received anti–tumor necrosis factor (anti-TNF) therapies for short periods of time (for a median of 1 year [range 1–3 years]) subsequent to baseline assessments.

At followup, 13 of 39 patients (33.3%) (11 men and 2 women) had AS according to the modified New York criteria (85% were HLA–B27 positive). Of these, 8 had AS (7 were HLA–B27 positive), 2 had associated inflammatory bowel disease (IBD), 2 had reactive AS, and 1 had psoriasis. Of the 27 patients who did not have AS, 3 had psoriatic SpA, 6 had reactive SpA, 1 had IBD SpA, and 17 had undifferentiated SpA (9 were HLA–B27 positive).

Overall, mean clinical scores, quality of life scores, and CRP level improved significantly for the cohort as a whole (Table 1). However, analysis of subgroups showed that patients who had AS at followup were proportionately more likely to have worsened, compared with patients who did not have AS, according to the BASDAI (33% versus 5%), BASFI (33% versus 10%), ASQoL scale (30% versus 19%), and VAS assessment of day pain (30% versus 22%).

Analysis of predictors of AS showed that severe sacroiliitis seen on MRI (found in 10 patients, all of whom were HLA–B27 positive) had a very high specificity (92%) for future development of AS according to the modified New York criteria, and AS was 20 times more likely to develop in patients with severe sacroiliitis than in subjects with mild or no sacroiliitis, regardless of HLA–B27 status (Table 2 and Figure 1). If moderate sacroiliitis was combined with severe sacroiliitis, regardless of HLA–B27 status, then the specificity for future development of AS was 62%, and the sensitivity was 77%. Specificity improved to 77% when moderate/severe sacroiliitis was combined with HLA–B27 positivity. Bilateral sacroiliitis seen on MRI conferred no predictive value (LR 1.0) compared with unilateral sacroiliitis.

Table 2. Baseline predictors of developing AS*
 Specificity, %Sensitivity, %PPV, %NPV, %Positive LR
  • *

    Severe sacroiliitis was defined as the presence of grade 3 bone marrow edema seen on magnetic resonance imaging (MRI) in any quadrant of either sacroiliac (SI) joint, moderate sacroiliitis was defined as the presence of at least grade 2 bone marrow edema seen on MRI in any quadrant of either SI joint, and mild sacroiliitis was defined as a maximum of grade 1 bone marrow edema seen on MRI in any quadrant of either SI joint. AS = ankylosing spondylitis; PPV = positive predictive value; NPV = negative predictive value; LR = likelihood ratio.

Severe sacroiliitis and HLA–B27 positivity (n = 10)926280838.0
Moderate/severe sacroiliitis and HLA–B27 positivity (n = 16)777763873.3
Moderate/severe sacroiliitis only (n = 20)627750842.0
HLA–B27 positivity only (n = 23)548548881.8
Mild sacroiliitis only (n = 19)382316500.4
Bilateral sacroiliitis (any grade) (n = 18)534633671.0
Persistence of moderate or severe sacroiliitis at 12 months (n = 6)843350732.1
Figure 1.

Baseline radiograph, baseline magnetic resonance image (MRI), and followup radiograph of a representative patient. A, Baseline radiograph of the sacroiliac (SI) joints. Grade 0 (right) and grade 1 (left) sacroiliitis according to the New York criteria were observed. B, Baseline T2 spectral presaturation with inversion recovery MRI of the SI joints, demonstrating bilateral grade 3 bone marrow edema (arrow). C, Followup radiograph of the SI joints. Bilateral grade 3 sacroiliitis according to the New York criteria was observed.

We have previously shown that HLA–B27 positivity is associated with the severity and persistence of MRI-evident sacroiliitis (7) after 1 year of followup. However, in the present study, analysis of persistent MRI-evident sacroiliitis at 12 months gave no additional predictive value of the future development of AS according to the modified New York criteria over baseline severity of sacroiliitis and HLA–B27 status (Table 2).

Lesions were observed on the MRI of the lumbar spine in a minority of patients at baseline (24 lesions in 12 patients). Baseline radiographic abnormality in the lumbar spine was seen in only 4 patients (with a median SASSS of 3 [range 2–4]), with little radiographic progression noted at followup, with the exception of 1 patient, in whom the SASSS progressed from 4 to 41 (median SASSS of 6 [range 1–41]). There was no significant relationship between the baseline MRI of the lumbar spine and followup radiographs.

DISCUSSION

It is well recognized that it may take up to a decade before a diagnosis of AS or SpA (17) is established in patients with IBP. In addition, it is known that the disease is heterogeneous, with only mild sacroiliitis developing in some patients, and the disease not progressing to spinal fusion in many. Given the availability of anti-TNF therapies for symptom control in SpA and their possible utility in ameliorating osteitis, there is a need to better define those groups with a worse prognosis at baseline. This study showed that the combination of severe sacroiliitis at a mean disease duration of 37 weeks with HLA–B27 positivity predicted with 92% specificity those patients likely to have radiographically evident AS at 8 years.

Our findings support the concept that MRI is of great utility for the early diagnosis of AS. It has been reported that the combination of MRI features typical of sacroiliitis with IBP and HLA–B27 positivity has a 90% probability of being diagnosed as axial SpA in patients with well-established symptoms (18). However, the utility of MRI in predicting long-term radiographic structural change consistent with the modified New York criteria for AS has not previously been reported in early IBP.

Some previous studies have looked at the utility of MRI in early SpA. One longitudinal study investigated MRI, computed tomography, and radiographic changes in the SI joints over 1 year in 34 patients with early IBP (4). No correlation between MRI scores and radiographic changes at followup was found. Oostveen et al (3) investigated 25 HLA–B27–positive patients to assess the diagnostic value of MRI in the detection of early sacroiliitis. Of those patients followed up at 3 years, 9 (64%) of 14 with structural changes in sacroiliitis on MRI, and 6 (67%) of 9 with inflammatory changes on MRI at baseline, subsequently developed AS according to the modified New York criteria. Our study had a number of advantages, since more participants, including HLA–B27–negative patients, were observed, patients had shorter disease duration at baseline, and there was a longer followup period. Oostveen et al (3) emphasized the positive predictive value of structural changes seen on MRI for future radiographically evident AS rather than the predictive role of inflammatory changes. In the present study, we concentrated on grading acute inflammatory osteitis in the form of bone marrow edema rather than structural changes, since OMERACT 7 (10) concluded that scoring inflammation is more important than scoring structural changes on MRI. Also, in the present study of very early disease, we noted that structural changes were uncommon.

In our study, 6 patients with IBP had definite AS according to the modified New York criteria at baseline, after a mean of only 48 weeks of symptoms. We were surprised by this, given that radiographically evident AS can take up to a decade to develop (2). However, similar results were demonstrated by a previous study of an almost identical cohort with early IBP (4), which suggests that radiographic change may also occur in early IBP. Of particular note, however, in the present study 2 additional patients were classified as having radiographically evident AS at baseline but had bilateral grade 2 radiographically evident sacroiliitis. This is the minimal grade possible to meet the criteria, and in a region with variable radiographic interreader reliability (19), may have been considered by others to be insufficient to meet the modified New York criteria for AS.

Overall, there was clinical improvement in the cohort over 8 years. This may be for a number of reasons. First, in this cohort with very early disease, the patients presented during a disease flare but were not necessarily reviewed during a flare. Second, this was a mixed group of patients with SpA, with some subgroups, such as patients with reactive SpA, more likely to improve than others. Third, at followup all patients had received NSAIDs and/or DMARDs, and a small proportion had received anti-TNF therapy. Clinical and serologic improvements in a similar cohort after 1 year of followup have previously been reported (4).

There were no differences observed in improvement in clinical outcomes between patients with AS and those who did not have AS at followup. However, the absolute number of patients who worsened clinically was proportionally higher in the group of patients with AS at followup compared with the group who did not have AS. The difference was not significant because of the small number of patients in the AS group.

Overall, this study identifies potentially useful markers of future development of AS according to the modified New York criteria, based on biologically plausible reasoning. Severe sacroiliitis in combination with HLA–B27 positivity has a specificity of 92%, an LR of 8, and a sensitivity of 62% for development of AS. Conversely, in patients with IBP, mild or no sacroiliitis on MRI has a very low specificity of 38%, and an LR of 0.4, and such patients are 20 times less likely to develop AS according to the modified New York criteria, regardless of HLA–B27 status. For a screening test for a condition in which long-term, expensive treatments, such as anti-TNF, are potentially required, a high specificity is preferable, since it would be undesirable to make such a large long-term financial commitment in patients with false-positive results.

In conclusion, this study showed that in patients with early IBP a combination of severe sacroiliitis and HLA–B27 had a high specificity for the future development of AS. In contrast, there was a very low likelihood that AS would develop in patients with mild or no osteitis, regardless of HLA–B27 status. These findings have important implications for the diagnosis of “early” AS and possibly for selection of patients for early, more aggressive interventions.

AUTHOR CONTRIBUTIONS

Dr. Marzo-Ortega had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study design. Bennett, McGonagle, Sivera, Emery, Marzo-Ortega.

Acquisition of data. Bennett, Emery, Marzo-Ortega.

Analysis and interpretation of data. Bennett, McGonagle, Hensor, Emery, Marzo-Ortega.

Manuscript preparation. Bennett, McGonagle, Coates, Emery, Marzo-Ortega.

Statistical analysis. Bennett, Hensor.

Scoring system development. McGonagle, O'Connor, Marzo-Ortega.

Ancillary