To systematically evaluate the literature addressing the role of magnetic resonance imaging (MRI) in the diagnosis and prognosis of early undifferentiated inflammatory arthritis and rheumatoid arthritis (RA).
To systematically evaluate the literature addressing the role of magnetic resonance imaging (MRI) in the diagnosis and prognosis of early undifferentiated inflammatory arthritis and rheumatoid arthritis (RA).
We performed a systematic literature review of the performance characteristics of MRI for diagnosing and prognosticating RA. We searched Ovid, supplementing this with manual searches of bibliographies, journals, meeting proceedings, and the ClinicalTrials.gov web site. To identify diagnostic studies, we included studies of any duration that prospectively examined whether MRI findings predicted RA diagnosis and reported adequate information to calculate sensitivity and specificity. To identify prognostic studies, we included prospective studies with at least a 12-month followup period that measured both baseline MRI findings and clinical and/or radiographic outcomes.
For diagnostic studies (n = 11), sensitivity and specificity of MRI findings for RA diagnosis ranged from 20–100% and 0–100%, respectively, depending upon the criteria used. Diagnostic performance of MRI improved when lower-quality studies or studies with longer disease duration were excluded. For prognostic studies (n = 17), MRI findings did not predict clinical remission, and the ability to predict radiographic progression varied significantly (range 18–100% for sensitivity and 5.9–97% for specificity). Restricting the analysis to specific MRI findings or earlier disease improved MRI prognostic performance. The only prognostic study reporting 100% of a priori quality criteria found MRI bone edema to be the strongest predictor of radiographic progression.
Data evaluating MRI for the diagnosis and prognosis of early RA are currently inadequate to justify widespread use of this technology for these purposes, although MRI bone edema may be predictive of progression in certain RA populations.
Rheumatoid arthritis (RA) is a common debilitating disease (1). Treatment with disease-modifying antirheumatic drugs (DMARDs) provides effective symptom control and decreased risk of disability. Evidence supports a “window of opportunity,” perhaps within 3 to 6 months of symptom onset, during which initiating treatment maximizes improvement in long-term outcomes (2, 3). Recent evidence suggests that early initiation of aggressive treatment might improve the chance of sustained remission (4). Some have even raised the possibility of cure (5). Therefore, methods to improve RA diagnosis and prognostication are of high priority because treating all patients would expose some individuals to unacceptable levels of risk from treatment. While DMARDs are effective in reducing inflammation and restoring function, they are not without cost, including infectious and other complications (6). Magnetic resonance imaging (MRI) has been proposed as a means to improve rheumatologists' ability to diagnose early RA and predict which patients will likely develop progressive disease and thus should receive more aggressive treatment. While utilization of MRI in RA is unknown, an unpublished national survey of rheumatologists found that >30% had used MRI for management of RA patients within the last year (Blum M: unpublished observations).
The ability of MRI to provide additional and more sensitive information than clinical examination or conventional radiography is well established (7, 8). MRI can identify bone erosions earlier than conventional radiography (7) and can detect bone marrow edema and synovitis, which may be important precursors to erosive disease (9, 10). Given these properties, MRI has been proposed as a diagnostic tool among individuals with suspected inflammatory arthritis and as a prognostic tool among those with known RA. However, MRI performance characteristics in the diagnosis and prognostication of early RA are not well defined, and false-positive results may counteract the benefits of high MRI sensitivity. Given the importance of accurate early diagnosis and prognostication in early RA and the rising utilization of MRI in RA, our objective was to systematically evaluate published reports describing the diagnostic and prognostic capability of MRI findings in undifferentiated inflammatory arthritis and early RA, respectively.
The following describes the eligibility criteria, search strategies, a priori criteria for methodologic quality, outcome measures, data extraction methods, and data analysis strategies for diagnostic and prognostic studies. We employed methods based upon Cochrane Collaboration guidelines, including systematic search strategies for all published literature to identify relevant articles, followed by comprehensive, standardized data extraction of relevant outcomes and study characteristics (11). We established a priori inclusion and exclusion criteria and employed both standard and topic-specific methodologic quality assessments.
The eligibility criteria for included diagnostic studies consisted of the following: 1) prospective English language studies of any duration that examined the ability of hand or wrist MRI findings to predict an RA diagnosis among adult patients with undifferentiated polyarthritis of the hand or wrist; 2) used American College of Rheumatology (ACR) 1987 revised criteria (12) and/or clinical assessment by a rheumatologist as the diagnostic gold standard; 3) reported adequate information to calculate sensitivity and specificity; and 4) reported data for >10 patients.
Undifferentiated polyarthritis was defined by published criteria (13, 14). These criteria included patients with characteristics, history, examination, or laboratory data suggesting an inflammatory arthritis, but without a specific diagnosis of a rheumatic disorder. Presentations in this category include arthralgias in a distribution typical of RA, with or without abnormal inflammatory markers or a positive rheumatoid factor, a dramatic response to corticosteroid medications, a convincing history of joint swelling, specific extraarticular features (e.g., nodules), or atypical joint swelling (e.g., asymmetric, oligoarticular, or unusual joint patterns) (14–16). Where possible, data from mixed populations of undifferentiated polyarthritis, arthralgia, and early suspected RA were examined separately.
Using Ovid, one reviewer (LGS) searched the specialized Cochrane Central Register of Controlled Trials and Medline (through April week 1, 2010). In order to capture all potential studies, we employed a broad search strategy using medical subject headings consisting of “exp Magnetic Resonance Imaging/” combined with “exp Arthritis/” or “exp Arthritis, Rheumatoid/”. Citation abstracts were searched by hand for studies meeting the above inclusion criteria. A secondary manual search included: 1) bibliographies of all included studies and relevant review articles identified by the preceding search within 3 years; 2) abstracts and meeting proceedings of journals and professional societies within 3 years; and 3) the ClinicalTrials.gov web site.
To identify and account for potential sources of selection and measurement biases, we assessed methodologic quality based on recently updated recommendations for assessing the methodologic quality of diagnostic studies (17, 18). We used the 14 items of the Quality Assessment of Diagnostic Accuracy Studies checklist (17) (see Supplementary Appendix A, available in the online version of this article at http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)2151-4658) to assess study quality. The Outcome Measures in Rheumatology Clinical Trials (OMERACT) group also published the Rheumatoid Arthritis Magnetic Resonance Imaging Scoring system (RAMRIS) that combines MRI evidence of erosions, edema, and synovitis into a validated, reproducible scoring system (19) and recommended minimum core sequences to improve the quality of research in this field (20). Therefore, we also considered whether or not the study obtained minimum core MRI sequences recommended by the recent OMERACT working group and/or employed a validated MRI scoring method (i.e., the OMERACT RAMRIS).
For diagnosis studies, the primary outcome was the ability to predict RA diagnosis, defined as fulfilling the ACR 1987 revised criteria (12) and/or clinical assessment by a rheumatologist at followup, and reported as sensitivity and specificity.
All articles were abstracted in duplicate by two independent reviewers using standard abstraction forms. Data abstracted from diagnostic studies included: 1) study features (design, sample size, handling of missing data); 2) baseline patient demographics and clinical, plain radiographic, and MRI data; and 3) sensitivity and specificity (as reported and/or calculated from the information provided). Discrepancies were resolved by consensus discussion between the reviewers.
The following measures of test accuracy were computed for each study: sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR), and negative LR. Sensitivity and specificity for test thresholds identified in each study were used to plot a summary receiver operating characteristic (ROC) curve and calculate the area under the curve (AUC). If appropriate, Cochran's Q statistic was used to determine homogeneity in measures of test accuracy across studies (P values greater than 0.1). For homogeneous data, hierarchical summary ROC and bivariate random-effects models were used to calculate average sensitivity and specificity values. We also stratified the analysis by study and patient characteristics, including quality criteria, specific MRI parameters, and disease duration.
The eligibility criteria for including studies on prognosis consisted of: 1) prospective English language study of a duration of at least 12 months that collected and reported hand, wrist, and/or foot MRI and plain radiographic and any clinical data on early RA patients; 2) RA was defined by 1987 ACR or equivalent classification criteria (12); 3) “early” RA was broadly defined as a disease duration of <60 months in order to capture all relevant studies; and 4) reported data for >10 patients.
To identify prognostic studies, we used the identical search strategy described above for diagnostic studies. Citation abstracts were searched by hand for studies meeting the above prognostic study inclusion criteria. We performed a secondary manual search of bibliographies, meeting proceedings, journals, and the ClinicalTrials.gov web site to further identify prognostic studies.
Although recommendations for the assessment of methodologic quality for prognostic studies do not exist, we sought to identify and account for potential sources of selection, measurement and, where relevant, intervention biases in the included studies. We combined recommendations for the assessment of diagnostic study methodologic quality listed above with relevant criteria for assessment of clinical trials and observational studies (11), including clear descriptions of methods, inclusion and exclusion criteria, blinding, handling of missing data and losses to followup, and use of OMERACT core sequences and/or validated MRI scoring (see Supplementary Appendix A, available in the online version of this article at http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)2151-4658). Intervention biases are of particular concern in prognostic studies because administration of treatment may alter the disease course. Therefore, we also assessed whether uniform standardized treatment protocols were employed to all or a subset of the study population and/or whether analyses adjusted for baseline disease severity.
For the prognosis studies, the primary outcome of interest was the ability to predict radiographic outcomes (Sharp or Larsen scores) at 12 months. Secondary outcome measures included the same radiographic outcomes at ≥12 months as well as clinical status (measured by the ACR 20% criteria for improvement [ACR20] and/or its components) and functional ability and/or quality of life (measured by Health Assessment Questionnaire [HAQ] score, Short Form 36 [SF-36], or other validated measure) at ≥12 months.
All articles were abstracted in duplicate by two independent reviewers using a standard abstraction form. Data abstracted from prognosis studies included: 1) study features (controls, randomization, sample size, therapeutic intervention, handling of missing data); 2) baseline patient demographics and clinical and MRI data; and 3) plain radiographic and MRI outcome data as well as clinical and functional outcomes (including ACR20 response, Disease Activity Score, HAQ scores, or SF-36 scores). Discrepancies were resolved by consensus discussion between the reviewers.
The primary comparison across studies was the effect size of the correlation between baseline MRI findings and 12-month radiographic progression as measured by Sharp, modified Sharp, or Larsen scores. In addition, we examined the sensitivity, specificity, PPV, NPV, positive LR, and negative LR of MRI findings to predict radiographic and clinical outcomes at ≥12 months. This information was plotted as a summary ROC curve and the AUC was calculated. If appropriate, Cochran's Q statistic was used to determine homogeneity (P values greater than 0.1). We planned to pool effect sizes for homogeneous data. Stratified analyses, according to the study and patient characteristics, were also performed. Correlations for secondary outcomes were similarly compared.
Search results for all included diagnostic and prognostic studies are shown in Figure 1 and are described below. The most common reasons for exclusion were a non-RA population; no hand, wrist, or foot imaging; ≤10 patients; and not original research (i.e., reviews and comments).
Our search for diagnostic studies yielded 11 studies comprising 606 individual patients. The mean/median disease duration was ≤18 months for the 8 studies (21–28) that reported this information (range 0.5–180 months), and the mean followup was <20 months (range 4–73 months). The study populations ranged from 67–100% female participants and the mean/median ages ranged from 40–57.7 years (range 13–80 years). Four studies (21, 23, 27, 28) did not report the prevalence of baseline radiographic erosions in their cohorts and the remainder excluded individuals with plain radiographic erosions at baseline. Two studies (25, 26) used low-field MRI (0.2T versus 1.0–1.5T for the rest of the studies).
There was marked variation among the studies regarding the MRI classification criteria used to diagnose RA (Table 1). Three studies (22, 24, 29) used the OMERACT RAMRIS scoring system, but only 1 study (22) reported a specific cutoff for positivity, and the remaining studies used the OMERACT definitions for synovitis, erosions, and bone edema without reporting score cutoffs. Eight studies (22–24, 26–28, 30, 31) considered the presence of anti–cyclic citrullinated peptide (anti-CCP) or rheumatoid factor antibodies in their analysis; 2 studies (30, 31) examined only synovitis or contrast enhancement and 1 study (22) examined only erosion scores or the presence of erosions.
|Author, year (ref.)||N||Mean (range) followup, months||RF positive, %||Anti- CCP positive, %||Plain radiograph erosions, %||Quality criteria reported, %||MRI definition of RA||Sensitivity, %||Specificity, %||PPV, %||NPV, %||Positive LR||Negative LR|
|Sugimoto et al, 1996 (30)||27||2.7 (NR)||37||NR||0||53||Presence of MRI synovitis†||100||72.7||84.2||94.1||3.56||0.04|
|Sugimoto et al, 2000 (31)‡||48||25 (4–72)||73||NR||0||69||Presence of bilateral MRI synovitis (periarticular enhancement)||96.2||86.4||89.3||95.0||7.05||0.05|
|Klarlund et al, 2000 (21)||13||12||49||NR||6||62.5||Presence of MRI tenosynovitis||60.0||62.5||50.0||71.4||40.00||0.80|
|Presence of MRI bone erosions||20.0||100.0||100.0||66.7||0.60||0.64|
|Boutry et al, 2005 (29)||47||29 (4–73)||NR||NR||0||88||Presence of OMERACT MRI synovitis||100.0||0.0||59.6||50.0||1.01||0.68|
|Presence of OMERACT MRI bone erosions§||60.7–100.0||15.8–52.6||63.6–65.4||47.6–85.7||1.17–1.28||0.75–0.11|
|Presence of OMERACT MRI bone edema||39.3–71.4||84.2–94.7||78.6–95.2||48.5–69.2||2.49–13.57||0.30–0.72|
|Solau-Gervais et al, 2006 (22)||30||30.6 (12 to NR)||30||NR||0||88||OMERACT RAMRIS erosion score >15||70.0||64.0||–||–||–||–|
|Presence of MRI carpus or MCP joint erosions||66.7–80.0||27.3–53.8||55.6–66.7||27.3–70.0||1.73||0.37|
|Tamai et al, 2006 (23)||113||12 (NR)||39||24||NR||63||Presence of ≥2 of anti-CCP or RF antibodies, MRI bone edema or MRI erosions, and symmetric MRI synovitis||82.5||84.8||93.0||66.7||5.43||0.21|
|Narvaez et al, 2008 (24)||40||20 (12–42)||0||23||0||87||Presence of OMERACT MRI synovitis with MRI bone edema or MRI erosions||100.0||77.8||93.9||93.3||4.43||0.02|
|Duer et al, 2008 (25)||41||24 (NR)||34||NR||0||73||Presence of MRI synovitis||100.0||60.0||48.0||100.0||2.50||0.00|
|Presence of MRI erosions||64.0||77.0||50.0||85.0||2.78||0.47|
|Presence of MRI synovitis or erosions||100.0||50.0||42.0||100.0||2.00||0.00|
|Presence of MRI synovitis and erosions||64.0||87.0||64.0||87.0||4.92||0.41|
|Mori et al, 2008 (27)||21||27.4 (13–40)||59||24||NR||71||Presence of MRI symmetric synovitis¶||100.0||75.0||62.5||100.0||2.00||0.18|
|Eshed et al, 2009 (26)||99||Median 8 (6–41)||35||26||0||69||Presence of MRI flexor tenosynovitis||60.3||73.2||76.1||56.6||2.25||0.54|
|Presence of MRI extensor tenosynovitis||24.1||87.8||73.7||45.0||1.98||0.86|
|Presence of MRI MCP joint synovitis||82.8||39.0||65.7||61.6||1.36||0.44|
|Presence of anti-CCP antibodies and MRI tenosynovitis||78.9||73.0||–||–||–||–|
|Tamai et al, 2009 (28)#||129||>12 (NR)||43||36||NR||63||Presence of MRI symmetric synovitis||74.7||59.3||71.8||62.7||1.84||0.43|
|Presence of MRI bone edema||41.3||90.7||86.1||52.7||4.44||0.65|
|Presence of MRI erosions||29.3||90.7||81.5||48.0||3.15||0.78|
|Presence of anti-CCP antibodies and MRI bone edema||50.7||100.0||100.0||59.3||31.97||0.71|
Overall, the sensitivity and specificity of MRI findings varied broadly (range 20–100% for sensitivity and 0–100% for specificity), even for comparable MRI definitions of RA (i.e., symmetric synovitis). Increasingly restrictive diagnostic criteria (i.e., requiring the combined presence of multiple MRI and/or other clinical or laboratory findings) improved specificity at the expense of sensitivity. However, Sugimoto et al (30) found decreased specificity with a diagnostic algorithm where rheumatoid factor and joint count assessment preceded MRI. MRI was less informative in differentiating between inflammatory conditions. Boutry et al (29) compared findings among individuals eventually found to have RA, systemic lupus erythematosus, and Sjögren's syndrome, and found no significant differences in MRI findings, including OMERACT erosion scores. However, other data (22) found that this score differed significantly between individuals eventually diagnosed with RA versus those with all other diseases pooled.
There was considerable variability in methodologic quality. Four studies (22, 24, 25, 29) met the criteria for the minimum recommended MRI sequences, and blinding, handling of missing data, and losses to followup were adequately reported by 4 or fewer studies. Where reported, missing data were handled by exclusion. Given apparent heterogeneity of MRI diagnostic criteria and study designs, we chose to provide stratified data, rather than pool diverse studies. Each graph in Figure 2 shows the sensitivity and specificity of included studies plotted in ROC space with a regression line and the R2 value provided to demonstrate fit. Figure 2A shows the results for all 11 included studies, some of which provided data for multiple MRI RA definitions (22, 26–29). Figure 2B shows data from studies receiving the highest quartile of quality assessment (i.e., those studies with 80% or greater scores of a possible 100% for quality) (22, 24, 29). Figure 2C provides data from studies (23, 26, 28) in the highest-size quartile (>86 participants), none of which were in the highest quality quartile. Figure 2D shows data for MRI erosions, while Figure 2E shows data for measures of MRI synovitis. There was an insufficient number of studies to allow subgroup analysis of those using OMERACT RAMRIS scoring or examining MRI bone edema or mixed arthritis populations. Figure 2F shows data from studies (21, 23, 24, 28) examining patients with <6 months of disease.
While limiting the analysis to only those studies in the highest quality quartile or earliest disease improved MRI performance (AUCs for all, highest quality quartile, and <6 months of disease duration studies were 0.77, 0.80, and 0.81, respectively), limiting analysis to studies in the highest size quartile or to specific MRI parameters appeared to decrease MRI performance (AUCs for the highest size quartile and MRI erosion studies were 0.70 and 0.61, respectively; AUC not calculated for MRI synovitis studies due to extreme heterogeneity of results). Only 1 study (29) examined the diagnostic capability of MRI bone marrow edema independent of other parameters.
Seventeen prognostic studies, comprising 710 individual patients, 7 randomized clinical trials (32–38), and 10 observational studies (9, 10, 39–46), met our inclusion criteria. An additional study (20) examined the prognostic capability of MRI in both early RA and undifferentiated arthritis patients, but did not provide sufficient data for the RA cohort to allow inclusion. The mean followup was 24.4 months and the mean/median disease duration was <12 months (range 0.4–20.6 months) for all but 3 studies (32, 39, 45) that reported mean/median disease durations ≤25 months (range 3–264 months). Women comprised 56–80% of study participants, with mean/ median ages 38–60 years (range 20–83 years). Six studies (9, 32, 33, 36, 37, 42) reported the prevalence of baseline plain radiographic erosions, which ranged from 24–62% of patients, and Cohen et al (38) used the presence of baseline radiographic erosions as an inclusion criterion in order to study a high-risk population. One study (42) used low-field (0.2T) MRI machines for all examinations, and Hetland et al (37) used machines with a range of 0.2–1.5T for their study; the remainder of studies used 1.5T MRIs.
One study (39) reported the prognostic capability of MRI to predict clinical outcomes (remission as defined by ACR criteria), but found no significant association. Sensitivity and specificity of MRI findings to predict radiographic progression, defined as either new erosions or increased Sharp score, varied broadly (range 18–100% for sensitivity and 5.9–97% for specificity), even for comparable MRI findings, such as baseline MRI erosions (range 60–88.9% for sensitivity and 5.9–94% for specificity) (Table 2).
|Author, year (ref.)||Study design||N||Mean followup, months||RF positive, %||Anti- CCP positive, %||Plain radiograph erosions, %||Quality criteria reported, %||MRI prognosis findings||Sensitivity, %||Specificity, %||PPV, %||NPV, %||OR (95% CI)||Correlation (P)|
|Lee et al, 1997 (39)||Open-label trial†||10||14||90||NR||NR||38||No significant association between MRI synovial proliferation, bone edema or erosions, and ACR-defined remission||–||–||–||–||–||–|
|Ostergaard et al, 1999 (32)||Open RCT‡||26||12||58||NR||46||50||MRI erosions at baseline predicted radiographic erosions at followup||88.9||5.9||33.3||50.0||–||–|
|MRI synovial membrane hypertrophy score significantly correlated with progression of radiographic erosions||–||–||–||–||–||0.42 (< 0.05)|
|MRI synovial membrane hypertrophy score of ≥5 cm3 predicted progression of radiographic erosions||100.0||29.4||42.9||100.0||–||–|
|MRI synovial membrane hypertrophy score of ≥10 cm3 predicted progression of radiographic erosions||44.4||64.7||40.0||68.8||–||–|
|MRI AUC for synovial membrane hypertrophy of ≥4 predicted progression of radiographic erosions||100.0||41.2||52.6||100.0||–||–|
|MRI AUC for synovial membrane hypertrophy of ≥8 predicted progression of radiographic erosions||80.0||93.8||88.9||88.2||–||–|
|McQueen et al, 1999 (40)||Observational||42||12||90||NR||36||50||MRI erosions predicted radiographic erosions||83.3||70.0||52.6||91.3||11.6 (NR)||–|
|MRI erosion score of ≥6 predicted radiographic erosions||93.3||81.8||93.0||–||63 (7.7–513)||–|
|MRI erosion score of ≥13 predicted radiographic erosions||82||73||53||92||–||–|
|MRI bone edema predicted radiographic erosions||–||–||–||–||6.47 (3.2–13.1)||–|
|MRI synovitis predicted radiographic erosions||–||–||–||–||2.14 (1.3–3.7)||–|
|McQueen et al, 2001 (41)§||Observational||42||24||90||NR||36||50||MRI erosion score of ≥13 predicted radiographic erosions at followup||80||76||67||86||13.4 (2.65–60.5)||–|
|Conaghan et al, 2003 (33)||RCT¶||42||12||60||NR||45||75||No significant prognostic findings reported#||–||–||–||–||–||–|
|McQueen et al, 2003 (9)**||Observational||42||72||90||NR||36||50||MRI bone edema significantly associated with 6-year radiographic progression||18.0||96.8||–||–||–||–|
|Quinn et al, 2005 (34)||RCT††||20||24||65||NR||NR||81||No significant prognostic findings reported‡‡||–||–||–||–||–||–|
|Lindegaard et al, 2006 (42)||Observational§§||24||12||95||NR||24||56||OMERACT MRI baseline erosions or bone edema predicted radiographic progression at followup||80.0||57.9||33.3||79.2||–||–|
|OMERACT MRI baseline erosion score significantly correlated with radiographic progression at followup||–||–||–||–||–||0.69 (< 0.001)|
|Jarrett et al, 2006 (35)||RCT¶¶||39||6||77||NR||NR||75||No significant prognostic findings reported||–||–||–||–||–||–|
|Durez et al, 2007 (36)||RCT##||44||12||77||70||27||75||No significant prognostic findings reported||–||–||–||–||–||–|
|Hetland et al, 2009 (37)||RCT***||130||24||67||61||62||100||OMERACT baseline MRI bone edema the only significant predictor of radiographic progression at followup†††||–||–||–||–||–||0.50–0.64 (< 0.001)|
|Cohen et al, 2008 (38)‡‡‡||RCT||218||12||78||NR||NR||81||No significant prognostic findings reported||–||–||–||–||–||–|
|Haavardsholm et al, 2008 (43)||Observational||84||12||44||55||NR||69||OMERACT baseline MRI bone edema score of >2 predicted radiographic progression at followup§§§||–||–||–||–||2.77 (1.06–7.21)||–|
|Boyesen et al, 2008 (10)¶¶¶||Observational||89||36||42||63||NR||69||AUC of OMERACT baseline MRI synovitis score predicted radiographic progression at followup###||–||–||–||–||–||0.53 (0.004)|
|Syversen et al, 2008 (44)||Observational||82||12||44||55||NR||71||MRI bone edema significantly associated with 1-year radiographic progression****||–||–||–||–||–||–|
|Mundwiler et al, 2009 (45)||Observational||46||24||90||76||NR||50||OMERACT baseline MRI erosions predicted radiographic progression at 12-month followup||60||94||10||99.5||32.2 (3.7–144.6)||–|
|OMERACT baseline MRI erosions predicted radiographic progression at 24-month followup||75||94||17||99.5||43.6 (4.27–445)||–|
|OMERACT baseline MRI bone edema predicted radiographic progression at 12-month followup||67||97||50||99||68.0 (13.6–338.9)||–|
|Hammer et al, 2009 (46)††††||Observational||58||12||>25%||>50%||NR||69||No significant prognostic findings reported||–||–||–||–||–||–|
There was marked variation in methodologic quality among studies, with the percentage of adequately addressed quality criteria ranging from 38–100% (Table 2). Nine studies used uniform treatment administration over the course of the study, either as randomized clinical trials (n = 7) (32–38) or with standardized treatment protocols (n = 2) (39, 42); however, only 2 studies (32, 37) found that MRI findings (synovitis and bone edema) were significantly associated with subsequent radiographic progression, and only 1 study (32) reported sufficient information to calculate sensitivity and specificity of MRI to predict radiographic erosions at 1 year. Hetland et al (37) found that MRI bone edema was the only statistically significant predictor of radiographic progression by Sharp score at 2 years in a multivariable model adjusting for age, sex, smoking status, HLA status, baseline disease activity, presence of anti-CCP antibodies, and baseline MRI erosion and synovitis scores. This model predicted 25% and 41% of variance among 130 subjects with wrist MRIs and 84 subjects with both wrist and metacarpophalangeal joint MRIs, respectively.
Other methodologic quality criteria varied across the included studies. While the included randomized clinical trials provided the highest-quality assessments, they rarely provided adequate data regarding the prognostic capability of MRI findings within uniform treatment groups to predict radiographic or clinical outcomes. One randomized clinical trial (33) that attempted to report data on the ability of MRI to predict radiographic outcomes was limited by the absence of plain radiographic progression in their cohort. In addition, most clinical trial studies found either clinical stability or uniform improvement over time, and therefore were not able to assess the prognostic capability of MRI to predict clinical progression.
Similar to diagnostic studies, given the apparent heterogeneity in MRI prognostic criteria, we chose to stratify analyses rather than pool diverse studies. Figure 3A shows the sensitivity and specificity from all applicable studies (9, 32, 40–42, 45) plotted in ROC space with a regression line and the R2 value provided to demonstrate fit (AUC 0.83). No study in the highest size or quality (i.e., quality scores of 75% or greater) quartile reported sensitivity and specificity data. Figures 3B and C show data examining MRI erosions and measures of synovitis, respectively. Figure 3D shows data (9, 10, 37, 40–44, 47) on patients with a disease duration of <6 months. There were insufficient data provided to pool odds ratios for studies that did not provide sensitivity and specificity. Limiting the analysis to studies examining only the presence of baseline MRI erosions or patients with a disease duration of <6 months slightly improved overall MRI performance (AUCs for MRI erosions and <6 months of disease duration were 0.84 and 0.86, respectively).
We performed a systematic review of published studies assessing the diagnostic and prognostic capability of MRI findings in undifferentiated inflammatory arthritis and early RA, respectively. To our knowledge, this is the first such systematic review. An exhaustive literature search found few published studies supporting the use of MRI for either of these roles. We found 11 studies addressing RA diagnosis and 17 evaluating prognosis; however, small study size, variability in methodologic quality, and lack of uniform treatment limited our ability to make robust statements about the utility of MRI in clinical practice.
The sensitivity and specificity of early MRI findings for RA diagnosis ranged from 20–100% and 0–100%, respectively, depending on the MRI criteria used. Among diagnostic studies, excluding lower-quality studies or studies of patients with longer symptom duration improved performance, while excluding small studies or examining individual MRI parameters, such as MRI erosions or synovitis, decreased MRI performance. No diagnostic study met 100% of our a priori methodologic quality criteria. Among prognostic studies, the ability of MRI to predict progressive radiographic damage varied widely (range 18–100% for sensitivity and 5.9–97% for specificity). Only one high-quality study examined the prognostic capability of MRI.
While data examining the utility of MRI in RA diagnosis and prognostication exist, there is no consensus on definitive MRI criteria for RA diagnosis. In addition, among prognostic studies, the only study to achieve a perfect rating for methodologic quality (37) found that MRI bone edema was the only significant predictor of radiographic progression. This study examined 130 early RA patients with a disease duration of less than 6 months receiving standardized treatment, making it a compelling statement in favor of the capability of MRI bone edema to predict radiographic erosions. However, despite the short disease duration of the study population, 62% of participants had baseline plain radiographic erosions. Because data suggest the prevalence of erosions in early RA ranges from 1% to 34% (33, 37, 40, 42, 46–48), this study may not be broadly generalizable. The utility of MRI to predict radiographic progression among individuals with no baseline radiographic erosions or to predict clinical outcomes such as remission remains undefined.
There are limitations to this analysis. We did not include unpublished data in this review. Patient-level meta-analysis of randomized clinical trial results where baseline MRI data were collected might improve our understanding of the role of MRI in predicting both response to therapy and the likelihood of clinical as well as radiographic progression. In addition, due to the rapidly expanding nature of this field, this analysis provides a temporary assessment of currently available data and will hopefully soon be superseded by definitive studies of the optimal role for MRI in the management of early RA. However, due to the uptake of MRI into clinical practice without adequate evidence to guide clinical decision making, reviews such as this are important additions to the literature.
The use of MRI in musculoskeletal diseases is expanding rapidly. According to the Medicare Payment Advisory Commission, a nearly 50% increase in spending on imaging occurred between 2000 and 2003, most of which was due to increases in computed tomography and MRI spending (49). Although national estimates of the utilization of MRI in rheumatoid arthritis are not available, office-based extremity MRI units are being directly marketed to rheumatologists for use in the management of early RA (50). Our findings suggest that, while data support the use of MRI for both the diagnosis and risk stratification of early RA, there are discordant results of which MRI findings are the most accurate at diagnosing RA and/or predictive of subsequent joint damage. Available data are limited by inconsistent MRI scoring systems, small sample and effect sizes, short followup, and lack of adjustment for disease severity and treatment.
Our findings suggest several approaches to improving the quality of literature in this field. Use of validated scoring systems, such as the OMERACT RAMRIS, and a uniform approach to combining radiographic and clinical information will significantly improve our understanding of the diagnostic role of MRI in undifferentiated inflammatory arthritis. In addition, future studies examining the utility of MRI in RA diagnosis should be concerned with study power. Larger studies with multiyear followup and adjustment for disease severity and treatment, as are now underway in the form of early RA randomized clinical trials, may provide valuable insights into the incremental prognostic capability of MRI over currently available prognostic markers. Data evaluating MRI for the diagnosis and prognosis of early RA are currently inadequate to justify widespread use of this technology for these purposes, although MRI bone edema may be predictive among patients with early, severe RA.
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Suter had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study conception and design. Suter, Fraenkel, Braithwaite.
Acquisition of data. Suter.
Analysis and interpretation of data. Suter, Fraenkel, Braithwaite.