Dr. Maksymowych is a Senior Scholar of the Alberta Heritage Foundation for Medical Research.
Validation of the Spondyloarthritis Research Consortium of Canada magnetic resonance imaging spinal inflammation index: Is it necessary to score the entire spine?
Article first published online: 29 MAR 2007
Copyright © 2007 by the American College of Rheumatology
Arthritis Care & Research
Volume 57, Issue 3, pages 501–507, 15 April 2007
How to Cite
Maksymowych, W. P., Dhillon, S. S., Park, R., Salonen, D., Inman, R. D. and Lambert, R. G. W. (2007), Validation of the Spondyloarthritis Research Consortium of Canada magnetic resonance imaging spinal inflammation index: Is it necessary to score the entire spine?. Arthritis & Rheumatism, 57: 501–507. doi: 10.1002/art.22627
- Issue published online: 29 MAR 2007
- Article first published online: 29 MAR 2007
- Manuscript Accepted: 23 JUN 2006
- Manuscript Received: 8 FEB 2006
- Magnetic resonance imaging;
- Ankylosing spondylitis;
- SPARCC method;
The Spondyloarthritis Research Consortium of Canada (SPARCC) magnetic resonance imaging (MRI) spinal inflammation index has been developed to objectively measure inflammation in ankylosing spondylitis (AS) and to assess change in response to therapeutic intervention. Scoring of the entire spine limits feasibility and a scoring method that records inflammation in only the more severely affected spinal segments may improve feasibility without sacrificing performance.
MRI films of 68 patients with AS were assessed in random order by 2 blinded readers. Interreader reliability was assessed by intraclass correlation coefficient. Pre- and posttreatment MRI films of 29 patients randomized to placebo or anti–tumor necrosis factor α (anti-TNFα) therapy were read by readers blinded to chronology, and responsiveness was assessed by effect size and standardized response mean. The performance of scores based on 6, 8, 10, and all 23 spinal discovertebral units (DVU) was compared.
The median number of affected spinal levels per patient was 6.0 and 62% of all affected levels were included when analysis was limited to only the 6 most severely affected levels per patient. Comparison of DVU scores that were limited to only the more severely affected DVU (6-, 8-, 10-DVU score) with scores for all 23 spinal DVU showed excellent interreader reliability for status and change scores (Spearman's correlation >0.90) as well as similar construct validity. Responsiveness to anti-TNFα therapy was greater when the more limited scoring methods were used and was greatest with the 6-DVU score.
The SPARCC MRI spinal inflammation index performs better when analysis is limited to a maximum of 6 most severely affected levels compared with assessment of the entire spine. This should improve its feasibility in clinical trials and research.
Magnetic resonance imaging (MRI) is the most sensitive imaging modality for detection of inflammatory lesions in the spine and sacroiliac joints of patients with ankylosing spondylitis (AS) (1). This has been made possible through the use of MRI sequences, such as STIR, that suppress the signal from marrow fat. Elimination of fat signal on T2-weighted sequences promotes the visualization of abnormal increased water content due to the underlying bone marrow edema that is associated with inflammation. Typical appearances in the spine include increased T2 signal at the anterior corners of the vertebrae, reflecting inflammation at the attachment of the annulus fibrosus to the vertebral corner, and increased signal in the subchondral bone adjacent to the vertebral end plate (2). Furthermore, it has been shown that these lesions resolve following the institution of anti–tumor necrosis factor α (anti-TNFα) therapies and it has therefore been suggested that MRI can be used to assess the efficacy of treatment, particularly because clinical outcome measures are largely based on patient self-reported questionnaires (3). Accordingly, scoring systems have been developed to facilitate the evaluation of inflammatory lesions observed on MRI (3, 4). However, the optimal approach to the scoring of MRI lesions currently lacks consensus and is presently the subject of further evaluation by investigators using the Outcome Measures in Rheumatology Clinical Trials (OMERACT) approach to the validation of outcome instruments in musculoskeletal disorders (5). In particular, OMERACT has proposed that newly developed instruments meet the criteria of feasibility, truth, and discrimination. The latter is a function of both reproducibility and responsiveness to change.
Two methods have been reported for scoring inflammatory lesions in the spine (3, 4). Both rely on the assessment of the signal on fat-suppressed images (STIR, T2-weighted fat saturation) in the anterior segment of the spine (vertebral body) and do not score lesions in the posterior elements of the spine. Both methods also use the discovertebral unit (DVU) as the primary anatomic region for scoring inflammation. The DVU is defined as the region between 2 imaginary lines drawn through the middle of adjacent vertebrae and including adjacent vertebral end plates with the intervening disc. The Spondyloarthritis Research Consortium of Canada (SPARCC) MRI spinal inflammation index takes advantage of the ability of MRI to visualize lesions in several dimensions (4). The developers of this method have proposed that scoring be limited to a maximum of 6 of the most severely affected levels on the basis that the mean number of affected DVU per patient in a prior study was 3.2 (95% confidence interval 1.2–5.2).
Limiting the assessment to only the most severely affected levels improves feasibility in that the time necessary for evaluation is less than for the entire spine. Although this approach may introduce measurement error due to readers differing in their selection of levels for scoring, the alternative, assessment of the entire spine, is subject to significant problems. Being forced to score the entire spine results in the inclusion of less discernable lesions, which may reduce sensitivity to change, and forces the reader to score levels that are affected by signal artifact. This is not a trivial issue as some degree of phase-encoding artifact occurs in almost every case when scanning the entire spine with large fields of view. Consequently, it is not clear how many levels should be assessed to maximize sensitivity to change without compromising interobserver reproducibility. In this study we compared the performance of the SPARCC scoring method according to the OMERACT filter for all 23 spinal levels with a scoring scheme that is limited to only the most severely affected DVU. Our objective was to determine how many levels should be analyzed for optimal feasibility and discrimination.
PATIENTS AND METHODS
Patients and study protocol.
We studied 2 cohorts of patients with AS as defined by the modified New York criteria (6). Cohort A was a cross-sectional cohort of 39 patients with AS (29 men, mean age 42.3 years [range 22–68 years], mean disease duration 13.4 years [range 2–41 years], mean Bath Ankylosing Spondylitis Disease Activity Index [BASDAI] score of 5.5 [range 3.0–8.6]) who attended the outpatient clinic in the Rheumatic Disease Unit at the University of Alberta. All patients had been recruited to a prospective, longitudinal observational cohort, the Follow up Research Cohort of AS study (FORCAST), in which clinical and laboratory data are systematically collected every 6 months and plain radiographic imaging and MRI are obtained annually. Most patients (83%) receive nonsteroidal antiinflammatory drugs and/or physical therapy.
Cohort B comprised 29 patients who had severe, active disease as defined by a BASDAI score ≥4 and who had been randomized to receive either an anti-TNFα agent or placebo in a 24-week double-blind placebo-controlled trial of either adalimumab (n = 11; 1:1 randomization, adalimumab administered in a dose of 40 mg subcutaneously on alternate weeks) or infliximab (n = 18; 3:8 randomization of placebo:infliximab, infliximab administered in a dose of 5 mg at 0, 2, and 6 weeks and every 6 weeks thereafter). Nineteen patients in cohort B were recruited at the University of Alberta and comprised 14 men and 5 women (mean age 43.4 years [range 33–65 years], mean disease duration 18.7 years [range 9–42 years]). Nine patients in cohort B were recruited at the University of Toronto and comprised 8 men and 1 woman (mean age 40.2 years, mean disease duration 16.1 years). The mean BASDAI score for the entire group of 29 patients was 6.1. Pre- and posttreatment MRI films from the 18 patients that were recruited to the infliximab trial had been scored 18 months prior to the current exercise by 1 (SSD) of the 2 readers (4).
Cohort A underwent MRI at a single time point whereas cohort B underwent MRI at baseline and either 12 weeks (adalimumab trial) or 24 weeks (infliximab trial) after randomization. We also included 6 controls with nonspecific back pain who underwent MRI at a single time point. The study was approved by the ethics committees of the University of Alberta and the University Health Network (Toronto).
Magnetic resonance imaging.
MRI of the spine was performed with 1.5T Siemens (Munich, Germany) or GE systems (Waukesha, WI) using appropriate surface coils. Sagittal sequences were obtained with 3–4-mm slice thickness and 11–15 slices acquired. Sequence parameters were as follows: T1-weighted spin echo (time to recovery [TR] 517–618 msec, time to echo [TE] 13 msec) and STIR (TR 2,720–3,170 msec, time to inversion 140 msec, TE 38–61 msec). The spine was imaged in 2 parts: upper half comprising the entire cervical and most of the thoracic spine, lower half comprising the lower portion of the thoracic spine and entire lumbar spine. The specific MRI parameters for acquiring spine images are provided on our Web site (available at: www.arthritisdoctor.ca).
Scoring of MRI lesions.
Scoring of MRI lesions has been described previously (4). Briefly, our scoring method for active inflammatory lesions in the spine relies on the use of the STIR sequence that suppresses the normal marrow fat signal, the presence of which frequently obscures signal emanating from bone marrow edema associated with inflammation. T1-weighted spin-echo images were included for anatomic reference only and were not scored. For each DVU, 3 consecutive sagittal slices were scored, which allowed evaluation of the coronal extent of lesions as well as assessment in the sagittal and anteroposterior planes. Discal lesions were not scored.
Definition of abnormal STIR signal.
Bone marrow signal in the center of the vertebra or an adjacent normal vertebra constituted the reference for designation of normal signal. A set of reference AS cases were included to facilitate designation of abnormal signal on STIR.
Scoring of depth and intensity.
Signal from cerebrospinal fluid constituted the reference for designating an inflammatory lesion as intense. A lesion was graded as deep if there was a homogeneous and unequivocal increase in signal over at least 1 cm from the vertebral end plate. Assessment of depth was made possible by including a scale on the image.
Each DVU was divided into 4 quadrants: upper anterior, upper posterior, lower anterior, and lower posterior. The presence of increased T2 signal in each of these 4 quadrants was scored on a dichotomous basis (1 = increased signal, 0 = normal signal). This was repeated for each of 3 consecutive sagittal slices giving a maximum score of 12 per DVU. On each slice, the presence of a lesion exhibiting intense signal in any quadrant was given an additional score of 1. Similarly, the presence of a lesion exhibiting a depth ≥1 cm in any quadrant was given an additional score of 1, resulting in a maximum additional score of 6 for that level and bringing the total score to 18 per DVU.
MRI reading exercises.
A unique MRI study number was allocated to each patient and control, thereby ensuring blinding to all patient demographics. Allocation was done by a technologist unconnected with the study. Assessment was performed on a 3-monitor review station by 2 readers using computer software that has been optimized for this type of review (Merge efilm, Milwaukee, WI). Each patient was only identified by the MRI study number and films were read in random order. Pre- and posttreatment images were scored concurrently with the reader blinded to time sequence. No instructions were provided as to how the reader should select the most severely affected DVU for the 6-, 8-, and 10-DVU scores and scoring was done from C2 to L5 in all cases. The 3-monitor review station readily permits simultaneous visualization of all segments of the spine on pre- and posttreatment images. Readers, trained in use of the SPARCC system, were instructed to identify the 6, 8, and 10 worst levels based on the scans from both time points and no other specific guidance was necessary as to how the reader should select the most severely affected DVU. The same DVU were scored on pre- and posttreatment images in the assessment of the 6, 8, and 10 most severely affected DVU.
Descriptive statistics (mean, median, interquartile range, standard deviation, maximum and minimum values) were used to describe the overall distribution of scores. Distribution of affected levels and DVU scores for the entire spine and according to spinal segment was based on the mean scores of the 2 readers. The interobserver reproducibility of status and change scores was calculated using analysis of variance to provide an intraclass correlation coefficient (ICC). A two-way mixed effects model with observer as a fixed factor was used. Values >0.6 represented good reproducibility, >0.8 represented very good reproducibility, and >0.9 represented excellent reproducibility. Reproducibility was also examined using Bland-Altman plots and 95% limits of agreement. Construct validity was assessed by comparing changes in the index score with changes in disease activity as quantified by the BASDAI (7), nocturnal back pain, and C-reactive protein (CRP) levels. This was done using Spearman's correlation coefficient analysis. Two statistical methods were used to assess responsiveness: the effect size and the standardized response mean. Values of 0.20, 0.50, and ≥0.80 were considered to represent small, moderate, and large degrees of responsiveness, respectively. Discrimination was not assessed because the open-label phase of the clinical trial is still ongoing and treatment codes remain unbroken at this time.
The mean number of affected levels for the entire spine was 6.9 (median 6.0) and the majority of affected levels were in the thoracic spine (mean 4.2, median 4.0) (Table 1). The highest DVU scores were also recorded in the thoracic spine and 65.2% of all affected levels were located in this region. Only 15.0% and 19.8% of affected levels were located in the cervical and lumbar spines, respectively. The percentages of patients that were assessed by both observers as having no affected level in the cervical, thoracic, and lumbar spines were 33.8% (23 of 68), 13.2% (9 of 68), and 26.5% (18 of 68), respectively. Percentages of patients assessed as having no affected level by at least 1 observer in the cervical, thoracic, and lumbar spines were 57.4% (39 of 68), 29.4% (20 of 68), and 50% (34 of 68), respectively.
|Parameter||Mean ± SD||Median (IQR)||Range|
|No. of affected DVU|
|Total||6.9 ± 5.5||6.0 (2–11)||0–19|
|Cervical spine||1.1 ± 1.3||1.0 (0–2)||0–5|
|Thoracic spine||4.2 ± 3.7||4.0 (1–7)||0–12|
|Lumbar spine||1.5 ± 1.6||1.0 (0–3)||0–5|
|Total (23 DVU)||29.3 ± 32.5||20 (3–43)||0–175|
|Cervical spine||4.4 ± 7.7||1 (0–6)||0–57|
|Thoracic spine||19.1 ± 21.9||11 (1–32)||0–94|
|Lumbar spine||5.8 ± 8.4||2 (0–8)||0–41|
|10-DVU score||26.6 ± 27.9||19 (3–41)||0–135|
|8-DVU score||24.6 ± 24.8||18 (3–37)||0–112|
|6-DVU score||21.6 ± 20.4||18 (3–32)||0–86|
Median scores for the 6-, 8-, 10-, and 23-DVU scores were similar. Approximately half of the patients (51.5%) had ≤6 affected levels and the percentages of patients that were assessed as having more than 6, 8, and 10 affected levels were 48.5% (33 of 68), 41.2% (28 of 68), and 26.5% (18 of 68), respectively. Of the 473 affected levels, 292 (61.7%) levels were scored when analysis was limited to only the 6 most severely affected DVU per patient. When scoring was limited to only the 8 and 10 most severely affected levels, the number of analyzed DVU increased to 352 (74.4%) and 409 (86.5%), respectively (Figure 1). The sum total DVU score for all 68 patients with AS was 1,992. Analysis that was limited to only 6 of the most severely affected levels captured 73.7% of the total DVU score, whereas analysis that was limited to 8 and 10 levels captured 84% and 90.8% of the total DVU score, respectively. Mean scores for controls were 4.4, 4.5, 4.5, and 4.5 for the 6-DVU, 8-DVU, 10-DVU, and 23-DVU scores, respectively (data not shown).
The mean percentage agreement for selection of the 6 most severely affected DVU was 67.6% (range 33.4–100%). Interobserver reliability for status scores was good to very good for detection of affected levels and excellent for scoring of affected levels in the thoracic and lumbar spines (Table 2). Reliability was only moderate for scoring affected levels in the cervical spine. Reliability of both status and change scores was excellent regardless of whether all or only a limited number of levels were analyzed. Bland-Altman plots showed that measurement differences between the 2 observers were evident across the entire range of scores (Figure 2). This was similarly noted for the 6-, 8-, and 10-DVU scores (data not shown).
|Parameter||Interobserver ICC status (n = 68)||Interobserver ICC change (n = 29)|
|Total (23 DVU)||0.93||0.91|
Significant and similar correlations were noted between changes in 6-, 8-, 10-, and 23-DVU scores and changes in CRP level in the 29 patients who received anti-TNF therapies (Table 3). No significant correlations were observed between changes in either nocturnal pain or BASDAI score and any DVU score.
|Δ23 DVU||Δ10 DVU||Δ8 DVU||Δ6 DVU|
|Δ Nocturnal pain||0.26||0.26||0.27||0.26|
|Δ CRP level||0.68†||0.66†||0.66†||0.65†|
Analysis of changes in response to anti-TNF therapy demonstrated that this was most readily apparent in the thoracic spine (Table 4). Responsiveness was minimal following assessment of the cervical spine. A more limited scoring system was more responsive than assessment of all 23 levels. Moreover, a scoring system that was limited to a maximum of 6 most severely affected levels demonstrated the greatest degree of responsiveness.
|Parameter||Mean ± SD score||Effect size||Standardized response mean|
|Total||8.8 ± 5.6||6.0 ± 4.4||0.50||0.66|
|Cervical spine||1.5 ± 1.5||1.3 ± 1.6||0.13||0.16|
|Thoracic spine||5.5 ± 3.6||3.5 ± 3.1||0.55||0.67|
|Lumbar spine||1.8 ± 1.6||1.3 ± 1.4||0.35||0.49|
|Total (23 DVU)||41.1 ± 37.8||20.9 ± 27.6||0.53||0.80|
|Cervical spine||6.8 ± 10.2||4.8 ± 10.9||0.19||0.30|
|Thoracic spine||27.3 ± 24.7||12.0 ± 15.9||0.62||0.82|
|Lumbar spine||7.1 ± 9.3||3.9 ± 6.0||0.34||0.43|
|6-DVU score||28.6 ± 21.3||13.4 ± 15.3||0.71||0.86|
|8-DVU score||33.4 ± 26.7||16.5 ± 19.3||0.64||0.84|
|10-DVU score||36.6 ± 30.8||18.3 ± 22.6||0.60||0.86|
Our analyses of the SPARCC scoring method for the assessment of spinal inflammation by MRI demonstrated that limiting scoring to only the 6 most severely affected levels captures 62% of all affected DVU and 74% of the total DVU score. Furthermore, interobserver reliability was excellent regardless of whether analysis was limited to only the most severely affected levels or included the entire spine whereas responsiveness was optimal when scoring was limited to only the 6 most severely affected levels. These observations, together with improved feasibility, support the notion that during assessment of the entire spine in patients with AS, scoring all affected DVU is unnecessary and may therefore facilitate acceptance of this approach for clinical research and in clinical trials.
These findings are not entirely surprising. Scoring of the entire spine, as opposed to only the more severely affected DVU, will include more subtle lesions that may be less responsive to change and more difficult to assess. If readers are permitted to select levels for scoring, some error in reading due to the presence of signal artifact may be eliminated because the reader has the choice of not selecting those levels that are clearly subject to phase-encoding, partial-volume, or other artifacts. In addition, reliability of assessment is not as good in the cervical spine and responsiveness to change in this region is poor. This finding likely reflects both a relative lack of involvement and the large field of view that is required to image the entire spine in 2 halves. In the lumbar spine, reader reliability in selection of affected DVU and reliability of change scores were also only moderate. The majority of affected levels and the greatest contribution to the total DVU score came from the thoracic spine. Accordingly, interreader reliability for status and change scores was maximal in this spinal segment. It is premature to conclude, however, that scoring should be confined to the thoracic spine because the distribution of spinal inflammatory lesions may vary according to disease duration and other demographic variables such as sex. Although stratification of our data according to disease duration and sex did not significantly influence the distribution of affected DVU and DVU scores in our cohorts (data not shown), this issue will require further study in larger data sets. This scoring method cannot be recommended for diagnostic evaluation at this time. Its primary purpose is to record change in inflammatory lesions for clinical and therapeutic trials research and no method for scoring MRI scans in clinical practice has yet shown consistent results.
A potential source of bias, which may primarily affect the reliability of the 23-DVU score, is introduced if the reader selects the 6 most severely affected DVU before the remaining DVU are scored. This may potentially reduce the variability in identification and scoring of the remaining less severely affected DVU, leading to higher ICC values for the 23-DVU score. In fact, readers were not provided with any instructions as to when the most severely affected DVU should be selected and they may, for instance, have scored all 23 DVU first and then chosen the 6 worst DVU. Alternatively, the reader may have made the selection first but could still have chosen to change the selection of the most severely affected DVU after scoring the entire spine. The impact of this study design on the reliability of the 23-DVU score is therefore not readily apparent. In contrast, the feasibility of a study design in which readers are asked to score 6, 8, 10, and 23 DVU in independent reads with the increasing likelihood of recall, particularly for severely affected DVU, is an open question.
The selection of the most severely affected DVU when assessing pre- and posttreatment images is based on a simultaneous assessment of these images using a 3-monitor review station. This readily permits simultaneous assessment of all spinal segments at both time points. Although limiting the selection to only the most severely affected DVU potentially adds to the measurement error, our data demonstrate that reliability of change scores is no different whether all 23 DVU or only the most severely affected DVU are scored. We consider it very important that viewing conditions are organized in a manner that readily permits simultaneous visualization of both pre- and posttreatment images.
One other scoring method for assessment of spinal inflammation by MRI has been published (3). This approach is also based on the assessment of a spinal DVU and scores bone edema and erosion in a single dimension from a sagittal image according to the proportion of the anteroposterior length of the DVU involved. Scores are weighted towards the presence of erosion and range from 0 to 6 per DVU. This approach uses both T2-weighted and gadolinium-enhanced MRI sequences. This index was shown to be reliable and responsive to change in patients receiving anti-TNFα therapy. Recently, the scoring approach has been modified to include the evaluation of edema only and the range of scores per DVU has accordingly been reduced to 0–3 (8). There has been no further work to determine whether a more focused approach to scoring the most severely affected levels might perform equally well compared with scoring the entire spine. Systematic examination of spinal lesions by MRI using this latter scoring method concurred with our observations that the majority of affected levels were located in the thoracic spine, although that examination revealed somewhat more lesions in the cervical spine than in the present study (8). Involvement of cervical DVU was evident in 16–26% of patients, although the number of affected cervical DVU per patient was not provided. There were no obvious differences in disease duration or severity that might account for these differences with our observations. Our analyses were based entirely on the assessment of STIR MRI sequences and it is recognized that gadolinium-enhanced MRI may reveal distinct lesions that score differently, although the likelihood of this affecting the total score for a patient is low (9, 10). Both approaches to scoring omit lesions in the posterior segment of the spine, including the facet joints, processes, and interspinous ligaments, which have not yet been systematically evaluated by MRI. Whether inclusion of these regions will improve the metrologic properties of MRI-based scoring systems requires further study.
Assessment of construct validity demonstrated that changes in spinal inflammation MRI scores primarily paralleled changes in CRP level regardless of the scoring method used in our study. The lack of correlation with the BASDAI may reflect the fact that the latter instrument is a self-reported measure of patient symptoms such as pain, stiffness, and fatigue and is therefore not necessarily specific for AS, but may equally reflect the symptomatology of nonspecific causes of back pain. Additional sources of back pain other than inflammation are possible in patients with AS with long disease duration who may either develop secondary structural damage and/or concomitant spinal disorders unrelated to AS.
Two reports have now shown that anti-TNF therapy for >2 years reduces MRI scores for disease activity in the spine as recorded by the Ankylosing Spondylitis Spinal MRI score, although there is persisting disease that amounts to 25–30% of the baseline score (11, 12). This could potentially raise concerns that a scoring system limited to only the most severely affected DVU might not capture residual disease, limiting its ability to record more effective treatment strategies. However, our data show that posttreatment scores are 46.9%, 49.4%, and 50% of pretreatment 6-, 8-, and 10-DVU scores, respectively, and are no different from an analysis of the entire spine (23-DVU score), which shows a posttreatment score that is 50.9% of the pretreatment score, allowing ample opportunity for assessment of more effective treatment strategies.
In conclusion, the SPARCC MRI spinal inflammation index requires assessment of the entire spine but performs better with respect to responsiveness when analysis is limited to a maximum of 6 most severely affected levels as compared with results derived from scoring the entire spine. Interreader reliability is excellent for both status and change scores with either scoring approach. The use of the 6-DVU scoring method should improve the feasibility of this tool in clinical trials and research.
Dr. Maksymowych had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study design. Maksymowych, Dhillon, Lambert.
Acquisition of data. Maksymowych, Dhillon, Park, Salonen, Inman, Lambert.
Analysis and interpretation of data. Maksymowych, Inman, Lambert.
Manuscript preparation. Maksymowych, Inman, Lambert.
Statistical analysis. Maksymowych.
- 3Magnetic resonance imaging examinations of the spine in patients with ankylosing spondylitis, before and after successful therapy with infliximab: evaluation of a new scoring system. Arthritis Rheum 2003; 48: 1126–36., , , , , , et al.
- 8Magnetic resonance imaging in ankylosing spondylitis: a detailed analysis [abstract]. Ann Rheum Dis 2005; 64 Suppl 3: 324., , , , , , et al.