Juvenile idiopathic arthritis (JIA) (1) can lead to destructive lesions of joint cartilage and periarticular bone. Since the introduction of more potent treatment strategies, the evaluation of radiographic joint damage has become more prominent in the assessment of disease progression in JIA (2–4). Because one of the aims of JIA treatment is to prevent or retard joint damage, and radiographs are able to document this damage, a standardized tool for the radiologic evaluation of the lesions of the joints over time is needed. Several methods have been proposed to assess radiographs in JIA (5–8), but none have been tested for sensitivity to change or discrimination between treatment groups in a trial. Recently, we introduced the Dijkstra score as a standardized method to evaluate the radiographs of patients with oligoarticular- and polyarticular-onset JIA. Data were obtained from the data set of a placebo-controlled sulfasalazine (SSZ) trial performed in The Netherlands (9). All radiographs were assessed for the presence of a comprehensive spectrum of JIA radiologic features. We found the reliability, feasibility, and measurement properties of the Dijkstra score to be adequate for its purpose.
To decide whether a measure or instrument is applicable in a particular clinical setting, the OMERACT filter can be applied (10). This tool was developed for the Outcome Measures in Rheumatology Clinical Trials (OMERACT) initiative and summarizes applicability with 3 criteria: 1) truth (Does the instrument measure what it is supposed to?), 2) discrimination (Can the instrument discriminate between situations of interest?), and 3) feasibility (Can the instrument be feasibly applied in the intended setting?). We confirmed, to some degree, the truth and feasibility of the Dijkstra score in a clinical trial setting in our previous study (9). In the current study, we focus on the discrimination criterion as it applies to the Dijkstra score. More specifically, we studied whether the Dijkstra score could detect radiographic change in a 6-month period, and also whether differences in change between the treatment groups could be detected. For this purpose, we propose a composite score for inflammation, damage, and growth disturbances, and a classification scheme to distinguish between progressive and nonprogressive radiographic joint damage.
PATIENTS AND METHODS
Data collection and scoring of radiographs.
A randomized, placebo-controlled trial of SSZ in patients with oligoarticular- and polyarticular-onset JIA (mean ± SD age 9 ± 4 years), performed by the Dutch JIA Study Group (3), yielded the data for this study. All conventional film-screen radiographs of the affected joints (those that are tender, painful, swollen, and/or limited in motion, as judged by the treating physician) and the contralateral joints, obtained at study entry and at 6 months' followup, were analyzed. After completion of the trial, the radiographs were scored in chronologic order in a single session by a skeletal radiologist (Piet F. Dijkstra) and a pediatric rheumatologist (MAJvR) in consensus. The readers were blinded to the subtype of JIA and the clinical condition of the patient.
The radiographs were scored in accordance with a standardized assessment method as described previously (9); this is referred to as the Dijkstra score. In summary, a maximum of 19 joints or joint groups were evaluated on radiographs: the cervical spine (1 joint), 2 mandibles, 2 shoulders, 2 elbows, 2 hands (for each hand, the joint group includes all finger, metacarpal, and wrist joints), 2 sacroiliac joints, 2 hips, 2 knees, 2 ankles, and 2 feet (for each foot, the joint group includes all tarsal, metatarsal, and toe joints). In the current report we use the term joint both for a large single joint and for a group of smaller joints (e.g., the hands). The following features (collectively defined as a radiologic abnormality) were scored as present or absent: soft tissue swelling, osteopenia, joint space narrowing (JSN), enlargement or other growth disturbances, subchondral bone cysts, erosions, and abnormal joint position, or malalignment. One joint/joint group could have a positive score for more than one feature, but each feature was only counted once (e.g., a joint/joint group could show more than one erosion, but the erosion score remained as 1).
A set of radiographs was obtained for each joint and comprised the films obtained at study entry (baseline) and those obtained at 6 months' followup. A standard set of films was defined as radiographs of both hands, feet, and knees of 1 patient. The differences in scores over time were compared between radiologic signs, joints, and patients.
For a further standardized numeric evaluation of the data, we defined the Dijkstra composite scores for each radiographed joint as follows: the Dijkstra inflammation (DI) score (range 0–2) is the summation of scores for swelling (range 0–1) and osteopenia (range 0–1); the Dijkstra damage (DD) score (range 0–3) is the summation of scores for JSN (range 0–1), bone cysts (range 0–1), and erosions (range 0–1); and the Dijkstra growth (DG) score is the score for growth abnormalities (range 0–1). The DI, DD, and DG scores were calculated at baseline and at followup for each radiographed joint and for each patient, and the values at both time points were compared. An increase in any of the Dijkstra composite scores was deemed to indicate joint deterioration, while a decrease reflected improvement. The malalignment sign was excluded from our analyses, since its prevalence on the radiographs was too low to generate useful data.
Definition of progression.
Joint damage was subsequently categorized as progressive when either the DD or the DG score in a joint increased. The disease course in all other joints was considered to be nonprogressive, with subclassifications of normal (both DD and DG scores of 0, with no increase), abnormal–stable (either or both scores >0, with no change), and abnormal–improved (either or both scores >0, with a subsequent decrease in either or both scores at 6 months). Patients were defined as having a progressive disease course (progressor) when at least 1 radiographed joint showed progression as defined above.
The presence of radiologic abnormalities was summarized both at the level of the various joints and at the level of the patient. At the level of the individual joints, a marginal regression model, as implemented in the SAS statistical program proc GENMOD (SAS Institute, Cary, NC), was used to compare different patient groups with respect to the Dijkstra composite scores. Wald-type chi-square statistics using robust variance estimates were calculated to account for the possible correlation between joints from the same patient. The same approach was used to test the differences between baseline and followup radiographs, and to assess associations between the different composite scores (reflecting radiologic changes) per joint. The logistic regression approach in proc GENMOD was used to compare the percentage of progressive/nonprogressive joints between different patient groups. At the level of the individual patients, ordinary linear and logistic regression analyses were used to evaluate the effect of treatment and other patient characteristics on the Dijkstra composite scores and on the percentage of progressor/nonprogressor patients. A P value of less than or equal to 0.05 was considered significant.
The original placebo-controlled SSZ trial included 69 patients with JIA. For the present study, 3 patients were excluded (1 because of reclassification as having systemic JIA, and 2 because of missing radiographs). Therefore, the data comprised 418 sets of radiographs from 66 patients. At study entry, all affected and contralateral joints were radiographed; of the baseline films, 288 (69%) originated from joints with clinical symptoms (swelling, pain, limitation of motion). The patients' characteristics are listed in Table 1. At baseline, 10 patients (15%) had no abnormalities on their radiographs. The mean number of sets of radiographs (a single joint radiographed at baseline and at followup) per patient was 6.3 (SD 3.7, range 2–15). The sets of radiographs consisted of radiographs of the knees (23%), hands (20%), ankles (18%), feet (15%), and other joints (24%).
Table 1. Characteristics at study entry of 66 patients with juvenile idiopathic arthritis who participated in the placebo-controlled trial of sulfasalazine*
|Age, mean ± SD (range) years||9.0 ± 4.1 (2.5–17.6)|
|Disease onset before age 6 years||35 (53)|
|Disease onset between 6 and 10 years||16 (24)|
|Disease onset beyond age 10 years||15 (23)|
|Disease duration, months|| |
| Median (IQR)||24 (10–40)|
|Polyarticular-onset type JCA†||29 (44)|
|Oligoarticular-onset type JCA||37 (56)|
|>4 joints with clinical arthritis at study entry‡||41 (62)|
|Antinuclear antibodies present||33 (50)|
|IgM rheumatoid factor present||9 (14)|
|HLA–B27 positive||11 (17)|
|Local corticosteroid use ever||29 (44)|
|Disease-modifying antirheumatic drug use ever||5 (8)|
|Systemic corticosteroids use ever||2 (3)|
The proportion of radiographs showing abnormalities was stable over time; 35% of radiographs showed swelling and osteopenia at baseline versus 38% at followup, and 24% did not show these signs at baseline versus 26% at followup. After 6 months, 58% of the radiographed joints remained normal, 23% remained abnormal but stable, 14% showed an increase in signs, and 5% showed a decrease in signs; if the presence of swelling and osteopenia are excluded from the score, these proportions change to 71% remaining normal, 18% abnormal but stable, 3% showing an increase in signs, and 8% showing a decrease in signs.
Changes per joint.
The knees, hands, and feet were most likely to change, especially from normal to abnormal in the latter 2 sites (Table 2). The knees were most likely to change from abnormal back to normal. With regard to the separate radiologic signs scored according to the Dijkstra score in the different types of joints (Table 3), all signs showed changes over time in the hands and knees. In the joints of the feet, all radiologic signs showed changes except for swelling. Swelling and growth abnormalities changed most often in the knees.
Table 2. Changes from baseline to followup in overall status per joint according to the Dijkstra score*
Table 3. Changes from baseline to followup in scored radiologic signs according to the Dijkstra score*
Changes per patient.
Of the 66 patients with JIA, 8 (12%) had normal radiographic findings throughout followup, 18 (27%) showed abnormalities at some sites without change, and 40 (61%) showed change in at least 1 site. Of these 40 patients whose radiographs showed change, only 2 had normal findings at baseline and developed abnormalities at followup, while the other patients already had abnormalities on the radiographs at baseline. Changes in the number of scored signs occurred in 24 patients, of whom 13 showed an increase and 11 showed a decrease in signs. If the presence of swelling and osteopenia are excluded from the score, the number of scored signs changes to 11 patients with an increase and 3 with a decrease in signs.
Progression of joint damage.
Changes in the clinical joint scores (swelling, pain, limitation of motion, overall clinical severity score) showed no correlation with changes in the radiologic joint scores (results not shown). The changes in Dijkstra composite scores DI, DD, and DG varied considerably per type of joint and occurred most frequently in the knees, hands, and feet (Table 4). The DI and DG scores (inflammation and growth abnormalities) changed most often in the knees, while the DD scores (joint damage) changed most often in the hands and feet. Patients with different prognostic profiles (HLA–B27 positive or IgM rheumatoid factor positive ) and patients with availability of films from the standard set (hands, feet, and knees) showed comparable degrees of changes in the Dijkstra composite scores (results not shown).
Table 4. Changes from baseline to followup in the Dijkstra composite score of the radiographed joints*
|Inflammation|| || || || || || || |
| DI unchanged||89||79||88||94||91||89||100|
| DI increased||6||8||7||3||8||9||–|
| Normal to abnormal||4||6||3||3||7||9||–|
| Increase in abnormality||2||2||4||–||1||–||–|
| DI decreased||5||13||5||3||1||2||–|
| Abnormal to normal||5||13||5||3||1||2||–|
| Decrease in abnormality||–||–||–||–||–||–||–|
|Damage|| || || || || || || |
| DD unchanged||92||95||86||85||97||92||96|
| DD increased||6||5||12||10||3||8||–|
| Normal to abnormal||4||4||4||8||3||8||–|
| Increase in abnormality||2||1||8||2||–||–||–|
| DD decreased||2||–||2||5||–||–||4|
| Abnormal to normal||1||–||2||3||–||–||4|
| Decrease in abnormality||1||–||–||2||–||–||–|
|Growth|| || || || || || || |
| DG unchanged||96||87||94||98||100||98||100|
| DG increased||2||3||6||2||–||2||–|
| Normal to abnormal||2||3||6||2||–||2||–|
| DG decreased||2||10||–||–||–||–||–|
| Abnormal to normal||2||10||–||–||–||–||–|
With regard to changes over time and disregarding treatment group, only the changes in the DD score were statistically significant (P = 0.035). Since most changes occurred in the standard set, there was little correlation between these and the (few) changes in the other joints, especially when considered according to the DI and DG scores. We did see some changes in the DD score among joints outside the standard set, but in all instances this was accompanied by DD score changes within the standard set. The disease course in 8% of joints was classified as progressive because either the DG or the DD score increased (Table 5). Among the nonprogressors, roughly 74% were subclassified as normal, 15% as abnormal–stable, and 3% as abnormal–improved. In the standard set, the proportion considered normal was somewhat lower at 62%, whereas it was higher (90%) in the remaining films. These results suggest that restriction of assessment to the standard set does not lead to important loss of information, and may even enhance the signal-to-noise ratio.
Table 5. Classification of radiologic change from baseline to followup into progressive and nonprogressive change*
|Progressive|| || || |
| Increase in DD or DG score||8||11||3|
|Nonprogressive|| || || |
| Normal (DD and DG score 0, no increase)||74||62||90|
| Abnormal–stable (DD or DG score >0, no change)||15||22||6|
| Abnormal–improved (DD or DG score >0, decrease in DD or DG score at 6 months)||3||5||1|
Comparison of treatment groups.
Finally, we explored the data to identify a possible treatment effect of SSZ on the outcome displayed by the radiographs. In this study, 187 (45%) of the sets of radiographs originated from 31 placebo-treated patients and 231 (55%) were from 35 SSZ-treated patients. Significantly less deterioration, as evidenced by changes in the DD scores, occurred in the SSZ-treated patients compared with the placebo-treated patients (P = 0.04), whereas the differences in the DI and DG scores were not significantly different between treatment groups. In the standard set (65 patients, 240 hand/foot/knee radiographs), the SSZ group consistently showed less deterioration than the placebo group, according to the DI, DG, and DD scores, but the difference was only marginally significant for the DD score (P = 0.052).
Classification of the radiologic change as progressive or nonprogressive improved the power of the comparison. When we classified films of the joints as progressive or nonprogressive, we found that 12% of the joints of the placebo group could be classified as progressive compared with 4% of the joints of the SSZ group (P = 0.037); the corresponding values from the films of the standard set were 16% and 7%, respectively (P = 0.025) (Table 6). We also analyzed the radiologic change at the individual patient level. When patients with at least 1 radiograph showing progression were classified as progressors, significantly more placebo-treated patients were considered to be progressors (P = 0.046). This difference between the treatment groups was no longer significant when the analysis was restricted to hand/foot/knee radiographs from patients with availability of radiographs from the standard set (P = 0.15).
Table 6. Classification of outcome on the radiographs from patients treated with sulfasalazine (SSZ) versus placebo in the clinical trial*
|Progressive||23 (12)||9 (4)||17 (16)||9 (7)|
|Nonprogressive|| || || || |
This study demonstrates that the Dijkstra scoring method of assessing radiographs in oligoarticular- and polyarticular-onset JIA can detect change over a trial period of 6 months. Changes could be demonstrated at the level of 1) the presence of radiologically scored signs, 2) the number of scored signs per joint, and 3) the number of scored radiologic signs per patient. In addition, the differences between placebo and SSZ treatment groups, many of which were statistically significant, could be demonstrated at these 3 levels. Finally, a simple classification scheme to identify progressors and nonprogressors proved discriminative between the treatment groups.
Reduction in the number of radiographs to a standard set of images of the hands, feet, and knees appears feasible without losing essential information. Radiologically scored abnormalities changed most often in the knees, hands, and feet. We therefore propose that the radiologic assessment of all joints of this standard set be carried out in clinical trials, regardless of disease activity.
Most radiographs showed additional radiologic features at followup, but in some radiographs, abnormalities present at baseline were not present at followup. This feature of normalization occurred in several types of joints, but most often in the knees. We assume that an increase in the number of scored signs reflects disease progression, but a stable number or a decrease in the number of scored signs does not inevitably reflect disease improvement (e.g., JSN or erosions can replace growth abnormality). Only joints without remaining signs have unquestionably improved. In addition, one should be aware that in the Dijkstra scoring system, an increase in identical signs within joints is not reflected in an increase in the number of scored signs.
Changes were evident in all aspects of the score, comprising inflammation (swelling and osteopenia), pathologic changes in the cartilage (JSN and growth abnormalities), and those in the bone (growth abnormalities, bone cysts, and erosions). In our previous study, we demonstrated that scores for swelling and osteopenia were only moderately reliable (9). Despite detectable change, we still believe that both swelling and osteopenia are of limited value in a scoring system. Nevertheless, at this stage of scoring development, we consider it too early to reject these radiologic findings for further evaluation.
Changes in growth abnormalities were detected in several types of joints; in particular, in the hand and knee, growth abnormalities both regressed and appeared during the followup period. This sign is considered a key manifestation of JIA (8, 11, 12), but in our investigations, its reproducibility was moderate (9). Definitions of growth disturbance therefore need further refinement (e.g., an atlas of reference films) to improve the value in a scoring system.
JSN is also considered a key manifestation in JIA. JSN showed a reliable reproducibility in our previous study and in other studies (9,13). In the present study, changes in JSN were demonstrated in all joints of the standard set. Scores for bone cysts and erosions changed in several joints and appeared reliably reproducible in our previous study (9). Quantification and refinement of the erosion and JSN scores might further improve the performance of radiologic scoring in JIA, consistent with that achieved in rheumatoid arthritis (RA) (14). Future studies with a longer period of followup are needed to elaborate on this subject. Changes in malalignment were only rarely detected in the hands, and we therefore have too little data to evaluate the value of scoring of this sign in JIA.
To be applicable in trials, we developed a numeric score, the Dijkstra composite score, comprising separate values for inflammation, growth abnormalities, and damage. In our opinion, these 3 scores represent distinct radiologic information. The results of our study show that DI, DG, and DD scores changed significantly over time and elucidated specific changes in radiographs at the level of the joints and at the level of the patient. The study also demonstrates that the Dijkstra composite score adequately reflects the radiologic change in different patient groups.
For a further evaluation of change, we categorized the radiologic change into progressive or nonprogressive. In evaluations of radiologic outcome in adult RA trials, the progressor classification may provide a useful summary of the data per patient, although its significance for long-term prognosis remains to be determined (15–17). We posited that in JIA, clinically meaningful radiologic change would imply progression of either growth abnormalities or damage. Therefore, we defined progressive radiologic change as an increase in either the DG score or the DD score in a joint. In our study, application of this proposed definition for classification resulted in a distinct discrimination of radiographs originating from placebo-treated and SSZ-treated patient groups. Moreover, individual patient-based analysis showed a significant difference between progressor and nonprogressor patients to the advantage of SSZ-treated patients. These findings must be interpreted with caution, since the trial was not designed to evaluate differences in damage progression. Nevertheless, the radiologic findings are consistent with the clinical findings in the trial (3), and with the effects of SSZ in adult RA (18, 19). Thus, the findings appear to confer some additional construct validity to the composite scores and subsequent classifications.
In summary, this study completes the initial validation phase of the Dijkstra score. We suggest that it is the first radiologic measure in JIA to pass the OMERACT filter of truth, discrimination, and feasibility, at least in the setting of a placebo-controlled trial in oligoarticular- and polyarticular-onset JIA. Future studies by other investigators and in other data sets should put this measure to the test. For this purpose, we intend to produce training materials, and we will further validate the scoring method on the basis of a long-term followup of patients in the trial.
Piet F. Dijkstra, skeletal radiologist, was one of the initiators of the study. The Dijkstra score is based on Dr. Dijkstra's lifelong experience of reading radiographs from patients with rheumatic diseases.