To investigate the applicability of the Sharp and Larsen scoring methods for radiographic damage in juvenile idiopathic arthritis (JIA).
To investigate the applicability of the Sharp and Larsen scoring methods for radiographic damage in juvenile idiopathic arthritis (JIA).
Wrist/hand radiographs of 25 patients with polyarthritis obtained at first observation and then yearly for 4–5 years were assessed independently by 2 pediatric rheumatologists according to the Sharp and Larsen methods. To facilitate score assignment, each patient radiograph was compared with a bone age–related standard. A third pediatric rheumatologist measured the Poznanski score, and a pediatric radiologist provided a semiquantitative assessment of radiographic damage severity.
Interobserver and intraobserver agreement on longitudinal scores were good for both Sharp and Larsen methods, with intraclass correlation coefficient >0.9. Agreement on change assessment was good for the Sharp method and moderate for the Larsen method. Both methods yielded a steady increase in scores during the study, with score change being more marked in the first year. Sharp and Larsen scores were highly correlated (rs = 0.96). Correlations of both scores with the Poznanski score were moderate to high (rs from −0.62 to −0.72). Radiologist score was correlated at borderline-high level with both Sharp (rs = 0.70) and Larsen (rs = 0.71) scores. Sharp and Larsen score change from baseline to final visit was moderately to highly correlated with the number of joints with active arthritis and restricted motion and the Childhood Health Assessment Questionnaire score at final visit.
Our results demonstrate that the Sharp and Larsen scoring systems are potentially reliable and valid for assessment of radiographic progression in patients with polyarticular JIA.
The assessment of radiographic joint damage is an important clinical tool in the evaluation of disease severity and progression in patients with chronic arthritis and is recommended as a main outcome measure in controlled clinical trials of disease-modifying therapies (1) as well as in longitudinal observational studies (2). A number of scoring systems are available to quantify radiographic changes in adults with rheumatoid arthritis (3, 4). However, little information exists on standardized measurement of radiographic damage in the investigation of disease outcome in children with juvenile idiopathic arthritis (JIA) (5–11). Furthermore, the assessment of radiographic progression has never been included in controlled trials of second-line agents in JIA. This chiefly reflects the lack of established radiographic scoring systems for use in children. However, because novel potent therapeutic agents are now available for children with JIA (12), there is a growing need for a clear and reproducible radiologic assessment standard to thoroughly investigate the effectiveness of these agents.
It is commonly believed that the traditional scoring methods used for adult rheumatoid arthritis, such as the scores developed by Sharp et al (13) and Larsen (14), which are based on the assessment of joint space narrowing (JSN) and erosions in the wrist and hand joints, may not be suitable for the evaluation of pediatric joint diseases. In contrast to adults, it is difficult to reliably determine cartilage loss and erosions in children by simple examination of radiographs because ossification is incomplete and the width of joint space varies with age (15). However, to date the adult rheumatoid arthritis scoring systems have never been tested in patients with JIA. The purpose of this study was to investigate the applicability of the Sharp and Larsen scoring systems in the assessment of radiographic progression in patients with polyarticular JIA.
Because we were principally interested in examining the value of scoring systems in the assessment of radiographic progression, we chose a followup design and aimed at analyzing continuous data. We reviewed the radiology records of patients seen at the study centers from May 1992 to November 2004 to identify those who had JIA according to the International League of Associations for Rheumatology revised criteria (16) and polyarthritis with wrist and/or hand joint involvement, and who had a standard radiograph of both wrists and hands in the posteroanterior view made at first observation (baseline) and then yearly in the subsequent 4–5 years.
Four scoring methods were used to score patient radiographs. The Sharp method (13) applies to 18 areas for JSN and 17 areas for erosions in each hand and wrist. Scores for JSN and erosion in each area range from 0 to 4 and from 0 to 5, respectively. The total Sharp score is calculated as the sum of the JSN (range 0–144) and erosion (range 0–170) scores and ranges from 0–314. In the younger children (generally in boys with a bone age <5 years and in girls with a bone age <6 years), some of the wrist areas included in the Sharp score were not assessable owing to incomplete ossification of the carpal bones. In this case, each nonassessable area was arbitrarily assigned the average score of assessable areas.
The original Larsen method has been modified several times by its author. We used the version recommended for longitudinal observational studies (14). The grading scale ranges from 0 to 5 (0 = intact bony outlines and normal joint space, 5 = mutilating changes). The Larsen score ranges from 0 to 120.
In younger children, the changes in carpal bones and, to a lesser extent, in distal metacarpal epiphyses were frequently seen as deformity in shape, from squaring to squeezing to gross deformity, rather than as discrete erosions; in this case, bone shape deformity was considered equivalent to bony erosion and its severity was graded in the Sharp erosion and Larsen scores on the same 0–5 severity scale.
Because in childhood the degree of ossification and the width of joint spaces vary with age, the evaluation of time progression of JSN and bony erosion in a single patient is difficult; the same applies to comparison of films from patients of different ages. To facilitate assignment of Sharp and Larsen scores, we compared each study patient's radiograph with a wrist/hand radiograph from a healthy child of the same bone age. Radiographs from healthy boys and girls of all bone ages ranging from 1.5 to 16 years according to the atlas of Greulich and Pyle (17) were identified by reviewing a large sample of radiographs from children who had a bone age evaluation for short stature and were found to have a constitutional growth delay without endocrinologic abnormalities or who had a radiograph (disclosing no abnormalities) after a wrist/hand trauma.
The scoring method developed by Poznanski et al has been developed for evaluating wrist/hand radiographs in children and has been assessed in patients with JIA and in children with congenital bone diseases (18). The Poznanski scoring method is based on the measurement of the radiometacarpal length, which is the distance from the base of the third metacarpal bone to the midpoint of the distal growth plate of the radius, and measurement of the maximal length of the second metacarpal bone. For each wrist, the number of SDs between the expected and the observed radiometacarpal length for the measured second metacarpal bone is calculated according to the formulas reported by Poznanski et al (18). The radiometacarpal/second metacarpal bone score, which constitutes the Poznanski score, reflects the amount of radiographic damage in the wrist. The more negative the Poznanski score is (that is, the shorter the radiometacarpal length is relative to the second metacarpal bone length), the more severe the radiographic damage. In each couple of wrists, the arithmetic sum of the scores was used in the analyses.
An overall semiquantitative score (radiologist score) was assigned to each radiograph by an experienced pediatric radiologist according to the following grading system: 0 = no abnormality, 1 = slight abnormality, 2 = definite JSN (no erosions), 3 = slight erosions, 4 = severe erosions, and 5 = mutilating abnormality or bony ankylosis. The radiologist was instructed to give each film the score of the more damaged area.
Two observers (FDD and OG) independently and simultaneously assigned the Sharp and Larsen scores to all study radiographs following a predefined order: the Sharp score first, the Larsen score second. Radiographs from each patient were read in sequential order, and previous radiographs and scores were available to observers when examining and scoring followup radiographs. Both observers were pediatric rheumatology fellows with 3 years of clinical experience in the field, but they were not familiar with radiographic scoring. Before the beginning of the study, the observers had a training session with the principal investigator (AR), who was a pediatric rheumatologist with ∼20 years of clinical experience and who was familiar with radiographic scoring, to gain experience with the Sharp and Larsen methods.
Interobserver reliability of each scoring method was assessed for all of the films read by the 2 observers. Intraobserver reliability was based on the scores of radiographs from a subset of 10 randomly selected patients whose films were read a second time in a blinded manner by the 2 observers (5 patients each) 6 weeks after the previous review.
The Poznanski score was measured independently by another observer (FR) who was a pediatric rheumatologist with 5 years of clinical experience and was familiar with this method. In our study, the Poznanski score proved quite reliable, as shown by the very high interobserver and intraobserver intraclass correlation coefficients (0.97–0.99) obtained in a previous study (9).
The radiologist score was assigned independently by one pediatric radiologist (MV) who was not familiar with radiographic scoring, but had much experience in examining radiographs of patients with JIA. Like the other observers, the radiologist read the radiographs in chronological order. All observers were blinded to patient clinical information.
Patient characteristics recorded at baseline included age at onset, sex, disease duration, and JIA subtype. The following clinical assessments were made at baseline and followup visits: number of joints with active disease, number of joints with restricted motion, functional ability assessment using the Italian version of the Childhood Health Assessment Questionnaire (CHAQ; 0 = best, 3 = worst) (19), and erythrocyte sedimentation rate (ESR; Westergren method).
Interobserver and intraobserver agreement for the Sharp and Larsen scores were analyzed by computing the intraclass correlation coefficient (ICC) (20) for both longitudinal score values and score changes between study time points. For the interpretation of ICC values, the following classification was used: <0.4 = poor agreement, ≥0.4–<0.75 = moderate agreement, and ≥0.75 = good agreement (21). To visualize observer agreement, we plotted the scoring values (both absolute and changes) using the Bland and Altman method (22). The independent scores of the 2 observers for each radiograph were then averaged, and this average was used for the analyses.
Because the range of the Sharp and Larsen scores differ, we compared the grading of joint damage for each method by normalizing each score by its possible range according to the following formula: (observed value − minimum value)/possible range × 100. Because no maximum and minimum values exist for the Poznanski score, the severity of damage measured by this method was normalized by the score range (−14–4) observed in a sample of >1,000 wrist radiographs obtained in our JIA population (unpublished observation). Because the Poznanski scores go in a direction opposite to other scores, we used the following formula to obtain an ascending time curve: (maximum value – observed value)/possible range × 100. The comparison of score changes between study time points was also made after normalization of each observed change by the maximum possible change.
Correlations among radiographic scores and between radiographic scores and clinical variables were assessed using Spearman's rank correlation coefficient. Because the Poznanski score is a measure of damage in the sole wrist, its correlation with the Sharp and Larsen scores was assessed by computing these scores only in the wrist areas. For the purposes of this analysis, correlations >0.7 were considered high, correlations ranging from 0.4 to 0.7 were considered moderate, and correlations <0.4 were considered low (23).
Because the primary objective of the study was the investigation of the level of agreement between observers in assessing the Sharp and Larsen scores, the ICC was chosen as indicator of reliability. By establishing the minimal acceptable level of reliability (ρ0) and the specified level of reliability (ρ1) between 2 observers at 2 different times at 0.2 and 0.6, respectively, we calculated that the minimum number of observations (radiographs) necessary to reach a power of 80% with a type 1 (alpha) error of 0.05 was 27 (24). Statistical analysis was performed with Statistica (StatSoft, Tulsa, OK).
A total of 25 patients (8 boys, 17 girls) eligible for the study were identified: 12 had systemic arthritis, 8 had polyarthritis (2 were rheumatoid factor positive), and 5 had extended oligoarthritis. The median age at disease onset was 4.8 years (range 1.7–12.1 years) and the median disease duration at baseline was 1.3 years (range 0.5–3.2 years). At baseline, the median number of joints with active disease was 8 (range 2–36), the median number of joints with restricted motion was 5 (range 0–36), the median CHAQ score was 0.3 (range 0–3), and the median ESR was 52 mm/hour (range 7–114 mm/hour). The number of longitudinal radiographs (baseline plus yearly radiographs) available for review was 6 in 21 patients and 5 in 4 patients, with the total number of radiographs being 146.
The interobserver and intraobserver agreement for the Sharp and Larsen methods, as assessed through the ICC, for all longitudinal score values and for the score changes between all study time points are shown in Table 1. The agreement on longitudinal scores was good by both methods, with all ICCs consistently >0.9. The reproducibility of change assessment was also good for the Sharp method, but only moderate for the Larsen method.
|Interobserver agreement||Intraobserver agreement|
|No. of radiographs||ICC||No. of radiographs||ICC|
|Total Sharp score||146||0.97||54||0.97|
|Sharp JSN score||146||0.96||—||—|
|Sharp erosion score||146||0.96||—||—|
|Total Sharp score||121||0.87||45||0.91|
The Bland and Altman plot of absolute values of total Sharp score is shown in Figure 1. A maximum difference of 40 on a scoring range of 0–314 was demonstrated; the average difference between the 2 observers was 0.9 and the 95% limits of agreement of the difference between the 2 observers were −18.5, 20.3. The plot of score changes (data not shown) revealed a maximum difference of 31; the average difference between the 2 observers was −0.4 and the 95% limits of agreement of the difference between the 2 observers were −20, 19.2. For the Larsen score, the 95% limits of agreement were −9.7, 11.1 for the absolute values and −11.8, 11.6 for the score changes. The Bland and Altman plots of intraobserver agreement for both methods were very similar (data not shown).
The median normalized values of the Sharp and Larsen scores at study time points are shown in Figure 2. All methods yielded a steady increase in scores during the study period. The total Sharp and Larsen scores showed a very close progression over time. The Sharp JSN scores increased more rapidly and remained consistently higher over time than the Sharp erosion scores.
The time course of the Sharp and Larsen scores computed only in the wrist areas and the time course of the Poznanski score are shown in Figure 3. At baseline, the Poznanski score captured more damage than the other methods. After the first year, the time course of the Poznanski score, which is essentially a measure of cartilage loss, became very close to that of the Sharp JSN score.
The serial wrist/hand radiographs of a representative patient who experienced significant progression of radiographic damage over time are presented in Figure 4. The Poznanski, Sharp, and Larsen scores of this patient are presented in Table 2.
|Baseline (A)||1 year (B)||2 years (C)||3 years (D)||4 years (E)||5 years (F)|
|Total Sharp score||10||56||143||178||199||248|
The median normalized changes in Sharp and Larsen scores between the study time points are presented in Figure 5. The change for both methods was more marked in the first year of followup than in the subsequent period. However, the early score change appeared to be mostly due to cartilage loss rather than to bone damage, as shown by the greater change in the JSN than in the erosion component of the Sharp score. The rate of JSN change decreased markedly after the first year, whereas that of erosive change remained relatively constant from baseline to the third year of followup. The progression of radiographic damage tended to decrease after the third year and diminished markedly after the fourth year. This phenomenon was partially explained by the improvement of radiographic scores over time in some patients (data not shown).
The Spearman correlation between the longitudinal values of total Sharp and Larsen scores was high (rs = 0.96), as was the correlation between the Larsen score and the Sharp JSN and erosion scores (rs = 0.91 and 0.94, respectively). The radiologist score was correlated at borderline-high level with both the total Sharp score (rs = 0.70) and Larsen score (rs = 0.71) and at low-moderate level with the Poznanski score (rs = −0.45). The Poznanski score was moderately correlated with both the total Sharp score (rs = −0.63) and Larsen score (rs = −0.62), but correlation increased to −0.72 and −0.70, respectively, when the sole wrist versions of the Sharp and Larsen scores were analyzed. Spearman's correlations between radiographic score changes were similar to those observed for absolute score values (data not shown).
Spearman's correlation between radiographic score changes and clinical measures of JIA severity was calculated between the change in each score from baseline to final visit and the values of each clinical variable at final visit. Total Sharp, Larsen, and Poznanski score changes were moderately to highly correlated with the number of joints with active arthritis (rs = 0.63, 0.63, and −0.41, respectively) and restricted motion (rs = 0.57, 0.61, and −0.40, respectively) and with the CHAQ score (rs = 0.80, 0.70, and −0.68, respectively) and were poorly correlated with ESR (rs = 0.23, 0.24, and −0.36, respectively), whereas the radiologist score correlated poorly with all variables (rs = 0.02–0.12). Correlations between the score changes from baseline to the final visit and the corresponding score values at the final visit were in the moderate-to-high range for all methods (rs = 0.75 for total Sharp score, 0.83 for Larsen score, −0.87 for Poznanski score, and 0.63 for radiologist score).
Although it is commonly believed that JIA has a lesser destructive potential than adult rheumatoid arthritis, several studies have shown that many children with chronic arthritis experience significant radiographic joint damage (8, 9, 25, 26). Furthermore, a higher than expected percentage of children with chronic arthritis may have JSN and erosions early in their illness (27, 28). Radiographic changes are seen most frequently in patients with JIA who have a polyarticular course (25, 29, 30). The presence of polyarthritis is an essential requirement for patient inclusion in controlled trials of second-line or biologic agents (31–33).
It has been suggested that patients with JIA with polyarthritis and wrist disease are at high risk of experiencing radiographic progression (9, 34). The wrist, together with the hip, is the most vulnerable site of radiographic changes in patients with JIA (28, 29). In patients with JIA and polyarthritis, wrist disease is frequently associated with involvement of the small joints of the hands. Therefore, the wrist and hand joints represent suitable sites to investigate radiographic progression in patients with polyarticular JIA.
Little information exists on the use of standardized scoring systems of joint radiographs in JIA (5–11). Recent studies have shown that the Poznanski method is a reliable and sensitive instrument for assessing radiographic progression in clinical and research settings and in therapeutic trials (6, 7, 9). This method is easily applicable and reproducible and has important advantages for use in children: it is not dependent on the degree of ossification of the carpal bones, and normal standards are available. The Poznanski method has some disadvantages, however: it only measures cartilage damage (bone erosions are not captured), it is unreliable in case of advanced carpometacarpal joint destruction, and it cannot be used once there is radiographic closure of the growth plates of the second metacarpal bone (9).
In this study, we examined the applicability of the Sharp and Larsen scoring systems, which are widely used in adult rheumatoid arthritis, in 25 patients with polyarticular JIA. These measures are based on assessment of JSN and erosions in several areas of the wrist and hand joints. To overcome the difficulties in the assessment of cartilage loss and erosions in growing children, we compared each patient's radiograph with a wrist/hand radiograph obtained from a healthy child of the same sex and bone age. We chose bone age–related instead of age-related or size-related standards because patients with JIA frequently have advanced skeletal maturation (34, 35) and are small for their age (with their bones being correspondingly small), making these standards unreliable. Because in younger children the changes in carpal bones and, to a lesser extent, in distal metacarpal epiphyses were seen most frequently as deformity in shape rather than as discrete erosions, these deformities were considered equivalent to bony erosions and were graded on the same scale. This phenomenon is unique to JIA and is likely due to a combination of growth abnormalities, ossification of previous cartilage injury, and true bony erosions (35–37).
Under the experimental conditions chosen, the Sharp and Larsen methods proved quite reliable. Interobserver and intraobserver agreement on longitudinal score values were good for both scores; agreement on score changes was good for the Sharp score and moderate for the Larsen score. The overall good concordance among observers was confirmed with Bland and Altman analysis. The potential validity of the Sharp and Larsen scores was supported by the close relationship of their time course with that of an established scoring system for JIA (the Poznanski score) and by the strong correlation of their values and changes with those of the Poznanski and radiologist scores. Furthermore, the change in Sharp and Larsen scores from baseline to last visit (that is, the entire radiographic progression throughout the study period) was highly correlated with the severity of joint disease and the level of functional disability at last visit.
The rate of Sharp and Larsen score progression was more pronounced during the first year of observation and decreased thereafter. A greater progression early in the course of illness was also seen in a recent study on the Poznanski score (9). These findings support the view that radiographic damage occurs early in polyarticular JIA (25). We cannot exclude that a ceiling effect could partly account for the observed reduction in radiographic progression after the first year. However, the increase in the number of patients who improved over the years of study also played a role. It is well known that the regenerative capacity of articular cartilage is better in growing children than in adults (25, 35). The progression rate in the first year was more pronounced for the JSN score than for the erosion score of the Sharp method, implying that scoring systems that address cartilage loss are potentially more sensitive in detecting early destructive changes in JIA. Notably, the amount of damage captured by the Sharp and Larsen scores was relatively greater when these scores were applied to the sole wrist areas (Figure 3) than when the entire wrist and hand areas were evaluated (Figure 2), suggesting that the wrist joint is best suited for assessment of radiographic progression in patients with JIA.
Our study should be viewed in light of certain limitations. We chose a longitudinal design because we aimed at examining the reliability of scores in the assessment of radiographic progression. Reading of serial radiographs may have facilitated concordance among readers, whereas agreement on scoring of cross-sectional films might have been more difficult to achieve. Readers examined the radiographs in chronological sequence and were allowed to see the previous scores. The chronological scoring of films is less prone to measurement error, but this introduces the possibility of reader bias based on the expectation of increasing damage with time. However, blinding readers to the chronological order of children's radiographs is impossible due to readily apparent growth and maturation of the skeleton. Because the Sharp and Larsen scores were assessed simultaneously in each film, we could not establish whether one method might be more sensitive than the other to detect abnormal findings. We should recognize that because the Poznanski and the Sharp and Larsen scores assess radiographic damage in the wrist only and in the wrist/hand joints, respectively, our findings are of value only for patients with wrist and/or hand disease.
We conclude that in our cohort of patients with JIA, the Sharp and Larsen methods proved feasible and reliable, captured damage and its progression well, and correlated strongly with an established pediatric scoring system and with the clinical measures of disease severity. Additional analyses in larger patient samples and with different investigational design (i.e., cross-sectional) are needed to corroborate these findings. Furthermore, it should be established whether isolated measurement on the wrist/hand represents a good surrogate measure for severity of erosive joint disease in other joints throughout the body.