A novel ultrasonographic synovitis scoring system suitable for analyzing finger joint inflammation in rheumatoid arthritis

Authors


Abstract

Objective

To develop an ultrasonographic (US) synovitis scoring system suitable for evaluation of finger joint inflammation in patients with active rheumatoid arthritis (RA) and to compare semiquantitative US scoring with quantitative US measurements.

Methods

US was performed at the palmar and dorsal sides of the second through fifth metacarpophalangeal (MCP) and proximal interphalangeal (PIP) joints in 10 healthy subjects and in the clinically more affected hand in 46 RA patients. Ten patients additionally underwent magnetic resonance imaging (MRI). Synovitis was measured, standardized, and scored according to a semiquantitative method. The 2 methods (semiquantitative US scoring, quantitative US) were compared and statistical cutoffs were identified using receiver operating characteristic (ROC) curve analysis. MRI results were compared with semiquantitative US scoring and quantitative US results. The optimal US scoring method from 6 joint combinations was identified (ROC curve analysis).

Results

Synovitis was most frequently detected in the palmar proximal area (86% of affected joints). We found no significant differences between individual PIP joints or between individual MCP joints, indicating that all fingers within each of these joint groups should be treated equally for statistical calculations, although each joint group as a whole should be treated separately. The optimal cutoff point to distinguish between “health” and “pathology” was 0.6 mm both for MCP joints (sensitivity 94%, specificity 89%) and for PIP joints (sensitivity 90%, specificity 88%). There was no significant difference between semiquantitative US scores and quantitative US measurements. The best results for joint combinations were achieved using the “sum of 4 fingers” (second through fifth MCP and PIP joints) and “sum of 3 fingers” (second through fourth MCP and PIP joints) methods. Comparison of MRI results with semiquantitative US scores revealed high concordance.

Conclusion

US evaluation of finger joint synovitis can be considerably simplified by focusing on the palmar side and by applying semiquantitative grading instead of quantitative measurements. For evaluation of treatment efficacy based on synovitis in RA patients, we recommend using the “sum of 3 fingers” method in longitudinal trials.

Accurate assessment of disease activity and joint damage in rheumatoid arthritis (RA) is important for monitoring treatment efficacy and for predicting the outcome of the disease. It has been shown that in early RA, synovitis appears to be the primary abnormality, and bone damage occurs in proportion to the level of synovitis, but not in its absence (1). Proximal interphalangeal (PIP) and metacarpophalangeal (MCP) finger joints are usually among the first to be affected in RA, and findings in these joints are considered to be markers of overall joint damage in RA patients (2). Consequently, reliable assessment of these joints is of major importance (3).

The past 3 decades have seen extraordinary advances in medical imaging, and innovations such as computed tomography, ultrasound (US), and magnetic resonance imaging (MRI) have dramatically changed the way clinicians diagnose and manage disease in their patients (4). Conventional radiography has long been the standard method of identifying progressive joint damage in arthritis. This method is, however, insensitive in the detection of soft tissue changes, e.g., synovitis, and usually does not reveal early erosive lesions (5–7). MRI has shown efficacy in detecting early synovitis (6, 8–10), but it requires the use of contrast agents, is comparatively expensive, and involves long data acquisition times. US has also become an established imaging technique for diagnosis and followup of RA (6, 7, 11–15). As compared with clinical examination and conventional radiography, improved sensitivity for the detection of joint effusion, synovitis, and bone erosions with the use of US in RA joints has been described (7). This information is based mainly on the visualization of synovitis that was not detected by clinical examination or by conventional radiography (6, 7).

To date, MRI is the most extensively studied modality for detection of early arthritic changes (1, 8, 14, 16–18). Evaluations of the utility of US for this purpose are limited (4). Since US is a quick, inexpensive, and easily accessible technique for assessment of joint synovitis, it is highly relevant to consider the validity of US measures of RA joint inflammation (16).

Radiographic systems for scoring of erosions are well established (19–21), and systems for scoring of MRI-detected erosions and synovitis have recently been published (1, 9, 14, 18). A number of comparative imaging studies that have included US have been performed (3, 5–7, 22–25). However, US scoring methods with quantitative assessment of joint inflammation have yet not been standardized or appropriately validated (26, 27). There is a need for quantification of US-detected synovitis and evaluation of a US synovitis measurement system that can be easily applied, especially for evaluation of treatment efficacy in the new era of highly effective biologic agents.

The aim of the present study was to simplify and standardize US examination for detection of synovitis in finger joints in RA patients by developing a clinically suitable US measurement system. We identified the region of interest with regard to synovitis in finger joints and statistically defined the most useful cutoffs to distinguish between healthy and pathologic conditions. As a next step, we developed a US synovitis score. Quantitative measurements of synovitis were used, as was application of a semiquantitative score ranging from 0 to 3, and results obtained with the 2 techniques were compared. US data were compared with clinical examination results and were validated by comparison with MRI results in 10 patients. Six different ways of statistical analysis were used to compare quantitative US measurements with semiquantitative scoring.

PATIENTS AND METHODS

Patients and healthy controls.

Over a period of 6 months, 46 consecutive patients (37 women and 9 men; mean age 53 years, range 17–75 years) with RA according to the criteria of the American College of Rheumatology (formerly, the American Rheumatism Assciation) (28) were enrolled in the study. All patients were recruited from the rheumatology outpatient clinic at Charité University Hospital. The mean ± SD disease duration was 8.5 ± 8.2 years. All patients had been or were being treated with disease-modifying antirheumatic drugs, including 20 who were treated with biologic agents. Nine patients had early RA (disease duration <1 year).

For each patient, the hand that was more clinically affected with regard to soft tissue swelling and joint tenderness was studied. The study focused on finger joint synovitis. The second through fifth PIP and MCP joints were selected for evaluation. The first MCP joints and IP joints were excluded because, in our experience, evaluation of these joints has high interobserver variability and may be confounded due to the sesamoid bone.

For clinical evaluation, the 28-joint Disease Activity Score (DAS28) (29), representing overall disease activity, was ascertained, pain history was obtained, and the joints were physically examined. Special care was taken in precise clinical examination of the PIP and MCP joints, to facilitate one-to-one comparison with imaging data. A total of 368 joints were examined. In addition, serum markers of inflammation (C-reactive-protein [CRP] and erythrocyte sedimentation rate [ESR]) were measured.

Ten controls (7 women and 3 men; mean age 37.5 years, range 25–60 years) without signs or symptoms of rheumatic disease were also studied. As in the patient group, the second through fifth PIP and MCP joints (80 joints) were studied. Clinical and US examinations were performed in exactly the same way as in the patient group.

The study was approved by the local ethics committee. All subjects provided informed consent prior to investigation.

Imaging methods.

All 46 RA patients underwent conventional radiography in 2 planes and US examination of the clinically more affected hand; 10 additionally underwent MRI. The reason for performing radiography was to exclude patients with extensive joint deformity, such as subluxation, luxation, ankylosis, and mutilation.

US was performed with an HDI 3500 high-end US system (Advanced Technologies Laboratories, Bothell, WA). We used a 10–5–MHz hockey-stick linear-array transducer for examination of the PIP and MCP joints. Two criteria for active inflammation were evaluated by US: joint effusion (visualized as a black, anechoic area), and thickening of the synovial membrane (visualized as hypo- or hyperechoic structures within the region affected by effusion). US evaluation was performed in 3 steps, as described below.

First, US was performed by one of the authors (AKS), as described in ref. 30, in the longitudinal and transverse planes (palmar and dorsal). In the longitudinal view, finger joints were divided into 4 quadrants: dorsal proximal, dorsal distal, palmar proximal, and palmar distal, and the location of synovitis was assessed. In most cases synovitis was detected in the palmar and proximal sites of the MCP and PIP joints. Only 14% of the affected joints did not show any signs of synovitis in the palmar proximal location. Based on this, we therefore decided to evaluate the US images from the palmar side.

Second, the degree of joint effusion and hypertrophy was evaluated and classified on a 4-grade semiquantitative US scale according to a modification of the system described by Szkudlarek et al (3). Those authors presented different scoring systems for synovitis and effusion (3), which is an excellent way to determine interobserver agreement. However, since pathophysiologically, in the majority of cases both phenomena appear concurrently and for simplification in clinical practice, we decided to include both synovitis and effusion in a combined measure and adapted the scoring system as follows: when no anechoic, hypoechoic, or hyperechoic structure was visible, a semiquantitative score of 0 (no effusion/hypertrophy) was assigned, and the larger the anechoic structure or extent of synovial hypertrophy seen on US images, the higher was the assigned score (1 = minimal effusion/hypertrophy; 2 = moderate effusion/hypertrophy; 3 = extensive effusion/hypertrophy) (Figure 1). Synovitis and effusion seen on US were evaluated from the palmar aspect with the hand in a neutral position. When the term “synovitis” is used below, it refers to the appearance of synovitis and/or effusion.

Figure 1.

Ultrasound images of the second proximal interphalangeal (PIP) joints of rheumatoid arthritis patients with different stages of synovitis. No erosions are visible in any of the images. Images were taken from the palmar side; the left side of the image is the proximal side of the hand and the right side the distal side. Effusion and synovitis of varying extents can be seen. Inflammation is seen mainly at the palmar proximal side of the PIP joint. All images were graded semiquantitatively with regard to the degree of inflammation, interpreted based on effusion and synovitis, as follows: 0 = none (a); 1 = little (b); 2 = moderate (c); and 3 = high (d). Measurements were performed in a standardized manner, at the proximal site perpendicular to the bone surface at the diaphysis at the point where the most synovitis was seen (on scale bars in b–d, hatch marks are 0.5 mm apart). Arrows indicate the margins of the synovitis.

Third, quantitative US measurements (mm) at the palmar side of the finger joint were performed according to a standardized protocol. Measurements were performed at the proximal site perpendicular to the bone surface at the diaphysis, at the point where most synovitis was seen (Figure 1).

Synovitis seen on stored US images was retrospectively scored, according to a semiquantitative measurement system, by 1 investigator experienced in musculoskeletal US (AKS), to ensure independence from the other scores. Quantitative measurements of synovitis on US images were evaluated by another examiner trained in musculoskeletal US (DP). When these evaluations were performed, both examiners were blinded with regard to diagnosis, clinical findings, and other US results. Results obtained with the 2 systems (semiquantitative US scoring and quantitative US) were compared, and statistical cutoffs for quantitative measurements were calculated. To determine interreader agreement, US images from 10 patients and 10 controls were rescored with both scoring systems (semiquantitative US, quantitative US) by another investigator with extensive experience in musculoskeletal US (MB), who was also blinded with regard to diagnosis, clinical findings, and other US results.

MRI, as an imaging gold standard, was performed in 10 patients by 1 of the authors (K-GAH). MRI was performed with a 0.2T unit (C-Scan; Esaote, Genoa, Italy). The protocol consisted of a T1-weighted spin-echo sequence, a STIR gradient-echo sequence, and a T1-weighted 3-dimensional gradient-echo sequence in coronal slice orientation before and after administration of gadolinium diethylenetriaminepentaacetic acid (0.2 mmoles/kg body weight). Reconstructions in transverse and sagittal section orientations were made. MRIs were evaluated using the RA MRI score (RAMRIS) as recommended by the Outcome Measures in Rheumatology Clinical Trials (OMERACT) group (1, 9, 18, 31). This semiquantitative score uses numerical values for synovitis (0 for unaffected joints; 1–3 for affected joints). These values were then compared with semiquantitative US and quantitative US findings.

Statistical evaluation and development of synovitis score.

For descriptive analysis, mean values, medians, minimum and maximum values, and standard deviations of numerical data were calculated.

Quantitative synovitis measurements were grouped according to the semiquantitative US score (0, 1, 2, or 3), which was used as the gold standard. Nonparametric repeated-measures analysis (32) was used to determine if there was a dependency of synovitis measurements by finger (second through fifth) (e.g., second PIP joint and second MCP joint should be treated equally) or within type of joint (joint group: either MCP or PIP).

Comparison of semiquantitative US scores and results of quantitative US of the diaphysis was performed applying receiver operating characteristic (ROC) analysis in the patient and control groups, for each finger and type of joint separately (33). ROC analysis resulted in the identification of optimal cutoffs between “healthy” and “pathologic” (synovitis). For calculation of cutoffs (mm), ROC analysis was used with semiquantitative US as the gold standard (e.g., cutoff between score 0 and 1: score 0 is considered as “healthy,” while all fingers with a score of 1, 2, and 3 are considered as “pathologic”). Cutoffs between the other scores were calculated accordingly. The Youden index (34) was used to extract these values from the ROC curves. In order to evaluate cutoffs between all scores, the following gold standards were used: cutoff between scores 0 and 1: score 0 is considered as “healthy,” while all fingers with a score of 1, 2, or 3 are considered as “pathologic”; cutoff between scores 1 and 2: fingers with scores of 0 or 1 are considered “healthy,” while scores 2 and 3 are considered “pathologic”; cutoff between scores 2 and 3: fingers with a score of 0, 1, or 2 are considered “healthy,” while fingers with a score of 3 are considered “pathologic.”

To evaluate a new US score suitable for evaluation of PIP and MCP joint synovitis in the clinically more involved hand in RA patients, 6 different statistical scores were applied for the 2 different methods (semiquantitative US scoring and quantitative US diaphysis measurement) and compared by ROC analysis. These scores were based on the following measurements: sum of 8 joints (second through fifth fingers) (s4), sum of 6 joints (second through fourth fingers) (s3), sum of 4 joints (second and third fingers) (s2), maximum of all values from 8 joints (second through fifth fingers) (Max), sum of 4 MCP joints (second through fifth fingers) (4MCP), and sum of 4 PIP joints (second through fifth fingers) (4PIP).

For ROC analysis, the control group was considered to be “healthy” and the patient group to be “pathologic.” Area under the curve (AUC) and Youden index (34) were extracted from the ROC curves for each score and method (total 12 ROC curves). P values were calculated for the AUC in order to compare the 6 different scores by ROC analysis (35). Cutoffs between “healthy” and “pathologic” were extracted from the ROC curves (Youden index) for each score and method. Correlations between the US scores and the DAS28, and between the US scores and laboratory data (ESR and CRP), were determined using the Pearson correlation coefficient.

As noted above, in 10 patients the second through fifth MCP and PIP joints of the same hand examined by US were additionally examined by MRI as the imaging gold standard, and were scored on a semiquantitative basis according to the OMERACT criteria (RAMRIS). Both semiquantitative US and quantitative US results were compared with MRI data. Evaluation of MRI and US data was performed with data from the palmar and dorsal sides as well as only the palmar side of the finger joint. For comparisons of the 2 imaging techniques, ROC analysis was applied when evaluating the presence of synovitis (0 = no synovitis; 1 = synovitis). For direct comparisons of MRI and semiquantitative US scores, the kappa statistic was applied.

For determination of interreader agreement, a second reader (MB) performed semiquantitative measurements (semiquantitative US scoring) and quantitative measurements (quantitative US) in 10 patients and 10 controls. For quantitative diaphysis measurements, a Bland-Altman plot was used to compare results recorded by the 2 readers. Semiquantitative scoring was compared using the kappa statistic.

RESULTS

Clinical and laboratory data.

Seventy-three of 184 MCP joints (39.7%) and 65 of 184 PIP joints (35.3%) were clinically swollen, while 64 MCP joints (34.8%) and 73 PIP joints (39.7%) were tender. There was no correlation between tender and swollen joints. The mean ± SD DAS28 score in the patient group was 4.87 ± 1.44, representing moderate disease activity. The mean ± SD ESR in the patient group was 25 ± 19 mm/hour, and the mean CRP value was 2.09 ± 2.43 mg/dl (normal <0.8). In the control group, none of the PIP or MCP joints were tender or swollen.

Imaging results.

Synovitis.

In the patient group, synovitis was detected most frequently at the palmar and proximal site of both the MCP and the PIP joints (in 86% of the affected joints overall). Synovitis was detected in only 14% of the affected joints at the dorsal side, and there was no palmar synovitis. We therefore performed further evaluations from the palmar proximal side. Also in the control group, anechoic zones (interpreted as effusion) without synovial thickening could be detected only at the palmar proximal side, although to a much lesser extent than was observed in the patients.

The distribution of synovitis in the MCP and PIP joints of the patients, with semiquantitative grading, is presented in Figure 2. Interestingly, in contrast to clinical examination, US examination revealed more frequent involvement of PIP joints than MCP joints. The most frequent scores achieved in the patient cohort were 1 in PIP joints (38.3%) and 0 in MCP joints (56.1%).

Figure 2.

Detection of synovitis in metacarpophalangeal (MCP) joints and proximal interphalangeal (PIP) joints by ultrasound (US) examination and clinical examination in patients with rheumatoid arthritis. Note that synovitis in PIP joints was detected more frequently by US than by clinical examination. The distribution of synovitis as assessed using a semiquantitative US score (sUSS; range 1–3) is shown.

Extensive synovitis (grade 3) was more often detected in PIP joints compared with MCP joints. The mean ± SD synovitis measurement in the PIP joints was 1.25 ± 0.87 mm in the patient group and 0.385 ± 0.342 mm in the control group. Mean synovitis measurements in the MCP joints were 0.81 ± 1.2 mm and 0.25 ± 0.47 mm in the patients and controls, respectively.

There were significant differences in synovitis measurements between PIP and MCP joints (P < 0.002 by nonparametric repeated-measures analysis). However, within each joint group (MCP and PIP), there were no significant differences in terms of which finger was being assessed (second through fifth). Consequently, for interpretation of synovitis, for statistical calculations, and for further evaluation, only the type of joint is reported.

Box plots displaying quantitative measurements according to semiquantitative US score are shown in Figure 3. Semiquantitative USS scoring and synovitis measurements (quantitative US) were performed by 2 different investigators who were blinded with regard to the other results. Nevertheless, as demonstrated in Figure 3, there was a clear concordance between the semiquantitative and the quantitative US results.

Figure 3.

Distribution of results obtained by quantitative US (qUS) in relation to scores obtained by semiquantitative US. Semiquantitative and quantitative US measurements were performed independently by 2 different examiners. A clear concordance between the results obtained using the 2 different measures can be seen. See Figure 2 for other definitions.

Optimal cutoffs to distinguish between healthy and pathologic (0/1) as well as the further thresholds (1/2, 2/3), with sensitivities and specificities, for semiquantitative US, were determined by ROC analysis and are displayed in Table 1. The optimal cutoff between “pathologic” and “healthy” was found to be 0.6 mm for both MCP joints (sensitivity 94%, specificity 89%) and PIP joints (sensitivity 90%, specificity 88%). Quantitative measurements for the further semiquantitative gradings were lower in PIP joints than in MCP joints (Table 1).

Table 1. Quantitative cutoffs between different semiquantitative US scores for each joint group (MCP or PIP)*
 Semiquantitative US scores
0/11/22/3
  • *

    The cutoff between scores of 0 and 1 can be taken to distinguish between “healthy” and “pathologic.” US = ultrasound; MCP = metacarpophalangeal; PIP = proximal interphalangeal.

MCP   
 Synovitis, mm0.581.752.08
 Sensitivity0.940.961.00
 Specificity0.890.960.94
PIP   
 Synovitis, mm0.631.151.83
 Sensitivity0.900.900.78
 Specificity0.880.930.94

Ultrasound synovitis score.

For semiquantitative US and quantitative US, of all 6 methods, the lowest AUC (0.69 semiquantitative US and 0.74 quantitative US) was achieved when the 4MCP method (sum of the second through fifth MCP joints) was applied. Using the Youden index, cutoffs of 1.2 (semiquantitative US score) and 0.3 (quantitative US) were determined, yielding a sensitivity of 45.7% by semiquantitative US scoring (quantitative US 71.7%) and a specificity of 90% by semiquantitative US scoring (quantitative US 70%). We therefore excluded the 4MCP method from further comparisons.

There were no significant differences in direct comparisons of the 5 remaining methods (P = 0.26 for quantitative US, P = 0.30 for semiquantitative US). However, the best AUCs were achieved when applying the s4 method (second through fifth MCP and PIP joints) and the s3 method (second through fourth MCP and PIP joints) for semiquantitative US scoring, both with an AUC of 0.90. Similarly good results were seen with the s2 method (second and third MCP and PIP joints) and the 4PIP method (second through fifth PIP joints) joints when semiquantitative US was applied (AUC 0.85 and 0.90, respectively). All cutoffs for the 5 different methods, with sensitivities and specificities for semiquantitative US and quantitative US, are shown in Table 2.

Table 2. Cutoffs between “healthy” and “pathologic” for the 5 different methods: sum of 8 joints in the second through fifth fingers (s4), sum of 6 joints in the second through fourth fingers (s3), sum of 4 joints in the second and third fingers (s2), sum of 4 PIP joints (4PIP), and maximum of all values from 8 joints (Max)*
 Semiquantitative USQuantitative US
s4s3s24PIPMaxs4s3s24PIPMax
  • *

    AUC = area under the curve (see Table 1 for other definitions).

  • Threshold values are scores for semiquantitative US and mm for quantitative US.

AUC0.900.900.850.900.870.860.860.800.870.86
Threshold3.52.52.52.331.53.353.952.62.551.35
Sensitivity0.760.800.610.800.670.760.630.650.760.78
Specificity0.900.901.01.01.00.901.00.900.900.90

There were no significant differences between findings with the 2 techniques (semiquantitative US and quantitative US). However, all AUCs for the quantitative US measurements were slightly lower (mean 0.85) as compared with those for semiquantitative US.

We did not find any significant correlations between laboratory data (ESR, CRP) and US scores. Similarly, clinical findings as judged by the DAS28 did not correlate with US results.

MRI data.

Detection of synovitis from the palmar and dorsal side by US compared with detection by MRI showed a good concordance, with an AUC of 0.85 for MCP joints and 0.96 for PIP joints (evaluations performed by 2 different investigators who were each blinded with regard to the other result). Analyzing semiquantitative US findings solely at the palmar side compared with MRI findings at the palmar and dorsal sides led to an AUC of 0.72 for MCP joints and 0.88 for PIP joints. Kappa analysis showed the same tendency: comparing US palmar and dorsal findings with MRI palmar and dorsal findings led to a kappa coefficient of 0.69 for MCP joints and 0.78 for PIP joints, while comparing US palmar findings with MRI palmar and dorsal findings resulted in a kappa coefficient of 0.34 for MCP joints and 0.60 for PIP joints. Comparing US palmar findings and MRI palmar findings resulted in kappa coefficients of 0.69 for PIP and 0.70 for MCP joints.

Interreader agreement.

Comparison of the semiquantitative US scores recorded by the 2 readers revealeded a high degree of concordance (mean kappa coefficient 0.88 for MCP joints and 0.93 for PIP joints). There was also a high rate of concordance in quantitative measurements by the 2 readers, as demonstrated in the Bland-Altman plot shown in Figure 4. The difference in the readings was grouped around 0, and 144 of 160 readings (90%) showed a difference of ≤0.5 mm. As seen in the Bland-Altman plot, interreader agreement was not dependent on the extent of the measured synovitis.

Figure 4.

Interreader agreement in scoring of synovitis by quantitative ultrasound. Quantitative diaphysis measurement was performed by 2 readers, and interreader agreement was assessed using a Bland-Altman plot in which the difference between the 2 readings was plotted against the average reading. The differences between the 2 readings were grouped around 0 (90% of the data lay in the interval between −0.5 and 0.5), and no trend could be seen (i.e., interreader agreement did not depend on the extent of the synovitis measured).

DISCUSSION

US has become a highly promising technique for the detection of joint inflammation in early RA. However, there remains a lack of standardization, and studies to define clear cutoffs between healthy and pathologic tissue have not previously been performed.

The first important observation in the present study pertained to the distribution of finger joint synovitis in MCP and PIP joints. We found that synovitis could be detected in the palmar and proximal sites of the finger joints in 86% of all fingers affected by synovitis. In 14% of affected joints, synovitis was visible at the dorsal side, while there was no palmar synovitis. This 14% is a proportion worth mentioning, but, in relation to the time needed for evaluation from both sides of the finger joints it is, in our opinion, negligible. We therefore recommend that a single US evaluation at the palmar side be considered sufficient for analysis of finger joint synovitis in, for instance, the setting of randomized clinical trials.

We performed MRI in 10 patients and compared the findings with US results to confirm the validity of synovitis detection by US in our study population. We found a good correlation between MRI and US for the detection of palmar synovitis (κ = 0.7). There is growing evidence that MRI is a valid method and can be used as the imaging gold standard against which other measures are compared (1, 14). Close correlation between findings with gadolinium-enhanced MRI and histologic findings has been demonstrated in knees (36–38) and MCP joints (39). Reliability of semiquantitative MRI scores has proven to be good, depending on the design of the study and reader training (9, 40). Thus, there is solid support for the notion that synovitis, as determined by gadolinium-enhanced T1-weighted MRI, represents true synovial inflammation.

US was more accurate than clinical examination in the detection of synovitis, which is consistent with the results of a number of earlier studies (3, 7, 24, 30, 41). However, we found more second and third MCP joints to be affected clinically than was evident by US. The reasons might be some limitations of clinical examination, causing overestimation of synovitis: in clinical examination, it is often not possible to clearly differentiate between joint synovitis, tenosynovitis, and “doughy” skin, especially in the MCP joints. Since the pathologic focus of the study was joint synovitis, US evaluation was performed with regard to this feature. Also, adding lateral and ulnar views might have slightly improved the sensitivity in detecting finger joint synovitis.

We used ROC analysis to identify optimal thresholds for distinguishing PIP and MCP joints with and without synovitis. Measurements were performed in a standardized manner (proximal site perpendicular to bone surface at diaphysis). The best cutoff to distinguish between “pathologic” and “healthy” was 0.6 mm for both MCP and PIP joints (second through fifth joints). However, cutoffs to distinguish between more advanced stages, as defined when applying semiquantitative measurements, were found to be higher for the MCP joints as opposed to the PIP joints, probably due to the larger anatomic measures (1.8 mm for grade 1 versus grade 2 and 2.1 mm for grade 2 versus grade 3 in MCP joints, 1.2 mm for grade 1 versus grade 2 and 1.8 mm for grade 2 versus grade 3 in PIP joints). Formerly, these cutoffs were achieved by calculating 2 standard deviations of the mean values in healthy tissue (16, 42), although this is only a descriptive way of presenting the data. Applying a 2-fold SD has several drawbacks caused by overlaps between the different scores and the underlying assumption of a normal distribution, which is clearly not present for US scores of 0. To define thresholds with sensitivities and specificities, it is necessary to perform ROC analysis (33), which has been applied with regard to musculoskeletal US for the determination of Achilles tendon thickness (43, 44) and hip joint synovitis (45).

One major goal of this study was to develop a validated scoring system for evaluation of finger joint synovitis. Hence, we investigated whether there was a dependency in synovitis measurements by finger or by type of joint, and we found no significant differences between individual PIP joints or between individual MCP joints. Therefore, PIP and MCP joints should be treated in the same manner within the “joint group,” for determination of cutoffs and for any statistical analysis. These findings aid in the simplification of a scoring system since cutoffs do not need to be calculated for each individual joint.

Semiquantitative scoring systems are frequently used in radiology: e.g., MRI evaluation of synovitis and bone lesions (31). Different scoring systems have been established for conventional techniques in musculoskeletal imaging (19–21). The OMERACT group has agreed on a semiquantitative scoring system for MRI-measured synovitis in RA (1, 9, 18, 31). According to that system, the RAMRIS, synovitis is assessed in 3 regions of the wrist and all MCP joints; PIP joints are excluded. A scale of 0–3 is used, with a score of 0 considered normal, while scores of 1–3 indicate mild, moderate, or severe synovitis (31).

For US, 4 different 4-grade semiquantitative scoring systems have recently been presented (3). However, the US scoring systems introduced previously may have some limitations (26), since synovitis and effusion exist mostly in combination. We therefore applied a simplified semiquantitative scoring system that includes both parameters (although joints having only minor synovial thickening but moderate effusion will be graded the same as joints with very thickened synovium). Interestingly, in contrast to clinical examination, with US examination we saw more frequent and distinct involvement of PIP joints compared with MCP joints. We therefore believe it is necessary to include PIP joints for analysis of finger joint synovitis, although the RAMRIS system (31) does not require their inclusion. In this regard, US has advantages over MRI, since it is easier to perform examinations of all finger and hand joints by US than by MRI because the coils needed for MRI limit the joint area that can be examined. US measurement of synovial thickening is a valid indicator of joint inflammation. However, MRI studies have shown that in later disease, fibrosis may contribute to synovial membrane thickening, and when US studies are limited to B-mode US without a Doppler function, actively inflamed joints may be missed (46).

This is the first report of a study precisely comparing semiquantitative US scoring and quantitative US measurement. Six different statistical scores (s4, s3, s2, Max, 4MCP, and 4PIP) were used for the 2 different methods (semiquantitative US and quantitative US) and compared. As shown by ROC analysis, there was no significant difference between semiquantitative US and quantitative US findings. We therefore recommend semiquantitative US (score range 0–3) for analysis for finger joint synovitis. Quantitative US cutoffs should be used for differentiation between “pathologic” and “healthy” in evaluating single finger joints, e.g., joints that appear clinically swollen or tender. Comparison of different statistical scores revealed no significant differences between s4, s3, s2, Max, and 4PIP. Hence, all 5 statistical scores could be used for evaluation. However, the best AUC values (0.90) were achieved with the s3 method (sum of second through fourth PIP and MCP joints), s4 method, and 4PIP method. Since both MCP and PIP joints are often affected and good sensitivity/specificity pairs were achieved with the s3 (Table 2), we suggest use of the s3 method with semiquantitative scoring for diagnosis and followup examination of RA synovitis. With this method, the optimal cutoff value to distinguish between “healthy” and “pathologic” was a score of 3, yielding a sensitivity of 80% and a specificity of 90%. However, further studies for evaluation of treatment efficacy applying the proposed score should be performed.

The time needed for US examination of all 4 PIP and MCP joints from the dorsal and palmar views is quite short (∼5 minutes). However, documentation and quantitative measurement are time-consuming (∼15–20 minutes). Therefore, focusing only on the palmar view as well as reducing the number of joints examined to those in 3 fingers (6 joints) and semiquantitative scoring would considerably shorten the time needed for evaluation and documentation of finger joint synovitis. Our study provides evidence that it is possible to simplify US examination for evaluation of finger joint synovitis. However, US was performed with an imaging device (10–5–MHz linear-array-transducer) with good B-image resolution, and results should be interpreted with knowledge of US device quality. The synovitis scoring method recommended (s3) has the potential to be effective especially for evaluation of treatment efficacy, which is of major importance since new therapeutic regimens are now widely available. However, the mean disease duration in our study was 8.5 years, and the data cannot necessarily be transferred to a population of patients with early RA. Further studies that include additional joints (e.g., hand joints), with evaluation of cutoffs by ROC analysis, are planned. While synovitis is among the first detectable pathologies in RA joints, there is also a need for scoring systems that address bone erosions, for long-term evaluation as well as longitudinal studies.

In conclusion, our data contribute to the validation and standardization of musculoskeletal US examination. It is possible to considerably simplify the US examination by focusing on the palmar side of the finger joints, especially for use in a clinical trial setting. The best cutoff between “pathologic” and “healthy” is 0.6 mm for MCP and PIP joints. Semiquantitative scoring yields the same results as quantitative synovitis measurement, and we suggest semiquantitative grading of the second through fourth MCP and PIP joints for assessment of finger synovitis. For evaluating synovitis and assessing treatment efficacy in RA, we recommend use of this “sum of 3 fingers” method (3 MCP and 3 PIP joints) in longitudinal studies.

Acknowledgements

We are grateful to Esaote (Genoa, Italy) for generously providing the MRI device.

Ancillary