Items for developing revised classification criteria in systemic sclerosis: Results of a consensus exercise

Authors


Abstract

Objective

Classification criteria for systemic sclerosis (SSc; scleroderma) are being updated. Our objective was to select a set of items potentially useful for the classification of SSc using consensus procedures, including the Delphi and nominal group techniques (NGT).

Methods

Items were identified through 2 independent consensus exercises performed by the Scleroderma Clinical Trials Consortium and the European League Against Rheumatism Scleroderma Trials and Research Group. The first-round items from both exercises were collated and redundancies were removed, leaving 168 items. A 3-round Delphi exercise was performed using a 1–9 scale (where 1 = completely inappropriate and 9 = completely appropriate) and a consensus meeting using NGT was conducted. During the last Delphi round, the items were ranked on a 1–10 scale.

Results

In round 1, 106 experts rated the 168 items. Those with a median score of <4 were removed, resulting in a list of 102 items. In round 2, the items were again rated for appropriateness and subjected to a consensus meeting using NGT by European and North American SSc experts (n = 16), resulting in 23 items. In round 3, SSc experts (n = 26) then individually scored each of the 23 items in a last Delphi round using an appropriateness score (1–9) and ranking their 10 most appropriate items for the classification of SSc. Presence of skin thickening, SSc-specific autoantibodies, abnormal nailfold capillary pattern, and Raynaud's phenomenon ranked highest in the final list that also included items indicating internal organ involvement.

Conclusion

The Delphi exercise and NGT resulted in a set of 23 items for the classification of SSc that will be assessed for their discriminative properties in a prospective study.

INTRODUCTION

Systemic sclerosis (SSc; scleroderma) is a complex multisystem autoimmune disease of which the pathogenesis is not completely understood (1). The 3 hallmark manifestations of the pathogenic process are vasculopathy of small vessels resulting in tissue ischemia, an immune response manifested as altered T and B lymphocyte function and production of autoantibodies, and fibroblast dysfunction leading to increased deposition of the extracellular matrix (2).

The current classification criteria for SSc were developed in1980 by a subcommittee of the American College of Rheumatology (ACR) (3). The ACR criteria were not intended for diagnostic purposes but for inclusion of patients in clinical studies (Table 1). ACR criteria were developed using patients with longstanding diffuse cutaneous SSc. As a consequence, patients with early SSc and a significant proportion of patients with limited cutaneous SSc (lcSSc) do not meet the current criteria (4–6).

Table 1. American College of Rheumatology preliminary classification criteria for SSc (adapted, with permission, from ref.3)*
1980 preliminary criteria for the classification of SSc
  • *

    SSc = systemic sclerosis (scleroderma).

Major criterion
 Proximal scleroderma: bilateral and symmetric sclerodermatous changes in any area proximal to the metacarpophalangeal joints or metatarsophalangeal joints
Minor criteria
 Sclerodactyly: sclerodermatous changes at fingers or toes only
 Digital pitting scars at fingertips or loss of distal finger pad substance (pulp loss)
 Bilateral basilar pulmonary fibrosis on chest radiograph
To be classified as SSc, a patient has to fulfill the major criterion OR at least 2 of the 3 minor criteria

Since the development of the 1980 criteria (3), knowledge regarding the association of SSc-specific autoantibodies with different SSc phenotypes has improved (4, 5). In addition, characteristic nailfold capillary changes have been found to be associated with the development of SSc (6, 7), and nailfold capillaroscopy is widely accepted as a diagnostic tool (7–9). In 2001, LeRoy and Medsger proposed to revise the classification criteria to include “early” cases of SSc (10). They suggested that patients may be classified as limited SSc (or “pre-SSc” or “unclassifiable SSc”) when having Raynaud's phenomenon (RP), a SSc-specific nailfold capillary pattern, and/or SSc-specific autoantibodies (10). The validity of the proposed criteria was supported in 2008 by a large cohort of patients with RP (7). The presence of SSc-specific autoantibodies and microvascular damage (as assessed by nailfold capillaroscopy) was predictive for the development of SSc: 80% of patients with RP who had both SSc-specific autoantibodies and capillary abnormalities developed SSc (study-specific definition) at 15 years (7). Also, there is now a greater understanding of the natural history of SSc (including internal organ involvement) and availability of better diagnostic tools (11–13).

Based on new knowledge about the pathogenesis of SSc and better diagnostic modalities to classify patients with early SSc, the ACR and the European League Against Rheumatism (EULAR) supported an international working group to revise the classification criteria for SSc. The main objective of the classification criteria was to distinguish patients with SSc from those without the disease. The international working group agreed that revised SSc classification criteria should meet additional requirements, and that they 1) should include the complete spectrum of SSc and should apply to patients that are early as well as late in the disease process; 2) should include vascular, immunologic, and fibrotic manifestations; and 3) should be feasible in daily clinical practice and clinical studies and be as close as possible to items used for diagnosis in clinical practice.

Currently, there is uncertainty about which items to include in classification criteria for SSc. To reduce the number of potential items to a set including the most promising items, we conducted consensus procedures using the Delphi technique and the nominal group technique (NGT) (14) to select a set of items as potentially useful for the classification of SSc.

MATERIALS AND METHODS

Design.

This study had 2 phases: item generation and subsequent item reduction. An internet-based Delphi approach was used for item generation, whereas internet-based Delphi rounds and a face-to-face meeting using the NGT were used for item reduction. These methodologies have recently been reviewed in detail (14).

International working group and expert panel.

The ACR/EULAR international working group for the classification of SSc consisted of 4 members from North America and 4 members from Europe. The international working group established a panel of expert diagnosticians in SSc to advise in the revision of SSc classification criteria. The expert panel consisted of members from both Europe (n = 14) and North America (n = 14).

Item generation process: Delphi exercise 1.

Potential items for classification criteria were identified through 2 internet-based consensus exercises, performed separately and independently by the Scleroderma Clinical Trials Consortium (SCTC) and the EULAR Scleroderma Trials and Research (EUSTAR) group (4). For the SCTC exercise, 96 randomly chosen SCTC members were invited to participate. During the first round of the 3-round SCTC Delphi exercise, participants were asked to nominate items that they incorporate in their daily practice to diagnose SSc (14). The aim of the EUSTAR exercise was slightly different and aimed to identify items suitable for diagnosis of early SSc (4). For the EUSTAR Delphi exercise, 121 items that were provided by 85 participants were subjected to a 3-round Delphi exercise. The first-round item lists of the 2 separate Delphi exercises were collated with removal of redundancies through consensus by 4 of the committee members, leading to a list of 168 items.

Item reduction process.

For item reduction, the list of 168 items was subjected to 3 rounds of a new internet-based Delphi exercise using standard methodology and a face-to-face consensus meeting with members of the expert panel using the NGT.

Delphi exercise 2.

Clinicians with expertise in scleroderma from the EUSTAR and SCTC were asked to participate by e-mail. One hundred six experts rated the appropriateness of the 168 items for classification of SSc on a 1–9 scale (where 1 = completely inappropriate and 9 = completely appropriate). The appropriateness score of an item was calculated as the median of the ratings (1–9). Items with a median appropriateness score of less than 4 in this round were removed, resulting in a list of 102 items.

NGT.

The 102 items were grouped into organ system domains and autoantibodies. Members of the expert panel met for a 1-day face-to-face NGT meeting in 2009. The day before the meeting, the available panelists (n = 16) again individually rated each of the 102 items for appropriateness. Median appropriateness scores were calculated for all items. To evaluate the level of (dis)agreement regarding the appropriateness of an item, the percentages of appropriateness scores falling in the lowest tertile (1–3) and the highest tertile (7–9) were calculated. All of the items with their scores were discussed in detail during the NGT meeting. In the NGT consensus meeting, the 16 experts were presented the item list and encouraged to discuss each item, and their opinions were recorded. This process ensured that all participants had an opportunity to contribute. The expert panel suggested by consensus that certain items be combined and others discarded, resulting in a list of 23 items.

Delphi exercise 3.

Members of the expert panel (n = 26 of 28) individually scored each of the 23 items in this next internet-based Delphi round using the appropriateness score (1–9), and the median appropriateness score was calculated. In addition, the experts also chose 10 items they thought were most appropriate and important to include from the list of 23, and ranked those 10 items from most appropriate (rank 1) to least appropriate (rank 10) to be able to discriminate between items with equal median appropriateness scores. The 23 items were ranked 1) in order of the median appropriateness score (1–9) and 2) in order of the average expert ranking (0–100%). The average expert ranking was calculated by giving scores to the ranks: the most favorite item scored 10 points and the least favorite item scored 1 point. For each item the rank scores were summed and divided by the theoretical maximum score (26 raters × 10 points = 260); they were then expressed as a percentage (100% maximum). For example, if 10 raters ranked a sign or symptom as 1 (10 points each), 10 rated it as 2 (9 points each), and 6 rated it as 3 (8 points each), the total score attributed to this sign or symptom is ([10 × 10] + [10 × 9] + [6 × 8]) = 238. The percentage attributed to this sign or symptom is 238/260 = 92%, and this percentage was used to determine the ranking of the particular sign or symptom.

RESULTS

The Delphi exercise and NGT meeting resulted in 23 items (Table 2). The items from the 1980 SSc classification criteria (Table 1) are also included in this list but are not identical (Table 2). The 23 items include skin thickening, autoantibodies, nailfold capillary abnormalities, RP, and organ system involvement deemed feasible and appropriate to be included. The presence of scleroderma (skin thickening) had the highest average ranking and a median appropriateness score of 9. The other highly ranked items included autoantibodies (positivity for anti–topoisomerase I antibodies, anticentromere antibodies, and anti–RNA polymerase III antibodies), abnormal nailfold capillary pattern, fingertip ulcers, presence of RP, presence of interstitial lung disease, presence of renal crisis, and presence of palpable tendon or bursal friction rubs (see Table 3 for definitions).

Table 2. Results of round 3 Delphi exercise (n = 26) in order of ranking for appropriateness*
 ItemMedian appropriateness score (1–9)Appropriateness scores between 1 and 3, %Average sum rank, %
  • *

    The 23 items were rated according to their appropriateness (where 1 = most inappropriate and 9 = most appropriate) and raters indicated their 10 most favorite items from 10 (most favorite) to 1 (least favorite). Items were ranked in order of 1) the median appropriateness rating and 2) the top 10 favorites ranking. The median appropriateness score (1–9) was calculated from the scores of the raters. Ratings between 1 and 3 indicate the percentage of raters that rated the item as inappropriate. Top 10 favorites ranking indicates the average ranking of an item in the top 10, with higher percentages indicating a higher average rank (100% as maximum). CT = computed tomography; SSc = systemic sclerosis (scleroderma); FVC = forced vital capacity; DLCO = diffusing capacity for carbon monoxide.

1Presence of scleroderma (skin thickening)9067
2Positive anti–topoisomerase I antibody9057
3Positive anticentromere antibody9050
4Positive anti–RNA polymerase III antibody9034
5Abnormal nailfold capillary pattern8050
6Fingertip ulcers/pitting scars8033
7Renal crisis8028
8Raynaud's phenomenon7427
9Interstitial lung disease/pulmonary fibrosis7025
10Tendon or bursal friction rubs7422
11Fingertip pulp loss or acroosteolysis747
12Esophageal dilatation on radiograph or CT7169
13Calcinosis, dermal, subcutaneous, or intramuscular784
14Telangiectasia consistent with SSc6411
15Puffy fingers6810
16Pulmonary arterial hypertension6138
17Positive antinuclear antibody5129
18Flexion contractures of the fingers5162
19Positive anti–PM-Scl antibodies5280
20Reduced FVC percent predicted5200
21Reduced DLCO percent predicted4247
22Recurrent or persistent gastroesophageal reflux disease4445
23Dysphagia for solid foods4242
Table 3. Proposed item definitions*
ItemDefinition
  • *

    SSc = systemic sclerosis (scleroderma); ENA = extractable nuclear antigen; DLCO = diffusing capacity for carbon monoxide; HRCT = high-resolution computed tomography; FVC = forced vital capacity; BP = blood pressure; ULN = upper limit of normal; RBC = red blood cell; hpf = high-power field.

Abnormal nailfold capillary pattern consistent with SScEnlarged capillaries and/or capillary loss with or without pericapillary hemorrhages
Anticentromere antibody or centromere pattern on antinuclear antibody testPositive according to local laboratory standards
Anti–topoisomerase I antibodyPositive according to local laboratory standards
Antinuclear antibodyPositive according to local laboratory standards
Anti–PM-Scl antibodyPositive according to local laboratory standards
Anti–RNA polymerase III antibodyPositive according to local laboratory standards May not be available in all laboratories as part of ENA
CalcinosisDetected either clinically or radiographically
Calcinosis is defined as palpable, dermal and/or subcutaneous, or intramuscular deposits
Usually located in fingers or toes or over large proximal joints or extensor surfaces of distal extremities
Reduced DLCO percent predictedAccording to local laboratory standards, <80% of predicted or lower cut point
Digital pulp loss or acroosteolysisLoss of substance from the fingertip pad as a result of ischemia rather than trauma or exogenous cause (3)
Acroosteolysis: osteolysis of the distal phalanx/phalanges (27)
Dysphagia for solid foodsBy history, a substernal discomfort on swallowing or sensation of food being held up or “stuck” in a retrosternal location
Esophageal dilatationEsophagus dilatated by imaging (barium swallow, chest radiograph, or HRCT of the chest)
Finger flexion contracturesInability to extend fingers to neutral position due to skin or tendon tightening
 not due to arthritis deformities, Dupuytren's contracture, or other conditions
Fingertip ulcers or pitting scarsUlcers or scars not thought to be due to trauma
Digital pitting scars are depressed areas at the tips as a result of ischemia rather than trauma or exogenous causes
Reduced FVC percent predictedAccording to local laboratory standards, <80% of predicted or lower cut point
Interstitial lung disease or pulmonary fibrosisPulmonary fibrosis on HRCT or chest radiograph, most pronounced in the basilar portions of the lungs or presence of “Velcro” crackles on auscultation
Persistent or recurrent gastroesophageal reflux diseaseBy history, endoscopy, or imaging
Puffy fingersSwollen fingers or toes: a diffuse, usually nonpitting increase in soft tissue mass of the fingers or toes extending beyond the normal confines of the joint capsule
Normal fingers or toes are tapered distally with the tissues following the contours of the digital bone and joint structures
Swelling of the fingers or toes obliterates these contours
Pulmonary arterial hypertensionDiagnosed by right heart catheterization
Raynaud's phenomenonSelf-report or reported by a physician with at least a 2-phase color change in finger(s) and often toe(s) consisting of pallor, cyanosis, and/or reactive hyperemia in response to cold exposure or emotion; usually 1 phase is pallor
Renal crisisNew onset of a systolic BP ≥140 mm Hg and a diastolic BP ≥90 mm Hg, OR a rise in systolic BP ≥30 mm Hg compared to usual and rise in diastolic BP ≥20 mm Hg  compared to usual, AND at least 1 of these features: 1. Serum creatinine: increase of ≥50% above usual level 2. Proteinuria: ≥2+ by dipstick confirmed by protein: creatine ratio > ULN 3. Hematuria: ≥2+ by dipstick or >10 RBCs/hpf (without menstruation) 4. Thrombocytopenia: <100,000/mm3 5. Hemolysis (fragmented RBC: by blood smear or increased reticulocyte count)
Scleroderma (skin thickening)Skin thickening or hardening anywhere but not due to scarring after injury, trauma, etc.
TelangiectasiasIn a scleroderma-like pattern are round and well demarcated and found on the hands, lips, inside of the mouth, and/or large matt-like telangiectasias Are visible macular dilated superficial blood vessels that collapse upon pressure and fill  slowly when pressure is released; distinguishable from rapidly filling spider angiomas  with central arteriole and from dilated superficial vessels
Tendon or bursal friction rubsOne or more friction rubs detectable at places such as the shoulders, olecranon bursae, wrists (flexor or extensors), fingers (flexor or extensor), knees, and ankles (Achilles, peroneal, posterior tibial, or anterior tibial tendons)

Four items had a median score of 9 (most appropriate) (Table 2). Loss of fingertip pulp or acroosteolysis, esophageal dilatation (by imaging), and calcinosis fell below the highest 10 items but all had a median appropriateness score of 7. The other 10 items had median appropriateness scores between 4 and 6.

The median rating generally corresponded with the average rank. No items with a median appropriateness score of 4–6 had a “top 10” favorite score above 20%. There was minimal disagreement for the other items except for gastroesophageal reflux disease, for which more than 30% of appropriateness scores fell in the lowest tertile (1–3). There were no items with a median score below 4.

DISCUSSION

Classification criteria for SSc are being updated because the existing ACR classification criteria for SSc may not be sensitive for early SSc and a significant proportion of patients with lcSSc may not be classified as SSc with the previous criteria (2, 5, 6, 10, 15, 16). The results of these consensus procedures are a first step in the process. Using the Delphi and NGT, we have reduced the number of items to be considered in developing revised classification criteria for SSc in a prospective cohort study. The 23 items chosen during the NGT meeting include skin thickening, autoantibodies, nailfold capillary abnormalities, RP, and internal organ system involvement.

The 4 items in the existing ACR classification criteria for SSc (Table 1), i.e., proximal scleroderma (proximal to the metacarpophalangeal joints), sclerodactyly, digital pitting scars or pulp loss, and bilateral basilar pulmonary fibrosis, are also included in the list of 23 items and received relative high rankings. Based on a better understanding of the role of SSc-specific autoantibodies and the presence of nailfold abnormalities as potential diagnostic criteria for SSc and considering the natural history of SSc, panelists also rated autoantibodies (antinuclear antibodies, anticentromere, anti–topoisomerase I, anti–RNA polymerase III, and anti–PM-Scl antibodies) high for appropriateness, after the presence of skin sclerosis. Other items not in the 1980 criteria that will be tested in the ongoing ACR/EULAR SSc criteria project include early vascular manifestations (RP, nailfold capillary pattern), skin and subcutaneous tissue manifestations (puffy fingers, calcinosis, telangiectasia), musculoskeletal manifestations (flexion contractures of the fingers, tendon or bursal friction rubs), and internal organ involvement (renal crisis, pulmonary arterial hypertension, interstitial lung disease [reduced forced vital capacity and reduced diffusing capacity for carbon monoxide], esophageal dilatation, gastroesophageal reflux disease, dysphagia).

Nearly all of the proposed items in the list of 23 can be considered to be frequently used to diagnose and assess patients with SSc in clinical practice (12, 17–19). For example, the presence of skin thickening is a hallmark of the clinical presentation of SSc and is present in 95% of patients (12). Other items such as the presence of RP and SSc-specific autoantibodies are characteristic of SSc (20). The absence of sclerodactyly, absence of RP, and/or absence of antinuclear antibodies and SSc-specific autoantibodies are used in clinical practice to differentiate SSc from scleroderma-like disorders such as eosinophilic fasciitis and scleromyxedema (20). Occurring in more than 90% of patients, RP often precedes skin and visceral fibrosis by years or decades, particularly in patients with lcSSc (7, 8, 21). Nailfold capillary microscopy is a useful noninvasive method of visualizing vascular abnormalities in SSc. It can be performed easily in the clinic without highly specialized equipment (22). The presence of dilated capillaries and pericapillary hemorrhage is very common in SSc, and the presence of these findings in patients with isolated RP predicts future evolution to SSc (7, 23). However, it is important to note that nailbed vascular changes are not unique to SSc because they can occur in other connective tissue diseases, especially dermatomyositis, in which RP is less frequent and the autoantibody profile is different. In late SSc, there may be much dropout of capillaries to the extent that no dilated capillaries can be seen by usual magnification.

SSc is also characterized by specific serum autoantibodies that include anti–topoisomerase I, anticentromere, anti-Th/To, anti–U3 RNP, anti–RNA polymerase I/III, and others (7, 24). These autoantibodies are known predictors of progression from isolated RP to SSc, internal organ involvement, and survival in SSc. Antibodies such as anti-Th/To and anti–U3 RNP, and in some cases RNA polymerase I/III, may have high specificity for SSc but are not yet commercially available.

While all 23 signs and symptoms are associated with SSc, the final result of this project is to find a combination of these items that is able to discriminate the presence of SSc from similar diseases. Some items that generally occur in patients with SSc, such as RP or positive antinuclear antibodies, may not discriminate well if they also frequently occur in other systemic connective tissue diseases. This means that their presence may be useful in prompting the physician to apply a diagnostic evaluation in which SSc is just one of the diagnoses to be considered. Other items are quite specific for SSc, such as renal crisis or digital pulp loss, but their occurrence may be too low to be very useful in discriminating SSc from other diseases. Then it may be useful to combine certain items, for example, “having pulmonary arterial hypertension, interstitial lung disease, or renal crisis” or “positive for SSc-specific autoantibodies.” Moreover, classification criteria also should apply to patients with established disease and ideally also early in the disease process.

The items were tested in existing databases of SSc patients and patients with similar “SSc mimicking” disorders, such as systemic lupus erythematosus, myositis, or primary RP (25). Not all items were tested in all databases and not all “SSc mimicking” disorders were represented, but the prevalence in SSc and mimickers was compared and the discriminant validity of many items was very high.

The starting point of the Delphi had been an extensive list of items derived from the first rounds of 2 independent Delphi exercises by the EUSTAR and by the SCTC. The EUSTAR Delphi exercise resulted in the identification of 3 main domains for SSc diagnosis, i.e., skin, vascular, and laboratory, whereas RP, puffy fingers, and antinuclear antibodies were considered as “indicators” for referral to a specialist for potential SSc diagnosis (4). The SCTC Delphi exercise has been presented as an abstract (26).

We were able to conduct a Delphi exercise using experts from member institutions of the SCTC and EUSTAR who have expertise in management of SSc with a good response rate. Eighty percent of SCTC members answered the first round and the response to the EULAR Delphi was 70%. Consensus methods such as Delphi and NGT are useful to apply if choices are to be made in the presence of considerable uncertainty. The proposed 23 items need further testing using data-driven methods and item reduction. Most importantly, the discriminatory capacity of the items should be tested prospectively in patients with SSc and in patients with diseases that mimic SSc such as eosinophilic fasciitis, scleromyxedema, systemic lupus erythematosus, polymyositis/dermatomyositis, primary RP, generalized morphea, and nephrogenic systemic fibrosis. However, some of these disorders may be even rarer than SSc. Once the univariate discriminatory ability of the items is known, further item reduction and weighting of items for a final criteria set may be required. Lastly, testing in an independent data set is needed.

In conclusion, many potential items for revised SSc criteria have been reduced to 23 items that include clinical, radiographic, and serologic features of SSc that capture the breadth and depth of the disease. These items are currently being validated in an international observational study for their inclusion in revised classification criteria for SSc.

AUTHOR CONTRIBUTIONS

All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Khanna had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study conception and design. Fransen, Johnson, van den Hoogen, Baron, Czirják, Denton, Distler, Furst, Mueller-Ladner, Riemekasten, Sierakowski, Valentini, Veale, Vonk, Collier, Seibold, Tyndall, Matucci-Cerinic, Pope, Khanna.

Acquisition of data. Fransen, van den Hoogen, Baron, Allanore, Carreira, Czirják, Denton, Distler, Furst, Gabrielli, Herrick, Inanc, Kahaleh, Kowal-Bielecka, Medsger, Mueller-Ladner, Riemekasten, Sierakowski, Valentini, Veale, Vonk, Walker, Chung, Clements, Collier, Csuka, Merkel, Silver, Steen, Matucci-Cerinic, Pope, Khanna.

Analysis and interpretation of data. Fransen, Johnson, van den Hoogen, Baron, Allanore, Denton, Furst, Inanc, Medsger, Mueller-Ladner, Riemekasten, Sierakowski, Valentini, Veale, Vonk, Clements, Collier, Jimenez, Merkel, Seibold, Matucci-Cerinic, Pope, Khanna.

Ancillary