Recently, the American College of Rheumatology (ACR) classification criteria for rheumatoid arthritis (RA), which were developed in 1987 (1), have been subjected to review by a joint task force of the ACR and the European League Against Rheumatism (EULAR). The aim of the review was to enable RA classification at an earlier disease stage compared to the 1987 ACR criteria, and the development of new criteria is an important step forward.
The development of the 2010 ACR/EULAR criteria comprised 3 phases. The first was a data-driven phase using findings in 3,115 patients from Europe and Canada. The second phase incorporated the expertise of 39 rheumatologists, and the third phase was a consensus phase undertaken by the same group (2–4). In coming years, the criteria will be studied in cohorts with different ethnic backgrounds and in dissimilar health care systems, in which the pretest probability for RA in new patients visiting rheumatologists differs.
The 2010 criteria are the first to include anti–citrullinated protein antibodies (ACPA) in addition to rheumatoid factor (RF). Presence of these autoantibodies can contribute substantially to the diagnosis of RA, for which ≥6 points are required; presence of ACPA or RF yields 2 points, and a high level of ACPA or RF yields 3 points. In the data-driven phase of criteria development, using data from several early arthritis cohorts, ACPA and RF were recognized as a theme in a factor analysis. Then, ACPA and RF were summarized as “serology.” Subsequently, the importance of serology, independent of other variables, was determined using a multivariate regression analysis. It was observed that within the group of patients with a positive serology, a level higher than the median received a higher weight than a level lower than the median. After the expert phase and the consensus phase, a high level was redefined as ≥3 times the reference value.
The present study investigated 2 main characteristics of the items defined as “serology,” particularly the RF level criterion, in the 2010 ACR/EULAR criteria for RA. The first characteristic was the discriminative ability of high levels of RF compared to ACPA for identifying early RA. Several studies have demonstrated an increased specificity for RA of a higher RF level compared to RF positivity (5, 6). However, an increased specificity for RA has also been observed for the presence of ACPA compared to the presence of RF (7). Thus far, extensive comparisons of the ability of increased RF levels to predict RA development compared with the ability of the presence of ACPA, notably anti–cyclic citrullinated peptide (anti-CCP) antibodies, to predict RA development have not been made. In 3 separate prospective cohorts of patients with undifferentiated arthritis (UA) of recent onset from 3 different countries, RA development was studied in relation to baseline RF levels and ACPA. RA was diagnosed according to the 1987 ACR criteria (1). To verify that the results were not different when other outcome measures were used, analyses in patients with UA were repeated with arthritis persistence as the outcome measure. Furthermore, the same analyses were performed in RA patients, with the rate of joint destruction and the achievement of sustained disease-modifying antirheumatic drug (DMARD)–free remission as outcomes.
The second characteristic was the capacity of different assays to uniformly define a high RF level. Despite the existence of international units for RF, RF level measurement is not adequately standardized between different methods. Subsequent variations in RF levels may yield differences between laboratories with regard to the classification or diagnosis of RA. Therefore, we determined the degree of variation in RF levels obtained when the same RF-positive serum samples were tested by the methods that are currently most frequently applied (enzyme-linked immunosorbent assay [ELISA], nephelometry, and turbidimetry). Although previous studies have evaluated the correlations between results of the Rose-Waaler method and ELISA (8), data on head-to-head comparisons of currently applied methods are, to the best of our knowledge, not available.
- Top of page
- PATIENTS AND METHODS
- AUTHOR CONTRIBUTIONS
Detailed knowledge of the individual items in the 2010 ACR/EULAR classification criteria for RA is necessary to optimally use these criteria in daily clinical practice. The inclusion of the item “low-positive RF” versus “high-positive RF” seems to hamper uniform application of the 2010 ACR/EULAR criteria.
In the present study, the test characteristics and prognostic ability of high RF levels were compared with those of the presence of ACPA in patients with early UA. The data, originating from 3 cohorts, revealed that the balance between positive LR and negative LR as well as between PPV and NPV was more favorable for ACPA positivity than for high RF level. These findings held both for the diagnosis of RA and for arthritis persistence. The same results were obtained when the severity of the course of RA was studied, which substantiated the findings.
The main outcome measure used in the current study was the development of RA according to the 1987 ACR criteria. An advantage of these criteria is that they could be uniformly applied in the different cohorts in Germany, Norway, and The Netherlands. In light of the new 2010 ACR/EULAR criteria, however, this outcome measure may seem to be an outdated definition of RA. Obviously, the 2010 ACR/EULAR criteria could not be used for the purpose of the present study because of circularity; both the presence of ACPA and RF level are part of these criteria. Using methotrexate (MTX) treatment as the outcome measure, as was done when deriving the 2010 ACR/EULAR criteria for RA, has limitations as well. The Leiden cohort began including UA patients in 1993, and at that time DMARDs were infrequently prescribed in early UA. Hence, there are differences in MTX prescription depending on the inclusion year, which impairs fair comparisons. In addition, MTX is prescribed for other diagnoses, such as psoriatic arthritis. An alternative outcome is expert opinion with regard to the presence of RA. However, expert opinion is likely not independent of the 1987 ACR criteria for RA. Having worked with the 1987 ACR criteria for ∼20 years, clinicians may, consciously or unconsciously, refer to these criteria in their judgments. In the present study, comparable findings were obtained using RA development, arthritis persistence, or RA severity as the outcome measure, suggesting that the findings were not dependent on the use of one particular outcome measure.
Two definitions of high RF level were studied in 3 cohorts. The definitions were RF50 (the definition of high RF level used in previous publications), and 3 times the reference value (the definition of high RF level used in the 2010 ACR/EULAR classification criteria for RA). It was observed that the posttest probabilities (PPV and NPV) varied between the cohorts. For example, the NPV was highest in the NOR-VEAC and lowest in the Berlin EAC. These values are influenced by the different percentages of UA patients who developed RA during the observation period (the pretest probability). Despite this difference, the same differences between the predictive ability of RF level and the predictive ability of ACPA were observed in all 3 cohorts, strengthening the findings. The sensitivities and specificities for high RF levels differed between the cohorts as well. This may be due partly to the different cutoff levels used to define RF positivity. RF50 may be a 2-fold increase compared to the cutoff value in some cohorts (as was the case in the Berlin EAC and the NOR-VEAC), but it may be a 10-fold increase when other methods are applied (as was the case in the Leiden EAC). Although this argument may apply to a lesser extent when the definition of high RF level of 3 times the reference value is used, in this case the stringency with which the reference value is chosen (according to manufacturer instructions or to in-house reference groups) may also affect the test characteristics. The differences in test characteristics of the presence of ACPA were smaller than for RF level.
Another factor that may contribute to differences in measured RF levels and differences in resulting test characteristics are the different techniques that can be used to measure RF. ELISAs were used to measure RF in all cohorts investigated in this study. Generally, there are several variants of each technique, including both in-house and commercially available kits. The manufacturers of these commercially available tests have not provided a 100% standardization of these kits to a reference kit with regard to detection and quantification of RF. Previously, IU/ml have been established, but this method only yields standardized results when the Boehringer nephelometer is used. The prevalent methods also differ with regard to the origin of the antibodies that are directed against RF (human or rabbit) and the isotypes of the antibodies that are tested. Nephelometry usually measures complexes of IgM, IgG, and IgA RFs, whereas ELISAs are specifically directed against one isotype, for instance, IgM-RF.
Appropriate and uniform application of the RF level criterion of the 2010 criteria for RA requires harmonization of all available RF tests. Efforts to harmonize RF determinations have been undertaken by Dutch and European task forces. In The Netherlands, a standard serum consisting of pooled serum from RF-positive patients (RELARES) was developed. However, as shown in the present study (Figure 3C), this did not result in better reproducibility between laboratories. Considerable variability was still observed, not only between various methods for determining RF (such as ELISA, nephelometry, and turbidimetry), but also between different laboratories using the same method. Considering the present difficulties, it is not feasible that worldwide standardization of RF measurement will be achieved in the short term. This study did not address the possibility of standardizing anti-CCP level measurements. In our experience, harmonizing ACPA measurements may be less complicated (data not shown). Therefore, assuming that a modification of the 2010 ACR/EULAR criteria will be undertaken in the future, we propose omitting the RF level and using only ACPA, with different weighted scores for ACPA positivity and ACPA level.
In conclusion, defining a high RF level is complicated due to the variation in RF levels obtained when different methods are applied. This problem hampers uniform application of the 2010 ACR/EULAR criteria for RA. The results of the present study revealed that the overall prognostic ability of ACPA positivity outweighs that of high RF level in patients with UA. For this reason, we suggest that a future modification of the classification criteria for RA should include ACPA determination but not RF level.
- Top of page
- PATIENTS AND METHODS
- AUTHOR CONTRIBUTIONS
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. van der Linden had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study conception and design. van der Linden, Batstra, Bakker-Jonges, Burmester, Huizinga, van der Helm-van Mil.
Acquisition of data. van der Linden, Batstra, Bakker-Jonges, Detert, Bastian, Scherer, Burmester, Mjaavatten, Kvien, Huizinga, van der Helm-van Mil.
Analysis and interpretation of data. van der Linden, Batstra, Bakker-Jonges, Toes, Huizinga, van der Helm-van Mil.