The American College of Rheumatology (ACR; formerly, the American Rheumatism Association) 1987 criteria for the classification of rheumatoid arthritis (RA) were initially devised to distinguish subjects with established RA from subjects with other established rheumatic conditions (1) for the purposes of clinical and epidemiologic studies. Based on the prevalence of key disease features in groups of subjects with and without RA, 2 sets of criteria were proposed. One approach requires the presence of at least 4 of a list of 7 symptoms (Table 1). The other approach is based on a decision tree, in which classification of subjects as RA positive requires membership in 1 of 5 subgroups based on different combinations of the disease features studied (Figure 1).
Table 1. Standard list format of the American College of Rheumatology 1987 criteria for the classification of rheumatoid arthritis*
|Morning stiffness||Morning stiffness in and around the joints, lasting at least 1 hour|
|Arthritis of ≥3 joint areas||Soft-tissue swelling or fluid in at least 3 of the following areas: the left or right PIP, MCP, wrist, elbow, knee, ankle, or MTP joints|
|Arthritis of the hand joints||Swelling of wrist, MCP, or PIP joints|
|Symmetric arthritis||Simultaneous involvement of the same joint areas (as above) on both sides of the body (at least 50% of affected joint areas affected symmetrically)|
|Rheumatoid nodules||Subcutaneous nodules present|
|Rheumatoid factor||Detected by a method that yields positive findings in <5% of normal controls|
|Radiographic changes||Erosions or unequivocal bony decalcification localized to the joints of the hands and wrists|
Figure 1. Decision tree format of the American College of Rheumatology 1987 criteria for the classification of rheumatoid arthritis (RA), as applied to patients with inflammatory polyarthritis (IP). MCP = metacarpophalangeal; RF = rheumatoid factor.
Download figure to PowerPoint
Using the decision tree, if the status of erosions and/or rheumatoid factor (RF) is unknown, the surrogate variables of swelling of the metacarpophalangeal (MCP) joints (for radiologic erosions) and swelling of the wrist (for RF positivity) could be substituted without loss of validity in the original data set (1). This substitution option is potentially helpful in epidemiologic and clinical studies that rely on retrospective chart review, since the results of these investigations are frequently unavailable. There have, however, been few published reports comparing the performance of the list approach with that of the tree approach, nor have there been many studies assessing the influence, if any, of making these substitutions. Studies have suggested that the decision tree strategy is more sensitive (2). However, it is likely that any disagreement between the tree and list definitions is due to the use of surrogates, since the surrogate option is not applicable in the list definition. With the list approach, missing data on RF status or radiologic findings have to be considered as negative, reducing the RA prevalence estimates and the number of subjects available for study.
An additional concern is whether the ACR 1987 criteria, in either format, are appropriate for classifying subjects shortly after disease onset. The criteria, when originally developed, were applied to subjects with longstanding disease and are considered suitable for selecting such subjects for inclusion in, for example, clinical trials. The ability of the criteria to identify individuals with new-onset inflammatory polyarthritis (IP) who could be classifiable as having RA is more problematic (3). Thus, it has been shown that applying the ACR criteria to patients with new-onset IP is not useful for predicting persistent, disabling, or erosive disease (4). One reason for this may be that a number of the individual criteria, for example, nodules and erosions that may not be present initially, develop over time. Therefore, the performance of the criteria changes with disease duration (5). In contrast, other disease features, such as joint swelling, vary with time (and in response to therapy). It is therefore advisable to use a cumulative approach to disease classification, and as a consequence, in any prospective study of new-onset IP, the proportion of subjects classified as having RA will also increase over time.
In this study, we compared the use of the decision tree with the use of the list approach for the classification of subjects as RA positive among individuals with new-onset IP who were followed up for 5 years. We compared, over the period of followup, the proportion of subjects classified as having RA by each definition, as well as the agreement between the 2 approaches. In addition, we investigated the effect of using the surrogate variables on the tree definition by addressing 2 questions: 1) Compared with the use of actual laboratory and radiologic data, what was the influence of using surrogate variables on RA prevalence? 2) Were the same subjects identified as having RA when using the surrogate variables?
- Top of page
- PATIENTS AND METHODS
The frequencies, at baseline, of the individual features used to classify subjects as RA positive by either approach are shown in Table 2. In this primary care–derived cohort of subjects with IP, the prevalence of RF positivity at baseline was low (25%) compared with what would be expected in a hospital cohort of RA subjects, whereas the prevalence of its surrogate, wrist swelling, was higher (40%). The prevalence of erosions, based on its surrogate of MCP swelling, was substantial at baseline, at 66%, which is far in excess of the expected prevalence of radiologic erosions at baseline.
Table 2. Prevalence of individual rheumatoid arthritis criteria features at baseline*
|Age at onset, mean ± SD years||53 ± 15|
|Morning stiffness||568 (67)|
|Arthritis of ≥3 joint areas||561 (66)|
|Arthritis of the hand joints||686 (81)|
|Symmetric arthritis||537 (63)|
|Rheumatoid nodules||55 (6)|
|Rheumatoid factor positive||209 (25)|
|Swelling of the wrist†||338 (40)|
|Radiographic erosion||Not tested|
|Swelling of the MCP joints‡||559 (66)|
The cumulative prevalence of RA at each assessment is given in Table 3. The prevalence was initially higher using the tree definition; however, this was not surprising, because the tree approach permits a prevalent clinical surrogate as a replacement for a presumed low prevalence of erosions. The difference between the 2 approaches decreased substantially over time, with no important differences at any of the subsequent time points. There was also some evidence that agreement in the classification of RA by the 2 approaches also increased over time. The agreement at baseline was already reasonable (a kappa statistic of ≥0.6 is conventionally considered very good ), but there was a continuing modest increase in agreement with each year. As mentioned above, one problem with applying the tree definition (as originally described) cumulatively is that subjects may change from being RA positive to RA negative if the surrogate variable is positive and the original variable, when measured later, is negative. This accounts for the slight fall in the prevalence of RA between baseline and 1 year, using the tree definition.
Table 3. Influence of disease duration on the prevalence of rheumatoid arthritis (RA), as determined using the list and decision tree approaches
|Year||RA prevalence, %||P||Expected agreement, %*||Observed agreement, %†||Kappa|
The effect of using surrogate variables was investigated in the 636 subjects who had both RF positivity and radiologic damage at 5 years. As anticipated from the baseline data, even at 5 years, the prevalence estimates of the clinical surrogates for RF positivity and erosions were both much higher than the prevalences based on the real results (Table 4). Furthermore, the agreement between the use of the real data and the use of the surrogate data was poor for both features. However, when used in the tree algorithm, the use of the surrogate variables yielded prevalences of RA that were similar to those derived from the real data (Table 5). Using MCP joint swelling as a surrogate for erosions led to a small increase in RA prevalence (78% versus 70%). Furthermore, in comparing the use of surrogates with the use of the original variables, the agreement between classifications of subjects as RA positive was very good when only 1 surrogate was used, but substantially worse when both surrogates were used (Table 5).
Table 4. Prevalence of the original (real) variables compared with the surrogate variables at 5 years among 636 subjects*
|Original variable||Surrogate variable||Prevalence of real variable, %||Prevalence of surrogate variable, %||P||Kappa†|
|Erosions||MCP joint swelling||46||78||0.0001||0.25|
|RF positivity||Wrist swelling||40||58||0.0001||0.13|
Table 5. Effect of using surrogate variables on the tree definition of rheumatoid arthritis (RA) at 5 years*
|Surrogates used||Prevalence of RA, %||Expected agreement, %||Observed agreement, %||Kappa†|
|Wrist swelling for RF||67||57||90||0.77|
|MCP joint swelling for erosions||78||61||91||0.77|
The final test of the performance of the surrogates was to assess the agreement in the classification of RA at 5 years using the list and the tree approaches, but with calculation of the latter using no surrogates, either surrogate, or both surrogates. The agreement between the list approach and these different variations on the tree approach is shown in Table 6. The use of wrist swelling as a surrogate for RF positivity leads to a weakening of agreement, although interestingly, the agreement between approaches was very good when both surrogates were used or when just MCP joint swelling was used as a substitute for erosions.
Table 6. Effect of using surrogate variables on agreement between the tree definition and list definition of rheumatoid arthritis (RA) at 5 years*
|Surrogates used||Expected agreement, %†||Observed agreement, %†||Kappa†|
|Wrist swelling for RF||58||84||0.61|
|MCP joint swelling for erosions||63||89||0.72|
- Top of page
- PATIENTS AND METHODS
The ACR 1987 criteria for RA have been widely used and accepted since their publication, but the performance of the decision tree approach has never been fully investigated. The advantage of the tree approach conceptually is that it gives different weights to the different constituent variables, reflecting their relative discriminatory performance. The other advantage is that, by allowing surrogate variables for missing laboratory data, they reflect the real-life situation, particularly in studies that rely on retrospective chart review. The tree approach, however, has not been as popular in use, given the greater ease in applying the 4 of 7 list rule. The application of criteria is probably most useful clinically in early disease, and thus, a close examination of the relative merits of the 2 formats is best carried out in an inception cohort of patients with IP, as has been undertaken in the present study.
The first key finding was that the prevalence of RA using the tree approach was substantially higher at presentation, probably because of the option to use surrogates. However, in the subsequent 5 years, the RA prevalence estimates using the 2 approaches converged. Nevertheless, the individuals classified as having RA by the 2 methods were not the same, although the agreement was, according to a standard interpretation (8), good. A key question is, which of the 2 methods produces a closer approximation to “real” RA? The gold standard of physician opinion is probably too influenced by knowledge of the criteria to be of value.
An alternative approach is to consider how subjects classified by one or both of the approaches vary in their prognosis (construct validity). For this purpose, the use of erosions is limited because of the relative contribution of erosions to disease classification. As an indicator of construct validity, we therefore analyzed the mean Health Assessment Questionnaire (HAQ) scores (9) at 5 years in those subjects who were RA positive only by the list, only by the decision tree, and by both the list and the decision tree at presentation. The results for the 3 groups were mean HAQ scores of 1.3, 0.9, and 1.3, respectively, suggesting a lower specificity for severe disease among those subjects who were classified by the tree approach only. Reassuringly, those subjects who, at baseline, were RA negative by both the tree and the list approaches had a very low 5-year mean HAQ score of 0.31. Thus, the overall conclusion is that the 2 methods generate similar RA prevalence estimates, although use of the list may identify, albeit modestly, a more severely affected group. The erosions data, although less reliable for the reasons given earlier, produced a similar picture, with RA prevalences (determined using the presence of erosions at 5 years) of 25%, 56%, 45%, and 55% in subjects who were RA positive at baseline by neither definition, the list definition only, the tree definition only, and both definitions, respectively.
The second key finding was that the use of surrogates for erosions and RF positivity, albeit agreeing weakly with the real results, did not have a major impact on the overall RA prevalence estimates. Indeed, the substitution of wrist swelling for RF positivity actually produced a modest decrease in RA prevalence, whereas the substitution of MCP joint swelling for erosions had the opposite and numerically greater effect. The interesting finding was that the agreement between the list and tree approaches was the same regardless of whether one or both surrogates were used. It is reassuring that the original surrogates performed well in this completely different cohort.
The use of surrogates may be most useful at presentation, since clinical studies requiring contemporary recruitment of subjects later during the disease course clearly do not require the use of surrogates because the real results can be readily and easily obtained. In contrast, the observations from these analyses are of value to those undertaking studies that rely on retrospective chart review, in which laboratory and radiographic data are frequently missing. Indeed, this type of experience was the rationale for their inclusion in the original ACR study (1).
One question not previously addressed is whether the surrogates have any value in the more commonly used list approach. The issue is somewhat problematic, since the features used in both approaches are not identical and the use of surrogates in the list approach, as currently described, could theoretically allow some variables to be counted twice; for example, MCP joint swelling would be both an indicator of hand arthritis and a surrogate for erosion. However, our analysis of the performance of the list approach was undertaken using the surrogates, and our results showed that there were 98 additional subjects who would have been RA positive by the list at baseline (an increase in prevalence of 15%). Interestingly, of these subjects, 71 (72%) could be classified as having RA by 5 years based on the use of real data at that time point. Thus, the surrogates suggested for the tree approach have useful predictive validity for RA.
There are some limitations to be considered. As suggested above, the lower prevalence of RA at presentation using the list definition at baseline was expected because radiographs were not obtained until year 1. In contrast, the tree approach allowed for missing radiographic data, as mentioned above. Thus, at the baseline visit, classification using the list was based on only 6 variables. It was assumed that most patients would have been negative for erosions at baseline because ethical concerns were expressed at the start of the study about exposing the patients to unnecessary radiation. This assumption, however, is unlikely to be valid, although robust data from comparable patient populations are limited. The Leiden Early Arthritis Cohort comprised 524 subjects with inflammatory arthritis affecting at least 1 joint. A total of 15% of patients in that cohort had erosions at presentation (10), although unlike the current cohort, this was not based on attendance in a primary care setting. Within our cohort, erosions were likely to be less common at presentation, and some of those with erosions were likely to satisfy either ≥4 other criteria or ≤2 other criteria, and therefore, their RA status would not have been affected if a radiograph had been available. Thus, the difference in RA prevalence between the list and tree definitions is likely to be due partly to an underestimate of prevalence using the list definition, because of the missing data on erosions, and due partly to an overestimate using the tree definition, because of the use of the surrogate variable.
Second, comparing the kappa statistics over time is difficult, since the prevalence of RA (and therefore the expected agreement by chance) increases over time. Thus, the same absolute agreement at a later time will give a lower kappa statistic.
In conclusion, the 2 formulations of the ACR 1987 criteria for RA perform equally well at 5 years after presentation with IP, and the use of surrogates for missing RF and erosion data does not cause an unacceptably high level of misclassification, compared either with the use of real data or with the use of the traditional list approach. The use of the decision tree at presentation will, with its surrogate option, yield a higher initial prevalence of RA, but the arthritis identified may be milder.