The 28-joint Disease Activity Score (DAS28) (1–4) is one of two currently used methods for describing the results of randomized clinical trials (RCTs), the other being the American College of Rheumatology (ACR) improvement criteria (5–7), and both have been shown to identify similar responder groups. The ACR criteria measure only change, while the DAS measures change as well as the level of rheumatoid arthritis (RA) activity.
There has been an increasing trend toward measuring disease activity in clinical practice (8–11). This is largely the result of attempts by governments and insurance companies to regulate prescription of expensive biologic agents, particularly anti–tumor necrosis factor (anti-TNF) therapy (12). They seek to have anti-TNF therapy prescribed to RA patients with severe or high levels of disease activity, and then to have the therapy continued only in those for whom there is a sufficient degree of response. The ability of the DAS to measure disease activity has led to its adoption in a number of countries. In the UK, for example, anti-TNF therapy is restricted to patients with DAS scores of >5.1 (12, 13). A DAS score of <3.2 is considered to indicate low disease activity, 3.2–5.1 moderate disease activity, and >5.1 high disease activity. The DAS has also been used to titrate the dosage of anti-TNF therapy (14).
The origins of the DAS come from the clinic (1–3). Patients in whom therapy was being changed because of worsening RA activity were studied, and a series of variables that best predicted change in therapy were identified. In an elegant series of analyses, a scale was constructed that reflected the distribution of 4 of these variables, the tender joint count, swollen joint count, erythrocyte sedimentation rate (ESR), and patient's estimate of global health. The Health Assessment Questionnaire (HAQ) (15) was not tested in this study, and neither pain nor physician global assessment is part of the index.
In the clinical setting the physician determines the level of RA activity but most often does not record it. To make this assessment, the physician may use a variety of data instead of or including the measures contained in the DAS. Such data might include degree of pain, pain sensitivity, functional ability, appearance, concomitant fibromyalgia, grip strength, response to therapy, other joints not included in the DAS28, extraarticular features, radiographic changes, and other laboratory results. Such assessments are weighted by the physician and produce the physician's measure of global RA activity. In clinical care this is the central determination that leads to prescription of therapy. Among the weaknesses of a physician's global evaluation is that each physician may rate these composite underlying activity variables differently or may not consider all of them. In addition, regulators may worry that such a single measure can easily be influenced so that treatment can be obtained, that it is not “objective,” and that it does not sufficiently document RA status. The DAS overcomes these problems and has become the standard evaluation tool in European clinics.
There are a number of reasons the DAS might not work well in the clinical setting. First, in the development of the measure, the valuation of the elements of RA activity was based on a single clinic in The Netherlands. It is possible that other physicians in other centers and countries might value these elements differently. Second, it is possible that physicians' estimates of RA global activity might not correlate strongly with the DAS. Third, classification into severity groups, while clearly acceptable and reliable for groups of patients, might not be accurate or reliable at the level of the individual patient. In this study we evaluated the use of the DAS as a clinic assessment tool by determining the concordance between the DAS score and the physician's assessment of RA activity. We then investigated factors relating to discrepancies and assessed the suitability of using the DAS in individual patients. However, our purpose was larger in that the DAS is representative of clinic assessment instruments in general. We hope the results of this study might be relevant when other assessment tools are considered for use in making regulatory determinations.
- Top of page
As part of a preliminary study to aid in the development of RA improvement criteria, we recruited Canadian and American academic and community physicians and asked them each to examine ∼10 consecutive RA patients in their clinics. Of 718 patients for whom data on the ESR were available, 27 had missing data on the physician assessment of RA activity and 48 had missing data on the patient global assessment. The results reported herein are from 669 patients for whom complete data were available on tender and swollen joint counts, ESR, patient global assessment, and physician rating of disease activity. Patient demographic and clinical characteristics are shown in Table 1. Data on these patients came from 61 physicians (mean ± SD 11.0 ± 6.9 patients each [median 9.0]).
Table 1. Demographic and clinical characteristics of the 669 RA patients*
|Age, years||58.0 ± 13.5|
|Ethnicity, %|| |
| Non-Hispanic white||80.3|
| Asian origin||3.2|
| Native North American||1.9|
|Years of education, %|| |
|Disease duration, years||12.5 ± 10.5|
|DAS28, 0–10||4.2 ± 1.7|
|Tender joint count, 0–28||6.1 ± 7.3|
|Swollen joint count, 0–28||5.1 ± 6.0|
|ESR, mm/hour||27.0 ± 23.0|
|Patient global assessment, 0–10||4.0 ± 2.7|
|Physician-assessed RA activity, 0–10||3.7 ± 2.5|
|HAQ, 0–3||1.06 ± 0.75|
|HAQ-II (0–3)||0.92 ± 0.69|
|M-HAQ, 0–3||0.49 ± 0.51|
|MD-HAQ, 0–3||0.73 ± 0.57|
|VAS fatigue, 0–10||4.5 ± 2.9|
|SF-36 vitality, 0–100||45.7 ± 23.9|
|Regional Pain Scale, 0–19||6.3 ± 5.6|
|Fibromyalgia (survey criteria), %||20.8|
|EuroQol utility, 0–1||0.61 ± 0.30|
|Patient self-reported joint count, 0–14||7.4 ± 4.3|
|Treatment, %|| |
| MTX ever||78.9|
| HCQ ever||56.2|
| Prednisone ever||65.6|
| Prednisone current||30.9|
| DMARDs or biologic agents current||88.8|
| Biologic agents current||28.3|
Physicians completed a 28-joint count of tender and swollen joints (4, 16) and a physician rating scale of RA disease activity. The scale consisted of 11 check boxes from 0 to 10. Under the 0, 1–3, 4–7, and 8–10 were brackets and the words “none, mild, moderate, severe” for the respective categories. Protocol instructions to the physician indicated “In making your evaluation you may take the patients' questionnaire responses into consideration or you may ignore them according to how you usually evaluate patients. You may ask any additional questions you wish or perform any examinations you would ordinarily do. That is, use your usual method of evaluation to determine the ‘Physician’s estimate of disease activity' … Disease activity does not mean structural damage. In addition, pulmonary fibrosis, pleuro-pericarditis and vasculitis does not necessarily mean disease activity, as these conditions may occur in patients with low levels of disease activity or without disease activity.” Physicians were asked if they considered extraarticular disease in making their assessments, and 19.6% indicated that they did.
Patients completed the HAQ (15), HAQ-II (17), multidimensional HAQ (18), modified HAQ (19), visual analog scales (VAS) for pain, global disease severity, and fatigue (20), a self-reported joint count (21), a count of nonarticular affected regions (the Regional Pain Scale) (22), and the Medical Outcomes Short Form 36 health survey (23, 24) from which the vitality scale was calculated.
The DAS was calculated from the tender and swollen joint count, the ESR, and the patient's global assessment, according to the instructions of the developers of the DAS.
Kendall's tau-a and associated confidence limits were calculated using the Somer's D package (25), adjusted for clustering within referring rheumatologists. Questionnaire agreement was also assessed using the Bland-Altman limits of agreement procedure (26) and Lin's coefficient of concordance (27). Other analyses were performed using Stata version 8.2 (Stata, College Station, TX). P values less than 0.05 were considered significant.
- Top of page
The data obtained in this study show that RA variables are not approached, valued, and weighted to the same extent with the DAS and physician-assessed RA activity scales. The most obvious differences are in the tender joint count (tau-a 0.63 versus 0.42), ESR (0.41 versus 0.17), and patient global assessment (0.41 versus 0.32). We believe it is a fair conclusion that North American physicians' assessments and DAS variables do not lead to similar conclusions regarding RA disease activity. It should be understood that our data do not address which method of evaluating activity is more correct, only that they are different.
This difference results in distributions of scores that are substantially different for the 2 scales (Figures 1–3). We would go so far as to say that the scales are incompatible with regard to cut points and designations of mild, moderate, and severe activity (physician RA activity assessment) and low, moderate, and high activity (DAS). This is one reason we believe that application of scales to the assessment of the individual patient is tenuous. We also have concerns regarding the distribution of values for the DAS (Figure 1). This distribution is, in part, dependent upon the nonlinear transformation of the DAS variables since normally distributed disease activity is not what is seen in the clinic, where distributions of the HAQ score, pain on VAS, and physician-assessed RA activity (Figure 2) all have similar appearances. The DAS cut points of 3.2 and 5.1 result in low, moderate, and high disease activity being designated in 29.0%, 42.2%, and 28.9% of patients, respectively. This almost normal distribution does not parallel findings obtained with other scales.
Any scale that relies on fixed coefficients has other potential advantages as well as limitations. ESR, pain, and patient global assessment differ according to sex when measured in male and female patients separately. A physician's global assessment might (but often does not) take into consideration other factors such as pain, pain sensitivity, functional ability, appearance, concomitant fibromyalgia, grip strength, response to therapy, other joints not included in the DAS 28-joint count (for example, joints in the feet), extraarticular features, radiographic changes, and other laboratory results. The physician's global evaluation might account for a person who always scores a certain way on the tender joint examination or the ESR, for example. Because we seem to be criticizing the DAS, we want to reemphasize that these criticisms apply only to the use of the DAS as a sole clinical measurement tool and do not apply to RCTs. Whether the DAS is the best activity scale for use in RCTs was not evaluated in this study.
The second major conclusion from our findings is that relying on any one scale to make a regulatory decision is not appropriate. Figures 3 and 4 (right) show that there is too much variability in the DAS to enable reliance on it for important decision making. This is not just a function of the DAS; it is also true of scales such as the HAQ, pain assessments, or the various global assessments. As alluded to above, the degree of variability that makes the scales inappropriate as sole indicators in clinical care does not limit their effective use in RCTs and observational studies.
In summary, North American physicians' assessments and DAS variables do not lead to similar conclusions regarding RA disease activity. RA variables are approached, valued, and weighted differently with the 2 methods, and levels of concordance between the 2 scales are not acceptable. There is too much inherent variability in RA scales for them to be recommended as sole determinants of RA activity for clinical or regulatory purposes.