Assessor-based disease activity measures such as the Disease Activity Score in 28 joints (DAS28), although widely used in rheumatoid arthritis (RA), have high interobserver variability. We developed and validated a patient-based disease activity score (PDAS) as an alternative assessment.
Patients' assessments of swollen or tender joints, visual analog scales for pain and general health, the Health Assessment Questionnaire, and erythrocyte sedimentation rate (ESR) were used to develop the PDAS. In a developmental cohort (204 patients), regression analyses determined the best fit with the DAS28. A validation cohort (322 patients) subsequently evaluated criterion and construct validity against a range of outcome measures, including the Nottingham Health Profile (NHP) and Short Form 36 (SF-36). Sensitivity to change was assessed in 56 patients after 6 months of treatment with disease-modifying antirheumatic drugs or biologics.
In the developmental cohort, the PDAS with ESR (PDAS1) and without ESR (PDAS2) achieved excellent fit with the DAS28 (r = 0.88 and 0.74, respectively). In the validation cohort, the PDAS showed high criterion validity by correlation with the DAS28 (PDAS1: r = 0.89, PDAS2: r = 0.76). Construct validity was demonstrated by high correlations with a range of disease activity measures (r ≥ 0.45), whereas low correlations (r < 0.45) with mental and social components of the SF-36 and NHP indicated divergent validity. The PDAS and DAS28 had similar sensitivity to change, determined using effect sizes (DAS28 = 1.03, PDAS1 = 1.02, PDAS2 = 0.77) or standardized response means (DAS28 = 0.79, PDAS1 = 0.77, PDAS2 = 0.73).
The PDAS1 and PDAS2 are valid and sensitive tools to assess disease activity in RA. They appear suitable for clinical decision making, epidemiologic research, and clinical trials.
Joint counts undertaken by physicians, nurses, or therapists are one cornerstone in the conventional assessment of disease activity in rheumatoid arthritis (RA). Substantial interobserver variation presents one substantial practical disadvantage when using such joint counts and this variability persists despite training (1–5). One limitation of this variability is the need to ensure the same assessor carries out disease activity assessments in each patient. This is generally stipulated in the protocols of clinical trials and attempted as far as possible in routine clinical practice. However, prolonged followup in trials and everyday care makes it impractical for long-term observations to be made by a single individual clinician. A second limitation is that joint counts by clinicians appear to be relatively insensitive for separating the effects of active therapy from placebo treatment compared with subjective patient-based measures such as the Health Assessment Questionnaire (HAQ) (6, 7). The probable explanation for this latter finding is that clinicians overestimate placebo effects compared with patients.
One approach to limiting the impact of these problems is to replace joint counts by clinicians with patient self-assessed joint counts. However, studies of self-assessed joint counts show that despite providing useful information, these joint counts cannot directly substitute joint counts by clinicians (8–18). More information is gained by combining self-assessed joint counts with health status assessments in instruments such as the RA Disease Activity Index (RADAI) (19–21). Although such measures provide much useful data, they are not directly comparable with existing integrated assessments of disease activity such as the Disease Activity Score in 28 joints (DAS28) (22, 23) and, as a consequence, their value is restricted in situations in which a score comparable with the DAS28 is needed. One example is identifying patients who may benefit from therapy with biologic treatments, for which some regulatory bodies require specific DAS28 scores.
Previous work from our unit has demonstrated that self-assessed joint counts can be used to generate patient-based disease activity scores (24). However, this earlier approach was simplistic and did not involve a formal evaluation of the optimal combination of measures to reproduce the DAS28. We have therefore extended this approach by developing and validating a patient-based disease activity score (PDAS), which is comparable with the clinician-based DAS28 using measures within the internationally agreed core data set for RA. Our goal was to design a valid, reliable, sensitive, and feasible alternative to conventional assessment by clinicians for determining individual clinical disease activity and responses to therapy with antirheumatic drugs.
PATIENTS AND METHODS
We studied current outpatients attending specialist rheumatology clinics in southeast London who met the 1987 American College of Rheumatology (ACR; formerly the American Rheumatism Association) criteria for RA (25). Three cohorts of patients were studied (Table 1).
Table 1. Details of patients in the different assessment groups*
Study 1: model data (n = 204)
Study 2: validation (n = 322)
Study 3: changes with treatment (n = 56)
Values are the number (percentage) unless otherwise indicated. DMARD = disease-modifying antirheumatic drug; IM = intramuscular; NSAIDs = nonsteroidal antiinflammatory drugs.
Age, mean (range) years
Disease duration, mean (range)
Rheumatoid factor positive
The developmental cohort comprised 204 consecutive patients with RA who completed the patient self-assessments. The initial 20 patients in this cohort were also involved in testing face validity. These 20 patients found that rating tender joints and swollen joints verbatim was confusing and preferred performing these assessments using a mannequin without grading. This is a key difference between the RADAI and PDAS. In addition, the test–retest reliability of the questionnaire was evaluated in 46 of the 204 patients who were asked to complete the questionnaire 24 hours after their initial assessment.
A different group of 322 consecutive patients with RA then completed the patient self-assessments and also had standard measures of disease assessed.
The responsiveness cohort comprised 56 patients who had started disease-modifying antirheumatic drugs (DMARDs) or biologic agents and were seen 6 months apart to assess responsiveness to change. Six patients were going to start biologic agents (infliximab or etanercept), 33 were going to start methotrexate, and 17 were going to start other DMARDs including combination therapies.
The South Thames Multicentre Research Ethics Committee approved the study. All patients who were enrolled gave written informed consent.
An initial systematic review of the literature identified the most relevant patient-based disease activity assessments. These included pain score (0–100-mm visual analog scale [VAS]), patient global assessment of disease activity (PGA; 0–100-mm VAS), fatigue score (0–100-mm VAS), early morning stiffness score (0–5 scale), patient self-assessed tender joint counts and swollen joint counts for up to 50 joints, HAQ, Short Form 36 (SF-36), Nottingham Health Profile (NHP), and EuroQol. Erythrocyte sedimentation rate (ESR) was measured on the same day as these assessments. Patient self-assessed joint counts were recorded on a self-administered questionnaire completed without specific verbal assistance; patients were asked to indicate all the joints that were painful at present using one mannequin that displayed individual joints (for tender joints) and all the joints that were swollen at present using a second mannequin of an identical design (for swollen joints).
Conventional disease outcome assessments were also performed, including tender and swollen joint counts (for 28 joints), and were used to calculate the conventional DAS28 (for 28 joints).
Dispersion and distribution of the data in the self-assessment questionnaire were examined and when necessary transformed into a Gaussian distribution. Many self-assessed outcome measures showed skewed distributions; one notable exception was the HAQ, which had a Gaussian distribution. These self-assessed measures were logarithmically transformed prior to multiple regression analysis.
Modeling of the PDAS was established by performing forward stepwise regression analyses. Patient-derived variables, coupled with HAQ scores and ESR results, were entered into SPSS software, version 10 (SPSS, Chicago, IL) to generate the best-fit models with the DAS28. Two models were developed: PDAS1, which included the ESR, and PDAS2, which did not include the ESR. The internal consistency and test–retest reliability of the PDAS was tested through Cronbach's alpha and intraclass correlation coefficients.
Validation of the PDAS.
Criterion validity for the PDAS1 and PDAS2 developed in the first cohort of patients was confirmed by correlation with the DAS28 and Clinical Disease Activity Index (CDAI) (26) in the second validation cohort of patients. Construct validity was assessed by correlation with individual components of the internationally agreed core data set for RA, SF-36, NHP, and EuroQol-5D based on assumptions that patients with active RA have more symptoms, more disability, and reduced physical function with relatively little direct impact on mental health. In assessing construct validity, given that the PDAS is a measure of disease activity, correlation with other measures of disease activity should be higher (convergent validity) than other measures such as quality of life (divergent validity).
Responsiveness and sensitivity to change of the PDAS.
Patients who took part in the responsiveness/sensitivity to change study were consecutive patients who were seen twice and had started a DMARD or biologic agent. These patients were asked to return to the clinic after a period of 6 months to complete the same set of questionnaires and an assessment as detailed in study 2. The responsiveness/sensitivity to change of the PDAS1 and PDAS2 were assessed by calculating effect sizes and standardized response means. Effect size was measured by the difference between the mean baseline scores and followup scores on the measure, divided by the standard deviation of baseline scores. Standardized response mean was calculated by dividing the mean observed change by the standard deviation of the change.
Patients were also asked to assess their responses to biologic agents and DMARDs at 6 months in terms of whether or not there was a response. Changes in DAS28, PDAS1, and PDAS2 were evaluated from this perspective.
Development of the PDAS.
Face validity of the patient self-assessments, incorporated into a questionnaire, was evaluated by showing the questionnaire to 20 patients. Patients preferred self-assessing joint tenderness and swelling using a mannequin rather than using verbatim assessment. Many patients found grading of joint tenderness and swelling to be too complicated and time consuming, and therefore this was omitted. The questionnaire was consequently revised for use in the subsequent definitive studies; this revised format was usually completed in 7 minutes.
The test–retest reliability of each item in the finalized questionnaire was assessed in 46 patients. This was graded as excellent with intraclass correlation coefficients ranging from 0.76 to 0.88.
The PDAS was then devised by stepwise multiple regression analysis in the full cohort of 204 patients after appropriate transformations of the various clinical variables. This analysis showed that 4 measures in the patient self-assessment questionnaire explained 79% of the variance in DAS28 scores (r = 0.89). PGA explained 44% of the variance in DAS28, logarithmically transformed ESR explained a further 28%, logarithmically transformed numbers of patient-assessed tender joints (50 joints) explained a further 5%, and the HAQ explained a final 1%. Because the HAQ added relatively little to the variation in PDAS1, it could have been omitted, but a decision was made to retain it to ensure maximal comparison with the DAS28. The regression equation for the PDAS1 (including ESR) was as follows:
where 50 TJC = tender joint count of 50 joints. Because ESR results may not be readily available in all clinical situations, a second model, the PDAS2 (without ESR), was also developed using a similar regression analysis. In this model 4 measures explained 55% of the variation in DAS28 (r = 0.74). PGA accounted for 44% of the variance and addition of the HAQ, patient self-assessed swollen joint count (for 28 joints), and early morning stiffness (EMS) score added a further 5%, 4%, and 1%, respectively. Because EMS added little to the total variation in PDAS2, it could have been omitted, but a decision was made to retain it to ensure maximal comparison with the DAS28. The regression equation for the PDAS2 (excluding ESR) was as follows:
where SJC = swollen joint count. The internal consistency of the 8 items included in the questionnaire was high; this was shown using Cronbach's alpha, which gave a value of 0.72. For the 4 items in the PDAS1 and PDAS2, Cronbach's alpha was 0.5 and 0.4, respectively.
The maximum number of joints in the patient-assessed joint counts was 50 for both tender and swollen joints. We explored the possibility of reducing this to the 28 joints used in the DAS28 score to reduce the demand on patients, but found that for the tender joint count, 50 joints performed better than 28 joints.
The PDAS1 (with ESR) and PDAS2 (without ESR) had distributions similar to the DAS28, with some minor variations (Figure 1). Both were less sensitive for detecting low disease activity with the appearance of a floor effect. This floor effect was more marked with the PDAS2. Using the DAS28, 54 (17%) patients had scores <3.1, and 29 (9%) had scores <2.6. By comparison, with the PDAS1 only 28 (9%) had scores <3.1 and 9 (3%) had scores <2.6, and with the PDAS2 there were 25 (8%) and zero, respectively, with scores in these lower ranges. The PDAS1 and PDAS2 both correlated highly with the DAS28, with Spearman's rank correlation coefficients of 0.89 and 0.76, respectively. The correlations of the PDAS1 and PDAS2 with the DAS28 are shown in Figure 2. The PDAS1 and PDAS2 also correlated highly with the CDAI, with correlation coefficients of 0.69 (P < 0.0001) and 0.73 (P < 0.0001), respectively.
The PDAS1 and PDAS2 showed convergent and divergent validity. Both showed relatively high correlations with other measures of disease activity and quality of life measures that capture arthritis symptoms such as pain and disability. These include assessor 28 tender joint counts and 28 swollen joint counts, VAS fatigue scores, VAS assessor global scores, VAS pain scores, C-reactive protein level, SF-36 physical component scores, NHP physical domain scores, NHP pain scores, and EuroQol scores. With the PDAS1, these correlations varied from 0.45 for assessor 28 swollen joint counts to 0.72 for VAS pain scores, and with the PDAS2, the correlations ranged from 0.37 for C-reactive protein level to 0.83 for VAS pain scores (Table 2). The main differences of both the PDAS1 and the PDAS2 compared with the DAS28 were higher correlations with VAS pain scores and VAS fatigue scores. In contrast, both the PDAS1 and PDAS2 showed lower correlations with measures of generic health such as sleep and social function. These correlations were <0.37 with the PDAS1 and <0.44 with the PDAS2 (Table 2).
Table 2. Convergent and divergent validity of the PDAS1 and PDAS2 compared with the DAS28: Spearman's rank correlation coefficients with disease activity and generic health measures in 322 patients in the validation study (study 2)*
There is convergent validity with other measures of disease activity and divergent validity with other measures, specifically quality of life measures. PDAS1 = patient-based disease activity score with erythrocyte sedimentation rate; PDAS2 = patient-based disease activity score without erythrocyte sedimentation rate; DAS28 = Disease Activity Score in 28 joints; CDAI = Clinical Disease Activity Index; 28TJ = 28 tender joint count; 28SJ = 28 swollen joint count; VAS = visual analog scale; SF-36 = Short Form 36; PCS = physical component summary; NHP = Nottingham Health Profile; MCS = mental component summary.
VAS assessor global
Responsiveness to change.
The sensitivity to change of the PDAS1 and PDAS2 was evaluated in 56 patients starting a new DMARD or biologic agent who were followed up 6 months later. Effect sizes and standardized response means for the PDAS1 and PDAS2 (Table 3) showed that the PDAS1 and DAS28 had similar effect sizes (1.02 and 1.03, respectively), with the PDAS2 showing a smaller effect size (0.8). The standardized response means were similar (0.70–0.79). The effect size of CDAI was 0.7. The PDAS1 and PDAS2 showed correlations to the DAS28 similar to other assessments of change in these cases. There were high Spearman's correlations with changes in VAS assessor global and VAS pain scores (≥0.59), moderate correlations with changes in assessor tender joint count (≥0.51), and no correlations with changes in SF-36 physical component summary and EuroQol scores.
Table 3. Effect size and standardized response mean of the PDAS1 and PDAS2 in 56 patients concerning changes with treatment study*
Effect size was calculated as the difference between mean baseline scores and followup scores divided by the standard deviation of baseline scores. Standardized response mean was calculated as the mean observed change divided by the standard deviation of the change. See Table 2 for definitions.
Mean ± SD change
1.00 ± 1.30
0.73 ± 1.10
1.20 ± 1.50
9.7 ± 13.9
Standardized response mean
VAS assessor global
Patient self-assessment of response comprised 37 (66%) responders and 19 (34%) nonresponders. At baseline the mean DAS28 score was 6.26 for nonresponders and 5.90 for responders. At 6 months the sample mean was 5.78 for nonresponders and 4.27 for responders. The PDAS1 including the ESR produced baseline sample means of 6.2 and 5.87 for nonresponders and responders, respectively. At the 6-month assessment the sample means were 5.77 for nonresponders and 4.56 for responders. A comparison of 6-month scores for responders and nonresponders is shown in Figure 3.
The PDAS defines an individual patient's disease activity on the day of his or her assessment, making it a useful measure to assess both symptom impact and changes in activity. The PDAS1 and PDAS2 have good psychometric properties and both meet the requirements of the Outcome Measures in Rheumatology Clinical Trials (OMERACT) filters as they are true (valid), show discrimination (sensitivity to change), and are feasible. They are tools that could be adopted in future clinical trials, epidemiologic research, and routine practice. Although both measures could be further simplified, for example, by removing the HAQ from the PDAS1 because it contributes only minimally to the overall variance, there is little benefit to such omissions because the HAQ is included in the OMERACT and European League Against Rheumatism (EULAR) core data set. For the PDAS2, EMS score can be omitted without significantly affecting the validity and sensitivity of the instrument. Use of the HAQ to assess disease activity could be criticized because HAQ scores are also influenced by structural damage; however, there is good evidence from secondary evaluations of clinical trial data that HAQ scores are sensitive indicators of disease activity (6) and we consider these scores suitable for use in this context.
It is interesting that the patient-derived joint counts were not entirely equivalent to the clinician-derived counts in the modeling of the PDAS; instead, the HAQ seemed to be an indicator of more importance, particularly in the PDAS2 model without the ESR. Our initial assumption was that patient-derived joint counts could simply substitute for those made by clinicians (24), but this proved incorrect. However, we have demonstrated that using a combination of self-assessment items, it is possible to measure disease activity in a manner that is as efficient as the DAS28. As with all clinical measures there are likely to be substantial differences in the judgments of individual clinicians and patients about whether or not active disease is present; this has been previously studied in detail by Kirwan and colleagues using “paper patients” to evaluate clinicians' views (27).
The use of laboratory measures in assessing disease activity in RA is complex. We developed 2 PDAS models, one with ESR and the other without ESR. On a superficial level, leaving out the ESR may be relatively disadvantageous for clinical trials and epidemiologic studies because its inclusion provides a more representative reflection of the conventional DAS28. However, there is considerable evidence that patient-derived measures provide a better assessment of clinical outcomes than laboratory measures (7, 26, 28, 29). The balance of evidence indicates that a pooled index of patient self-report questionnaire measures is equally as informative as ACR 20% improvement criteria (ACR20) responses (30), DAS28 scores, and pooled indices of all and assessor-derived measures in the core data set for RA in distinguishing active treatment from placebo. The PDAS instruments we have developed reflect the benefits of patient self-assessments. It is interesting that the PDAS1 uses patient-derived tender joint counts in preference to patient-derived swollen joint counts because a recent study of 82 patients with RA demonstrated that within-patient and patient-physician correlations for joint tenderness counts were high whereas patient-physician correlations for joint swelling counts, although significant, were much lower (31). This study together with our own findings imply that patient-assessed joint tenderness is the key measure.
There is debate about the value of summated assessment measures in clinical practice and the levels of activity they should represent (32). The DAS28 is widely used in much of Europe, although there are simplified alternatives, including the Simplified Disease Activity Index reported by Smolen et al (33), the Patient Activity Scales (PAS and PAS-II) reported by Wolfe et al (34), the CDAI (26), and the patient self-report questionnaire Routine Assessment of Patient Index Data (RAPID) score reported by Pincus et al (35). Some of these scales, such as the RAPID score, involve patient self-assessment. Interestingly, when Gulfe and colleagues (36) compared ACR20 responses, DAS28 responses, and RAPID to identify individual responses in 184 outpatients, they found good agreement at the ACR20 level but poor agreement at the ACR50 level. They recommended that this discordance should be taken into account when using response criteria to guide clinical decisions. Similar concerns have been expressed about using the DAS28 to define the need for tumor necrosis factor inhibitors (37, 38). Despite these concerns, the DAS28 is recommended for use in clinical practice in the UK (39, 40) and the balance of evidence indicates its use is beneficial for routine practice (41–43).
There is no doubt that patient-based assessments, which have been championed for many years by Pincus and Sokka (44, 45), are of key importance. In this context it is relevant to consider the relative merits of using existing self-assessed measures with joint counts (such as the RADAI ), without joint counts (such as the patient activity scale ), and the PDAS score we have developed. In this context Wolfe and colleagues have argued that the simplicity of instruments such as the PAS makes them particularly useful in the clinic (34). Interestingly, there is also evidence that the HAQ alone performs well in aiding clinical assessments of disease activity (6). On balance we consider there is no single strong reason to prefer one measure to another. They are all likely to have benefits and drawbacks. Instead, we suggest that their use must reflect the circumstances in which patients are being assessed. In a clinical environment when a physician-based composite measure such as the DAS28 has been widely used, it would be possible to replace it with the PDAS without a major change in the nature of the data being collected. One specific difficulty with patient-based measures, highlighted by Kievit and colleagues (46), is the concept of response shift. Kievit et al studied 624 newly diagnosed patients with RA who had completed 3 years of followup and found that although the DAS28 and the VAS assessment for global health were significantly associated, the explained variance was low (6.7%). Longitudinal regression modeling showed that VAS assessment for global health improved during the course of RA, independent of change in DAS28 score, and this was in keeping with a change in patients' perceptions of the disease. They consider that this type of response shift mitigates against using patient-generated scores.
One important benefit of replacing the DAS28 with the PDAS is that the PDAS will involve patients far more directly in assessing their disease, which is widely considered to be important in optimizing care (47, 48). Measures such as the PDAS could also be used in Web-based recording of disease activity, which may well become of growing importance in future years. Interactive technology including touch-screen programs and Internet access to questionnaires may also facilitate the assessment of disease activity. Greenwood and colleagues have demonstrated that touch-screen computer systems can be used in rheumatology clinics as a means of collecting reliable, user-friendly outcome data from patients (49). Athale et al (50) showed that Web-based computer health assessment surveys could be undertaken by patients with RA and that they provided information comparable with paper versions. The PDAS could readily be adopted in such a Web-based system for patient assessment. We recognize that in some specific circumstances, such as the recognition of near remission, other patient-generated assessments, such as the RAPID score proposed by Pincus et al (35), may have advantages, particularly as we have not evaluated the ability of the PDAS to detect remission or near remission in RA.
We believe the PDAS is a suitable clinical tool to highlight individual patient concerns and help monitor progress, including the effectiveness of treatments over time. It could also be used in epidemiologic studies, and may even be converted into an economic utility tool. Single-handed practitioners and clinicians working in an environment in which resources are limited could adopt patient-derived measures of disease activity such as the PDAS. Overall, the PDAS shows good reliability, validity, and responsiveness. The main area in which this type of assessment appears to be of limited use is in determining the presence of low disease states or remission. Further work is needed to establish the smallest detectable difference and minimal clinically important difference, low disease activity, and disease remission of the PDAS, as well as cross-cultural validation. It may also be important to elicit patients' views on completing such questionnaires, and whether or not they believe such an approach can enhance their involvement in managing their own disease.
Dr. Choy had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study design. Choy, MacGregor, Scott.
Acquisition of data. Choy, Khoshaba.
Analysis and interpretation of data. Choy, Khoshaba.
Manuscript preparation. Choy, Khoshaba, Scott.
Statistical analysis. Choy, Cooper, MacGregor.
We thank Professor G. S. Panayi and Dr. B. Kirkham (Guy's Hospital, London) and Dr. N. Chung (Queen Mary Hospital, Sidcup) for their help.