- Top of page
- SUBJECTS AND METHODS
Measurement of appropriate outcomes in health is of central importance when trying to understand the influence of interventions upon disease or the impact of disease upon patients and their families (1). Enhanced quality of life (QOL) seems to be an outcome that is appropriate for many chronic diseases (2). Yet the meaning of quality of life and the most appropriate way to measure this construct remains uncertain. Indeed, some have suggested that the conceptual limitations of many measures purporting to address QOL seriously restrict their value (3, 4).
One approach that has attempted to overcome such problems is that taken by the World Health Organization (WHO) (5). Quality of life is defined here as “individuals' perceptions of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards and concerns.” An instrument based on this conceptual framework was concurrently developed across several countries and cultures while retaining similar psychometric properties and structure. The WHOQOL-100 is available in several culture-specific and language-specific versions (6). A 26-item, short form of this instrument (WHOQOL-BREF) was developed for pragmatic reasons and has been shown to have similar psychometric properties to the WHOQOL-100 (7).
Quality of life for people with rheumatoid arthritis (RA) is probably reduced compared with the general population (8). However, many studies purporting to measure QOL in RA use instruments that measure self-perceived functional limitations, such as the Health Assessment Questionnaire (HAQ). Disease-specific measures of QOL (e.g., RAQoL) (9) are clearly useful within RA populations but require advanced measurement techniques, such as Rasch analysis and item banks with overlapping items for use in the general population or in other disease groups. The WHOQOL-BREF may be a useful generic measure of QOL in people with RA and would permit comparison of QOL among people with different diseases. There are advantages and disadvantages of disease-specific measures compared with generic measures. The particular advantage of generic measures is to permit valid comparison of QOL among different groups of people (for example, those with different diseases or without disease). Comparison with a general population sample is clearly helpful to decide whether QOL is actually different from normal. Comparison with other disease groups may be of interest when considering priorities for funding interventions (for example, quality-adjusted life year applications) (10). On the other hand, when considering the effect of a specific intervention on a particular disease, specific measures are more responsive and relevant to the population of interest (11). Even so, there may be an important role for generic measures alongside specific measures within clinical intervention trials to facilitate informed resource allocations.
There is limited information regarding the validity or psychometric properties of WHOQOL-BREF in people with rheumatoid arthritis, although the WHOQOL-100 has been used in this population. In a small study by Wirnsberger et al of outpatients with RA, only the overall, physical health, and independence domains were reported fully and were found to be impaired, although it was stated that the RA group did not have decreased psychological health (12).
The purpose of the current study is to determine the psychometric properties of the WHOLQOL-BREF in people with RA (test-retest reliability, internal consistency, concurrent validity, factor structure, and change sensitivity) and to compare their QOL with the general population, as measured by this instrument.
- Top of page
- SUBJECTS AND METHODS
Of the 324 patients randomly selected from the disease register for sample 1, 190 (59%) agreed to participate and were mailed questionnaire booklets. Reasons for nonparticipation included inability to locate patient (17%), patient deceased (16%), and patient declined or was otherwise unable to participate (9%). Of the 190 potential participants, 142 (75%) returned their questionnaires. Fifteen patients (7.9%) withdrew their consent and 33 (17.4%) failed to return the questionnaires despite reminders. During the recruitment period for the inpatient sample, 99 patients with RA were admitted to the inpatient unit, of whom 72 (73%) were recruited. The main reasons for not participating were declining to consent, failure to contact the patient prior to the day of admission, and residence beyond the regional catchment area. Only 4 patients were unavailable at 2 weeks following discharge.
The demographic characteristics of the 2 study samples are shown in Table 2. We compared the WHOQOL-BREF domain scores from sample 1 (outpatients) and sample 2 (inpatients) with scores obtained from a randomly selected community sample of 396 Victorian (Australia) residents (21, 22). This is shown in Figure 1. There appear to be differences between outpatient RA patients, inpatient RA patients, and the general population with respect to most domains, but a formal statistical test was not performed because of the different sampling methods between the different studies.
Table 2. Main demographic and disease characteristics of the study subjects
| ||Sample 1 (randomly selected outpatients)||Sample 2 (consecutive inpatients)|
| ||n = 142||n = 72|
|Age, mean ± SD years||60.7 ± 14.4||60.6 ± 12.7|
|Sex, female, %||71.8||71.6|
|No. of current and previous disease-modifying drugs, median (interquartile range)||3 (0.25–5.75)||4 (2–5)|
|Duration of disease since diagnosis, median (interquartile range) years||12.6 (6.8–25.1)||10.7 (5.0–24.4)|
Figure 1. The mean (95% confidence interval) of each World Health Organization Quality of Life instrument, short form, domain for the outpatient sample with rheumatoid arthritis (▒), the inpatient sample with rheumatoid arthritis (□), and the Australian general population (▪). Data from reference 21.
Download figure to PowerPoint
Test-retest reliability for sample 2 indicated reasonable stability (i.e., intraclass correlation coefficient [ICC] >0.70) with the following ICCs: physical health 0.79, psychological 0.86, social relationships 0.91, and environment 0.72. The mean time between preadmission and admission was 3.5 days. Normal plots of the residuals from the ANOVA models demonstrated adequate fit to a normal distribution. Internal consistency was reasonably adequate (i.e., α > 0.80) for each domain except social relationships with the following values for Cronbach's alpha: physical health 0.87, psychological 0.82, social relationships 0.64, and environment 0.82.
A correlation matrix for the individual questionnaire items was very similar whether rank correlation or Pearson's correlation was used (data not shown). We therefore used Pearson's correlation coefficients to determine the factor structure. Bartlett's test of sphericity was significant (χ2 = 1,525; 276 degrees of freedom; P < 0.001), indicating that the data was appropriate for a factor analysis. Factor structure was fairly similar to that previously reported. The scree plot (Figure 2) suggested 4 meaningful factors so that the model was forced to have only 4 factors. These largely corresponded to previous work (Tables 3 and 4) with slight differences. Item 11 (bodily appearance) and item 14 (opportunity for leisure activity) loaded onto the physical health factor rather than the psychological domain (item 11) or environment domain (item 14). Item 10 (energy) loaded onto both the physical and the psychological domains. Item 13 (availability of information) loaded onto both the environment and social relationships domain. Item 19 (self satisfaction) loaded onto both physical and psychological domains.
Table 3. Principal components factor analysis with varimax rotation (forced to have 4 factors)*
| ||Items that loaded on this factor (correlation coefficient > 0.4)||Eigenvalue||Percentage of variance explained|
|Factor 1, physical||3, 4, 10, 11, 14, 15, 16, 17, 18||7.65||17.9|
|Factor 2, environment||8, 9, 12, 13, 23, 24, 25||2.21||13.2|
|Factor 3, psychological||3, 5, 6, 7, 19, 26||1.67||13.2|
|Factor 4, social||13, 20, 21, 22||1.28||9.1|
Table 4. Item loading for the rotated factor solution (forced to have only 4 factors)
|11 bodily appearance||0.41||0.27||0.25||0.21|
|14 leisure activities||0.40||0.26||0.19||0.40|
|15 physical mobility||0.76||0.23||0.00||0.22|
|17 activities of daily living||0.73||0.16||0.35||0.12|
|4 medical treatment||0.69||−0.12||0.00||0.00|
|23 living conditions||0.00||0.71||0.25||−0.14|
|24 access to health care||−0.11||0.64||0.12||0.00|
|9 health of environment||0.12||0.64||0.18||0.15|
|19 self satisfaction||0.47||0.23||0.54||0.12|
|26 negative feelings||0.22||0.17||0.67||0.00|
|5 enjoyment of life||0.23||0.10||0.75||0.20|
|6 life meaning||0.13||0.00||0.75||0.29|
|20 personal relationships||0.00||0.29||0.17||0.66|
|22 support from friends||−0.12||0.16||0.38||0.46|
Using the original domain structure, only the psychological domain (beta weight 0.300, P = 0.001) and the environmental domain (beta weight 0.299, P < 0.001) contributed significantly to a multiple regression model of overall QOL (as measured by the overall QLP score, which is calculated as the average of the 9 QLP subscales). This model explained only 42.9% of the variance in overall QOL. Using the 4-factor model derived from the observed data and calculating new domain scores based entirely on these factors and weighting by the factor loading, 45.0% of the variance in overall QOL could be explained, but now all domains contributed significantly: physical health (beta weight 0.297, P < 0.001), environment (beta weight 0.372, P < 0.001), psychological (0.379, P < 0.001), and social (beta weight 0.282, P < 0.001).
There was significant correlation with the WHOQOL-BREF domains and HAQ disability index, in a pattern consistent with the predicted relationship between QOL and physical disability. Although the structure of the QLP and the WHOQOL-BREF are not the same, the relationships between the domains of each instrument are similar to what was predicted, especially with regard to the physical health and psychological domains (Table 5).
Table 5. Correlation matrix between WHOQOL-BREF domains and other measures of disability (HAQ or QOL Profile)*
| ||Physical health||Psychological||Social relationships||Environment||P|
|HAQ disability index||−0.65†||−0.39†||−0.29†||−0.40||0.002|
|Quality of Life Profile|| || || || || |
| Being|| || || || || |
| Belonging|| || || || || |
| Becoming|| || || || || |
|P for a difference between each set of correlation coefficients||<0.001||<0.001||0.31||0.91|| |
For the subjects who were admitted to hospital for treatment of their arthritis, 43% obtained an improvement in their disease activity as judged by the ACR-20 criterion. There was very little agreement between patients' perception of change in either overall QOL or physical health (as measured using the 5-point Likert scale) and the ACR-20 criterion of change, with kappa values of only 0.094 and 0.173, respectively. The responsiveness indices of the domains other than physical health were marginally adequate (Table 6), whereas the physical health scale showed excellent responsiveness. Change scores were normally distributed for all instruments and domains. The HAQ instrument did not have better responsiveness compared with the WHOQOL-BREF using a statistical test of difference between the AUC for summated domain scores (not shown) and HAQ.
Table 6. Responsiveness indices for WHOQOL-BREF domain scores and HAQ between admission to the inpatient treatment unit and followup 2 weeks later*
| ||Mean change||Effect size||Standardized response mean||AUC (95% CI)†|
|Physical health||18.2||1.13||1.05||0.78 (0.65–0.91)|
|Social relationships||5.6||0.31||0.46||0.69 (0.52–0.85)|
- Top of page
- SUBJECTS AND METHODS
This study set out to examine the psychometric properties of the WHOQOL-BREF to assess its potential as a measure of outcome in people with RA. In general, the measure appears to have adequate test-retest reliability sufficient for evaluating groups of patients (but according to recommended standards, not for evaluating individual patients) (23), concurrent validity with other health status instruments, and discriminative validity when comparing different disease or nondisease groups. However, its factor structure is slightly different in this population from that which was originally reported, and internal consistency of the social relationships domain is poor and lacks sensitivity to change. We consider reasons for the differences and similarities in these findings compared with prior studies of WHOQOL-BREF.
First, there are limitations in the quality of the data and methodology of the current study. A significant proportion of the outpatient sample had missing data that prevented valid score calculations for all subjects. Although such data may be missing at random, it is possible that subjects who failed to return their survey forms or provided incomplete answers had different QOL from those whose data was complete. The response rate of 75% for the postal survey is highly satisfactory for such surveys. Furthermore, the indices of reliability and change sensitivity were calculated from the inpatient sample, which had a near complete response rate.
Second, the intervention that we hypothesized would alter QOL (inpatient treatment of active arthritis disease) could have been predicted to affect mainly physical health. It may not be reasonable to expect other aspects of QOL to show significant change with such an intervention. Nevertheless, by using patients' own perception of improvement in QOL, we were able to show that there was significant correspondence between this and each domain on the WHOQOL-BREF, suggesting that when scores in these domains do change, it represents meaningful change.
The relatively small sample size may also prevent valid determination of a factor structure using principal components analysis. We had 169 subjects for analyzing 24 items, or a little more than 7 subjects per item. There is no absolute consensus on what represents a valid sample size for this kind of analysis, but some authors regard 5–10 subjects per item to be adequate (24).
There are limited data from other studies regarding the factor structure of WHOQOL-BREF in specific disease populations. The original study used confirmatory factor analysis with structural equation modeling techniques to test the 4-domain structure (based on the WHOQOL-100) in a mixed disease/healthy population from several countries. The comparative fit index (CFI) was >0.9, suggesting a good fit (7). However, these techniques do not actually show that this structure is optimal, simply that the observed data was consistent with the proposed model. Other models might presumably be better.
In a study of the Taiwanese version of WHOQOL-BREF, although the confirmatory factor analysis showed the original factor structure to fit the data (CFI = 0.886), principle components factor analysis described quite a different factor structure with 2 factors (rather than 1) concerning the environment and a single factor encompassing psychological and social items (25). On the other hand, the Australian validation study of WHOQOL-BREF in a community sample found a structure identical to that originally reported, using principal components analysis, but also noted that the social relationships domain had a relatively low eigenvalue of 1.85 (21). It is difficult to know how important the small differences in factor structure was, as these likely reflect technical differences between different factor analyses rather than a true difference in factor structure.
Because there is no clear consensus regarding the most appropriate method of determining responsiveness, we used 3 methods. The ES and SRM are useful indicators of change following an effective intervention (such as inpatient treatment of RA). Retrospective assessment of change and correlational methods, such as using ROC curves, have been shown by Norman et al (26) to relate only weakly to the size of a treatment effect. In this study, we were able to compare the AUC of the HAQ and WHOQOL-BREF in a meaningful way because all subjects underwent the same intervention. We found no significant differences between the AUC of the domain scores and HAQ, indicating that each correlates to the same extent with a retrospective assessment of change.
The SRM values for WHOQOL-BREF following liver transplantation were somewhat similar to those found in our study (27). In particular, the social relationships domain was also found to be less sensitive to change following liver transplantation, with an SRM of 0.43, and this was significantly worse than the social relationships domain on the original WHOQOL-100. This is in close agreement to the value of 0.46 that we found. However, we found smaller environment and psychological SRM values (0.50 and 0.59) compared with the liver transplantation study (0.74 and 0.91). On the other hand, the physical SRM was greater in our study (1.05) compared with 0.92 in the liver transplantation study. This is probably due to disease-specific factors and the impact of definitive and life-saving treatment (liver transplantation) compared with a predominantly musculoskeletal disease with symptom control and physical function as primary outcomes of treatment, rather than cure.
Although it has often been claimed (28) and sometimes found (29) that disease-specific measures are more responsive to change, we did not observe this. Using patients' perceptions of improvement in QOL as an external criterion standard of improvement, we were able to show that there was no difference between the WHOQOL-BREF domains or the HAQ disability index, in terms of concordance between changes in these measures with patients' perception of improvement. Furthermore, the physical domain of WHOQOL-BREF actually showed a greater SRM than HAQ (1.05 compared with 0.85). One consequence of this may actually be increased statistical power in detecting change following intervention for patients with RA, if the physical health domain WHOQOL-BREF is used as an outcome measure rather than HAQ. Comparison between different disorders would also be greatly facilitated by the use of such a generic instrument. Other investigators have found the SRM for HAQ to vary greatly, depending on the context of measurement, with values ranging from 0.23 or 0.46 following joint arthroplasty (12 months or 3 months postsurgery) to 1.4 in intervention trials for RA (30, 31). Furthermore, confidence intervals for the SRM of the HAQ, where calculated appropriately (32, 33), are broad and suggest that sampling error also contributes significantly to the variation in SRM across different studies. In support of our findings, Tugwell et al (34) found that the standardized treatment effect for the physical function subscale and physical component score of the Short Form 36 (a generic health status measure) was not different from the standardized treatment effect for the HAQ disability index. Similarly, the RAQoL instrument (Swedish version) was not found to be more sensitive to change than the generic Nottingham Health Profile (35).
Sensitivity does not entirely characterize the responsiveness of a measurement instrument. The extent to which “true” change occurs is analogous to diagnostic specificity and was evaluated in this study using ROC curves. The area under an ROC curve (AUC) represents the probability that a randomly chosen patient who reports improvement is correctly classified as improved according to the change in instrument score (36). Each WHOQOL-BREF domain score was better than chance (probability of 0.5) in doing this. In addition, we found no differences in the AUC between the HAQ score and a summated WHOQOL-BREF score using a method to compare 2 or more AUCs calculated from the same sample (18).
In conclusion, our findings indicate that WHOQOL-BREF has adequate psychometric properties in patients with RA and may be a useful means of assessing treatment effects across different disease states.