To assess a generic measure of health-related quality of life (HRQOL) as an outcome measure in granulomatosis with polyangiitis (Wegener's) (GPA).
To assess a generic measure of health-related quality of life (HRQOL) as an outcome measure in granulomatosis with polyangiitis (Wegener's) (GPA).
Subjects were participants in the Wegener's Granulomatosis Etanercept Trial (WGET) or the Vasculitis Clinical Research Consortium Longitudinal Study (VCRC-LS). HRQOL was assessed with the Short Form 36 (SF-36) health survey that includes physical and mental component summary scores (PCS and MCS, respectively). Disease activity was assessed with the Birmingham Vasculitis Activity Score for Wegener's Granulomatosis (BVAS/WG).
The data from 180 subjects in the WGET (median followup 2.3 years, mean number of visits 10) and 237 subjects in the VCRC-LS (median followup 2.0 years, mean number of visits 8) were analyzed. A 1 unit increase in the BVAS/WG corresponded to a 1.15 unit (95% confidence interval [95% CI] 1.02, 1.29) decrease for the PCS and a 0.93 (95% CI 0.78, 1.07) decrease for the MCS in the WGET, and to a 1.16 unit decrease for the PCS (95% CI 0.94, 1.39) and a 0.79 unit decrease for the MCS (95% CI 0.51, 1.39) in the VCRC-LS. In both arms of the WGET study, SF-36 measures improved rapidly during the first 6 weeks of treatment followed by gradual improvement among patients achieving sustained remission (0.5 improvement in PCS per 3 months), but worsened slightly (0.03 decrease in PCS every 3 months) among patients not achieving sustained remission (P = 0.005).
HRQOL, as measured by the SF-36, is reduced among patients with GPA. SF-36 measures are modestly associated with other disease outcomes and discriminate between disease states of importance in GPA.
Granulomatosis with polyangiitis (Wegener's) (GPA) is an organ- and life-th reatening multisystemic disease usually treated initially with high-dose glucocorticoids in combination with an additional immunosuppressive drug. For patients with GPA, the cumulative burden of disease and treatment-related adverse effects can be substantial (1, 2).
In randomized clinical trials (RCTs) of GPA and the closely related form of vasculitis, microscopic polyangiitis (MPA), treatment efficacy of experimental agents has almost exclusively been determined by physician-based measures of disease activity (3–7). These outcome measurement tools were developed by consensus expert opinion panels and then validated for use in trials (8–11). Patients rank the relative importance of disease manifestations of vasculitis differently than do physicians (1, 12), and patient-reported outcomes might capture elements of treatment response that are missed by currently used activity measures. In a study of other rheumatic diseases, there is an increased awareness that outcomes such as fatigue (13) and health-related quality of life (HRQOL) (14) can discriminate between disease states of importance and are useful as outcome measures for clinical trials. Several observational studies have found that patients with systemic vasculitis have reduced HRQOL (15–20). Use of instruments directly originating from patients as outcome measures in clinical trials of vasculitis is essential to understanding how different treatments impact the disease from the patients' perspective.
The objective of this study was to explore whether HRQOL, as measured by the Short Form 36 (SF-36) health survey, has an additive role as an outcome measure tool for use in RCTs in GPA beyond currently used outcome measures; specifically, 1) describing SF-36 scores among patients with GPA, 2) exploring the association of SF-36 scores with physician-based and other patient-based outcome measures, and 3) determining whether SF-36 scores discriminate between disease states of importance in GPA. This study is part of a larger effort to develop valid outcome measures for use in clinical trials of vasculitis (11, 21).
Health-related quality of life, as measured by the Short Form 36 (SF-36) health survey, is reduced among patients with granulomatosis with polyangiitis (Wegener's) (GPA), and should be assessed in clinical trials of vasculitis.
SF-36 measurements help discriminate among disease states of importance in GPA.
These data demonstrate the usefulness of including SF-36 as an outcome measure for clinical trials of GPA.
Study subjects were participants in the Wegener's Granulomatosis Etanercept Trial (WGET) (22, 23) or they were subjects with GPA enrolled in the Vasculitis Clinical Research Consortium (VCRC) Longitudinal Study (LS) of GPA. The WGET was a randomized, double-blind placebo-controlled trial of standard therapy with the addition of etanercept or placebo for patients with GPA conducted at 8 clinical centers in North America. Subjects were enrolled at a time of active vasculitis and had study visits at baseline, 6 weeks, and every 3 months thereafter. The VCRC-LS is an ongoing longitudinal observational cohort, where subjects are followed on either annual or quarterly schedules. Data from all subjects from the WGET and those subjects with GPA in the VCRC-LS that came for at least 3 study visits were included in this analysis.
HRQOL was assessed with the SF-36 at all study visits in the VCRC-LS observational cohort and at every other visit in the WGET. In both cohorts, this information was collected on paper in the presence of a study administrator who provided no coaching or suggestions regarding the content of the questionnaire. The SF-36 contains 36 items that assess HRQOL in 8 health dimensions: physical functioning, role physical, bodily pain, general health, vitality, social functioning, role emotional, and mental health (24, 25). Scores for each dimension/subscale range from 0–100, with higher scores indicating better HRQOL. Two summary scores are derived from the 8 subscales: the physical component summary (PCS) score and the mental component summary (MCS) score, both of which are norm-based scores standardized to the US general population and transformed to have a mean of 50 and SD of 10 in the referent population.
Disease activity was assessed with the Birmingham Vasculitis Activity Score for Wegener's Granulomatosis (BVAS/WG) (10) that measures activity in 34 items categorized into 9 groups. The BVAS/WG takes values from 0–63, with higher scores representing more manifestations of active disease. Active disease is defined as a BVAS/WG of >0 and inactive disease as BVAS/WG = 0. Sustained remission was defined as BVAS/WG = 0 lasting for 6 months.
Disease damage was assessed with the Vasculitis Damage Index (VDI) (9), a 64-item catalog of damage with all items equally weighted and with scores ranging from 0–64, where higher scores indicate more disease-related damage. The VDI was measured at baseline and every 6 months thereafter in both the WGET and the VCRC-LS.
Patient-reported disease severity was assessed on a visual analog scale of 0–100 in the WGET and on an 11-point scale (0–10) in the VCRC-LS cohort, where the data were transformed to a scale of 0–100 to allow for comparable data for both cohorts.
The distribution of baseline factors is described with means and SDs for normally-distributed variables and medians and interquartile ranges (IQR) for non–normally distributed variables. For both WGET and VCRC data, the association of SF-36 summary scores (dependent variables) with other outcome measures (independent variables) was explored with mixed linear models with a random intercept to account for within-individual correlation of outcomes. Bivariate association between traditional outcome measures with SF-36 summary scores was the primary analysis, followed by multivariable analysis exploring the association of traditionally used outcome measures with the SF-36 summary scores adjusted for other outcome measures. The proportion in variability of the SF-36 measures that was explained by other outcome measures was estimated by the relative reduction in the residual variance by comparing the residual variance of the model that included BVAS/WG, VDI, patient-reported severity, and the random effect for each subject with the residual variance of a model that only included the random effect for each patient (intercept only model). Analyses were done separately for each cohort.
For subjects in the WGET only, the longitudinal trajectory of mean summary SF-36 scores was assessed with a mixed linear model with a random intercept to account for the within-individual correlation of the SF-36 measures and linear splines to allow for different rate of change in HRQOL with time. After visual inspection of data, a knot was placed at the time of 6 weeks of followup.
To determine if the SF-36 longitudinal trajectories are associated with sustained remission beyond what would be due to their association with disease activity, a simulation-based analysis was used. For this analysis, the null hypothesis was that there is no further association between the SF-36 measures and sustained remission than that which is induced by their association with the BVAS/WG. To test this hypothesis, the mean difference between subjects with sustained remission and those without in the PCS and MCS over time in the WGET were compared to a 95% confidence band for the expected difference in these changes over time under the null hypothesis. If the observed data went outside these confidence bands, that would indicate that the null hypothesis should be rejected at the 0.05 significance level.
Simulations were used to create the 95% confidence bands under the null hypothesis. In the simulations, random number generators were used to create values for the PCS and MCS at all time points in 200 artificial data sets that mimic the WGET data but that also follow the null hypothesis. These artificial data sets mimic the WGET in the following ways: 1) the same numbers of subjects, 2) the same BVAS/WG scores, 3) the same linear association between the BVAS/WG and PCS and MCS, and 4) the same PCS and MCS means, SDs, and within-subject correlations. Differences in the mean trajectories in subjects that achieved sustained remission and those that did not achieve sustained remission were calculated for each data set at each time. The lower 2.5 percentile and upper 97.5 percentile of these differences were used to create the 95% confidence band. All statistical analyses were done using SAS software, version 9.1.
Data from 180 subjects in the WGET and 228 subjects in the VCRC-LS cohort were used for analysis. Baseline demographic and disease characteristics are summarized in Table 1. Baseline visit SF-36 scores were reduced from those expected in the general population, and more so in the WGET cohort where all subjects had active disease at baseline (Table 2).
|WGET (n = 180)||VCRC-LS (n = 228)|
|Age, mean ± SD years||49.9 ± 15.4||49.1 ± 16.7|
|Disease duration, median (IQR) years||0.4 (0.1–2.9)||4.0 (1.9–8.4)|
|Followup, median (IQR) years||2.3 (1.3–2.8)||2.3 (1.9–2.9)|
|No. of visits, median (IQR)||10 (7–13)||7 (4–10)|
|BVAS/WG at baseline, median (IQR) (range)||6 (4–9) (2–18)||0 (0–1) (0–9)|
|VDI at baseline, mean ± SD||1.8 ± 1.9||2.2 ± 2.1|
|WGET (n = 180)||VCRC-LS (n = 228)|
|Physical functioning||55.8 ± 28.1||71.7 ± 25.4|
|Role physical||16.6 ± 31.6||62.8 ± 31.5|
|Bodily pain||47.7 ± 28.3||65.4 ± 27.0|
|General health||46.0 ± 20.2||52.0 ± 22.6|
|Vitality||35.6 ± 21.6||49.9 ± 23.0|
|Social functioning||49.5 ± 28.5||69.7 ± 27.9|
|Role emotional||53.5 ± 44.8||76.0 ± 28.6|
|Mental health||53.5 ± 44.8||71.2 ± 19.3|
|PCS||33.5 ± 9.7||43.6 ± 10.3|
|MCS||44.2 ± 11.6||46.5 ± 11.7|
In both study cohorts, the SF-36 summary measures were inversely associated with disease activity, i.e., a 1-unit increase in the BVAS/WG corresponded to a decrease in PCS by 1.15 (95% confidence interval [95% CI] 1.02, 1.29) in the WGET and by 1.06 (95% CI 0.82, 1.31) in the VCRC-LS. A 1-unit increase in BVAS/WG corresponded to a decrease in MCS by 0.93 (95% CI 0.78, 1.07) in the WGET and by 0.89 (95% CI 0.58, 1.20) in the VCRC-LS. In both study cohorts, SF-36 summary measures were inversely associated with patient-reported disease severity (Table 3). In the bivariate analysis, the VDI was significantly associated with PCS in the VCRC-LS only (Table 3) and not associated with MCS in either cohort. However, in an analysis adjusted for disease activity (BVAS/WG), there was a statistically significant association between damage and PCS in the WGET, with a 1-unit increase in VDI corresponding to a decrease in PCS by 0.91 (95% CI 0.44, 1.38). In the WGET, the BVAS/WG, the VDI, and patient-reported disease severity together explained 23% and 15% of the variance in PCS and MCS, respectively. In the VCRC-LS, these 3 outcome measures together explained 16% and 5% of the variance in PCS and MCS, respectively (Table 3).
|BVAS/WG||−1.15 (−1.29, −1.02)||−0.93 (−1.07, −0.78)|
|VDI||−0.28 (−0.77, 0.22)||−0.16 (−0.32, 0.064)|
|Ptglobal||−0.14 (−0.15, −0.13)||−0.12 (−0.13, −0.10)|
|Variance in SF-36 measures explained by BVAS/WG, VDI, and Ptglobal, %||23||15|
|BVAS/WG||−1.06 (−1.31, −0.82)||−0.89 (−1.20, −0.58)|
|VDI||−0.40 (−0.73, −0.07)||−0.06 (−0.45, 0.33)|
|Ptglobal||−0.14 (−0.16, −0.13)||−0.11 (−0.13, −0.09)|
|Variance in SF-36 measures explained by BVAS/WG, VDI, and Ptglobal, %||16||5|
In the WGET cohort, where all subjects had active disease at baseline, 126 out of 180 subjects (70%) achieved sustained remission. The SF-36 summary scores improved rapidly during the first 6 weeks for all patients. However, after 6 weeks there was more gradual improvement (0.5 in PCS per each 3-month interval) among those who achieved sustained remission versus a slight worsening (0.03 in PCS) among those who did not achieve sustained remission (P = 0.005 for the difference between those that achieved sustained remission and those who did not achieve sustained remission). Results were similar for MCS, with improvement by 0.4 points per each 3-month interval among those who achieved sustained remission versus slight worsening by 0.1 among those who did not achieve remission (P = 0.005 for difference between those that achieved sustained remission and those who did not achieve sustained remission) (Figure 1).
Figure 2 shows the observed difference in PCS and MCS over time between subjects with sustained remission and those without, compared to the expected difference and its 95% CI under the null hypothesis of there being no association beyond that induced by the BVAS/WG. The observed data escapes the confidence limits for both the PCS and MCS, therefore indicating a rejection of this null hypothesis. For the PCS, the observed difference escapes the confidence band almost immediately, indicating that the null hypothesis and the observed data are quite discordant. For the MCS, the observed difference is slower to leave the confidence band and stays somewhat closer, indicating that the discrepancy is likely not to be as large.
The findings of this study are consistent with previous reports that HRQOL is reduced among patients with GPA (15–20) and provide evidence that SF-36 is a valid outcome measure for use in clinical trials of GPA. HRQOL, as measured by the SF-36, is especially reduced during periods of active disease in GPA and rapidly improves with induction treatment. SF-36 summary scores are associated with both physician-based measures of disease and patient-reported disease severity. The SF-36 is a generic tool for assessing HRQOL and its utility has been validated in multiple chronic inflammatory diseases (26–28). The SF-36 is designed to capture elements of HRQOL across different diseases and cultural groups. BVAS/WG and VDI only explain a modest proportion in the variability of the SF-36 measures, and SF-36 measures discriminate between disease states beyond their association with BVAS/WG.
Outcome measurement tools for determining efficacy of potential therapeutic agents for GPA and MPA are under active development by the Outcome Measures in Rheumatology Vasculitis Working Group (11, 21). Systemic vasculitides are rare diseases, recruitment for clinical trials is challenging, and compared to most other diseases, clinical trials of vasculitis involve relatively small sample sizes. The success of RCTs in providing high-level evidence for therapeutic approaches in vasculitis is dependent on the successful design of outcome measures with good discriminatory power between effective and ineffective therapies. This goal can sometimes be achieved by combining individual components that modestly correlate with each other into a composite measure more sensitive to change than its individual components (29, 30). Our findings suggest that HRQOL could represent one disease domain contributing to such a composite index in GPA
This study has several important strengths. The data sources are 2 large, well-defined patient cohorts with GPA, each followed for more than 2 years. One data source represents a trial population for which the utility of SF-36 measures were under study, and the other source is an observational cohort, adding to the generalizability of our findings to other patient populations with GPA. The longitudinal nature of the data provides repeated SF-36 measurements allowing for an evaluation of longitudinal changes during the disease course. Furthermore, along with measurements of HRQOL, other established outcome measures for GPA were obtained, all with standardized methods, allowing for the exploration of how SF-36 measures compare with other currently used outcome measures in GPA.
This study also has some limitations to consider. Disease activity (and other outcome measures) was measured with the BVAS/WG at scheduled study visits only. Since the BVAS/WG reflects disease activity during the 28 days prior to the study visits, periods of disease activity between could be missed. Furthermore, the VCRC-LS data might have been subject to a selection bias with differential followup between those with periods of active disease and those with inactive disease. It is possible that the SF-36 captures effects of nonvasculitic conditions on HRQOL. However, data on comorbidities were not available for our analysis. Despite this lack of specificity, the SF-36 still serves as an important outcome measure and could differentiate between active treatment of placebo, with respect to medication effects that affect HRQOL and are of importance to patients. Additionally, neither of the study cohorts are inception cohorts and therefore patients who present with rapidly fatal initial disease are underrepresented in the current study.
The lack of association between the SF-36 and disease-related damage is notable and is in contrast to a previous analysis from the WGET cohort in which a statistically significant correlation was found between PCS (but not MCS) and VDI (2). The reason for these discrepant findings is that the 2005 analysis was cross-sectional with all data from a single time point (at 1 year of followup), whereas the current analysis utilizes the longitudinal nature of the data, using all available study visits up to 2 years from enrollment of the last subject. The results of the primary analysis demonstrating no association between disease-related damage and PCS may represent a spurious finding, while the secondary analysis, adjusted for the BVAS/WG, is more appropriate. While all subjects in the WGET had active disease at baseline, 44% of the subjects had new disease at enrollment and could not have accumulated disease-related damage; these subjects also had an initial VDI score of 0, creating an inverse association between disease activity and disease damage. Therefore, the analysis adjusting for disease activity better captures the true association between disease damage and PCS in the WGET cohort and is consistent with the findings in the separate VCRC-LS cohort.
This study demonstrates that HRQOL in GPA, as measured by the SF-36, is associated with vasculitis disease activity and discriminates between disease states of importance. Therefore, the SF-36 could capture data on a disease domain not represented by other currently used outcome measures. The SF-36 should be included in the core set of outcome measures in GPA (11). Future research on using the SF-36 in trials of vasculitis should include determination of minimally clinically important changes in SF-36 scores, how the SF-36 measures discriminate between effective and ineffective (or less effective) treatments, and whether the SF-36 could contribute to a composite outcome measure.
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be submitted for publication. Dr. Merkel had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study conception and design. Tomasson, Walsh, Davis, Hoffman, McCune, Spiera, Stone, Merkel.
Acquisition of data. Carette, Davis, Hoffman, Langford, McAlear, McCune, Monach, Seo, Specks, Spiera, St.Clair, Stone, Ytterberg, Merkel.
Analysis and interpretation of data. Tomasson, Boers, Walsh, LaValley, Cuthbertson, Khalidi, Langford, McCune, Spiera, St.Clair, Merkel.