Outcomes after total hip replacement based on patients' baseline status: What results can be expected?

Authors


Abstract

Objective

We evaluated patient satisfaction with total hip replacement (THR) to establish cut points of sufficient improvement based on the patient acceptable symptom state (PASS) and receiver operating characteristic (ROC) curves, and compared them with measures derived from the minimum clinically important difference (MCID), taking into account patients' baseline status.

Methods

Two cohorts of prospectively recruited patients on waiting lists for THR were studied. Sociodemographic data and comorbidities were recorded. Patients completed the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) and other patient-reported outcomes questionnaires before THR and 6 months afterward. Cut points of sufficient improvement were established by the PASS, ROC, and MCID and were compared.

Results

Patients satisfied with THR had, by preintervention WOMAC tertiles, gains of 19.4, 34.1, and 49.3 in the WOMAC pain domain and 17.8, 30.8, and 41.4 in the WOMAC functional limitation domain. The PASS cut points determined were 20, 25, and 25 for postintervention WOMAC pain and 28, 35, and 42 for functional limitation. ROC cut points were 19, 25, and 25 for postintervention pain and 26.4, 39, and 40 for functional limitation. Agreement among cut points classifying patients as responders to THR was 1.0 for pain with both PASS and ROC, and 0.85 for functional limitation; 0.6 for pain between MCID and PASS or ROC, and 0.58 and 0.60 for functional limitation.

Conclusion

Cut points of expected gain after THR can help clinicians, researchers, and managers to identify suitable candidates for THR, although such measures must be used with caution.

INTRODUCTION

The prevalence of hip osteoarthritis (OA) is expected to increase substantially over the next 2 decades, especially in developed countries with aging populations (1–4). Along with this will come an increase in the number of total hip replacement (THR) operations, an effective treatment for patients with severe hip pain or functional limitation due to advanced OA and other rheumatologic and orthopedic problems (5, 6).

In recent years, researchers and clinicians have started incorporating measures of effectiveness derived from patients' perceptions of the success or failure of interventions such as THR using patient-reported outcomes (PROs). These are generally obtained through answers to structured, validated questionnaires. In the case of hip OA, questionnaires such as the generic Short Form 36 (7) or the OA-specific Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) (8) are commonly used (9). An important problem with these tools is translating the results into a clinically meaningful interpretation. For example, what does a change of “x” units on a particular PRO instrument mean after a particular surgical intervention, or how much change on a PRO instrument is needed to consider that intervention to have been successful? In an effort to answer such questions, 2 parameters have been defined: the minimum clinically important difference (MCID), defined as the smallest change in measurement that signifies an important difference or improvement in a symptom, and the minimal detectable change (MDC), defined as an estimate of the smallest change in score that can be detected objectively for a patient (10–12).

A new approach to measuring patients' responses to treatment is the patient acceptable symptom state (PASS) (13, 14). As described by Tubach et al (13), the PASS is based on a yes/no answer to a single question: “Considering all the different ways your disease is affecting you, if you would stay in this state for the next months, do you consider that your current state is satisfactory?” A positive PASS indicates that a patient feels well and is therefore a simple way to determine whether a patient has achieved therapeutic success in a clinical trial or in clinical practice. However, the PASS has not yet been validated for patients undergoing THR due to OA. In addition, it is unclear if or how much the PASS or MCID values are influenced by a patient's baseline values.

Our study had 4 goals: 1) to study the relationship of patients' satisfaction with PROs, as measured by the WOMAC, taking into account patient status at baseline; 2) to establish cut points that identify sufficient gain after THR intervention based on the PASS considering baseline status; 3) to establish cut points that identify significant improvement in symptoms measured by the MCID, again taking into account patient status at baseline; and 4) to compare cut points obtained by the PASS, receiver operating characteristic (ROC) curve, and MCID methods.

Significance & Innovations

  • This study provides cut points that identify sufficient gain, as measured by the Western Ontario and McMaster Universities Osteoarthritis Index as a specific patient-reported outcome instrument, after total hip replacement (THR) intervention in patients with hip osteoarthritis taking into account patient status at baseline.

  • Various methods are employed, and compared, to establish those cut points providing a range of proper gain values after the intervention for different patients.

  • Those cut points of expected gain after THR can help clinicians, researchers, and managers to identify suitable candidates for THR and to inform patients.

PATIENTS AND METHODS

Study design and population.

The present study is based on data from 2 prospective cohorts recruited independently from 7 teaching hospitals of the Basque Health Service (Galdakao-Usansolo, Basurto, Cruces, San Eloy, Donostia, Santiago, and Txagorritxu). These hospitals serve a catchment area of ∼2 million people and provide free, unrestricted care to nearly 100% of the population.

Consecutive patients scheduled to undergo THR between March 1999 and March 2000 were eligible for cohort 1, which we called the derivation cohort. Consecutive patients scheduled to undergo THR between September 2003 and September 2004 were eligible for cohort 2, which we called the validation cohort. In both cohorts, patients were excluded from the study if they were undergoing THR for the following reasons: any condition other than hip OA, refused to participate, had cancer or other organic (dementia, cerebrovascular accident, sensory problems) or psychiatric severe conditions that prevented participating in or completing the questionnaires, or were not undergoing THR for any reason (death, intervention at another hospital, refusal to undergo the intervention) within 1 year after inclusion in the study. Physicians in each hospital were blinded to the study goals. Each hospital's ethics review board approved the study.

Of the 1,499 patients in cohort 1, 678 were excluded (326 did not have OA, 69 had severe diseases, 109 did not have the intervention during the study period, and 174 refused to participate). Of the remaining 821 patients, in 788, we took the first THR intervention with an accessible medical record and, of these 788 patients, 558 fulfilled the WOMAC at baseline and 6 months after the intervention as well. In cohort 2, of 634 patients, 558 fulfilled the selection criteria (71 excluded by not having OA, and 5 refused to participate). Of the remaining patients, in 445, this was the first THR intervention with an accessible medical record and, of these 445 patients, 339 fulfilled the WOMAC at baseline and 6 months after the intervention.

Instruments and data collection.

The data collection process and methodology were the same for both cohorts. All patients on the waiting list for THR were sent a letter that described the study and requested voluntary participation. The mailing also included measurement instruments (described below) and a questionnaire for sociodemographic and clinical data. A reminder letter was sent to patients who had not replied after 15 days. Those who had still not responded after another 15 days were sent the mailing again and were contacted by telephone. Approximately 6 months after the intervention, patients were sent a mailing with a similar reminder procedure.

The WOMAC was included in both preintervention and postintervention mailings. The WOMAC is an arthritis-specific, self-administered questionnaire made up of 24 items grouped into 3 dimensions: pain (5 items), stiffness (2 items), and physical function (17 items). We used the Likert version of the WOMAC, with 5 responses for each item representing different degrees of intensity that were scored from 0 (none) to 4 (extreme) (8). The WOMAC has been translated into Spanish and validated in Spain (15). For the purpose of this study, we concentrated on the pain and functional limitation domains.

In the postintervention mailing to both cohorts, patients were asked a PASS question about the results of their THR: “If you had to live the rest of your life with the hip symptoms you have now, how would you feel?” There were 4 possible choices: totally satisfied, slightly satisfied, not satisfied, and not at all satisfied.

In the postintervention mailing to the derivation cohort only, patients were asked a transitional question about the improvement after the THR intervention: “Compared with your status before you had the hip prosthesis, how would you rate the status of your hip right now?” There were 7 choices, ranging from “a great deal better” to “a great deal worse.”

Statistical analysis.

The unit of study was the patient. If a patient underwent 2 interventions during the recruitment period or study, we selected the first intervention performed. We used means and SDs, frequencies, and percentages to describe the samples. Sociodemographic and clinical data, as well as WOMAC domains at baseline, were compared between the cohorts using chi-square or Fisher's exact tests for categorical variables and the t-test or Wilcoxon's nonparametric test when necessary. In a similar way, differences in sociodemographic, clinical, and WOMAC scores at baseline between responders (patients with WOMAC scores at followup) and nonresponders were assessed.

Baseline WOMAC pain and functional limitation scores were categorized into tertiles. Means and SDs were calculated for the preintervention and postintervention WOMAC pain and functional limitation scores, as well as the mean change (difference between preintervention and postintervention domains) in the WOMAC domains, according to patient satisfaction status. For the purpose of this analysis, of the 4 patient satisfaction status categories, the 2 worst levels “not at all satisfied” and “not satisfied” were merged into 1 level called “not satisfied,” leaving 3 satisfaction categories overall.

Analysis of variance was used to assess the relationship of the preintervention and postintervention WOMAC scores and the mean change in WOMAC domains with the satisfaction status in the derivation cohort. Scheffe's post hoc test was performed to evaluate differences among the different satisfaction levels. These analyses were performed for cohort 1 and by tertiles. The Kruskal-Wallis nonparametric test was used to assess the relationship of the mean change in WOMAC domains with the transitional question in the derivation cohort stratified by the estimated tertiles.

We computed the PASS value to identify the postintervention WOMAC pain and functional limitation scores beyond which patients considered themselves well. Patient satisfaction state after THR was chosen as the anchoring variable. For the purpose of this analysis the 4 satisfaction response categories were recategorized into 2 categories: being satisfied (totally satisfied and slightly satisfied) or not satisfied. An empirical cumulative probability curve was constructed as a function of the postintervention WOMAC pain and functional limitation scores for patients in the satisfactory group. PASS was targeted as the 75th percentile of the obtained curve.

We developed logistic regression models to determine the optimal cut points for the postintervention scores of the ROC curves. Patient satisfaction was selected as the outcome variable and the postintervention WOMAC pain and functional limitation domains were selected as independent variables. The upper-left point of the ROC curve was chosen as the optimal value. The area under the curve (AUC) for the obtained value with the 95% confidence intervals (95% CIs) was assessed. AUCs for both samples were compared (16).

PASS values and optimal cut points were estimated for the stratified sample by the tertiles of the baseline preintervention WOMAC pain (≤45, 45–60, >60) and functional limitation (≤57.3, 57.4–73.4, >73.4) scores. We compared the PASS values of both WOMAC domains obtained in the 2 cohorts in the following way: we divided each studied sample into 2 groups, according to whether scores of WOMAC domains were lower than or equal to the estimated PASS values. To this end, the chi-square test has been used. This was done for the entire sample and stratified by their corresponding baseline tertiles.

The mean change score for patients whose response to the transitional question was “somewhat better” was used to estimate the MCID for WOMAC pain and functional limitation scores in the derivation cohort (17). The transitional question was not included in the mailings to the validation cohort; therefore, it was not possible to estimate the MCID in this group.

We established 3 types of responders to THR: those whose scores were lower than or equal to the computed PASS estimate thresholds in each of the WOMAC domains assessed, those whose scores were lower than or equal to the optimal cut points obtained by the ROC curves (18), and those who surpassed the mean change value of the improvement in each scale defined by those patients who answered feeling “a little better” 6 months after THR (10). Agreement among responders was assessed by means of the kappa coefficient and 95% CIs in both the derivation and validation cohorts, and stratified by the baseline tertiles of scores in each scale. A kappa coefficient >0.70 was considered as good reproducibility (19).

Statistical analyses were performed using SAS for Windows statistical software, version 9.2 (SAS Institute). Figures were developed with R release 2.11 statistical software.

RESULTS

The 2 cohorts in our study included 573 and 333 patients undergoing THR with a response rate at 6 months above 70%. When comparing both cohorts, no differences were observed in sociodemographic or clinical variables between both cohorts, except for the WOMAC scales (mean ± SD pain preintervention score in cohort 1: 54.37 ± 18.65, mean ± SD pain preintervention score in cohort 2: 57.79 ± 19.58; P < 0.001 and mean ± SD functional limitation preintervention score in cohort 1: 64.53 ± 16.86, mean ± SD functional limitation preintervention score in cohort 2: 68.00 ± 16.98; P = 0.001) with poorer results in cohort 2.

When comparing responders and nonresponders to the WOMAC questionnaire at 6 months by patient sociodemographic characteristics, body mass index, Charlson Comorbidity Index, presence of contralateral OA, and baseline WOMAC scores, we did not find any difference among both groups in any of the 2 cohorts. Descriptive sociodemographic and clinical data about both cohorts are included in Table 1.

Table 1. Descriptive statistics of the studied samples*
 Total (n = 906)Cohort 1 (n = 573)Cohort 2 (n = 333)P
  • *

    Values are the frequencies (percentages) unless otherwise indicated. OA = osteoarthritis; WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index; FL = functional limitation.

Sex   0.30
 Male449 (49.56)276 (48.17)173 (51.95) 
 Female457 (50.44)297 (51.83)160 (48.05) 
Age, years   0.95
 >70439 (48.62)278 (48.52)161 (48.79) 
 ≤70464 (51.38)295 (51.48)169 (51.21) 
Marital status   0.67
 Married632 (70.93)402 (71.28)230 (70.34) 
 Divorced7 (0.79)4 (0.71)3 (0.92) 
 Widow172 (19.30)112 (19.86)60 (18.35) 
 Single80 (8.98)46 (8.16)34 (10.40) 
Body mass index, kg/m2   0.37
 <25164 (18.79)98 (17.66)66 (20.75) 
 25–30427 (48.91)270 (48.65)157 (49.37) 
 >30282 (32.30)187 (33.69)95 (29.87) 
Charlson Comorbidity Index   0.12
 0529 (58.45)324 (56.54)205 (61.75) 
 1–2345 (38.12)232 (40.49)113 (34.04) 
 >231 (3.43)17 (2.97)14 (4.22) 
Social support 509 (91.22)279 (86.65)0.04
Contralateral hip OA236 (26.05)231 (40.31)5 (1.50)< 0.001
Age, mean ± SD years68.93 ± 9.1269.35 ± 8.5568.36 ± 9.970.55
Baseline WOMAC pain domain, mean ± SD55.62 ± 19.0554.37 ± 18.6557.79 ± 19.58< 0.001
Baseline WOMAC FL domain, mean ± SD65.80 ± 16.9764.53 ± 16.8668.00 ± 16.98< 0.001
Time to intervention, median (25th, 75th percentiles) months96 (8, 177)141 (92, 207)5.71 (3.62, 7.55)< 0.01
 

Patients in the derivation cohort had similar preintervention scores in the 2 WOMAC domains studied across the 3 satisfaction categories (Table 2). Nevertheless, satisfaction increased with greater improvement in any of the 2 WOMAC domains and by tertiles. In both domains, by preintervention tertiles, the higher the WOMAC gain the higher the satisfaction, i.e., those being worse at the baseline time and gaining more were more satisfied as well as those gaining >15 points in any baseline tertile.

Table 2. WOMAC pain and functional limitation scores in the derivation cohort at baseline and 6 months after total hip replacement according to patient satisfaction after the intervention*
 Total (n = 573)Totally satisfied (n = 346)Slightly satisfied (n = 144)Not satisfied (n = 83)P
  • *

    Values are the mean ± SD unless otherwise indicated. The satisfaction question was “If you had to be the rest of your life with the hip symptoms you have now, how would you feel?” The responses “not at all satisfied” and “not satisfied” were grouped as “not satisfied.” All 573 patients provided full answers to this question. WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index.

  • Significant score mean differences with the slightly satisfied and not satisfied groups. The post hoc Scheffe's test was used.

  • The difference between the preintervention and postintervention domains. The higher the score, the better the health status in each domain.

  • §

    Significant score mean differences with the totally satisfied group. The post hoc Scheffe's test was used.

  • The tertiles for the preintervention WOMAC pain domain are: 1st tertile ≤45 points, 2nd tertile 45–60 points, and 3rd tertile >60 points. Tertiles for the preintervention WOMAC functional limitation domain are: 1st tertile ≤57.3 points, 2nd tertile 57.4–73.4 points, and 3rd tertile >73.4 points.

  • #

    Significant score mean differences with the totally satisfied and not satisfied groups. The post hoc Scheffe's test was used.

  • **

    Significant score mean differences with the totally satisfied and slightly satisfied groups. The post hoc Scheffe's test was used.

WOMAC pain domain     
 Preintervention54.30 ± 18.5753.60 ± 18.4353.90 ± 18.4957.93 ± 19.140.19
 Postintervention15.45 ± 16.769.83 ± 12.6720.78 ± 16.00329.63 ± 21.28< 0.001
 Mean change38.85 ± 21.9643.76 ± 20.4833.11 ± 21.62§28.30 ± 22.69§< 0.001
 Preintervention in tertiles,   mean ± SD change     
  1st tertile24.68 ± 14.6928.83 ± 12.3019.4 ± 13.11§15.67 ± 20.80§< 0.001
  2nd tertile38.92 ± 16.9943.15 ± 14.9634.11 ± 16.35§26.96 ± 19.63§< 0.001
  3rd tertile55.67 ± 21.3563.77 ± 16.8449.27 ± 22.27§39.53 ± 20.99§< 0.001
WOMAC functional limitation domain     
 Preintervention64.39 ± 16.8663.79 ± 16.6164.66 ± 16.7666.45 ± 18.090.40
 Postintervention26.97 ± 18.7320.20 ± 15.2534.33 ± 16.8442.83 ± 21.03< 0.001
 Mean change37.42 ± 21.5643.59 ± 19.3130.33 ± 19.8523.62 ± 23.55< 0.001
 Preintervention in tertiles,   mean ± SD change     
  1st tertile24.13 ± 16.9030.09 ± 14.0117.84 ± 15.49#7.99 ± 17.39**< 0.001
  2nd tertile38.42 ± 17.9244.29 ± 15.8830.79 ± 14.26§21.74 ± 20.28§< 0.001
  3rd tertile49.74 ± 21.1457.73 ± 17.0941.43 ± 21.06§36.63 ± 22.12§< 0.001

Next, we estimated different cutoff points for the WOMAC change, or postintervention expected values, after a THR by the 3 different methods, taking into account the baseline preintervention values. We first made the estimation in cohort 1 and then in cohort 2 so that results for a particular parameter and method can be compared.

Estimated cut points by the PASS method.

The cumulative distribution of patients who reported to be satisfied with THR in relation to their postintervention WOMAC pain and functional limitation scores, stratified by preintervention tertiles, is shown in Figure 1. In the derivation cohort, PASS cut points, defined as the 75th percentile of the obtained curve, for the WOMAC pain domain were 20, 25, and 25 for the 1st, 2nd, and 3rd tertiles, respectively. For the WOMAC functional limitation domain, the PASS cut points were 28, 35, and 42. In the validation cohort, PASS cut points were 20, 25, and 25 for pain, and 32, 32, and 40 for functional limitation. When comparing the results between both cohorts, statistical differences were not found among them except in the 1st tertile of the WOMAC pain domain (P = 0.04).

Figure 1.

Cumulative distribution of patients who reported to be satisfied with the outcome, in relation to the postintervention Western Ontario and McMaster Universities Osteoarthritis Index (Womac) pain and functional limitation (FL) domains, after total hip replacement stratified by preintervention tertiles. The 4 satisfaction response categories were recategorized in 2 categories: being satisfied or not satisfied. Optimal patient acceptable symptom state (PASS) cutoff points are shown.

Optimal cut points by the ROC curve estimation method.

Figure 2 shows the ROC curves of patients who reported being satisfied with their THR, according to the postintervention WOMAC pain and functional limitation domains, stratified by preintervention tertiles. In the derivation cohort, the optimal cut points for the WOMAC pain domain were 19, 25, and 25 for the 1st, 2nd, and 3rd tertile, respectively. The optimal cut points for the WOMAC functional limitation domain were 26.4, 39, and 40. In the validation cohort, ROC cut points were 20, 25, and 25 for pain, and 25, 39, and 40 for functional limitation. In both cohorts, AUC values ranged from 0.70–0.85 for the WOMAC pain domain and from 0.70–0.83 for the WOMAC functional limitation domain. No differences were observed between the different AUC values between the derivation and validation cohorts.

Figure 2.

Receiver operating characteristic (ROC) curves of patients who reported to be satisfied with the outcome of total hip replacement according to postintervention Western Ontario and McMaster Universities Osteoarthritis Index (Womac) pain and functional limitation (FL) domains, stratified by preintervention tertiles. Black stars indicate the optimal cut points in the ROC curves. Areas under the curves (AUCs) and optimal ROC cut points are indicated.

Estimated cut points by the MCID method.

We established the MCID for the 2 WOMAC domains according to their respective baseline tertiles (Table 3). Using patients who responded that they were feeling “a little better” after the intervention than before the intervention as a reference MCID group, the MCID cut point values for the WOMAC pain domain in the derivation cohort were 15, 23, and 36 for the 1st, 2nd, and 3rd baseline tertile scores, respectively. MCID cut points in the derivation cohort for the WOMAC functional limitation domain were 9, 22, and 31.

Table 3. Changes in WOMAC pain and functional limitation scores in the derivation cohort from baseline to 6 months after total hip replacement by baseline pain and functional limitation tertiles and the response to the transitional question*
 Transitional question response options at 6 months after the interventionP
A great deal better (n = 484)A little better (n = 72)Same (n = 10)A little worse (n = 5)A great deal worse (n = 5)
  • *

    WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index; NA = not available, no patients found in that category.

  • The transitional question about improvement following total hip replacement was “Compared with your status before being placed the hip prosthesis, how would you rate the status of your hip right now?” There were 7 choices for responses, ranging from “a great deal better” to “a great deal worse.” Patients whose response to the transitional question was “somewhat better” were used to estimate the minimum clinically important difference. The tertiles for the preintervention WOMAC pain domain are: 1st tertile ≤45 points, 2nd tertile 45–60 points, and 3rd tertile >60 points. Tertiles for the preintervention WOMAC functional limitation domain are: 1st tertile ≤57.3 points, 2nd tertile 57.4–73.4 points, and 3rd tertile >73.4 points.

Preintervention WOMAC pain,  mean ± SD change      
 1st tertile27.36 ± 12.8514.66 ± 12.5510 ± 10.860 ± –−10.50 ± 39.86< 0.001
 2nd tertile41.61 ± 15.2423.29 ± 16.9925 ± 49.4920 ± 13.54NA< 0.001
 3rd tertile59.75 ± 19.2235.80 ± 18.3526.25 ± 27.20NANA< 0.001
Preintervention WOMAC functional  limitation, mean ± SD change      
 1st tertile27.55 ± 14.709.42 ± 16.2812.05 ± 20.4310.29 ± 8.32−8.42 ± 26.19< 0.001
 2nd tertile42.49 ± 15.0221.79 ± 9.91−5.15 ± 1.032.94 ± 6.23−15.48 ± 9.29< 0.001
 3rd tertile53.36 ± 18.4330.53 ± 22.3628.42 ± 39.345.93 ± –NA< 0.001

Comparison of the different methods.

Cut points from the PASS and ROC methods provided a high level of agreement, with better results for the WOMAC pain domain (agreement between 0.95 and 1) (Table 4) than with the WOMAC functional limitation domain (agreement between 0.81 and 0.85) (Table 5). Agreement was lower with PASS and ROC criteria than with the MCID (based on those who responded with the option “a little better than before the intervention”) where agreement, as measured by the kappa statistic, was 0.60 for the WOMAC pain domain and between 0.58 and 0.60 for the WOMAC functional limitation domain. As mentioned previously, MCID was evaluated only in the derivation cohort.

Table 4. Agreement among different responder definition criteria by the WOMAC pain domain*
CriteriaCohort 1Cohort 2
PASS responders (n1 = 411)PASS nonresponders (n1 = 167)Agreement (95% CI)ROC responders (n1 = 411)ROC nonresponders (n1 = 167)Agreement (95% CI)PASS responders (n2 = 223)PASS nonresponders (n2 = 111)Agreement (95% CI)
  • *

    Values are the frequencies (row percentage) unless otherwise indicated. There were 3 types of responders to total hip replacement: those who surpassed the patient acceptable symptom state (PASS) estimate, those who surpassed the optimal cut points obtained by the receiver operating characteristic (ROC) curves, and those who surpassed the mean change value of the improvement in each scale defined by those patients who answered feeling “a little better” 6 months after total hip replacement. By baseline status. WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index; n1 = sample in the derivation set; agreement (95% CI) = agreement measured by kappa statistics at 95% confidence level; n2 = sample in the validation set; MCID = minimum clinically important difference.

  • Can only be computed in the derivation set.

ROC responders  1     0.95 (0.91–0.98)
 Yes (n1= 411/n2= 231)411 (100)0 (0)    223 (96.54)8 (3.46) 
 No (n1= 167/n2 = 103)0 (0)167 (100)    0 (0)103 (100) 
MCID responders  0.60 (0.52–0.67)  0.60 (0.52–0.67)   
 Yes (n1= 464)394 (84.91)70 (15.09) 394 (84.91)70 (15.09)    
 No (n1= 114)17 (14.91)97 (85.09) 17 (14.91)97 (85.09)    
Table 5. Agreement among different responder definition criteria by the WOMAC functional limitation domain*
CriteriaCohort 1Cohort 2
PASS responders (n1 = 401)PASS nonresponders (n1 = 173)Agreement (95% CI)ROC responders (n1 = 408)ROC nonresponders (n1 = 166)Agreement (95% CI)PASS responders (n2 = 219)PASS nonresponders (n2 = 114)Agreement (95% CI)
  • *

    Values are the frequencies (row percentage) unless otherwise indicated. There were 3 types of responders to total hip replacement: those who surpassed the patient acceptable symptom state (PASS) estimate, those who surpassed the optimal cut points obtained by the receiver operating characteristic (ROC) curves, and those who surpassed the mean change value of the improvement in each scale defined by those patients who answered feeling “a little better” 6 months after total hip replacement. By baseline status. WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index; n1 = sample in the derivation set; agreement (95% CI) = agreement measured by kappa statistics at 95% confidence level; n2 = sample in the validation set; MCID = minimum clinically important difference.

  • Can only be computed in the derivation set.

ROC responders  0.85 (0.81–0.90)     0.81 (0.75–0.88)
 Yes (n1= 408/n2= 235)387 (94.85)21 (5.15)    206 (93.21)15 (6.79) 
 No (n1= 166/n2 = 103)14 (8.43)152 (91.57)    13 (11.61)99 (88.39) 
MCID responders  0.58 (0.50–0.65)  0.60 (0.52–0.67)   
 Yes (n1= 475)393 (82.74)82 (17.26) 399 (84)76 (16)    
 No (n1= 99)8 (8.08)91 (91.92) 9 (9.09)90 (90.91)    

DISCUSSION

Various approaches such as the MCID, the MDC, and the PASS (11–12, 17) have been used to establish cut points for clinically relevant improvements or postintervention changes derived from patient answers to PRO instruments. In arthritis research, these techniques have primarily been applied to data from patients receiving pharmacologic treatment or rehabilitation procedures (18, 20, 21). To our knowledge, these approaches have not been applied to THR, a very common intervention among patients with OA. In addition, few prior studies have taken into account patients' baseline status when defining cut points. In a large cohort of patients with hip OA undergoing THR, we demonstrated a clear relationship between satisfaction and changes in PRO parameters and defined cut points and the influence of patients' baseline status on them. These results were validated in a second large cohort.

Our results show a clear relationship between patient satisfaction with THR and PROs measured by the WOMAC. Patients who were more satisfied with THR were also more likely to report achieving greater gains from the intervention. However, patient satisfaction and gain were both dependent on baseline status. The greater the impact that hip OA had on a patient at baseline, the greater his or her satisfaction was with the intervention and the greater the gain reported. It is possible this is due to the floor/ceiling effect that is quite common with PRO measures that are unable to reach very low or high values. It is equally likely that the poorer a patient's health-related quality of life prior to an intervention such as THR, the more he or she can gain from the intervention.

In 2 independent cohorts, we observed similar cut points of sufficient gain after THR based on the PASS and ROC estimations, even after taking into account patient status at baseline. To the best of our knowledge, no studies have focused on establishing cut points of success based on PASS for patients with a hip replacement.

We also established the MCID in the derivation cohort. When we established the MCID using the answer feeling “a little better” after THR to the transitional question, the MCID cut points for the baseline tertiles were always lower than the PASS values. The PASS and ROC values we observed were closer to those found among patients who responded feeling “much better” to the transitional question. Our data also show that the baseline WOMAC status conditioned the MCID estimations. Although a few studies have estimated the MCID for these kinds of patients (13, 14, 20, 22), none took into account the baseline status of the patient.

Comparing whether patients were classified as responders or nonresponders to THR based on cut points established by the PASS, ROC, and MCID criteria, taking baseline status into account, a high agreement was observed between PASS and ROC criteria with kappa coefficient results above 0.90 in both cohorts. When we compared them with the MCID criteria the agreement was far lower because more patients were considered responders based on the MCID criteria, while some of these patients would not be considered responders by the PASS or ROC criteria. Previous studies have reported that the MCID and PASS measure different things (12, 23). Our data suggest that PASS and ROC cut points represent an upper limit of what patients can be expected to gain after THR, while the MCID represents a lower limit.

As with any prospective cohort study, a clear limitation of our study is missing data. Nevertheless, the response rates obtained in both cohorts were similar to those in other studies, and no important differences were observed among responders and nonresponders. Second, there was a difference between our study and the original article about PASS. The question about patient opinion had different wording and was recorded by answering “yes” or “no.” In our case, the question, as cited above, was also focused on symptoms but was not as specific as in the original. In addition, the answers were rated using a Likert scale, which was dichotomized as responders (totally satisfied/satisfied) and nonresponders (some dissatisfied/totally dissatisfied). In spite of this, we think both versions of the question are about adaptation to symptoms and satisfaction. It is possible that waiting for 6 months to assess the impact of THR may also have biased the results. Although several studies have demonstrated that improvement can be seen clearly at 6 months (6, 24–26), some studies suggest a longer followup period (27) to establish the highest recovery level for these patients.

Patients' satisfaction can be an important parameter to define minimum or acceptable changes after a surgical intervention such as THR. Nevertheless, patients' satisfaction is conditioned by several other parameters, some of which are not totally defined yet; among them, what patients' expectations may have been previous to the intervention (28, 29). Some studies have already shown a clear relationship of patients' expectations with their satisfaction after an intervention (30, 31). In defining these estimators more precisely, future studies should focus on the interrelationship of all these parameters. Finally, we must also note that application of the MCID and PASS parameters to clinical practice must be done cautiously (12, 23, 32). These reference values are for groups, not individual patients. We should then ask ourselves what the practical value of these results is. Our findings can be used to support the clinician in the clinical decision-making process. The clinician has to take them in the context of the proper evaluation of the particular patient, taking the patient's opinion into consideration while basing their decision on their clinical experience and expertise. Then, once all these aspects have been taken into account we can incorporate our estimations to the final decision-making process.

In summary, our study provides reference cut points for THR that may be useful for clinicians, patients, and researchers. Clinicians can use these cut points to judge the appropriateness of THR based on the expected results or to provide more realistic information about the likely results of the intervention to patients and their families. These instruments are also useful for managers or administrators in quality improvement programs and for researchers in health services research studies. Although our study provided estimates of change in pain and functional limitation following THR, as measured by the WOMAC, such estimates must be considered in light of individual patient characteristics, especially the patient's baseline status, given our finding that postintervention improvement was inversely associated with baseline status. However, the low agreement among the MCID and PASS/ROC criteria calls for prudence when using these cut points in individual cases or even for legal or managerial control of THR. Instead, these cut points should be seen more broadly as CIs representing what a patient is likely to gain following THR.

AUTHOR CONTRIBUTIONS

All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Quintana had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study conception and design. Quintana, Escobar.

Acquisition of data. Orive, Garcia.

Analysis and interpretation of data. Aguirre, Barrio.

Ancillary