To determine whether provider volume is associated with early failures following total hip replacement (THR) requiring revision.
To determine whether provider volume is associated with early failures following total hip replacement (THR) requiring revision.
Claims data were analyzed to follow a cohort of 57,488 Medicare beneficiaries who underwent elective primary THR in 1995–1996 in 3,044 hospitals in the US. Patients were followed through the end of 1999. Failure of primary THR was defined as a subsequent revision THR, as determined by International Classification of Diseases, Ninth Revision codes in hospital claims. Hospitals were stratified into 4 volume groups: low (<25 THRs/year), medium (26–50, 51–100 THRs/year), and high (>100 THRs/year). Low-volume surgeons were defined as those surgeons performing <12 elective primary THRs annually in the Medicare population. Associations between the rates of revision and surgeon volume were determined by hazard ratios from a proportional hazard model, with adjustment for hospital volume, patient age, poverty status, sex, and comorbidities. We also examined whether the effect of surgeon volume on revision rates differed between the first 18 months postoperatively and later time periods.
Among 57,488 patients who had elective primary THR in 1995–1996, 2,537 (4.4%) had at least 1 revision THR by the end of 1999, with 1,437 (56.6%) of these revisions occurring within the first 18 months after the index primary THR. Median followup time was 47 months (range 0–54). Patients of high-volume surgeons were less likely to have revision THRs than patients of low-volume surgeons, regardless of hospital volume stratum. Further analysis revealed that the effect of surgeon volume on revisions was striking in the first 18 months after surgery but was not evident in the subsequent years.
Patients of low-volume surgeons have higher rates of revision THR than patients of high-volume surgeons, particularly within the first 18 months postoperatively. Referring clinicians should consider including surgeon volume among the factors influencing their choice of surgeon for elective THR.
Total hip replacement (THR) is performed frequently to relieve pain and improve function in patients with advanced hip joint destruction. More than 3,000 centers in the US perform elective THRs in Medicare beneficiaries, and ∼75% of these centers perform <25 THRs/year in the Medicare population. In fact, approximately one-third of all THRs in the US Medicare population are performed in centers with a case load of ≤25 THRs/year (1). Hospital and surgeon THR volume is inversely associated with rates of postoperative mortality and complications (1–3). However, hospital and surgeon volumes do not appear to be associated with patient functional status and satisfaction following surgery (4).
Hip prostheses function well for up to 20 years in >80% of patients, with failure rates of ∼1% per year (5, 6). Studies of prosthesis survival have been performed primarily in single surgical practices or large-volume centers (7, 8). The question of whether failure rates vary by surgeon and hospital volumes has not been studied in detail. One Canadian study that examined the association between the surgeon volume and THR revision rates during the first 3 postoperative years failed to show an association between volume and outcome. However, the cutoff for low surgeon volume in that study was 40 THRs/year (9), which is relatively high by US standards (1).
The goal of this analysis was to examine the association between surgeon volume of THR and revision rates during the first 4 postoperative years in a national sample of US Medicare beneficiaries who underwent primary THR in 1995–1996. We hypothesized that patients operated upon by high-volume surgeons may have lower revision rates compared with patients of low-volume surgeons. Since technical problems are likely to become manifest soon after surgery, we also hypothesized that this association would be more striking during the first 18 months after the index procedure.
Our analysis was based on a national sample of US Medicare beneficiaries who underwent a primary THR between July 1, 1995, and June 30, 1996. Medicare claims data were used to identify recipients of elective primary THR. We excluded patients <65 years of age and health maintenance organization enrollees (since they often lack the data necessary for the analysis). We also excluded THRs associated with infection or cancer involving the hip region, hip fracture, and hemiarthroplasty, because these cases are generally not elective and have worse prognoses than elective THR. Algorithms for case identification have been published elsewhere (1). We followed this cohort of primary THR recipients through Medicare claims for 4 years.
Failure of primary THR was defined as a revision THR, documented by hospital-based (Part A) claims (International Classification of Diseases, Ninth Revision, code 81.53), regardless of position.
Surgeon volume was calculated as the total number of elective primary and revision THRs performed by the surgeon in 1995 in the Medicare population. Approximately two-thirds of all elective THRs are performed on Medicare beneficiaries (10). We designated surgeons performing <12 Medicare THRs as low-volume surgeons. This cutoff value (12 THRs/year) was suggested by clinicians as meaningful (low-volume surgeons are those surgeons performing <1 THR/month, on average, among Medicare beneficiaries). The cutoff also corresponded to a median distributional split, since ∼50% of patients in our study population were operated upon by low-volume surgeons by this definition. Hospital volume was calculated in an analogous manner. Four hospital volume strata were used in our analysis: 1–25, 26–50, 51–100, and >100 THRs/year in the Medicare population. This stratification was consistent with previous work done in this area and permitted balanced distribution of patients across the hospital volume strata.
The time window for this analysis extended from the index primary THR to the end of 1999. In the regression analyses, patients were followed until the earliest of the following: 1) December 31, 1999; 2) a revision hip arthroplasty; 3) death; or 4) a contralateral THR (because of the resulting confusion regarding the laterality of any subsequent revision). In addition to performing the analysis over the entire time window, we also created 2 time intervals: 1) the first 18 months after index primary THR and 2) the remaining followup time extending from the 19th month after the index procedure. We hypothesized that volume effect would be most apparent in the first 18 months after surgery.
Medicare denominator files were used to obtain data on patient age, race, sex, and date of death. Age was ascertained at the time of the index primary procedure. Medicare denominator files for the years 1995–1999 were used to ensure accuracy and completeness of death dates. Comorbidities were assessed with a claims-based modification of the Charlson comorbidity index (11). We included comorbidities documented in claims within 6 months preceding the index primary THR. After the comorbidity index was constructed, we distinguished patients with no comorbidities, a single comorbidity, and 2 or more comorbidities. Other patient characteristics included the clinical indication for primary THR: osteoarthritis, avascular necrosis, rheumatoid arthritis, and other. The Medicaid eligibility indicator in Medicare claims was used as a proxy for poverty status.
A wide range of sensitivity analyses were performed to examine the influence of several assumptions on our conclusions, including the definition of volume strata, duration of the early postoperative period, and censoring at the time of second primary THR.
Time-to-event (revision, in this case) analysis was used as the main analytical tool for this investigation. The Kaplan-Meier method was used to estimate failure-free survival, and the extension of the Cox proportional hazards model was used to evaluate effects of time-dependent covariates and to adjust for patient characteristics and hospital volume. The association between early revision and surgeon volume was described by hazard ratios (HRs) and 95% confidence intervals (95% CIs) comparing revision rates for patients operated upon by high-volume (at least 12 annually) surgeons with revision rates for patients operated upon by low-volume surgeons. The association between hospital volume and early revision rates was described similarly, by comparing revision rates for patients operated upon in hospitals in the higher volume strata with revision rates for patients operated upon in hospitals with an annual THR volume of ≤25 cases in the Medicare population. All analyses were performed using SAS 8.02 (SAS Institute, Cary, NC). P values less than 0.05 were considered significant.
A total of 76,627 Medicare beneficiaries underwent a primary THR between July 1, 1995, and June 30, 1996. Of these, 58,521 (76%) met our definition of elective primary THR. For 57,488 (98%) of these patients, claims data contained information on surgeon volume; these patients comprised the study sample used for this investigation. Median followup time was 47 months (range 0–54). Approximately half of the sample was older than 75 years and approximately two-thirds were women. The vast majority was white and had osteoarthritis as the clinical indication for primary THR (Table 1). Twelve percent of the cohort died over the course of the followup. Eleven percent had contralateral primary THR during the study period. More than 50% of the cohort had their primary THR performed by a surgeon with an annual elective procedure volume of <12 elective THRs/year in the Medicare population. Only ∼10% of the procedures were performed in centers with an annual volume of ≥100 by surgeons with a THR load of at least 12 THRs/year (Figure 1). A more detailed description of the distribution of THRs by hospital and surgeon volumes is published elsewhere (1).
|Patient characteristics||No. (%) of patients||Percentage revised by the end of 1999||P|
Overall, 2,537 patients (4.4%) had revision THRs by the end of 1999. The cumulative 4-year revision rates varied from 3.3% for patients operated upon by high-volume surgeons in centers with caseloads of >100 THRs/year to 4.9% for patients operated upon by low-volume surgeons in centers with caseloads of <25 THRs/year (HR 0.65, 95% CI 0.54–0.78). Across all hospital volume strata, patients operated upon by high-volume surgeons had consistently lower failure rates compared with patients operated upon by low-volume surgeons (Table 2). These multivariate analyses also included age, sex, race, comorbidity, and poverty status. Results showed that younger patients (HR 0.986/year of age, P < 0.0001), women (HR 0.83, P < 0.0001), and patients with <2 comorbidities (HR 0.85, P = 0.0065) were less likely to have revisions when hospital and surgeon volumes also were included in the model.
|Hospital volume, surgeon volume||No. of patients||Failure rate, mean ± SEM %*||Hazard ratio (95% CI)†|
|1–11||19,713||4.9% (0.2)||1.00 (referent)|
|≥12||939||4.0% (0.7)||0.80 (0.57–1.12)|
|1–11||12,815||4.0% (0.2)||0.81 (0.73–0.91)|
|≥12||3,494||3.9% (0.3)||0.78 (0.65–0.94)|
|1–11||8,197||4.4% (0.2)||0.89 (0.78–1.00)|
|≥12||6,519||3.3% (0.2)||0.66 (0.57–0.77)|
|1–11||1,657||4.9% (0.6)||0.96 (0.76–1.21)|
|>12||4,154||3.3% (0.3)||0.65 (0.54–0.78)|
Figure 2 shows the cumulative revision rates for patients operated upon by low- and high-volume surgeons. Within the first 18 months, the curves diverge markedly; subsequently, they are approximately parallel, with the high-volume surgeons having consistently lower revision rates than the low-volume surgeons (P < 0.0001 by log rank test). We performed a more detailed analysis of the effect of surgeon and hospital volumes in the first 18 postoperative months as compared with months 19–48 (Figure 3). Within the first 18 months following primary THR, patients operated upon by high-volume surgeons had significantly lower revision rates than patients operated upon by low-volume surgeons. The magnitude of the relative difference ranged from a 20% decrease in low-volume hospitals to an ∼50% decrease in high-volume centers (Figure 3). These results were adjusted for patient age, sex, poverty status, comorbidity, and clinical indication for primary THR. In contrast to the first postoperative period, within the second period (between 19 and 48 months), we saw no association between surgeon volume and revision rates across all hospital volume strata (Figure 3).
In further analyses, we examined the separate independent effects of hospital procedure volume on failures for patients operated upon by low- and high-volume surgeons (Figure 4). For low-volume surgeons, we found no association between the risk of THR failure and hospital volume. For high-volume surgeons, hospitals with annual caseloads of >25 Medicare THRs had ∼20% lower failure rates than hospitals with an annual caseload of <25 Medicare THRs. These results were also adjusted for patient age, sex, comorbidity, and underlying arthritis diagnosis at the time of primary THR.
We performed a series of sensitivity analyses in which we considered hospital and surgeon volume variables as ordinal and conducted a test for linear trend. Results were indicative of an inverse linear trend, with increased revision rates associated with decreased hospital (and surgeon) volume.
We also conducted a series of analyses by changing the time of the early postoperative period from the first 18 months to the first 12 months. The results did not change substantially, although results from the lowest and highest hospital volume groups lacked statistical power due to the smaller number of cases.
A final set of sensitivity analyses was performed in which we did not censor the 11% of the cohort that had a second primary THR following the index procedure, but instead followed them either to death or to the end of 1999, whichever came first. The findings of this analysis did not alter the conclusions derived from primary analyses.
The principal findings of this study are that patients operated upon by low-volume surgeons are considerably more likely to undergo subsequent revision of the index THR than patients operated upon by high-volume surgeons. Further, the association between surgeon volume and early failure occurs primarily during the first 18 months after surgery, suggesting technical error as the mechanism of early failure. Finally, among patients operated upon by high-volume surgeons, higher hospital volume is independently associated with lower early failure rates.
These results are consistent with those of a previous study (12). Espehaug and colleagues found that in Norway, patients operated upon in university hospitals (which had median procedure volumes of 11 THRs/year) had higher cumulative revision rates after 4 years than those operated upon in the central (18 THRs/year) and local (27 THRs/year) hospitals (5.5% failure rate in the university versus 3.5% in the central and local centers) (12). The volume effect was seen primarily in patients who received uncemented prostheses. Our study differed from that of Espehaug et al in that in our claims-based study we could not analyze the effect of cemented versus uncemented prostheses, while the Norwegian study did not have accurate data on surgeon volume.
In contrast to our findings, a Canadian study did not show any statistically significant association between surgeon volume and revision rates 1 and 3 years postoperatively (9). The Canadian investigators defined 40 procedures a year as low surgeon volume, as compared with 12 in our study. Since more than two-thirds of elective THRs are performed in the Medicare population (10), even accounting for non-Medicare THRs, our volume threshold was lower than that used in the Canadian study. Hence, the Canadian investigators may simply have missed associations with physician volume because they did not focus on truly low-volume surgeons. In Canada, as in Norway, THRs are concentrated in a smaller number of centers, therefore resulting in a larger overall annual THR caseload per surgeon.
The strengths of our study include the population-based nature of the sample and the flexibility of the analysis to examine the effect of both surgeon and hospital volume during clinically distinct time periods. To the best of our knowledge, ours is the first US-based study examining the association between failure rates and surgeon and hospital volumes of THRs.
Our study has several limitations. Our definition of THR failure was simply the occurrence of a revision THR. Clearly, patients can have a poor outcome of THR yet not receive revision surgery, either because they prefer not to have surgery or because they are poor surgical candidates (13). Other limitations of our study include the fact that Medicare hospital (Part A) claims data, used for this analysis, lack surgical details and reasons for revision. Also, claims data do not specify the side (left or right) of the revision or the index primary THR. Consequently, it is not possible to ascertain whether the revision occurred on the index or contralateral hip. However, the magnitude of misclassification is likely to be slight. Our (unpublished) medical record data on a population-based cohort of patients who had primary or revision THR documents that only 17% of patients undergoing revision THR had a contralateral primary THR performed prior to the index procedure (for details of the cohort composition, see ref. 4). Thus, the opportunity for misclassification is at most 17%. Furthermore, these data indicated that having another primary THR before the index THR procedure did not exhibit association with being revised. Those having prior THR were equally distributed within each hospital and surgeon volume stratum and were as likely to have revisions in earlier (1–18 months) as in later (>18 months) periods. All this information led us to believe that any bias with respect to associations we examined in the study is unlikely.
Restricting the analysis to the Medicare population is unlikely to introduce bias. THRs performed in the Medicare population represent at least 67% of the total number of THRs performed in the US (10). Furthermore, our unpublished data show that the correlation between overall THR hospital and surgeon volume and the volume of THRs performed on Medicare beneficiaries exceeds 0.95.
In conclusion, the majority of THRs performed on Medicare beneficiaries in the US in 1995–1996 were performed by surgeons with annual procedure volumes of <12 THRs/year in the Medicare population. Higher surgeon volume is associated with lower failure rates. This association is most striking during the first 18 postoperative months. Further population-based studies are needed to better understand the role of the type of implant on this association. In addition, studies are needed to determine whether volume influences later failure (e.g., after 10 years of followup). Clinicians should consider surgeon volume among the factors influencing their referrals for elective THR.