www.eurodrg.eu/EuroDRG_group.pdf. Funding: The presented results were conducted within the research project ‘EuroDRG – Diagnosis Related Groups in Europe: towards efficiency and quality’, which was funded by the European Commission under the Seventh Framework Programme. Research area: HEALTH-2007-3.2-8 European System of Diagnosis-Related Groups, Project reference: 223300.
Most European countries adopted diagnosis-related group (DRG) systems to increase the transparency of services, which are delivered in hospitals and as a basis to pay hospitals in relation to their activity (Wiley, 2011; Geissler et al., 2011). DRGs classify patients with similar diagnoses and/or procedures, which are characterized by comparable resource consumption (Kobel et al., 2011). DRG-based payments incentivize hospitals to cut down costs per patient, for example, by reducing the length of stay (LoS) or the intensity of services (Cots et al., 2011). These behavioural responses may be intended (e.g. a reduction of resource intensity via the introduction of efficient clinical pathways) but may also take unintended forms such as inappropriate early discharge, skimping, cream-skimming or dumping (Ellis, 1998; Newhouse and Byrne, 1988; Levaggi and Montefiori, 2003; Martinussen and Hagen, 2009). To avoid these unintended consequences, DRGs need to be defined resource homogenously to ensure that DRG-based payments reflect treatment costs as precisely as possible for a given patient.
However, beyond application descriptions of DRG systems (Kimberly et al., 2008; Busse et al., 2011), little is known whether these systems can adequately account for treatment costs. Therefore, we assess the ability of national DRGs in 10 European countries (Austria, England, Estonia, Finland, France, Germany, Ireland, Poland, Spain and Sweden) to account for patient-level variation in hospital resource consumption against a standard set of patient characteristics, treatment and quality variables. By way of example, we analysed DRG classification of hip replacement patients as they account for major epidemiological and economic burden in all European health systems (Stargardt, 2008).
Replacement of the hip joint by an artificial implant can provide effective relief for patients with osteoarthritis of the hip (coxarthrosis), which is estimated to become the fourth leading reason for disability in 2020 (Woolf and Pfleger, 2003; Kurtz et al., 2007; United Nations, UN et al., 2007, Bitton, 2009). Hip replacement is provided with increasing frequency but with varying rates throughout Europe. In 2009, the rate ranged from 296 per 100 000 inhabitants in Germany to 44 per 100 000 inhabitants in Poland (OECD Health Data, 2011).
Diagnosis-related group classification and payment for hip replacement differ substantially across Europe. For example, the number of DRGs, which individually account for at least 1% of hip replacement patients, ranges from two in Estonia and Sweden to 10 in France or even 14 in England. Similarly, the number and type of patient and treatment characteristics, which are taken into account by each DRG system to define DRG split variables differ greatly (Table 1). For example, all DRG systems distinguish between cases with total hip replacement and the revision of an implant, but only four systems (Austria, England, Germany and Poland) have specific DRGs for partial replacements. Moreover, main and secondary diagnoses are used to classify patients only in England, France, Germany, Ireland and Spain. Age and LoS is only used to group hip replacement patients in France and Germany.
Table 1. Number of diagnosis-related groups and split variables used to classify hip replacement cases in 10 European countries
Number of DRGs
Diagnosis-related group-split variables
Length of stay
Revision of replacement
Notes: X variable is used to classify patients; — variable is not used to classify patients;
DRGs for multiple/ bilateral procedures are part of the DRG system but not considered by the analysis because they are populated with less than 1% of hip patients in the available data sources.
Researchers from 10 European countries performed quantitative analyses on national routine patient level data samples from 2008 as these were the latest available data samples across all countries (Street et al., 2012).
We identified hip replacement cases using ICD-9 CM procedure codes 81.51–81.53 (total, partial and revision of hip replacement), 00.7 (other hip procedures), 00.85–00.87 (hip resurfacing) or equivalent codes in national coding systems, which were mostly available from the Hospital Data Project (Magee, 2003). Patients in ambulatory settings and infants aged less than 1 year were excluded from the analysis.
In addition to the standard set of patient-level variables, such as age categories in quintiles, gender and Charlson index co-morbidities (Street et al., 2012), we defined three dummy variables to control severity and case-mix differences specific to hip replacement patients: First, hip replacement is increasingly required for patients with severe hip fractures, and we expect that their costs are higher than replacements in elective settings (Gullberg et al.,1997; Burge et al., 2007). Hence, patients with fractures were identified via a dummy variable that captures recoded ICD-10 codes S72, M84, M960 or S324. Second, there are different treatment options for hip replacement: it can be performed as total replacement, replacing both the acetabulum (hip socket) and the femoral head, or as partial (hemi) replacement, replacing just the femoral head using press or cement-fit implants (Wülker, 2010). As we expect that patients with partial hip replacements have lower costs and lengths of stay because of a less complex surgical intervention, we account for this by a dummy variable defined via ICD-9-CM procedure codes 81.52, 00.86, 00.87 or equivalent. Finally, artificial hip joints have to be revised some years after implementation, for example, because of aseptic loosening, instability or osteolysis (Clohisy et al., 2004), and we expect related procedures to be more costly than the reference case (Braithwaite et al., 2003; Bozic et al., 2005). Hence, we define a dummy variable for revisions via ICD-9-CM procedure codes 81.53, 00.70–00.73. Some surgical techniques such as ‘press-fit’ and ‘cemented’ could not be differentiated across countries as the relevant procedures cannot be consistently identified across countries.
A two-stage multilevel regression approach was adopted to identify the effects of patient and hospital variables. Seven of 10 countries (England, Estonia, Finland, France, Germany, Spain and Sweden) provided cost data. Because of data restrictions, three countries (Austria, Ireland and Poland) had to rely on LoS data, which were used as a proxy for treatment costs (Street et al., 2012). Cost and LoS data were collected from hospital admission to the discharge of the patient to either another institution (e.g. rehabilitation unit) or home (ibid.). The costs analyses were performed using OLS fixed effects models with log costs as the dependent variable. For the LoS analysis, negative binominal (NegBin) models were used, as over dispersion was detected during exploratory data analyses.
For the first stage, three models were designed to assess the determinants of costs and LoS variation. First, MD contains only DRG dummy variables to test the ability of relevant DRGs to reflect resource consumption. Second, MP uses the standard set of patient characteristics such as age, treatment variables, quality indicators and co-morbidities to predict variation in costs or LoS. Finally, the fully specified model MF includes both sets of variables, that is, DRGs and the variables considered in MP. A comparison of the model results will uncover if the set of patient characteristic and treatment variables has a greater ability to predict the costs or LoS of a given patient than the national DRGs or vice versa. In the second stage, we analyse the estimated hospital fixed effects to understand why some hospitals have higher average cost or LoS than others after taking patient characteristics and treatment variables into account. A set of hospital-specific variables, such as teaching and ownership status, volume of activity and specialization, were used for explaining hospital level differences in cost and LoS. The methodological approach and a detailed description of the core variables are presented in Street et al. (2012) as part of this freely available supplement.
3.1 Descriptive statistics
Table 1 provides an overview of the DRGs and underlying split variables used to group hip replacement cases across European DRG systems. It illustrates that common origins of DRG systems are reflected in the selection of classification variables. For example, the Estonian, Finnish and Swedish DRG system are very similar as they are all operated under the framework of the NordDRG system (Linna and Virtanen, 2011).
Table 2 provides descriptive statistics of the data used for the analysis. The sample size differs widely across countries from 86 090 cases in England to 2941 cases in Spain. Additionally, cases were clustered in a varying number of hospitals - from 277 hospitals in Poland to 5 hospitals in Finland.
Table 2. Descriptive patient-level and hospital-level statistics of hip replacement samples in 10 European countries
NA = not available.
*DRGs ordered by ascending DRG weights (DRGs vary by country).
In all countries, except Poland, England and France, more than 70% of patients are grouped into one DRG. In England, the country with the highest number of DRGs, just 31% of cases are grouped into the most populated DRG, whereas in Estonia, 92% of cases are grouped into one DRG. In Table 2, the DRGs are ordered by ascending DRG weights. It indicates that in seven countries, the DRG with the highest volume is also the least expensive.
The share of male patients varies from 34% (Poland) to 45% (Ireland). The mean age ranges from 67.8 years in Poland to 72.9 years in England.
The LoS in hospitals varies remarkably from 4.7 days in Finland to 15.2 days in Germany. These significant differences can be explained with different treatment patterns and care organization including the arrangements about rehabilitation after replacement (its availability and place). In Finland, patients are kept in the acute care wards just for the hip surgery and transferred very quickly into specialized rehabilitation units, whereas in Germany, a longer observation in hospital is the norm before deciding a proper setting for rehabilitation. In any case rehabilitation is not part of the DRG-based payment in any of the countries studied here and reimbursed separately.
Germany has the highest mean number of different coded diagnoses (6.6 against 1.2 in Poland or Finland). Similarly, the average number of different procedures differs widely from 5.2 in Poland to 1.0 in Estonia.
In Spain, almost 40% of patients have a recorded fracture, whereas in Austria, the share of fracture patients is less than 20%. The proportion of revisions ranges from 6% (Sweden) to 13% (France and Spain). We find the highest rate of partial replacements in Finland (35%) and the lowest in Poland (less than 17%).
Ireland has the highest rate of recorded wound infection during the hospital stay (2%), and Germany has the highest recorded urinary tract infections (7%). Adverse events were most frequent with 2% in France (mostly because of pulmonary embolism and deep vein thrombosis) and Spain (mostly because of sepsis).1 The highest mortality rate was recorded in England with 3% of patients deceased after being admitted for a hip surgery procedure. The English in-hospital mortality rate is notable as it is at least twice as high as in any other country.
The highest frequency of Charlson co-morbidities was in England and Germany. One third of English patients have at least one coded co-morbidity, which is mostly chronic pulmonary disease, diabetes and dementia. In Germany, 41% of patients are frequently affected by diabetes, congestive heart failure, chronic pulmonary disease and dementia1.
3.2 Regression results
3.2.1 Stage 1
Tables 3 and 4 present results of the first stage regressions. In model MD, where DRGs are ordered by ascending DRG weights, most DRG coefficients are greater than one (LoS analysis) or positive (costs analysis) and significant, which shows that patients who are grouped into these DRGs have higher costs or longer LoS than patients grouped to the most populated DRG, which is our reference case. Two exceptions (DRG5 in Poland2; DRG6 in England3) stand out. Patients within these groups have shorter LoS and lower costs than patients within the reference group but higher DRG weights, which means higher payments to hospitals.
MP estimated the impact of a set of patient characteristics on LoS and costs of hip replacement. We find a significant positive relationship between age and the LoS. In cost data, this relationship is less clear as it is only observed for England. In Estonia, Germany and Sweden, patients at age younger than 61 years are significantly more costly than the reference group (aged between 61 and 70 years). In Finland and Estonia, older patients (age category 3, 4 and 5) have significantly lower resource consumption. In France and Spain, we find no significant impact of age on costs.
Gender is significantly affecting LoS: hospital stays tend to be shorter for male patients. The relationship between gender and costs is significant only for three countries (Estonia, Spain and Sweden) at the 5% level, indicating that male patients tend to be more costly.
The number of coded diagnoses and procedures is highly significant across countries, except for Estonia. Regardless of different coding standards, this reflects that patients with more coded diagnoses and procedures have higher costs and longer LoS.
Patients who were transferred to a hospital from another institution tend to have significantly lower costs in England, Estonia, France and Sweden and a shorter LoS in Ireland. A transfer out of the hospital to another institution, such as rehabilitation units, seems to reduce the LoS in Poland and costs in Estonia and Sweden. In contrast, in England, France and Spain, cases with a transfer out of the hospital appear to have higher costs. Interestingly, in Finland, the country with the lowest LoS, where patients receive rehabilitation care at home or in other institutions than hospitals, both the transfer to or out of a hospital had no significant effect on costs.
An emergency admission of patients is associated with significantly longer LoS in Austria, Ireland and Poland and higher costs in Finland and Sweden, whereas in England, France and Germany, emergency patients have significantly lower costs.
Deceased patients have significantly shorter LoS and significantly lower costs in France, Germany and Sweden.
Co-morbidities assessed via the Charlson index (Street et al., 2012) are rarely significant, and where they are, they show inconsistent influences on costs and LoS across countries. The weak explanatory power of the Charlson index variables might be caused by the fact that both the Charlson index and the total number of diagnostics are used to measure patients' morbidity.
Patients with fractures have significantly longer LoS in Poland and higher costs in England, Germany and Sweden. Patients who receive a partial replacement have significantly lower LoS in Austria and Poland and lower costs in Estonia, France, Spain and Sweden. The revision of a hip implant leads to significantly longer LoS and higher costs in all countries.
In all countries, except Austria, Estonia, Finland and Sweden, patients who suffer from urinary tract or wound infection have significantly longer LoS and imply higher costs. Adverse events are not significant in explaining differences in costs and LoS across countries, except in England.
In some countries, certain patient and treatment variables included in MP are also used to define DRGs (see Table 1). If DRGs adequately reflect differences in resource consumption, the patient and treatment variables should have less explanatory power when adding them to the DRG variables as is done in the full model (MF). Therefore, the coefficients of these variables are expected to be lower in MF than in MP or become insignificant in MF. This applies to the majority of significant coefficients such as age categories, diagnoses, procedures, emergency and revision. In contrast, the coefficients are mostly not significant for the Carlson index, adverse events, urinary tract and wound infection variables. The revision variable stands out because in five countries the coefficient becomes insignificant in MF.
The degree to which DRGs and patient/treatment variables are able to explain costs differences among patients varies across countries. The adjusted R2 statistics indicate that for the patient characteristic models (MP), the explained variance in resource consumption ranges from 17% in Finland up to 61% in Estonia, whereas the remaining countries are in a corridor of 30% to 50%. In most countries, except for England and Finland, MP are superior to the DRG models in explaining variation of costs. The R2 difference between MD and MP in most countries approximates 10% with the exception of Estonia where the difference is substantially higher at 40%.
3.2.2 Stage 2
After controlling for different patient/treatment characteristics and DRGs, Figure 1 shows the unexplained variance in costs or LoS across hospitals within each country. Hospitals are ranked from left to right by their deviation from the national sample mean costs or LoS in ascending order. Figure 1 illustrates that even after taking different patient characteristics and treatment variables into account, the variation of unexplained LoS and costs from one hospital to another remains large. Countries with a larger hospital sample (e. g. Austria, Poland, England and France) show an S-shaped distribution of hospitals with remarkable outlier hospitals at the extremes. The confidence intervals reflect the distribution of the dependent variable. Wide intervals are mostly because of a low number of cases with highly different costs or LoS within one hospital and because of low or high cost outlier cases at the end of the distribution.
A second-stage analysis of hospital-specific variables and their impact on LoS and costs was conducted for Austria, Ireland, Poland, England, France, Germany and Spain (Street et al., 2012).4 Patients in Austrian private hospitals stay significantly longer than in public or non-profit hospitals. In France, private for-profit ownership leads to significantly lower treatment costs. In Poland and England, LoS and costs slightly increase with the size of the hospital, that is, we find modest for support diseconomies of scale. In addition, in England specialized hospitals, which were identified using an adapted Gini index (Street et al., 2012), appear to treat hip patients at significantly higher costs. None of the other explanatory variables (i.e. total volume of hospital cases, share of hip surgery cases on total hospital volume, teach, urbanity and adverse events) was a significant predictor of resource consumption in the second stage analyses.
4 DISCUSSION AND CONCLUSIONS
The ability of European DRG systems to explain the variation in resource use was compared against a standard set of patient characteristic and treatment-specific variables based on patient-level data. However, cross-country comparisons should be made with caution because of the differences in data/hospital samples. First, the coding practices, especially for secondary diagnoses, vary from one country to another based on different national (or even regional) regulations and coding incentives induced by the given DRG-based payment methodology. Therefore, conflicting results across countries for any given explanatory variable may partially be the result of coding differences. Moreover, patients' costs are reported differently in each country and may include a varying number of cost categories because of national specific costing and reporting standards (Tan et al., 2011). Third, the need to define a common set of variables, which reflect patient characteristics and treatments consistently across countries, prevented the inclusion of some potentially important treatment variables such as different surgical techniques (e.g. press-fit versus cemented) as information about these were not available in all countries. In addition, in countries where add-on payments for high cost implants or rehabilitation exist, DRGs might systematically differ from countries where they do not as for example different split-variables are needed to account for resource consumption variation or the shape of the cost distribution differs. Lastly, regulators may be motivated by concerns other than resource homogeneity when defining DRGs that we did not pay attention to in our analysis such as low transaction costs or a fast incorporation of technological innovation.
Nevertheless, in six countries, the set of patient characteristics and treatment variables proposed in our model appear to explain better than the DRG variables the variation in costs or LoS. Given the relatively weak ability of some DRG models to explain cost variation, these systems may benefit from integrating selected patient characteristics in the definition of relevant groups. For example, although all countries differentiate between primary and revision hip replacement, just four countries have DRGs for partial replacements. However, our results suggest that incorporating this variable in classification can improve their capacity to account for resource use.
For other classification variables, we find that their operationalization seems to matter less than expected. For example, the mechanism to account for secondary diagnoses differs widely across countries, and much attention is paid to the definition of these measures in many national policy discourses (Busse et al., 2011). However, some countries, such as the NordDRG countries and Austria, do not account for secondary diagnoses and, nevertheless, do not systematically perform worse than the other countries in accounting for resource consumption variation. Our analysis suggests that co-morbidities play only a minor role for resource consumption variation of hip replacement patients as, for example, Charlson index co-morbidities and adverse events are not significant explanators of hospital costs. Moreover, co-morbidity measures taking into account the number and type of diagnoses, such as the patient cumulative complexity level used in Germany or Ireland, should – theoretically – perform better than simple measures. However, our results do not confirm this hypothesis because the total number of diagnoses remains significant in the full model (MF) in all countries except Finland.
It is also interesting to note that countries with a higher amount of DRGs to group hip replacement cases do not necessarily perform better than those with fewer groups. Considering the cost analyses, the adjusted R2 of MD for Sweden (two DRGs) is in the same range of approximately 30% as the adjusted R2 of Germany with eight DRGs. Despite this, the English HRG system with 14 DRGs has the greatest ability to explain differences in resource consumption (adjusted R2 of MD: 50%). Similarly, the adjusted R2 of MD for LoS analyses show that Ireland with three DRGs has an adjusted R2 similar to Poland with six DRGs. This suggests that the number of DRGs for hip replacement is not per se a good predictor for the ability of DRG systems to explain differences in resource consumption.
In some countries (e.g. Austria, Finland, Ireland and Sweden), DRGs are used for measuring hospital activity or for adjusting global hospital budgets rather than to directly determine payment rates (Geissler et al., 2011). In these contexts, it might be less relevant for stakeholders whether DRGs precisely account for patient-level costs variation. However, especially in countries that perform poorly in our comparative analyses and where hospital revenue is directly linked to DRG-based payments, policymakers should consider revisiting their grouping algorithms for hip replacement patients.
CONFLICT OF INTEREST
The authors declare no conflicts of interest.
The authors are grateful to Zeynep Or and Mikko Peltola for their comments on earlier drafts. Moreover, the authors thank the journal's referees for their constructive support in improving this article.