HOW WELL DO DIAGNOSIS-RELATED GROUPS EXPLAIN VARIATIONS IN COSTS OR LENGTH OF STAY AMONG PATIENTS AND ACROSS HOSPITALS? METHODS FOR ANALYSING ROUTINE PATIENT DATA

Authors


Centre for Health Economics, University of York, York YO10 5DD, UK. E-mail: andrew.street@york.ac.uk

ABSTRACT

We set out an analytical strategy to examine variations in resource use, whether cost or length of stay, of patients hospitalised with different conditions. The methods are designed to evaluate (i) how well diagnosis-related groups (DRGs) capture variation in resource use relative to other patient characteristics and (ii) what influence the hospital has on their resource use. In a first step, we examine the influence of variables that describe each individual patient, including the DRG to which the patients are assigned and a range of personal and treatment-related characteristics. In a second step, we explore the influence that hospitals have on the average cost or length of stay of their patients, purged of the influence of the variables accounted for in the first stage. We provide a rationale for the variables used in both stages of the analysis and detail how each is defined. The analytical strategy allows us (i) to identify those factors that explain variation in resource use across patients, (ii) to assess the explanatory power of DRGs relative to other patient and treatment characteristics and (iii) to assess relative hospital performance in managing resources and the characteristics of hospitals that explain this performance. Copyright © 2012 John Wiley & Sons, Ltd.

1 INTRODUCTION

Since the development of the first classification of diagnosis-related groups (DRGs) in the 1970s (Fetter et al., 1980), the number of DRG systems has proliferated, with many countries developing their own versions, which are periodically overhauled (Kobel et al., 2011). The original intention was that DRGs would classify patients into a manageable number of resource homogenous groups (Fetter et al., 1980), and this remains the fundamental basis for classification. However, herein lies a puzzle: is variation in medical practice and resource use so great across countries that each requires its own patient classification system? Or are some DRG systems better than others at categorising patients into resource homogenous groups?

To address these questions, we consider patients admitted to hospitals in 10 European countries1 for one of 10 conditions, which we define as episodes of care (EoCs). The EoCs are listed in Table 1. The number and construction of DRGs used to categorise patients in each EoC varies markedly across these 10 countries.

Table 1. Diagnostic and procedural codes used to identify patients to each EoC
EoCMain diagnosis code (ICD-10)Procedure code (ICD-9-CM)
  1. a

    All cases that receive a bypass are to be excluded.

  2. b

    In each country, the definition that best matches national coding habits should be used. The aim is to include all cases of childbirth.

Acute myocardial infarctionI21, I22exclude 36.1a
AppendectomyK35-K3847.0
Breast cancer surgeryC50, D0585.20–85.23, 85.33–85.36, 85.4
ChildbirthbZ37, O80–O8472, 73, 74
CholecystectomyK8051.2
Coronary artery bypass graft36.1
Inguinal hernia surgeryK4017.1, 17.2, 53.0, 53.1
Hip replacement00.7, 81.51–81.53
Knee replacement00.8, 81.54–81.55
StrokeI61, I63, I64

The empirical articles in this special issue of Health Economics apply a common analytical strategy to examine the DRG classifications used in each of these countries in terms of how well they explain variations in resource use for patients having the same EoC. In this article, we set out the strategy adopted to examine why resource requirements might vary among patients. Our approach builds on the literature that examines hospital costs using patient-level data (McClellan, 1997; Kessler and McClellan, 2002; Dormont and Milcent, 2004; Olsen and Street, 2008; Laudicella et al., 2010; Dormont and Milcent, 2005; Bradford et al., 2001).

We have two types of dependent variables, either cost (yc) or length of stay (LoS; ys), both of which can be considered as measures of resource use (Martin and Smith, 1996; Gilman, 2000; Norton et al., 2002). Although the analysis of cost variation is preferred, patient-level cost data are not available for all countries. Where cost data are absent, variation in LoS is examined instead.

LoS has the advantage that it is defined in a straightforward manner, calculated as the difference between the date of discharge and the date of admission. The disadvantage, of course, is that LoS is an imperfect indicator of resource use, particularly for surgical patients. In theory, costs should better reflect actual incurred resource use than LoS, but their accuracy depends on the process used to allocate costs to individual patients (Tan et al., 2011). In those countries where patient-level costs are available, general guidance is set out detailing how hospitals should calculate costs, but this guidance differs in important respects across countries, as summarised in Table 2. These differences render it illegitimate to pool cost data across countries. Instead, all analyses are performed separately for each country.

Table 2. Summary of approaches to costing
CountryOverhead cost allocation to medical departmentsDirect cost allocation to patientsThe following costs included/excluded in patient costs
Investments and buildingsInterestTaxes/VAT-related costsResearch and teachingUser charges
  1. a

    All indirect costs except research, teaching, external projects, ambulances and counties' politicians and their staff are included.

  2. b

    If not related to capital loans.

EnglandStep downTop–down microcosting to DRG levelIncludedIncludedExcludedExcludedExcluded
EstoniaDirectBottom–up microcostingExcludedExcludedExcludedExcludedExcluded
FinlandDirectBottom–up microcostingIncludedIncludedExcludedExcludedExcluded
FranceStep downMixture of top–down and bottom–up microcostingIncludedIncludedIncludedExcluded 
GermanyStep down (preferably)Bottom–up microcostingExcludedExcludedbExcludedExcluded 
Spain (Catalonia)Top–down with reciprocal imputationBottom–up (activity based costing)IncludedExcludedIncludedExcluded 
SwedenDirectBottom–up microcostingIncludedaIncludedaExcludedExcludedIncluded

Our analysis involves assessing the extent to which variation in costs or LoS of patients in the same EoC is explained by the following:

  • The DRGs to which the patients are allocated. As noted earlier, there is considerable variation across the 10 countries in the number and construction of DRGs used to describe patients in each EoC.
  • Other variables that describe each patient including their demographic characteristics, diagnoses, treatment and quality of care. We specify a set of variables that can be constructed from routine patient-level data in all 10 countries.
  • The hospital in which treatment took place. Our analytical models allow us to explore the extent to which the relative influence that each hospital has on the costs or LoS of its patients is driven by hospital-level characteristics.

For six countries, the analytical sample for each EoC consists of all hospital patients that have one of the specific diagnostic or procedural codes as detailed in Table 1 recorded in their medical record. The analyses for Finland, France (for cost), Germany and Spain (Catalonia) are based on all patients admitted to a sample of hospitals. Data are for 2007/2008, with the exception of France and Poland where data relate to 2007 and 2009, respectively. All admitted patients are included whether or not they were treated on a day-case basis, as DRGs are designed to capture the full caseload of admitted patients, irrespective of their LoS (Fetter et al., 1980). Indeed, in some countries, DRG payments have been designed with the expressed intention to incentivise hospitals to shift from overnight to day-case treatment (Street and Maynard, 2007). Patients attending outpatient or ambulatory clinics, whether before or after hospital treatment, are excluded as different classification systems are used to describe this activity (Berlowitz et al., 1995; Goldfield et al., 2008).

In what follows, we first specify in general terms the models used to estimate variation in costs or LoS for these patients. We describe the construction of the variables to be included in these models and then provide a guide to comparative interpretation of the models. We conclude with a summary of how we build on the existing literature, the analytical choices made given the constraining features of the EuroDRG project and the suggested directions for future research.

2 EMPIRICAL FRAMEWORK

2.1 Overview

An analysis of why resource use differs among patients can be undertaken by specifying a multilevel (or hierarchical) model, which recognises that patients (level 1) are clustered within hospitals (level 2). We explore variations at each level of the hierarchy by undertaking the analysis as a two-stage process (Saxonhouse, 1976; Lewis and Linzer, 2005; Jusko and Shively, 2005; Laudicella et al., 2010). In the first stage, we assess the influence on costs or LoS of a set of variables defined for each individual patient. After controlling for these patient-level factors, we assess the influence that hospitals have on costs and LoS in the second stage.

2.2 Analysis of patient-level costs

When cost data are available, in the first stage, we analyse the influence of patient characteristics on cost. The model takes the following general form:

display math

where math formulais the (logarithmic) cost of patient i in hospital k and xik is a vector of characteristics of patient i in hospital k, which will be discussed in detail in Section 'Patient-level variables'. For characteristics that enter as dummy variables, their proportionate influence is calculated as math formula (Halvorsen and Palmquist, 1980). uk captures the hospital influence on costs over and above the patient characteristics, whereas εik is the standard disturbance.

Typically, the distribution of patient costs is positively skewed. Although we exclude outliers from the analysis,2 we also avoid any remaining high-cost patients exerting undue influence on the estimated relationships by converting costs into logarithmic form, which serves to normalise the overall distribution of costs.

With only two levels to the hierarchy (patients clustered in hospitals), the previously mentioned equation can be estimated using log-linear models with fixed effects, which are appropriate given our aim to make inferences about hospitals themselves (Rice and Jones, 1997).3 These fixed effects, uk, can be interpreted as a measure of hospital performance, with higher values implying that this hospital's costs are above average after taking into account the characteristics of the patients being treated (Dormont and Milcent, 2004; Dormont and Milcent, 2005).

2.3 Analysis of patient-level length of stay

For those countries where patient-level cost data are unavailable, LoS is analysed as a proxy for resource use. As LoS is count data, the model outlined above has to be adjusted to incorporate this.

The standard model for count data analysis is Poisson regression. The underlying assumption is that equidispersion is present, that is, conditional mean equals variance, math formula. The probability density function is as follows:

display math

where math formula is the LoS (number of days) of patient i in hospital k and hk represents the hospital effects we want to capture in the model.

For some of our EoCs, the assumption of equidispersion might be too restrictive as conditional variance exceeds the mean, a traditional way of generalization being to use negative binomial (NB) models. In the standard NB model (NB2 in Cameron and Trivedi, 1998) the variance math formula is assumed to be a quadratic function of the mean.

display math

Then, the probability density function is as follows:

display math

However, NB models are required only if overdispersion is present. A formal test for overdispersion has been proposed to choose between the Poisson and the NB models (Lee, 1986; Cameron and Trivedi, 1986). Therefore, on the basis of this test, we analyse LoS with either Poisson or NB models, but consistently across countries for each EoC.4

The way that these count data models account for the clustering of patients within hospitals is not quite equivalent to the process applied in the cost model. This is because the count data fixed effects differ in theory (Hausman et al., 1984; Cameron and Trivedi, 1998) and in implementation (see xtpoisson and xtnbreg routines in StataCorp., 2009) from log-linear models with fixed effects. Instead, we apply unconditional Poisson/NB regression estimators with hk representing the deviation of hospital k from the grand mean, each of which captures the hospital's effect on LoS. This gives us:

display math

Here, uk is the estimated hospital effect, which is analogous to the fixed effect in the cost models. It has been demonstrated that the direct estimation of the fixed effects in count data models can be achieved by introducing a dummy variable for each hospital (Cameron and Trivedi, 1998; Allison and Waterman, 2002). uk can then be interpreted in the same way as for the cost model.

2.4 Analysis of hospital-level characteristics

In the second stage, the estimated hospital effects, math formula, are analysed to explore reasons why some hospitals have higher average costs or lengths of stay than others. We shall consider what these characteristics are in Section 'Hospital-level variables', but for the moment, we summarise them as a vector zk of variables measuring hospital-level characteristics in a regression of the form:

display math

Estimated dependent variables suffer both sampling and random error (Saxonhouse, 1976; Lewis and Linzer, 2005). Here, the sampling error is the difference between the true value of the hospital effect and its estimated value from the first stage. Random error arises even if the second-stage variable was observed directly rather than being estimated. Consequently, we need to account for both sources of error, a task further complicated when the size of measurement errors is unknown. As we want observations with smaller variance to carry larger weight in the regression, we compute a GLS regression with weights proportional to the inverse of the squared standard errors5 and Efron robust standard errors to correct for potential heteroscedasticity. These are more accurate than standard Huber–White corrections when dealing with small samples (Davidson and MacKinnon, 1983; Lewis and Linzer, 2005).

2.5 Graphical analyses

We interpret the hospital effects derived from the cost and LoS equations as measures of relative hospital performance (Laudicella et al., 2010). These capture the average cost or LoS of the hospital's patients, purged of the influence of the explanatory variables included in the first-stage equations, which are assumed to be uncorrelated with the hospital effects (Dormont and Milcent, 2005).

A visual comparison of the effects across hospitals requires standardisation of the graphical representation. For the effects derived from the cost analyses, we plot the hospital intercept math formula standardized by the sample average to derive the relative performance of each hospital in comparison with the mean. Consequently, having conditioned on the variables accounted for in the first stage, each ratio corresponds to the mean cost for the hospital in question relative to the overall mean for all hospitals in the country. For LoS, we adapt this method to our unconditional Poisson/NB regression with predicted fixed effects. Each hospital's ratio corresponds to its relative LoS compared with the national average.

The 95% confidence intervals of the hospital effects from the cost and LoS analyses assume a normal distribution of these fixed effects. Some hospitals with very wide confidence intervals are omitted from the graphs. These were omitted if, for the EoC in question, they either had very low volumes of activity or a handful of patients with markedly different costs or LoS to most other patients in the same hospital. In both cases, the hospital's effect will be estimated imprecisely.

The interpretation of the ratios and graphs is straightforward. A value of 1.3 means that patients in this hospital have 30% higher costs or LoS compared with the average for all hospitals in the country that deliver this type of care. Hospitals are ordered (from left to right) from those with the lowest mean effects to the highest. Patients in hospitals to the left have lower cost or LoS, on average, than those in other hospitals (this not being due to the explanatory variables accounted for in the first stage of the analysis). We interpret the position of each hospital relative to its national peers as a measure of its relative efficiency in managing resources.

3 EXPLANATORY FACTORS

3.1 Patient-level variables

The data used to define the variables included in the models are drawn from routine sources in each country, as summarised in Table 3. Our vector, xik, comprises two groups of explanatory factors, namely, those that capture the DRG to which each patient is allocated math formula and a set of patient-level characteristics math formula, including demographic, diagnostic, treatment and quality information, such that

display math
Table 3. Databases by country
CountryData sourcesYear
AustriaPerformance-oriented Hospital Financing Framework Database2008
Private Hospitals Financing Fund Database2008
EnglandHospital Episode Statistics2007/2008
National Health Service Reference Costs2007/2008
EstoniaEstonian Health Insurance Fund database2008
FinlandHospital Discharge Register, hospitals of Helsinki and Uusimaa2008
FranceNational hospital cost study (ENCC; representative sample of voluntary hospitals)2007
GermanyHospital inpatient activity database (PMSI MCO; exhaustive sample of all hospitals)2008
Research database based on patient-level data according to §21 Hospital Remuneration Act (KHEntG)2008
National G-DRG cost accounting standards by the Institute for the Hospital Remuneration System (InEK)2008
IrelandHospital In-Patient Enquiry2008
PolandCentral register of healthcare services and reimbursements2009
Spain (Catalonia)Public Hospital Network of Catalonia2008
Spanish Network of Hospital Costs2008/2009
SwedenNational Case Costing Database2008

Our initial specification (MD) explores the extent to which DRGs explain variation in costs or LoS of patients having the same EoC:

display math

Each patient is allocated to a single DRG, and a dummy variable is used to identify which DRG. The number of DRGs to which patients are allocated varies according to the EoC and country in question. For the 10 countries in the study, on average, 6 DRGs are used to describe the vast majority of patients having a knee replacement whereas up to 16 DRGs are used to describe patients admitted for acute myocardial infarction (AMI). The number of DRGs used to describe patients in the same EoC also varies markedly across countries because DRG classifications differ (Kobel et al., 2011). Consequently, the set of DRG dummy variables varies across both EoCs and countries.

For comparative purposes, we adopted the following procedure in constructing and ordering each country's DRG dummy variables. First, the reference DRG (which is omitted from the model) is that which captures most the country's patients in that EoC. Second, we include a dummy variable for any other DRG if it accounts for at least 1% of patients in that EoC. In most cases, these are reported in an ascending order of their associated reimbursement rate. Third, a dummy variable captures all other patients in the EoC that are not assigned to the identified DRGs.

Our second specification (MP) considers a set of explanatory characteristics other than DRGs to explain variation in resource use among patients. Thus, we consider the vector math formula in the following specification:

display math

These variables describe patients according to their demographic characteristics, diagnoses, treatment and quality of care.

We construct age categories based on quintiles chosen according to the observed distribution of age for the EoC in question, with the second age category generally being the reference group. A dummy variable identifies whether the patient was male.

We consider the impact on resource use of the number of diagnoses and procedures performed, whether the patient was transferred from or to another institution as part of their care pathway, whether they were admitted through the emergency department and whether the patient died in a hospital. We also specify various variables that capture common diagnostic characteristics and procedural techniques that are specific to the EoC in question. Details of these variables are provided in the empirical articles.

We take account of the comorbidities used in the construction of the Charlson index (Charlson et al., 1987; Quan et al., 2005). Rather than using the index itself, we define three distinct patient groups based on their Charlson comorbidities. The first involves specifying six of the 17 Charlson comorbidities as ‘severe’, these being hemiplegia/paraplegia, renal disease, cancer, moderate or severe liver disease, metastatic solid tumour and AIDS/HIV (Charlson et al., 1987). The other comorbidities are designated ‘non-severe’. We then define a dummy variable indicating whether the patients suffered a single non-severe comorbidity and another dummy variable indicating at least one severe or two non-severe comorbidities; all other patients suffered no comorbidity (and form the reference group).

We modify this process if the diagnoses included in the Charlson categories are directly related to the EoC analysed (in which case they are not comorbidities). The disregarded comorbidities are myocardial infarction for the analysis of AMI and coronary-artery bypass graft (CABG), cancer and metastatic solid tumour for breast cancer and cerebrovascular disease and hemiplegia/paraplegia for stroke.

We also include a set of variables that capture adverse events (Loke et al., 2008). We use some of the patient safety indicators (PSIs) that have been developed for use with routine hospital administrative data (Quan et al., 2008). A dummy variable indicates whether any one of the following occurred: PSI5 (foreign body left in during procedure), PSI7 (infection and inflammatory reaction due to other vascular device, implant, etc.), PSI12 (pulmonary embolism/deep vein thrombosis), PSI13 (sepsis) and PSI15 (accidental cut, puncture, perforation or haemorrhage during medical care). For childbirth, we take account of PSI18, PSI19 and PSI20 (obstetric trauma) rather than the other indicators.

Finally, we define two dummy variables measuring whether urinary tract infection (International Classification of Diseases, 10th Revision (ICD-10) codes N30.x, N39.0, O23.x and O86.2) or postoperative surgical infection (T81.4) are suffered during hospitalisation.

The third model is our ‘full’ specification (MF) that includes both the vectors of DRG variables and patient characteristics:

display math

The hospital effects from the full model are those used for the graphical presentation and for the second-stage analysis, designed to identify hospital characteristics that may drive average costs or LoS. We now turn to the discussion of these characteristics.

3.2 Hospital-level variables

Our second-stage analysis is designed to explore the explanatory power of various hospital characteristics, incorporated in the vector zk, on costs or LoS. This analysis is conducted only for countries that have a sufficient number of hospitals for the EoC to be included in the regression; as a rule of thumb, Montenegro (2001) suggests a minimum of 10 observations per parameter.

The characteristics considered include the hospital's teaching or ownership status, the amount and range of activity undertaken and the quality of care. We discuss the reasons why these characteristics may be related to resource use in detail elsewhere (Street et al., 2010) and offer a brief summary in Table 4.

Table 4. Hospital-level variables
VariableDescription/rationale for consideration
Teaching statusThe funding of teaching and research may not accurately reflect the full cost of these activities, opening up the possibility of cross-subsidisation with patient care.
Teaching hospitals might systematically attract more severe patients within each DRG.
Ownership statusPublic and private or for-profit and not-for-profit status may confer different incentives on staff and may also mean that hospitals are subject to different regulatory constraints (Duckett, 2001; Mason et al., 2009).
Volume of activityHospitals might benefit from economies of scale, experiencing decreasing average costs as volume increases. Beyond a particular size, diseconomies of scale may arise.
Measured by (i) the size of the hospital (number of patients treated annually, in thousands) and (ii) the share of the hospital's patients that fall under the EoC in question (in percent).
SpecializationIf there are economies of scope, increasing the range of activity leads to lower costs (Panzar and Willig, 1981). However, hospitals that specialise may have lower costs than general hospitals (Dranove, 1987). We enhance the measure of concentration of activity across Major Diagnostic Categories (MDCs) proposed by Daidone and D'Amico (2009), measuring specialisation as the deviation from the national average hospital morbidity across MDCs.
Adverse eventsWe consider the rate per 1,000 of infections (rate 1) and of PSIs for all patients (rate 2) of the hospital. For childbirth, we consider obstetric trauma instead.

The observed relationships between these characteristics and our second-stage dependent variables are summarised in each of the empirical articles that follow. Tables of results are not reported but available from www.eurodrg.eu.

4 MODEL INTERPRETATIONS AND COMPARISONS

In the empirical analyses for each EoC, we estimate and compare our models to gain insight into three broad sets of issues.

First, we identify what factors explain variation in cost and LoS and assess whether these explanatory factors are generally consistent across cost and LoS equations. As is usual practice, this involves the consideration of each of the variables included in the vector  xik and of overall explanatory power as given by the adjusted R2 statistics for the cost equations. For the LoS equations, we use the deviance R2 proposed by Cameron and Windmeijer (1996) with adjustment for the number of parameters. The deviance R2 represents the sum of squared deviance residuals, which is a generalization of one of the many ways to express the normal R2 statistics (Kvålseth, 1985). In addition, it can be expressed as the relative log-likelihood increase compared with the maximal possible log-likelihood increase of a saturated model (Cameron and Windmeijer, 1996).6

Second, we assess the explanatory power of the set of DRGs to which each patient is allocated, math formula, relative to the set of characteristics included in the vector math formula and in relation to the fully specified model. We evaluate the explanatory power of model MD relative to model MF by measuring math formula. Similarly, the explanatory power of model MP relative to model MF is defined as math formula.

Using these measures a and b, we categorise countries according to the explanatory power of their DRG system math formula relative to the set of patient characteristics included in the vector math formula. The assessment and the interpretation depend on which category a country appears in. Generally, we distinguish between several categories.

  • Both a and b have relatively small values. In this case, we could conclude that math formula and math formula are equally successful at explaining variation, appear to play a similar function and can be considered interchangeable.
  • Both a and b have relatively large values. Hence, math formula and math formula are only able to capture a small proportion of what the full model MF is able to explain. Possibly, the DRG system could be refined to take into account math formula.
  • If a is larger than b, math formula is better than math formula at explaining variation. This suggests that the country's DRG system lacks specific criteria that could be used for grouping purposes. The patient characteristics included in math formula include these criteria.
  • If a is smaller than b, we could conclude that math formula performs better than math formula. This might be because the local DRG system, using country-specific information (such as operation codes), has valuable explanatory power.

Finally, we assess the hospital effects derived from model MF. These effects capture the average cost or LoS of patients in each hospital, purged of the influence on resource use of the explanatory variables in xik. As such, the effects can be considered measures of relative hospital performance, the relative position of each hospital not being due to differences in the characteristics of their patients or to the DRGs to which patients are allocated, which have been accounted for in the first-stage analysis. There may be other identifiable characteristics of hospitals that influence their relative performance. We are able to evaluate these characteristics statistically in those countries with a sufficient number of hospitals.

5 DISCUSSION AND CONCLUSIONS

This article builds on the econometric literature that analyses reasons why costs vary among hospital patients. We contribute to this literature in four important respects.

First, we set out methods for analysing variations in LoS that are equivalent to those used in analysing costs. There are advantages to analysing LoS, notably because information is more readily available and less subject to discretionary measurement than costs. Analysis based on LoS rather than cost may also prove more powerful at fostering behaviour change if it prompts clinicians to ask why their patients are staying longer in hospitals than are those treated elsewhere. LoS is, of course, only a partial measure of resource use, and our preference is for the analysis of cost data where these are available. When both measures are available, it is straightforward to run both analyses and compare the results. We have reserved these comparisons for country-level reports (Gaughan et al., 2012).

Second, we have set out an econometric approach to evaluating the performance of DRG systems in explaining variations in resource use among patients undergoing the same EoC. Our approach is intended to complement rather than substitute for traditional means of constructing and evaluating DRGs. Typically, these involve either (i) constructing DRGs from scratch by assessing potential grouping variables often using the Classification and Regression Tree (CART) analysis and measuring reduction in variance (Lemon et al., 2003) or (ii) comparing existing grouping systems by running the same set of patient data through them (Reid et al., 2000; Reid and Sutch, 2008). Before the Classification and Regression Tree, it may be useful to perform econometric analysis, which can indicate the size of influence on costs or LoS of each explanatory variable and therefore provide a good starting point for identifying candidate grouping variables.

Third, unlike those studies of hospital efficiency that rely on aggregate or average hospital costs (Hollingsworth, 2008), analysis based on patient-level data is both more robust and insightful, in which hospital effects are purged of the influence of patient characteristics (Laudicella et al., 2010). Although early studies based on patient-level data explored the average impact of hospital characteristics on costs (McClellan, 1997; Dormont and Milcent, 2004), here we also focus on the purged effects in their own right, interpreting them as measures of relative hospital efficiency in managing resource use. Although this has been performed previously in analysing costs (Laudicella et al., 2010; Dormont and Milcent, 2005), we extend the approach to the analysis of LoS.

Fourth, we have outlined an analytical strategy for making cross-country comparisons that does not require the pooling of data or the use of purchasing power parities to rebase costs. A key feature of the empirical analyses is that they are based on routinely available data. We have constructed a set of explanatory variables that are defined consistently across all 10 countries. However, there may be systematic differences across countries in how costs are calculated and in coding practice. For example, in Finland, it is uncommon to code beyond the primary diagnosis, in contrast to greater coding depth elsewhere, for example, Germany. Differential practice prohibits the pooling of data and also means that caution needs to be exercised in making comparisons across countries. Such cautions appear in the accompanying empirical articles.

Various methodological choices reflect compromises made to ensure consistency across EoCs and countries, given the objectives of the EuroDRG project. Consistency was sought particularly in three areas: (i) to ensure comparability between those countries with cost data and those with LoS data, (ii) to ensure a common cost or LoS functional form across all countries for each EoC and (iii) to ensure a common set of explanatory variables that could be constructed from routine patient-level data in all countries and across all EoCs. In research contexts that are not subject to such overarching constraints, alternative choices might be made.

For instance, we adopted a fixed effects specification to ensure the comparability of the cost and LoS analyses. This decision was somewhat costly, requiring us to undertake a second-stage analysis to assess whether the fixed effects are driven by provider characteristics. If cost is the only variable of interest, a random effects model might be preferable, allowing the consideration of provider characteristics in a single specification (Dormont and Milcent, 2004). That said, when considerable information is available about the first-level effects, the gains from estimating both levels in a single step are modest (Lewis and Linzer, 2005).

Similarly, we choose to model log-transformed costs because this allows an easy comparison of the proportionate influence on the costs of explanatory variables across all countries, irrespective of their currencies. If not making cross-country comparisons and subject to their distribution, the analysis of costs in linear or gamma forms may in some situations be preferable (Manning and Mullahy, 2001). Note though that the empirical gains may not be substantial for either cost or LoS analyses; for instance, Austin et al. (2002, 2003) compared several regression models and found they have similar ability to predict costs or LoS after CABG surgery.

Finally, the full specification, MF used to estimate hospital effects comprises both the vectors of DRGs, math formula and patient-level characteristics, math formula. There may be multicollinearity between variables in these vectors, particularly as DRGs can be a function of patient characteristics. In the empirical analyses, variables that were highly collinear (r > 0.8) with DRGs variables were excluded. However, in any case, including irrelevant variables in the model does not bias the estimates of the hospital effects (Wooldridge, 2002).

In summary, we have set out an analytical strategy to examine variations in resource use, whether cost or LoS, of patients hospitalised with different conditions. This allows us (i) to identify those factors that explain variation in resource use across patients, (ii) to assess the explanatory power of DRGs relative to other patient and treatment characteristics and (iii) to assess relative hospital performance in managing resources and the characteristics of hospitals that explain this performance. Although the approach has been tailored to the requirements of the EuroDRG project to support the empirical articles that appear in this special issue, the methods should be of general interest to researchers evaluating variation in resource use among patients and across healthcare providers and countries.

CONFLICT OF INTEREST

The authors declare no conflict of interest.

ACKNOWLEDGEMENTS

The authors thank all members of the EuroDRG project team and, particularly, Reinhard Busse, Nils Gutacker, Unto Häkkinen, Jacqueline O'Reilly, Gunnar Rosenqvist and Susanne Strohmaier for their advice and guidance. They are also grateful to the journal's referees for their constructive comments. The usual caveat applies.

  • 1

    Austria, England, Estonia, Finland, France, Germany, Ireland, Poland, Sweden and Spain (Catalonia).

  • 2

    Newborns (younger than 1 year) were dropped from the sample, as were hospitals where records were available for fewer than five patients. Cost outliers are identified with a bilateral trim based on three times the standard deviation of the cost distribution. LoS distributions are highly skewed, so only the right-tailed length of stay outliers are identified on a log-transformation of LoS with an upper trim also based on three times the standard deviation threshold. Note that hospital-level variables have been calculated on the full data set (before dropping any outlier).

  • 3

    Random effects can also be estimated, and for some episodes of care and for some countries, the Hausman (1978) test might indicate that these are to be preferred. If our interest was solely in costs, a random effects model would be preferable (provided xik and uk are uncorrelated), as it would allow the consideration of both patient- and hospital-level characteristics in a single model. We adopted a fixed effects specification though to ensure comparability with the LoS equations and because we seek inferences about specific hospitals rather than the population from which they might be drawn.

  • 4

    This test usually gave consistent results across countries about whether it is appropriate to run a Poisson or an NB model, providing assurance about the appropriateness of the chosen specification for each episode of care.

  • 5

    See the Stata Manual (StataCorp, 2009) and Lewis and Linzer (2005) for a more detailed discussion about the different options.

  • 6

    In addition to the deviance R2, we have evaluated the model performance of LoS models using other measures of fit (i.e. AIC, BIC, pseudo-R2 and correlation R2; Cameron and Windmeijer, 1996) to ensure that the conclusions are not conditional upon the measure used.

Ancillary