Commentary: A Systematic Review of Health Care Efficiency Measures


  • Leah F. Binder,

  • Barbara Rudolph

Frustration with high costs and poor quality in American health care led private sector purchasers of health care benefits to form our organization, The Leapfrog Group, in 2000. Purchasers use Leapfrog to translate and then deploy the most important evidence about the levers that improve quality and reduce costs in hospital care. Leapfrog bridges the gap between the researchers and experts studying the value of health care services and business leaders writing the checks for those services. Our Leapfrog Hospital Survey is a publicly available dashboard of the National Quality Forum (NQF)-endorsed, evidence-based measures that research suggests have the greatest impact on quality, safety, and cost-effectiveness of hospital care. Employers and other purchasers use Leapfrog data to encourage beneficiaries to use the highest performing hospitals.

Given our organization's role mediating between purchasers and researchers we eagerly anticipated this review of the literature on health care efficiency (Hussey et al. 2009). Few issues are more critical to purchasers than efficiency; with rising insurance premiums eating away at profits in a down economy, purchasers look to us every day to plumb the literature for new strategies for reducing costs. But it is also important to purchasers that cost reductions not compromise quality. Investment of any magnitude in employee benefits is only worthwhile if employees are relatively satisfied and care is good. Dissatisfied, unhealthy employees, victims of poor quality care, contradict the very purpose of having health benefits in the first place. If purchasers invest in benefits, they want to see a benefit.

We are thus disappointed to learn from this review that most of the literature on health care efficiency does not account for quality outcomes. According to this review, nearly all of the studies under review defined efficiency by the resources and costs involved in particular functions within health care, without regard to the quality and outcome of the functions. While the number of workers and widgets needed to accomplish a task is certainly a factor in understanding efficiency, so is verification that the task was necessary in the first place, and that the task led to a worthwhile accomplishment. A task that was done for no purpose or botched is inefficient no matter how many resources were used to undertake it. Health care services that are clinically unnecessary or lead to poor outcomes should not be compared with services that are higher quality.

Our disappointment in the literature compounds when we consider the unique link between quality and costs in health care—a more profound and complicated link than in other industries. Arguably, quality problems are the single most significant source of high costs in health care, so efficiency studies without quality considerations leave out a critical variable. Worse than leaving knowledge gaps, however, is the risk of producing irrelevant or even spurious conclusions about our health care system at a time when there is such a strong need for good answers. Studies that measure only resource utilization for procedures need to make the following assumptions about the procedures they select for investigation: (1) that the studied procedures are needed and/or appropriate, because if they were unnecessary they were inefficient by definition, (2) that the outcomes of services are generally uniform across settings and populations, because studies do not control for variation in outcomes, and (3) that the larger set of comparison data on the efficiency of similar providers or hospitals, used in stochastic frontier analysis (SFA) and data envelopment analysis (DEA), comes from necessary and appropriate procedures that led to a generally uniform and predictable set of outcomes.

The first assumption, that services are needed or appropriate, is challenged by a considerable body of literature on the overuse of certain procedures as well as the regional variation in utilization of key services. Studies show unexplained variation in the intensity of medical services among geographic areas, with estimates reaching as high as 30 percent of total health care spending related to this variation (Wennberg, Fisher, and Skinner 2002). Evidence of overuse is also abundant. According to a review of the quality literature by Schuster and colleagues, “The dominant finding in our review is that there are large gaps between the care people should receive and the care they do receive. This is true for all three types of care—preventive, acute, and chronic—whether one goes for a check-up, a sore throat, or diabetic care … It is true for all age groups, from children to the elderly. And it is true whether one is looking at the whole country or a single city” (Schuster, McGynn, and Brook 1998). Studies show such costly overuse as:

The second assumption, that the outcomes of services are generally predictable and uniform across settings and populations, also contradicts evidence. First, considerable evidence suggests that adverse events and infections compromise outcomes and add considerable costs. For example, studies from the Harvard Medical School estimate that adverse events account for 5 percent or more of total health care spending or $100 billion a year (Thomas et al. 1999). Hospital-acquired infections may affect 5–10 percent of all patients admitted to hospitals each year, resulting in an estimated 90,000 deaths each year and waste totaling $4.5 to $5.7 billion (Burke 2003).

In addition, The Leapfrog Hospital Survey reports sometimes-dramatic variation among different hospitals in risk-adjusted mortality rates and adherence to best clinical practices for certain procedures. More than 50 percent of the 1,276 reporting hospitals in 2008 failed to meet Leapfrog standards for each of the procedures in our study, and for some procedures only a handful of hospitals met standards. These procedures and the percent of reporting hospitals meeting standards in 2008 included:

  • Coronary artery bypass graft (43 percent met standards),
  • Percutaneous coronary interventions (35 percent),
  • High-risk deliveries (32 percent),
  • Pancreatic resection (23 percent),
  • Bariatric surgery (16 percent),
  • Esophagectomy (15 percent),
  • Aortic valve replacement (7 percent), and
  • Aortic abdominal aneurysm repair (5 percent).

Research points to outcomes differentials among hospitals associated with surgical volumes as well as observed hospital mortality. Applying the literature on this correlation, John Birkmeyer, MD, University of Michigan and Justin Dimick, MD, from the Veterans Administration, devised a survival predictor for predicting hospital mortality on the Leapfrog Survey (Birkmeyer and Dimick 2008).

The final assumption in the efficiency research seems to be that it is possible to develop a meaningful dataset of providers or hospitals that can be used as the standard-bearer for comparing resource utilization for a given service or procedure. Although the research approaches are different, both SFA and DEA depend on the identification of such a database. Yet the literature on quality would caution against the assumption that there is any uniformity in the performance of the health care system nationally and that variation is more the rule than the exception. A database of procedures needs to differentiate between those that are clinically appropriate and those that are not. Moreover, the literature suggests the database will be different depending on which area of the country it comes from, since utilization patterns vary. Finally, a database of procedures ought to differentiate between those that were successful and those that were not. We know from the literature that procedures performed in one hospital may generally lead to a poor outcome, while in another hospital they may generally lead to positive outcomes.

Even if researchers found a way to isolate a dataset that demonstrated less variability, it is unlikely the dataset would serve as a beacon of what is possible for a high performing health care system. Unfortunately, the United States as a whole performs poorly compared with other industrialized nations in a number of population health indicators, while spending far more per capita on health care services. On a macro level, the American health care system is neither efficient nor high quality; thus, a dataset that relies on data from the U.S. health care system cannot be presumed optimal. Indeed, although resources limited the scope of these researchers' methodology, we hope there will be a follow-up review of the international studies of efficiency that were excluded from this review. Those studies may open new horizons for measuring and promoting efficiency and raise the bar on what we consider optimal measures of efficiency.

The Leapfrog Group defines efficiency as The AQA Alliance and NQF do: a combination of resource utilization and outcomes. In our survey, we rate efficiency of four procedures/conditions: CABG, percutaneous coronary interventions (PCI), acute myocardial infarction (AMI), and pneumonia. To assess resource utilization, Leapfrog measures severity-adjusted average length of stay inflated by readmission rate. For outcomes, we consider risk-adjusted mortality rates. The results suggest that some hospitals use few resources to achieve highest quality outcomes, some use many resources to achieve poor outcomes, and some are in between. The data pose more questions than it answers, and we encourage researchers to investigate the factors that lead to these variations in hospital quality and efficiency.

Leapfrog is committed to full transparency, so our definition and criteria for developing efficiency data are described in detail on our website. Again, we encourage researchers to consider this data or construct studies that consider both quality outcomes and resource utilization in evaluating efficiency. From our perspective, the value of the data is that purchasers can use it to understand the factors that are inflating costs and driving down quality. The nation is looking to its research community for advice and guidance for health care reform amid economic crisis; we must begin to understand better what leads to increased costs.

Given the variability in necessity and outcomes of care, it is not enough to simply report on resources used for individual services. This does nothing to pinpoint the major drivers of excess costs and waste. Moreover, an inventory of costs without a link to quality does not give purchasers or elected officials a way to reduce costs while earning the support of their constituencies. The American ethos places special value on human life and dignity and rejects the notion that cost-cutting is acceptable if doing so damages the health of a patient; thus, they expect cost-cutting to improve quality as well. Without providing information on the impact of a cost reduction on the quality of care, researchers fail to provide the quality implications necessary to make the case for change. While researchers should not be led in their inquiry by politicians or employers, they ought to shine the light of good evidence on the pathways of the leaders. Efficiency data are critical to these leaders.