A typical patient of mine will have a better outcome if his or her liver transplant is performed at a higher volume program. That fact has been known for a number of years, and it seems to make sense to most people. After all, practice makes perfect, right? Unfortunately, it is not so simple in the real world. Not all patients are typical, very few donor organs are truly average in quality, and the performance of all transplant programs with a certain volume of activity is not uniform.
Donor quality, an elusive concept, is partially quantified by the donor risk index (DRI) for livers recovered from deceased donors.1 The DRI is a composite measure that considers the following: the donor's age, race/ethnicity, and height; the cause and type of death (eg, donation after cardiac death); the type of graft (whole versus partial); the donor's location with respect to the local organ procurement organization service area; and the cold ischemia time. On average, for the typical patient, and at the typical liver transplant program, a higher DRI is associated with worse outcomes, and a lower DRI is associated with better outcomes.
Lately, we have been experiencing a form of Lake Wobegon syndrome: all donated organs seem to be worse than average. Of course, this is not mathematically possible. However, the reverse is true: the average organ is getting worse, at least according to DRI measurements. This is not surprising because donor age is a prominent DRI component, and the donor population is aging along with the rest of us.
How well are we doing with these higher DRI organs, and are higher volume programs particularly good at getting better results with them? In this issue of Liver Transplantation, Ozhathil et al.2 examine these questions and offer their take on the answers. They analyzed data from a cohort of US deceased donor liver transplants (2002-2008). They split the cohort according to the calculated DRIs: half had a DRI greater than 1.90, and the other half had a DRI less than or equal to 1.90. The heart of the analysis was focused on the half-cohort of 15,668 so-called high-DRI transplants, which perhaps would have been better termed higher DRI transplants.
There were not large differences between the high-, medium-, and low-volume programs. The average unadjusted probabilities of a functioning liver transplant after 5 years were 60.3% at low-volume programs, 60.6% at medium-volume programs, and 62.6% at high-volume programs. The corresponding probabilities of being alive were 64.7%, 66.8%, and 68.3%. In all likelihood, the ranges of these results overlapped across the 3 program volume tertiles, although these data are not reported. Moreover, it is likely that some individual low- or medium-volume programs achieved better results than certain high-volume programs did, as previously reported by Axelrod et al.3 Programs should not be judged purely on the basis of their volume.
Multivariate statistical models were used to adjust for differences in patient and program characteristics that might have affected the outcomes of interest and to gauge relative outcomes. This is the “compared to what” question. A higher DRI within the higher DRI half-cohort was associated with significantly higher risks of graft failure and patient death in comparison with a lower DRI (again within the higher DRI half-cohort). A higher annual center volume was also associated with significantly higher risks of graft failure and death. Unfortunately, some of the factors in the final models were not significant, and the goodness of fit was not reported. Although this limits our ability to interpret the results, we can still accept that higher volume programs on average achieve better results than lower volume programs, even when the focus is on the 50% of donor organs that are in the upper half of the DRI distribution.
The key question, however, is whether higher volume programs do an especially good job with truly high-DRI organs in comparison with medium- and low-volume programs. This critically important issue is not addressed by Ozhathil et al.2 There are methods that could have been applied; they could have started with a statistical test of the interaction between the program volume and the DRI. In other words, are the effects of the program volume and the DRI more than simply additive and independent? An affirmative answer would have suggested that higher volume programs have identifiable attributes or practice patterns that (1) mitigate the adverse effects of higher DRI organs and (2) distinguish them from lower volume programs in ways unrelated to their volume.
The authors further posit that greater experience with higher DRI organs in higher volume programs might lead to improved outcomes by a mass effect.2 This kind of learning curve is well established for living donor liver transplantation4 and could be involved here. A learning curve effect could be evaluated by the assignment of a high-DRI case experience number to each patient. The first patient receiving a donor liver with a DRI > 1.90 at a given program would be labeled high-DRI case 1, the second patient would be labeled high-DRI case 2, and subsequent patients would be labeled in the same fashion. The high-DRI case number would then be tested as a covariate in the predictive model. A significant result would support the hypothesis that increasing experience with high-DRI livers is associated with better outcomes independently of the overall program volume.
Yet another approach would test whether all the patients at a given program benefit if the program has a higher median DRI. With an instrumental variable approach,5 each program's median DRI would be assigned to each of the program's patients in the analysis. If the median DRI of a program is a useful instrument, it should be significant in the model and trump each patient's individual DRI.
In an era of inexorably rising DRIs, we have a pressing responsibility to look for ways to improve the results of liver transplantation. Rather than just examining an ever-longer list of patient and donor characteristics, we must start to look at the health care delivery system surrounding the patient. The Dialysis Outcomes and Practice Patterns Study is a good example of this approach.6 Dialysis facilities are scrupulously surveyed to identify practices that may portend better or worse outcomes. When significant associations between facility practices and outcomes are identified, this kind of study serves a useful hypothesis-generating function and permits the formulation of quantitatively developed markers of best practices.
In liver transplantation, we know a lot about the patients and their donors. It is now time for us to deconstruct the center effect and focus on our programs and our practices as determinants of outcomes after liver transplantation.