This work was funded by intramural grant support from the Department of Anesthesiology of Columbia University College of Physicians and Surgeons (New York, NY).
Address reprint requests to Gebhard Wagener, M.D., Department of Anesthesiology, College of Physicians and Surgeons, Columbia University, 630 West 168th Street, P&S Box 46 (PH-5), New York, NY 10032-3784. Telephone: 212-305-6494; FAX: 212-305-2182; E-mail: firstname.lastname@example.org
Early allograft dysfunction (EAD) is a serious but ill-defined complication after liver transplantation (LT). Many different factors can affect outcomes after LT. Recipient and donor characteristics as well as intraoperative and postoperative complications may precipitate graft dysfunction, which may ultimately progress to graft failure and possibly death.[1-3] An early evaluation of graft function is important for identifying patients who may benefit from timely relisting and possibly from retransplantation. Early relisting is especially crucial when donor availability is scarce.[4-6]
Multiple previous definitions of EAD have been suggested to help in identifying patients at risk for graft loss.[7-10] However, these definitions are based on arbitrarily chosen laboratory cutoff values or subjective clinical variables such as encephalopathy. Although these definitions correlate with short-term outcomes, they may not represent the best possible definition, which should maximize accuracy and predictive power.
The aim of the present study was to determine the ability of various postoperative laboratory values and the postoperative physiological Model for End-Stage Liver Disease (MELD) score to predict early graft loss and mortality. We further compared these results to 2 previously published definitions of EAD[7, 8] and devised a definition that maximizes sensitivity and specificity for predicting early graft loss and mortality.
PATIENTS AND METHODS
All adult patients undergoing cadaveric LT at Columbia University Medical Center were eligible for inclusion in this study. The study was approved by the institutional review board of Columbia University, and the requirement to obtain informed consent was waived for the retrospective analysis of existing data.
Basic demographic data were collected. The international normalized ratio (INR) and the total bilirubin, aspartate aminotransferase (AST), alanine aminotransferase (ALT), and serum albumin levels were routinely determined before transplantation and then at least daily postoperatively for the duration of the hospital stay. Physiological MELD scores were calculated daily on the basis of these laboratory values. The peak values of the INR, total bilirubin level, AST level, and MELD score and the nadir serum albumin levels on postoperative days 0 to 7 were compared.
Early Graft Failure (EGF)
Our primary endpoint was EGF, which was defined as mortality or retransplantation within 90 days of transplantation. Mortality data were collected from the Social Security Death Index database, which was provided by Clinical Information Systems at New York–Presbyterian Hospital, and other databases.
We compared our results to previously used definitions of EAD:
Either a total bilirubin level > 10 mg/dL or an INR > 1.6 on postoperative day 7 or an AST or ALT level > 2000 IU/L within the first 7 days (Olthoff et al.).
A peak total bilirubin level > 10 mg/dL on postoperative days 2 to 7. We removed 2 categories that Deschênes et al. used: encephalopathy and a prothrombin time greater than 17 seconds. Encephalopathy was not defined by Deschênes et al. and could, therefore, not be reproduced. The prothrombin time needs to be corrected for the laboratory norm and expressed as INR, and these data also were not provided by Deschênes et al.
Comparisons and correlations between groups were made with an unpaired t test or Pearson's test for correlations between values with a Gaussian distribution and with the Mann-Whitney (Wilcoxon rank) test or Spearman's test for correlations between continuous variables without a normal distribution. A Gaussian distribution was determined with the Kolmogorov-Smirnov test. Categorical data were compared with the χ test or Fisher's exact test. P values were 2-tailed, and P < 0.05 was considered significant.
Receiver operating characteristic (ROC) curves were plotted, and the point on the ROC curve closest to sensitivity = specificity = 1 was considered the best cutoff value. The areas under the curves (AUCs) of the ROC curves were estimated with the Mann-Whitney statistic with associated Wald 95% confidence intervals (CIs). The 2 largest AUCs of the ROC curves on each postoperative day were compared with the method described by Hanley and McNeil.
SAS 9.1 (SAS, Inc., Cary, NC), SPSS 11.0.4 (SPSS, Inc., Chicago, IL), and GraphPad Prism 4.0 (GraphPad Software, San Diego, CA) were used for the statistical analysis.
Patient and Graft Survival
LT was performed 987 times between January 2001 and May 2009 at Columbia University Medical Center. One hundred eighteen pediatric patients and 218 patients with incomplete records were excluded, and the remaining 651 patients were enrolled in this study. Seventy-eight cases of living donor LT (12.0%) and 1 domino transplant were also excluded. Therefore, 572 cadaveric LT patients were enrolled. The main cause of liver failure was hepatitis C cirrhosis, which accounted for 297 LT recipients (51.9%).
Thirty-eight patients (6.6% of all transplants) had EGF, which was defined as either retransplantation (n = 4) or death (n = 34) within 90 days of transplantation. Retransplantation for 4 patients occurred 12 ± 16 days after the primary transplant and was due to primary graft failure in all cases. Thirty-four patients died 43 ± 23 days after the primary transplant, and the cause of death in the majority of these patients was sepsis and multiorgan system failure. One patient died because of a large intracranial hemorrhage associated with graft failure 39 days after transplantation.
Variables Associated With Graft Failure or Death
Patients with EGF were more likely to have autoimmune hepatitis as an indication for transplantation. These patients had higher preoperative total bilirubin levels, INRs, and MELD scores, but they did not have higher preoperative serum creatinine, AST, or ALT levels. Patients with EGF were also more likely to have undergone LT previously or to have received a graft from older donors (Table 1). Donor risk indices, however, were not different between the groups. Eight of the 38 patients (21.1%) with EGF required continuous venovenous hemodialysis postoperatively, whereas 34 of the 534 patients (6.4%) without EGF did (P = 0.003).
Table 1. Demographics
No Graft failure
Mean ± SD or n (%)
n = 572
n = 38 (6.6 %)
n = 534 (93.4 %)
To convert serum creatinine from mg/dL into mcmol/L, multiply with 88.4
54.3 ± 10.2
56.1 ± 9.0
54.2 ± 10.3
Sex = female (%)
Body mass index – kg/m2
27.9 ± 5.3
28.9 ± 5.1
27.9 ± 5.4
Acute liver failure
Dialysis prior to transplant
2.25 ± 1.67
2.76 ± 2.03
2.21 ± 1.64
Preoperative serum creatinine - mg/dL
1.37 ± 0.97
1.65 ± 0.84
1.35 ± 0.98
Preoperative total bilirubin - mg/dL
8.18 ± 9.64
14.89 ± 11.66
7.79 ± 9.36
Preoperative MELD score
21.8 ± 11.0
28.4 ± 12.1
21.3 ± 10.8
Cold ischemic time - hours
8.2 ± 3.4
7.7 ± 2.7
8.2 ± 3.5
Warm ischemic time - minutes
43.9 ± 10.6
42.5 ± 9.0
44.0 ± 10.7
Donor age - years
46.4 ± 17.7
52.5 ± 16.3
46.0 ± 17.7
Donor weight – kg
78.4 ± 21.7
76.7 ± 18.6
78.4 ± 21.9
Donor height – cm
170.8 ± 11.2
169.3 ± 9.6
170.8 ± 11.3
35.1 ± 201.9
26.5 ± 5.1
35.8 ± 26.7
Donor risk index
1.69 ± 0.49
1.75 ± 0.48
1.68 ± 0.49
Dialysis at hospital discharge
Re-transplant within 90 days
Mortality or Re-Tx within 90 days
Mortality within 1 year
Re-transplant within 1 year
Mortality or Re-Tx within 1 year
Postoperative Laboratory Values
The MELD score increased on postoperative day 0 and then decreased continuously until postoperative day 7 for all patients (Fig. 1). Patients with EGF had higher total bilirubin levels, INRs, and MELD scores at all time points after surgery. AST was significantly higher from postoperative days 1 to 5 in patients with EGF, and serum creatinine levels were higher from postoperative days 3 to 7. There was no difference in the serum albumin levels at any time point before or after surgery between patients with EGF and patients without EGF (Fig. 2).
We plotted an ROC curve for each postoperative day and each laboratory value (peak values of the INR, total bilirubin level, serum creatinine level, AST level, and MELD score and nadir serum albumin levels) to evaluate the predictive power for EGF. Figure 3 depicts the AUCs and their 95% CIs for each postoperative day. The nadir albumin level was not predictive of EGF at any time point. AST was able to predict EGF only on postoperative days 1 to 5. The remaining laboratory values were all significant predictors of EGF at all time points after LT (as were the preoperative laboratory values)
The best predictor was the MELD score on postoperative day 5 with an AUC of 0.812 (95% CI = 0.739-0.886, P < 0.001).
We compared the statistical significance of the differences between the AUCs of the ROC curves of the 2 variables with the highest AUC on each day after surgery with the method described by Hanley and McNeil. There were no statistically significant differences between the 2 largest AUCs of the ROC curves on any day after surgery. Therefore, the MELD score on day 5 had the largest AUC of the ROC curve, but this was not statistically significant (Table 2).
Table 2. Comparison of the two largest areas under the curve (AUC) of the receiver operator characteristics (ROC) curves of each postoperative day to predict of graft failure or death within 90 days
The best cutoff value for the MELD score on postoperative day 5, which was defined as the point on the ROC curve closest to sensitivity = specificity = 1, was 18.9 (Fig. 4).
Comparison of the MELD Score With Other Definitions of EAD
Thirty of the 176 patients with a MELD score > 18.9 on postoperative day 5 either were dead or required retransplantation (17.0%), whereas this was true for only 8 of the 396 patients (2.0%) with a MELD score ≤ 18.9 on postoperative day 5. Figure 5 illustrates Kaplan-Meier survival curves for patients with MELD scores ≤ and > 18.9 on postoperative day 5.
This correlated to 79.0% sensitivity (95% CI = 62.7%-905%) and 72.7% specificity (95% CI = 68.7%-76.4%) for the ability of a MELD score > 18.9 on postoperative day 5 to predict graft failure with a likelihood ratio [sensitivity/(1 − specificity)] of 2.88. A MELD score > 18.9 on day 5 had a higher predictive power than any of the other definitions of early graft dysfunction: Olthoff et al.'s definition (either a total bilirubin level > 10 mg/dL or an INR > 1.6 on postoperative day 7 or an AST or ALT level > 2000 IU/L within the first 7 days) and a peak total bilirubin level > 10 mg/dL on postoperative days 2 to 77 (Table 3).
Table 3. Comparison of the best cutoff of MELD score on postoperative day 5 with other definitions of early allograft dysfunction
MELD Score >18.9 on Day 5
Total Bilirubin > 10 mg/dL on Days 2–7
Olthoff: - defintion of early allograft dysfunction by Olthoff et al.: either total bilirubin > 10 mg/dL or INR > 1.6 on postoperative day 7 or AST or ALT > 2000 IU/l within the first 7 days.
Graft failure or death / 90 days
No Graft failure death / 90 days
Chi2 / P value
0.627 to 0.905
0.569 to 0.866
0.514 to 0.825
0.687 to 0.764
0.573 to 0.658
0.706 to 0.782
Positive Predictive Value
0.118 to 0.234
0.081 to 0.169
0.108 to 0.226
Negative Predictive Value
0.961 to 0.991
0.946 to 0.986
0.949 to 0.985
Changes in Laboratory Values and Graft Failure
The postoperative changes in the AST, ALT, serum creatinine, total bilirubin, and albumin levels, INR, and MELD score (in comparison with preoperative values) at any time point were less predictive of graft failure than the absolute values. Of these, the change in the MELD score on postoperative day 0 had the highest predictive value with an AUC of the ROC curve of 0.715 (95% CI = 0.587-0.842, P < 0.001). Other changes in laboratory values (from the preoperative period to the postoperative period) that were significant predictors were the ALT levels on days 1 to 4 (AUC of the ROC curve = 0.652-0.686) and the delta serum creatinine value on day 0 (AUC of the ROC curve = 0.625).
Predicting Graft Failure Early After Transplantation Within the First 3 Days
Between postoperative days 0 and 2, the total bilirubin level on day 2 was the best predictor of graft failure. The AUC of the ROC curve was 0.809 (95% CI = 0.742-0.877, P < 0.001). The best cutoff for total bilirubin on day 2 was 6.55 mg/dL with a sensitivity of 72.5% and a specificity of 70.4%. The total bilirubin level on day 1 (AUC of the ROC curve = 0.804, 95% CI = 0.734-0.873, P < 0.001) was the second best predictor, and this was followed by the MELD score on day 2 (AUC of the ROC curve = 0.777, 95% CI = 0.694-0.859, P < 0.001).
This study demonstrates that 90-day graft and patient survival can be predicted with good accuracy from laboratory and calculated values within the first week after LT. The physiological MELD score on the fifth postoperative day was the best predictor of 90-day death or graft failure and outperformed 2 commonly used definitions of EAD. Within the first 3 days after transplantation, total bilirubin levels were a better predictor of 90-day death or graft failure than the MELD score.
The MELD score was introduced as a predictor of survival after transjugular intrahepatic portosystemic shunts and is now almost universally used as a tool for allocating LT grafts.[11, 17-19] A pioneering risk assessment tool, it was rapidly adapted because it contains all relevant variables of liver disease: total bilirubin and INR assess the metabolic and synthetic function of the liver, respectively, and serum creatinine represents extrahepatic manifestations of liver disease.
The MELD score is extensively used preoperatively, but little is known about the course of the MELD score after LT. We have demonstrated that the MELD score decreases rapidly within days after a small increase immediately after transplantation. Postoperative liver dysfunction, indicated by persistently elevated MELD scores almost a week after LT, was closely associated with 90-day graft and patient survival, and this was better than any of the previously used definitions of early graft dysfunction. Causes of EAD are often multifactorial, and graft loss may be due to a combination of patient comorbidities, poor graft quality, technical problems, and early postoperative complications such as infection and renal failure.[10, 20] The MELD score has the unique ability to incorporate many of these problems and reflect the overall status of the graft and the patient. EGF and mortality are likely caused by either poor graft function (which will result in elevations of the total bilirubin level and INR) or infectious complications. These are commonly interrelated because sepsis worsens liver function and liver failure increases the susceptibility for infection and sepsis. Both liver failure and sepsis can cause systemic inflammatory response syndrome and result in acute kidney injury. Any of these conditions will result in an increase in the postoperative MELD score. We did not include sepsis as a predictive variable because we wanted to restrict this analysis to objective laboratory values. Infectious laboratory variables such as white blood cell counts were not included because they are too nonspecific in immunosuppressed patients. Death after LT is almost always multifactorial, and the MELD score includes not only graft-specific variables but also serum creatinine as an extrahepatic variable. The ability of the MELD score to predict early death and graft loss may, therefore, be more evidence that the MELD score is a marker of the severity of disease and not necessarily specific to graft dysfunction.
Automatic calculation of MELD scores by, for example, the hospital laboratory system could, therefore, provide important and pertinent additional information beyond any single laboratory variable and facilitate postoperative risk assessment and management. Clearly, elevations of the MELD score should not be the only variable that determines whether a patient should be relisted for transplantation, but an increased postoperative MELD score could alert clinicians that a graft might be at risk. Larger multicenter studies will be required before the postoperative MELD score can be integrated into clinical practice. The close association of the postoperative MELD score and graft/patient survival may also justify its use as a substitute endpoint in clinical trials.
Surprisingly, changes in laboratory results in comparison with preoperative values (delta laboratory values) were much weaker predictors of outcomes than absolute postoperative values. We expected an improvement in laboratory values to better reflect the recovery of the patient than absolute values. For example, persistently high postoperative levels of total bilirubin are often considered innocuous as long as they are decreasing. Our data do not support this stance: high absolute laboratory values were much better predictors than the trend of these values. With adequate graft function, laboratory values of liver and renal function, therefore, should recover rapidly after transplantation, and high absolute levels only a few days after surgery should be a cause for concern.
Variables that reflect hepatic injury such as AST and ALT were only weak predictors even early after transplantation. Although very high transaminase levels in the immediate postoperative period are frequently a cause for apprehension,[21, 22] our data do not support drastic action such as early relisting for isolated elevations of AST or ALT levels because these were only weak predictors for graft loss. Serum albumin, which is included in some older definitions of early graft dysfunction as part of the Child-Pugh score, was not predictive at all and should probably not be included in postoperative risk assessment.
The MELD score on postoperative day 5 was the best predictor for graft loss; however, a much earlier assessment is required when one is weighing the decision to list a transplant recipient for retransplantation. During those crucial 1 to 3 days after transplantation, total bilirubin outperformed the MELD score as a predictor of graft failure and patient mortality. Early high levels of total bilirubin should, therefore, be a cause for concern, and listing for retransplantation may need to be considered.
The AUC of the ROC curve for the MELD score on postoperative day 5 was the largest; however, this was not statistically significant. There are few established statistical tests for comparing AUCs of ROC curves; frequently, the AUCs of different predictive tests are not compared with a statistical test, and the test with the larger AUC is considered the better test.
We compared the estimates of the AUCs of the ROC curves regardless of the overlapping of CIs or statistical tests of the differences between the AUCs. Although there are statistical tests that compare the estimates of AUCs of different ROC curves, these tests are probably not appropriate when nested models are used. Because the MELD score includes total bilirubin, serum creatinine, and INR, using the MELD score with the other variables would require a nested model comparison. A direct comparison of the overlap of CIs may be an easier method; however, this technique has frequently been criticized as too conservative, and conventionally, the test with the higher estimate of the AUC of the ROC curve is considered the test with the higher accuracy. We, therefore, consider the MELD score on day 5 as the test with the highest accuracy in comparison with the other variables.
We limited the number of variables in this study to those that were included in other definitions of EAD. The addition of other laboratory results that are not directly associated with liver function or injury would likely not change the results of this study. A comparison with general intensive care unit predictors of mortality such as the Acute Physiology and Chronic Health Evaluation score and the Sequential Organ Failure Assessment score could be very interesting but is beyond the scope of this study. We also did not include preoperative or donor variables. The inclusion of these variables may improve the ability of this model to predict graft and patient loss. However, the aim of the present study was to derive a statistically more powerful (laboratory) definition of early graft dysfunction that is easily calculated and can be used regardless of the center or diagnosis. Future studies may be able to use this definition (in combination with preoperative or donor variables) as a substitute endpoint for hard outcomes such as graft and patient loss.
Our study was further limited by a broad primary endpoint. We used 90-day graft or patient survival as an endpoint and did not specify the cause of death or graft loss. Liver failure is a systemic disease, and patients often die of multiorgan failure or infection that is ultimately related to a dysfunctional liver. By the time of death or retransplantation, extrahepatic organ failure may be dominant but still attributable to a failing liver. We arbitrarily chose 90 days as a cutoff because we estimated that graft dysfunction would lead to death or retransplantation within this time period. Later graft failure may be more closely related to other complications such as rejection.
We have been able to prove the utility of the MELD score beyond the preoperative period. The MELD score after LT is a useful marker and may help us to detect patients at risk for graft failure. The close correlation between the postoperative MELD score and hard endpoints such as graft failure will further allow us to use the postoperative MELD score as a substitute endpoint in clinical studies. Larger multicenter studies, however, are required to confirm the results of our study before the MELD score can be used routinely as a clinical tool. We, therefore, recommend calculating MELD scores for all patients after LT.