Predicting Survival among Patients Listed for Liver Transplantation: An Assessment of Serial MELD Measurements

Authors


*Corresponding author: W. Ray Kim, kim.woong@mayo.edu

Abstract

We examined whether consideration of repeated model for end-stage liver disease (MELD) measurements for patients listed for liver transplantation improves predictive value beyond current MELD alone. Clinical data were extracted for all adult primary liver transplantation candidates from our institution who were listed with the United Network for Organ Sharing (UNOS) between 1990 and 1999. Serum creatinine, bilirubin, and international normalized ratio (INR) were obtained from an institutional laboratory database. Cox models were constructed using current MELD, change in MELD (Delta), and number of MELD scores to predict survival on the waiting list. Eight hundred and sixty-one patients met inclusion criteria, 639 underwent transplantation, and 80 died while waiting. A one-unit increment in current MELD imparted significant hazard ratios ranging from 1.12 to 1.19 in all models. Delta MELD was predictive of mortality univariately, but less predictive when current MELD was included, and not predictive when considered with both current and number of MELD scores. Overall, current MELD is the single most important determinant of mortality risk on the waiting list. Delta MELD is predictive of death only within 4 d of the event; however, part of this correlates with the dying process itself, thus limiting Delta MELD's utility in survival prediction models.

Introduction

In February 2002, the United Network for Organ Sharing (UNOS) adopted a new policy in which the model for end-stage liver disease (MELD) score replaced the Child–Turcotte–Pugh classification as the disease severity index to determine priorities in donor liver allocation (1,2). Currently, the most recent MELD score available for each waiting list patient is used to prioritize organs (3). One of the policy principles under which this change was introduced was to allocate livers to patients who are most likely to benefit based on liver disease severity, while avoiding futile transplantation in the most severely ill patients with predicted poor prognoses. It was also expected that further refinement and changes in the current MELD system based on continued data accrual and analyses may help improve the outcome of patients on the waiting list for liver transplantation.

Thus, research focused on monitoring waiting list mortality and refinement of the MELD scoring system is ongoing. We have demonstrated, in a cohort of adult patients with chronic liver disease who were added to the UNOS liver waiting list as status 2A or 2B between 1999 and 2001, that the MELD scoring system is able to accurately predict 3-month mortality on the waiting list (4). Additionally, Merion et al. have evaluated the accuracy of the MELD scoring system in predicting survival of patients with end-stage liver disease who have been listed for liver transplantation at their medical institution (5). Using their institutional liver transplantation database of serial MELD measurements for each patient, these researchers found that the most recent MELD score for a patient awaiting liver transplantation was significantly associated with waiting list mortality. It was also found that an increasing MELD score, estimated by the slope of the line representing the changes of MELD scores over the 30-d period preceding the most recent MELD, conferred an increased mortality risk on the waiting list, while decreasing MELD may be associated with a decrease in mortality (5).

While these results may make some intuitive sense, at least two methodological issues need to be addressed. First, an increasing MELD score may simply represent an intrinsic, irreversible component of the death process, rather than being predictive of death in the future. For example, patients in the terminal phase of their disease may be expected to have increasing daily MELD scores during the last few days of life due to progressive organ failure. Obviously, in such a case, the utility of the change in MELD as a survival prediction tool for the purpose of organ allocation is very limited. Second, Merion et al. used retrospectively collected laboratory data for calculation of the MELD scores and, thus, the potential for detection bias exists. For example, patients who present with acute illness, regardless of the status of their liver disease, will undergo frequent laboratory tests producing multiple observations of MELD scores. This may confound the significance of the change in MELD, as patients with more MELD scores are more likely to have had clinical issues leading to the increased monitoring of their MELD status. Therefore these data (i.e. frequency of laboratory test) may entail more information than MELD scores per se, such as clinical presentation and the healthcare provider's overall clinical judgment. The goal of our investigation is to conduct further analyses into whether Delta MELD improves the survival prediction afforded by the current MELD score alone, while addressing the methodologic issues in previous analyses.

Methods

Waiting list registrants and data collection

With the approval of the Mayo Foundation Institutional Review Board, all adult patients (≥16 years of age) with chronic liver disease who were registered onto the UNOS waiting list at Mayo Clinic, Rochester between February 1990 and August 1999, were identified. Patients placed on the transplantation list for fulminant hepatic failure or hepatic malignancies were excluded from the study. Patients waiting for re-transplantation were also excluded from this investigation. Identification of patients meeting the study criteria was accomplished using both the UNOS registration records and the Mayo Clinic liver transplantation database, which has been maintained and continually updated since the initiation of the institution's liver transplantation program. Patient medical records were used to extract information on each transplant candidate including demographic data, liver disease etiology, date of transplant listing, and clinical outcome while on the waiting list. Laboratory data necessary to calculate MELD scores were electronically extracted from the computerized database at Mayo laboratories. All available values of bilirubin, creatinine, and international normalized ratio (INR) for prothrombin time in study subjects were obtained. The outcomes of interest were death, liver transplantation, or removal from the list. Patients were followed from the time of transplant listing until one of these three outcomes occurred or until study closure in March 2003.

MELD score calculations

MELD scores were calculated using serum creatinine, serum total bilirubin, and the INR according to the following formula currently in use by UNOS:

image

The MELD score was calculated at multiple time points for each patient, including time of listing for liver transplantation and on each date that the patient had a new set of all three MELD laboratory parameters drawn simultaneously.

Definitions

A description of the variables used in our modeling scenarios is given below and graphically displayed in Figure 1.

Figure 1.

Illustration of time-lagged Cox regression model design. The time lag represents a period of 0–14 d immediately preceding the end of a patient's period of observation (event). The most recent MELD score preceding the time lag was designated the Current MELD. The serial MELD scores were collected over the 30-d period that preceded the date of Current MELD. Delta MELD is the difference between Current MELD and the lowest Serial MELD score measured in the preceding 30 d. The MELD Number represents the number Serial MELD scores measured in the 30-d window.

Current MELD. The most recent MELD score available for each patient. The timing of the Current MELD score depends upon the time lag used in the model.

Time lag. Time lags were included in the modeling scenarios as a means of distinguishing whether observed changes in MELD occurring very near the time of death are measurements of the agonal event itself (i.e. correlation), rather than predictive of it. As shown in Figure 1, the time lag is a period of 0, 1, 4, 7, or 14 d.

Serial MELD. All MELD scores observed for each patient during the 30-d period preceding the date of the Current MELD score.

Delta MELD. The calculated difference between the Current MELD score and the lowest of all Serial MELD scores in the preceding 30-d window. Thus, Delta MELD was defined as the maximum change in MELD over the 30-d period.

Delta MELD > 5. A dichotomous variable (i.e. yes or no) indicating whether or not the Delta MELD was greater than five points (i.e. the slope of the change in MELD being greater than 5 points over 30 d).

MELD Number. The number of Serial MELD scores obtained on each patient within the 30-d period. This was included as a surrogate marker for clinical judgment of level of overall sickness, which would contribute to detection bias.

Statistical analysis

Univariate and multivariate, time-dependent Cox regression analyses were conducted to predict survival of patients on the liver transplant waiting list. The relative risk of death on the waiting list estimated from these models is expressed as a hazard ratio (HR). All Cox models were adjusted for age, gender, race, and year of placement on the transplant waiting list. Also, disease etiology, dichotomized as cholestatic versus non-cholestatic, was considered in the analyses. The following variables were used for modeling: (1) Current MELD; (2) Delta MELD; (3) Delta MELD > 5; and (4) MELD Number.

Additionally, we hypothesized that observed changes in MELD score occurring very near the time of death may be measurements of the agonal event itself, rather than predictive of it. To test this hypothesis, the Cox models were constructed using five different time lags (0, 1, 4, 7, and 14-d) between the Current MELD score and death (Figure 1). Thus, the models with 1, 4, 7, and 14-d time lags attempt to predict waiting list deaths occurring 1, 4, 7, and 14-d or more in the future, respectively. Furthermore, models were constructed which addressed the issue of how the hazard of death is affected when a given increase in the Delta MELD occurs at a lower Serial MELD compared to a higher Serial MELD score. In other words, for example, does a given increase in MELD score confer the same risk in a patient with a Serial MELD score of 15 as in a patient with a Serial MELD score of 30?

In the Cox models, duration of follow-up extended from the date of activation for liver transplantation until death, removal from the waiting list, date of liver transplantation, or study closure. In addition, deaths occurring within 90-d of removal from the transplant waiting list were treated as deaths on the waiting list. Surviving patients who were still awaiting liver transplantation were censored at study closure in March 2003. Patients who were removed from the waiting list during the study period and survived beyond 90-d were censored on the date of removal. Liver transplantation was also censored. In the Cox regression models, when a death occurred on the waiting list, the at-risk patient population used for comparison included all liver transplantation candidates on the waiting list on the calendar date that the death occurred.

Results

Patient characteristics

Eight hundred and sixty-one patients were listed for liver transplantation at Mayo Clinic between February 1990 and August 1999 who met study inclusion criteria. Demographic characteristics of the patient population are given in Table 1. The mean age (± standard deviation, SD) of the study cohort was 50.2 (± 10.4) years, with 55% of the patients being male. The majority of patients (97%) were Caucasian. Cholestatic liver disease (35%) was the most common primary diagnosis for end-stage liver disease, followed by viral hepatitis (24%), and alcoholic liver disease (14%).

Table 1.  Characteristics of candidates on the liver transplant waiting list at Mayo Clinic, Rochester
Registration (n)861
  1. Note: Age and MELD score are given as mean (± standard deviation); laboratory values are median [interquartile range]. Values reported are summaries of patients at their activation date.

Transplanted (n)639
Age (years)50.2 ± 10.4
Gender (% males)55
Race (%)
 Caucasian97
 Hispanic1
 African American<1
 Native American1
Underlying liver disease (%)
Cholestatic liver disease35
 Viral hepatitis24
 Alcoholic liver disease14
 Other27
Laboratory values
 Serum creatinine (mg/dL)0.9 [0.8–1.2]
 Serum bilirubin (mg/dL)3.4 [2.0–7.6]
 INR1.4 [1.2–1.6]
MELD score at time of listing15.7 ± 7.5

The 861 study subjects generated 6505 sets of concurrent laboratory data for the calculation of MELD scores, with a mean of 7.5 MELD score determinations per patient. The mean MELD score (± SD) at the time of listing for liver transplantation was 15.7 ± 7.5. Of the 861 patients, 639 (74%) patients had undergone liver transplantation as of the closure of the study. Sixty-seven (8%) died on the waiting list, 53 (6%) were withdrawn from the list, and 102 (12%) were still alive on the waiting list at last follow-up. Of the 53 patients withdrawn from the list, 13 died within 3 months of the withdrawal. These 13 patients were considered to represent patients ‘too sick to transplant’ and classified as deaths.

Survival analysis

Uni- and multivariate Cox models were created based on the four variables of interest: Current MELD, Delta MELD, Delta MELD > 5, and MELD Number. Each of the models was run with 0, 1, 4, 7, and 14-d time lags. The model results are summarized in Figures 2 and 3.

Figure 2.

Hazard ratios (HR) generated from uni- and multivariate Cox regression models with 0-day time lag (i.e. no time lag). The HR for Current MELD remains statistically significant and relatively unchanged when modeled alone or with Delta MELD and number of MELD. The magnitude and significance of Delta MELD diminish when it is modeled with Current MELD and MELD Number.

Figure 3.

Hazard ratios (HR) generated from five different time-lagged multivariate Cox regression analyses modeling Current MELD and Delta MELD. The magnitude and significance of Current MELD is relatively stable in all time-lagged models whereas the significance of Delta MELD decreases with increasing time lag.

Cox models with 0-d lag

The models with a 0-d lag period utilized all MELD scores available for each patient up to the time of death, transplantation, removal from the waiting list, or date of last follow-up. First, univariate models were created evaluating the impact of the Current MELD, Delta MELD, Delta MELD > 5, and MELD Number, separately. As shown in Figure 2, Current MELD, Delta MELD, and MELD Number were significantly associated with a risk of death on the transplantation waiting list. A one-unit increase in Current MELD was associated with a HR = 1.19 [95% confidence interval (CI), 1.16–1.22, p < 0.01]. A one-unit increase in Delta MELD had a HR = 1.29 [95% CI, 1.24–1.34, p < 0.01]. MELD number conferred a HR = 1.57 [95% CI, 1.46–1.70, p < 0.01]. Delta MELD > 5 was also significantly associated with an increased risk of death on the waiting list, HR = 28.0 [95% CI, 17.0–46.2, p < 0.01].

Next, as all four of these variables were significant, multivariate models were constructed, evaluating combinations of these variables. First, we considered the effect of adding Delta MELD to Current MELD, thus asking whether knowing how the patient reached the current state is more beneficial than the current status alone. As shown in Figure 2, the HR associated with each unit increase in Current MELD was 1.15 [95% CI, 1.11–1.19], similar to the univariate HR of 1.18. On the other hand, the HR for Delta MELD was 1.10 [95% CI, 1.04–1.16], considerably smaller than its univariate counterpart, HR = 1.24, indicating that part of the effect of the Delta MELD in the univariate analysis was attributable to the Current MELD. Namely, patients who had a large increase in MELD also ended with a high Current MELD. Next, we examined the effect of adding a third variable, namely MELD Number, to the existing two (Current MELD and Delta MELD) (Figure 2). The Current MELD remained strongly significant with a HR of 1.12 [95% CI, 1.07–1.16]. The next significant variable was the MELD Number, which had a HR of 1.23 [95% CI, 1.13–1.35]. Interestingly, however, once the Current MELD and the MELD Number were in the model, the effect of Delta MELD became smaller and statistically insignificant with a HR of 1.06 [95% CI, 0.99–1.13, p = 0.06]. A similar phenomenon occurred with the variable Delta MELD > 5 when it was included in a model with both Current MELD and MELD Number (HR = 2.15 [95% CI, 0.95–4.88], 1.12 [95% CI, 1.08–1.16], and 1.23 [95% CI, 1.13–1.34], respectively). These observations indicate that patients who had frequent labs taken were at risk of death, regardless of the direction of changes in their MELD scores.

Cox models with 1, 4, 7, and 14-d lags

These uni- and multivariate models were modified with 1, 4, 7, and 14-d time lags in order to distinguish whether these MELD variables are predictive of survival or whether they merely reflect the death process (correlation). These time lags did not have a dramatic impact on the univariate models (data not shown).

The effect of the time lags on the two-variable model (including Current MELD and Delta MELD) is shown in Figure 3. Regardless of the duration of the time lag, Current MELD remained a strong predictor of death. A one-unit increase in Current MELD was associated with a HR of 1.15 [95% CI, 1.11–1.19], 1.16 [95% CI, 1.12–1.19], 1.16 [95% CI, 1.12–1.19], and 1.17 [95% CI, 1.13–1.21], corresponding to the 1, 4, 7, and 14-d time lags, respectively. However, Delta MELD showed a trend toward decreasing significance with increasing time lag, HR of 1.07 [95% CI, 1.02–1.13], 1.05 [95% CI, 0.99–1.11], 1.03 [95% CI, 0.98–1.09], and 1.07 [95% CI, 1.01–1.14].

In Figure 4, the effect of the time lags on the three-variable model (including Current MELD, Delta MELD, and MELD Number) is shown. Both the Current MELD and the MELD Number retained statistical significance, irrespective of the duration of the time lag. A one-unit increase in Current MELD was associated with a HR of 1.12 [95% CI, 1.08–1.16], 1.13 [95% CI, 1.09–1.17], 1.14 [95% CI, 1.10–1.17], and 1.15 [95% CI, 1.11–1.19], corresponding to the 1, 4, 7, and 14-d time lags, respectively. The HR associated with an increasing MELD Number was 1.30 [95% CI, 1.19–1.42], 1.34 [95% CI, 1.21–1.47], 1.35 [95% CI, 1.22–1.49], and 1.37 [95% CI, 1.21–1.56] in the 1, 4, 7, and 14-d time lag models, respectively. The effect of Delta MELD, however, became progressively lower and less significant as the time lag increased. The HR was 1.03 [95% CI, 0.97–1.09] with a 1-d lag, 1.01 [95% CI, 0.95–1.07] with a 4-d lag, 0.99 [95% CI, 0.94–1.05] with a 7-d lag, and 1.02 [95% CI, 0.96–1.09] with a 14-d lag. Similarly, the effect of Delta MELD > 5 on survival was not statistically significant in any of the multivariate time-lagged models which included the Current MELD and MELD Number (data not shown).

Figure 4.

Hazard ratios (HR) generated from five different time-lagged multivariate Cox regression analyses modeling Current MELD, Delta MELD, and MELD Number. The magnitude and significance of Current MELD is relatively stable in all time-lagged models whereas the significance of Delta MELD decreases with increasing time lag.

All Cox regression models were then repeated with adjustments made for disease etiology (cholestatic vs. non-cholestatic) and revealed no substantial changes overall in either magnitude or significance of the HRs for Current MELD, Delta MELD, or MELD Number. Finally, we addressed the question of whether a given value of Delta MELD confers the same risk of death when it occurs in the setting of a lower Serial MELD score (<30) compared to a higher Serial MELD score (≥30). In these analyses, it did not appear as though there was a decreased hazard of death associated with Delta MELD in patients with lower Serial MELD scores [HR = 1.13 (95% CI, 1.10 to 1.16) and HR = 1.01 (95% CI, 0.93 to 1.10) for Current MELD and Delta MELD, respectively in multivariate modeling of patients with Serial MELD<30, with a 7-day time lag].

Discussion

According to the current UNOS policy, the most recent MELD score available is used to prioritize donor organ allocation for patients with end-stage liver disease awaiting liver transplantation (3). In this analysis, we demonstrate that there is insufficient evidence to support the incorporation of Delta MELD in the transplant allocation policy as a predictor of waiting list mortality, as has been previously proposed (6). As discussed throughout the paper, there are at least two important issues that lead us to this conclusion.

The first is the issue of distinguishing correlation from prediction. Given that an increasing MELD score would be expected in any patient who is dying, the question arises as to how much lead time is necessary for the terminal process to be reversed by liver transplantation. This was examined by our Cox models that incorporated time lags, ranging from 1 to 14-d, between the date of death and the date of the Current MELD.

The results of the models with no time lag (0-d time lag) are instructive in comparison to previous analyses. In univariate models of Current MELD and Delta MELD, both are significantly associated with survival on the waiting list. We also examined the effect of Delta MELD > 5 and found a large effect (HR = 27). Thus, results of our univariate analyses are consistent with the findings of Merion et al. (5). When modeled together, both Current MELD and Delta MELD were significantly predictive of waiting list survival. However, the importance of Delta MELD (HR = 1.10) was diminished compared with that in the univariate consideration (HR = 1.29). In contrast, the effect of Current MELD was preserved indicating that part of the predictive value of the Delta MELD is accounted for in the Current MELD. Thus, a large increase in Delta MELD is associated with a high Current MELD.

The time-lagged multivariate models including Current MELD, Delta MELD, and MELD Number showed that the significance of Delta MELD decreases with an increasing time lag, which supports our hypothesis that part of the apparent effect of Delta MELD on predicting waiting list mortality is actually due to correlation of the Delta MELD with the death process, rather than Delta MELD being predictive of death in the future. Furthermore, when Delta MELD > 5 is modeled with Current MELD and MELD number using time lags, Delta MELD > 5 is not significantly associated with survival in any of the scenarios. It is notable that the time-lagged models using Current MELD and Delta MELD together revealed that Delta MELD is able to aid in the prediction of deaths that will occur in less than 4-d, but not further in the future. Thus, when one considers the potential of reversibility of the terminal process in the last few days of life and the length of time usually necessary to receive an organ, the very short time horizon of the information provided by Delta MELD limits its practical utility in organ allocation.

The second issue relates to the fact that this and previous analyses were based on retrospective data obtained on waiting list patients cared for in a clinical practice setting and that the timing of monitoring of MELD scores for each patient could not be controlled. The potential for a detection bias in this setting is explained in more detail in Figure 5, which describes two hypothetical patients with the same baseline MELD and who are under the care of the same physician. The case in the left panel develops a fatal complication (e.g. severe sepsis), whereas the one on right experiences less severe illness and subsequently recovers. In the first patient, the physician, sensing the gravity of the problem, obtains multiple laboratory data in the course of the illness, generating a number of observations of MELD scores. In the latter, however, the clinical constellation is less severe and laboratory data are obtained with much less frequency. Now, when a subsequent retrospective analysis is done on clinically obtained data, the first patient will have a high Delta MELD observed, which is correlated with subsequent demise of the patient. On the other hand, the identical Delta MELD in the second patient remains undetected and undocumented. Thus, the correlation between the clinician's decision to order more labs and the eventual outcome introduces a bias that will increase the apparent utility of Delta MELD. In this work, we used the number of MELD observations in a 30-d window (‘MELD Number’) as a surrogate marker for the clinician's judgment of acuity of illness.

Figure 5.

Hypothetical examples that illustrate the potential for detection bias in retrospective analyses. The eventual outcome is correlated with the probability of detecting meaningful increases in MELD (‘delta’).

Our data demonstrate that when the Delta MELD is modeled with both the Current MELD and the MELD Number, the effect of Delta MELD is no longer significant. As imperfect as the MELD Number may be as an accurate measure of clinical severity that is not necessarily measured by the intrinsic components of the Current MELD, this observation lends strong support to our hypothesis that in our retrospective analysis, Delta MELD is confounded by clinical judgment influencing the monitoring of MELD scores for individual patients.

Additionally, liver disease etiology is a relevant consideration in these analyses of the predictive value of Serial MELD scores for patients on the liver transplant waiting list inasmuch as disease etiology influences a patient's clinical course and may therefore have subtle influences on the clinician's approach to the patient. Thus in our patient population, with a disproportionate representation of cholestatic liver disease (35%), it may be suggested that the lack of significance associated with Delta MELD was due to this skew in liver disease etiology. However, when we compared our patients without cholestatic disease to the entire patient cohort, we found that the change in the hazard of death associated with Delta MELD was minimal. Therefore, we conclude that although our patient population contains an overrepresentation of cholestatic patients, this demographic issue was not a confounding factor in the analysis and does not explain the lack of strong association between Delta MELD and survival on the liver transplant waiting list.

Finally, another interesting question with regard to the survival prediction of Serial MELD scores is whether the hazard of death associated with Delta MELD is dependent upon the absolute level of MELD from which the change occurred. In other words, one might speculate that a patient whose MELD score increases 5-points from 15 to 20 would have a lower hazard of death compared to a patient with a similar MELD increase, but whose starting MELD score is higher. However, our analysis did not convincingly support such a hypothesis.

In summary, the results of the present study, based on a longitudinal database of serial MELD scores and our best effort to control for potential confounders in this retrospective data, suggest that it is premature to use Delta MELD in the organ allocation decisions. We demonstrate that overall, the Current MELD score is still the most significant parameter predictive of mortality on the liver transplantation waiting list. The effect of Current MELD remains significant after controlling for detection bias and is a useful predictor of waiting list mortality at least 14-d in advance. Therefore, our findings validate the current UNOS practice of allocating organs based on the most recent MELD score for patients awaiting liver transplantation. Although our investigation suggests that the predictive value of Delta MELD may be limited, further studies based on prospectively collected laboratory data in which the frequency of MELD measurements can be controlled may be able to address this issue more definitively.

Acknowledgments

This work was supported by a grant from the National Institutes of Health (DK-34238).

Ancillary