Assessing severity of disease in patients with alcoholic hepatitis (AH) is useful for predicting mortality, guiding treatment decisions, and stratifying patients for therapeutic trials. The traditional disease-specific prognostic model used for this purpose is the Maddrey discriminant function (DF). The model for end-stage liver disease (MELD) is a more recently developed scoring system that has been validated as an independent predictor of patient survival in candidates for liver transplantation. The aim of the present study was to examine the ability of MELD to predict mortality in patients with AH. A retrospective cohort study of 73 patients diagnosed with AH between 1995 and 2001 was performed at the Mayo Clinic in Rochester, Minnesota. MELD was the only independent predictor of mortality in patients with AH. MELD was comparable to DF in predicting 30-day mortality (c-statistic and 95% CI: 0.83 [0.71-0.96] and 0.74 [0.62-0.87] for MELD and DF, respectively, not significant) and 90-day mortality (c-statistic and 95% CI: 0.86 [0.77-0.96] and 0.83 [0.74-0.92] for MELD and DF, respectively, not significant). A MELD score of 21 had a sensitivity of 75% and a specificity of 75% in predicting 90-day mortality in AH. In conclusion, MELD is useful for predicting 30-day and 90-day mortality in patients with AH and maintains some practical and statistical advantages over DF in predicting mortality rate in these patients. MELD is a useful clinical tool for gauging mortality and guiding treatment decisions in patients with AH, particularly those complicated by ascites and/or encephalopathy. (HEPATOLOGY 2005;41:353–358.)
Alcoholic hepatitis (AH) is an acute, inflammatory syndrome associated with significant morbidity and mortality that occurs in a subset of patients that consume excessive amounts of alcohol.1 Mild forms of AH improve with conservative management, whereas more severe AH is associated with substantial mortality.2 Pharmacological therapies including corticosteroids and pentoxifylline have been proposed for patients with more severe disease.3–6 Consequently, the a priori identification of subsets of patients with significant disease who will not improve with conservative therapy is an area of active investigation.
The Maddrey discriminant function (DF) (DF = 4.6 × [prothrombin time (PT) in seconds − control PT)] + serum bilirubin in mg/dL) was introduced in 1978 as a tool for predicting risk for mortality in AH and thereby identifying a subset of patients that may benefit from intervention with corticosteroids.7 Based on these analyses, corticosteroid treatment is advocated by many clinicians for patients with a DF score of more than 32, because these patients appear to have a mortality rate exceeding 50% in the absence of pharmacological intervention.4, 8 This criterion was also used in a recent clinical trial evaluating the potential clinical efficacy of pentoxifylline in AH.5 However, use of a DF score of more than 32 for intervention has some drawbacks: (1) DF uses the PT, a variable that is poorly standardized across different laboratories; (2) initial validation of DF relationship to mortality is based on patient cohorts from several decades past7; and (3) patients with a DF score of more than 32 may still have a notable risk of death of up to 17%.9–11 In addition to the DF, other studies have identified clinical variables, most notably hepatic encephalopathy, as a risk factor for mortality in AH as well. However, these models rely heavily on subjective parameters, which render them less optimal.3, 12
The model for end-stage liver disease (MELD) is a survival model based on a composite of three laboratory variables: serum creatinine, serum bilirubin, and international normalized ratio (INR) for PT. The model was originally derived from a cohort of 231 patients to assess the short-term prognosis of patients with cirrhosis undergoing elective transjugular intrahepatic portosystemic shunt at four centers in the United States.13 This model was subsequently validated as an independent predictor of survival in several independent cohorts of patients with cirrhosis.13–16 Because of the ability of MELD to accurately stratify patients according to mortality risk, it has now replaced the Child-Turcotte-Pugh score to prioritize and rank organ allocation of cadaveric livers for transplantation on the United Network for Organ Sharing liver waiting list.14–16 The variables in MELD include PT and bilirubin, which are also included in the DF score; however, the bilirubin and PT are weighted appropriately based on extensive validation studies and are expressed as logarithm values to avoid extreme values, unduly influencing the results. Moreover, MELD—but not DF—includes serum creatinine as a variable. Elevated serum creatinine has been shown to be associated with poor outcome in patients with AH.17 Because of the demonstrated capacity of MELD to discriminate patients with cirrhosis—and in conjunction with the aforementioned limitations in current disease-specific models used to predict mortality in patients with AH—the aim of the current study was to develop, characterize, and validate a MELD-based strategy to predict mortality in AH.
AH, alcoholic hepatitis; DF, Maddrey discriminant function; PT, prothrombin time; MELD, model for end-stage liver disease; INR, international normalized ratio; c-statistic, concordance statistic.
Patients and Methods
This was a Mayo Clinic Institutional Review Board–approved retrospective cohort study of patients with a diagnosis of AH (ICD-9 code 571.1) who were seen at the Mayo Clinic Rochester inpatient or outpatient facilities between January 1, 1995, and December 31, 2001. In all potential cases, inpatient and outpatient records were reviewed and demographic, clinical, and laboratory data were extracted. The presence of acute AH was confirmed via clinical and laboratory criteria that included the following: (1) alcohol consumption within 2 months and exceeding 40 g/d for male and 20 g/d for female patients; (2) an aspartate/alanine aminotransferase ratio above 1.5 with an aspartate aminotransferase level above 45 U/L; (3) a total bilirubin level above 2 mg/dL; and (4) absence of an alternative primary cause of liver disease based on clinical history and serological studies (11 patients with AH had underlying viral hepatitis but were not excluded because the clinical basis of the admission/visit was due to AH). These threshold levels allowed for the inclusion of patients with both mild and severe AH. A total of 182 patients were diagnosed with AH based on ICD coding. After reviewing the medical record, 98 of these patients met the entry criteria of the study as outlined above. Of these 98 patients, 73 had all the requisite laboratory data (PT, INR, serum creatinine, and serum total bilirubin within 24 hours of presentation) and comprised the final patient sample. The first available laboratory tests within 24 hours of presentation were used to calculate baseline MELD and DF scores. Laboratory data to calculate MELD on day 7 following admission were available in 27 patients. The change in MELD—day 7 MELD − MELD on admission—was defined as ΔMELD. For patients with more than one episode of AH in the time period, only the initial episode was included. Presence or absence of ascites and encephalopathy were based on physical examination findings described in the chart by the primary care physician. Survival was verified with hospital record and social security death index.
The statistical end point was death due to any cause within 90 days of the hospital admission. Univariate logistic regression was used to screen the variables reported in Table 1 for associations with 90-day mortality. Variables that were statistically significant formed a pool of potential independent predictors that are reported in Table 2. These predictors were entered into a backward elimination variable selection procedure (logistic regression); the criteria for retaining predictors was a P value less than .05. This variable selection process was repeated using forward variable selection procedure. Concordance (range, 0.0-1.0) is equivalent to the area under the receiver operating characteristic curve and quantifies the prognostic validities of variables. The concordance (c) statistics of MELD and DF were compared using Delong's test.18 Thirty-day c-statistics of MELD and DF were also calculated for comparison, because this was the time point for which the DF score was originally derived.7 The statistical program used was SAS version 8.0 (Cary, NC). MELD was calculated using the following formula: MELD = 9.57 × loge (Cr mg/dL) + 3.78 × loge (bili mg/dL) + 11.20 × loge (INR) + 6.43.16 The probability of 90-day mortality in AH was calibrated using the data from logistic regression (P = e(−4.3 + 0.16 × MELD) / [1 + e(−4.3 + 0.16 × MELD)]).
Table 1. Demographic, Clinical, and Laboratory Variables Analyzed With Univariate Logistic Regression
History of Present Illness
Past Medical History
NOTE. Values reported are percentage or mean ± standard deviation of patients with the finding.
Abbreviations: BMI, body mass index; ALT, alanine aminotransferase; AST, aspartate aminotransferase; WBC, white blood cell.
Age (47 ± 10)
Amount of alcohol consumption (139.8 ± 141.1 g/d)
Fever (13.4% ≥38°)
Total bilirubin (12.1 ± 12.0 mg/dL)
Sex (31.5% female)
Coronary artery disease (8.2%)
Creatinine (1.1 ± 1.0 mg/dL)
BMI (28.1 ± 6.0)
Renal insufficiency (8.2%)
ALT (68.6 ± 61.9 U/L)
Diabetes mellitus (2.7%)
AST (244.3 ± 218.7 U/L)
Abdominal pain (27.4%)
Concomitant viral hepatitis (15.0%)
Albumin (6.6 ± 11.6 g/dL)
Weight loss (28.8%)
Spider angioma (50.7%)
Platelet count (161.0 ± 97.2 × 109/L)
Palmer erythema (17.8%)
WBC count (12.3 ± 13.7 × 109/L)
Gastrointestinal bleeding (21.9%)
Hemoglobin (11.4 ± 2.4 g/dL)
Concurrent acetaminophen use >1 g/d (4.1%)
Abdominal veins (5.5%)
INR (1.5 ± 0.4 s)
Admission MELD (17.0 ± 9.1)
Admission DF (34.5 ± 28.4)
7-day MELD (23.6 ± 9.5)
ΔMELD (0.77 ± 5.7)
Table 2. Risk Factors Associated With 90-Day Mortality
Patient Demographics and Variables Associated With Mortality on Univariate and Multivariate Analysis.
Demographic, clinical, and laboratory variables from the initial patient presentation retrieved from the patient record and analyzed with univariate logistic regression are outlined in Table 1. The patient cohort consisted of 73 patients with a median age of 47 years (range, 24-65), with 16 deaths occurring in this group by 90 days. Thirty-three patients had cirrhosis based on histology or imaging. All biopsies were consistent with alcoholic hepatitis. Eleven patients had liver-relevant diagnoses concomitant to AH, including hepatitis C virus, hepatitis B virus, and recent acetaminophen consumption. These patients were not excluded because the reason for their current illness, based on the hospital record and the retrospective chart review, was solely due to alcoholic hepatitis. The vast majority of the patient cohort did not receive active pharmacotherapy, consistent with standard practice at our institution; however, 12 patients did receive active therapies including corticosteroids7 and etanercept in the context of a clinical trial.19 Table 2 depicts the variables that were significantly associated with 90-day mortality on univariate analysis, which consisted largely of components that comprise the MELD and DF (INR, creatinine, bilirubin), in addition to specific physical examination findings (ascites, edema, encephalopathy) and a history of gastrointestinal bleeding or renal insufficiency. Notable variables that were not associated with mortality on univariate analysis included ΔMELD, underlying viral hepatitis, and presence of cirrhosis. Treatment with an active medication did not influence survival when corrected for MELD. Because INR, creatinine, and bilirubin are components of both DF and MELD, they were excluded from the subsequent multivariate analysis. In multivariate logistic regression, both backward elimination and forward selection procedures selected MELD as the independent predictor of 90-day mortality rate, and no additional variables increased its accuracy. Table 2 shows that when the six remaining variables were added to MELD one at a time using forward selection, all six variables lost significance, and the combined (c) statistics did not significantly improve compared with MELD alone.
Validation and Calibration of MELD as an Independent Predictor of Mortality.
Because MELD emerged as the only independent predictor of mortality in the multivariate analysis, we next sought to further validate and calibrate MELD to predict mortality in AH. First, the probability of 90-day mortality in AH was calibrated using the data from logistic regression (P = e(−4.3 + 0.16 × MELD) / [1 + e(−4.3 + 0.16 × MELD)]). Figure 1 depicts the plotted curve demonstrating the estimated mortality for a given MELD score using this data. Next, receiver operating characteristic curves were generated to compare MELD with DF for 30-day mortality, the time point for which the DF was originally derived, as well as to examine the prognostic use of MELD for predicting an extended 90-day mortality. For 30-day mortality (Fig. 2A), the c-statistic was 0.83 (95% CI: 0.71-0.96) for MELD and 0.74 (95% CI: 0.62-0.87) for DF, with the optimal cut points of 22 and 41 for MELD and DF, respectively. For 30-day mortality using the optimal cut point, MELD had a sensitivity of 0.75 and a specificity of 0.75, while DF had a sensitivity of 0.75 and a specificity of 0.69. For 90-day mortality (Fig. 2B), the c-statistic was 0.86 (95% CI: 0.77-0.96) for MELD and 0.83 (95% CI: 0.74-0.92) for DF, with the optimal cut points of 21 and 37 for MELD and DF, respectively. For 90-day mortality using the optimal cut point, MELD had a sensitivity of 0.75 and a specificity of 0.75, while DF had a sensitivity of 0.88 and a specificity of 0.65. The differences in c-statistic between MELD and DF were not statistically significant for 30-day or 90-day mortality. Furthermore, the addition of other variables on physical examination and biochemical testing that emerged from the univariate analysis did not significantly increase the c-statistic achieved by MELD alone in predicting mortality (Table 2).
These analyses suggest an equivalency of MELD to DF rather than a substantive advantage of one of the parameters. To further address the question of whether MELD scores may diverge from DF in some select patients, we plotted MELD and DF scores in patients that survived and patients that died at 30 days (Fig. 3). Visual inspection of this plot demonstrates that while DF largely correlates with MELD at lower values, at higher values, many patients have disproportionally higher MELD score compared to DF. Among these patients, deaths appear to track more closely with MELD rather than DF.
Ascites and Encephalopathy as Predictors of Mortality.
Because the physical examination findings of ascites and encephalopathy were significantly associated with mortality in the univariate analysis and are routinely assessed in patients with AH, these two variables were further examined as predictors of 90-day mortality. Table 3 shows the 90-day mortality of patients who evidenced presence and absence of ascites and/or encephalopathy. Notably, patients that lacked both encephalopathy and ascites had 100% survival at 90 days. However, survival was highly variable in the majority of patients that evidenced the presence of one or both of the physical examination findings of ascites and encephalopathy, highlighting the need for prognostic models such as MELD in this patient population.
Table 3. Encephalopathy and Ascites as Predictors of 90-Day Mortality
90-Day Mortality/Total Number (%)
Mean ± SD (Range of MELD)
29 ± 9 (17–38)
17 ± 8 (6–36)
21 ± 8 (7–33)
10 ± 4 (0–18)
This study was undertaken to examine the accuracy of MELD in predicting mortality in patients with AH and thereby optimize strategies to prognosticate patients. The traditional disease-specific formula used to predict 30-day mortality in patients with AH is the DF.7 In this study, the c-statistics calculated to compare prognostic validity of MELD and DF in AH were comparable for 30-day as well as 90-day mortality. Interestingly, in our multivariate logistic regression, MELD, not DF, emerged as the only independent predictor of 90-day mortality. This may be due to the incorporation of serum creatinine in MELD. Serum creatinine correlates with survival in a number of disease states, including AH.10, 17 Although a recent study demonstrated that temporal changes in serum bilirubin during the course of AH are effective in predicting subsequent patient mortality,20 in our study the baseline MELD was the best independent predictor of subsequent mortality, while interval changes in MELD (ΔMELD)21 did not emerge as a predictor of mortality in the univariate analysis. Indeed, no other variables were significantly associated with mortality after correction for MELD; furthermore, no other variables augmented the capacity of MELD to predict mortality. The optimal cut point of 21 was obtained for MELD in the 90-day receiver operating characteristic mortality analyses, with a c-statistic that was similar to the DF. A series published by Sheth et al.22 recently provided initial evidence that MELD may be useful in estimating 30-day prognosis in patients with AH; a more recent analysis of MELD conducted by Said et al.23 across a broad spectrum of liver disease, which included a cohort of patients wth AH, also supports this concept. The current study, performed in a large characterized cohort, establishes the ability of MELD to rank patients with AH by risk of death and provides evidence that MELD accurately predicts mortality up to 90 days. Although prospective validation would be required to definitively prove or disprove that MELD exceeds DF in predicting mortality in AH, some practical points favoring the use of MELD in this setting should be considered. In the current era, with laboratories using INR as a standardized analysis of coagulation status rather than PT expressed in seconds, there are tangible benefits in using predictive formulas that use INR, such as MELD. The PT expressed as INR is comparable across all laboratories, because the calculation accounts for the sensitivity of the thromboplastin reagent used in the test. In contrast, PT expressed in seconds is highly dependent on the sensitivity of the thromboplastin used. Therefore, the same patient may have markedly variable values for PT expressed in seconds from laboratory to laboratory if varying sensitivities of thromboplastin are used.21 For instance, a patient with a PT of 12.6 seconds with a thromboplastin sensitivity of 3 will have a PT of 14 seconds with a thromboplastin sensitivity of 2, and a PT of 20 seconds if thromboplastin of sensitivity 1 is used. Furthermore, DF was developed several decades ago7 compared with MELD, which has been prospectively and retrospectively validated in heterogenous cohorts of patients derived from the current era.13–16 With the increasing use of MELD in many applications, most prominently in the replacement of the Child-Turcotte-Pugh score for allocation of liver allografts,15 many convenient approaches to calculate MELD have arisen, including the maintenance of the formula on personal handheld computers—or, alternatively, the use of a common Web site calculator. Our analysis suggests a use for this calculation to estimate mortality in AH patients, and a website is available to estimate MELD-based mortality in patients with AH (http://www.mayoclinic.org/gi-rst/mayomodel7.html).
An important consideration in the use of prognostic models in patients with AH is to allow for the determination of which patients should undergo therapy with biologically active medications versus which patients should be managed supportively with the anticipation of spontaneous improvement. In this regard, a DF score of more than 32 predicts a mortality exceeding 50%3, 7, 8 and has traditionally been used as an indication for corticosteroid therapy.3 This criterion has been frequently used as an inclusion criterion in new treatment trials of patients with AH, as well.5 However, a significant minority of patients (10%-17%) with a DF score below 32 may still die from AH.9–11 Because the rationale for the cut point of 50% mortality as a trigger for treatment was based on the risk–benefit ratio specific to corticosteroids,7 if an effective therapy was available with a lower adverse event profile than corticosteroids, the therapeutic intervention could be initiated at a cut point of mortality lower than 50%. In this regard, the present study identifies a MELD score of 21 as having the highest sensitivity and specificity to predict mortality with an estimated 90-day mortality of 20% for patients with this score. Thus, patients with AH and a MELD score of more than 21 could be considered for entry into studies addressing the use of potential therapeutic agents, although the specific MELD cut point used may depend on the degree of treatment-related mortality.
Interestingly, our analyses suggest that physical examination signs of encephalopathy and ascites may be useful as a quick bedside screening test for mortality in AH. According to our patient sample, in the absence of encephalopathy or ascites, 90-day mortality approaches 0, thereby reducing the use of detailed prognostic calculations to predict mortality. Although the absence of encephalopathy and ascites were useful screening tests in this context, many patients with AH do have one or both of these findings.4 Furthermore, the clinical diagnoses of encephalopathy and ascites can be subjective, as evidenced by large interobserver variability.24 Indeed, these limitations in accurately detecting the presence or absence of these physical examination parameters was a major reason that MELD was adopted as a nonbiased approach to prioritize and rank organ allocation of cadaveric livers for transplantation on the United Network for Organ Sharing liver waiting list.14–16
In summary, MELD is useful for predicting 30-day and 90-day mortality in patients with AH and maintains some practical and statistical advantages over DF in predicting mortality rate in these patients. These data suggest that in patients with AH—particularly those complicated by ascites, encephalopathy, or both—MELD is a useful clinical tool with which to gauge mortality and guide treatment decisions.