Serum bilirubin levels and mortality after myeloablative allogeneic hematopoietic cell transplantation


  • Ted A. Gooley,

    1. Sections of Clinical Statistics and Gastroenterology/Hepatology, Clinical Research Division, Fred Hutchinson Cancer Research Center, and the University of Washington School of Medicine, Seattle, WA
    Search for more papers by this author
  • Pankaj Rajvanshi,

    1. Sections of Clinical Statistics and Gastroenterology/Hepatology, Clinical Research Division, Fred Hutchinson Cancer Research Center, and the University of Washington School of Medicine, Seattle, WA
    Search for more papers by this author
  • H. Gary Schoch,

    1. Sections of Clinical Statistics and Gastroenterology/Hepatology, Clinical Research Division, Fred Hutchinson Cancer Research Center, and the University of Washington School of Medicine, Seattle, WA
    Search for more papers by this author
  • George B. McDonald

    Corresponding author
    1. Sections of Clinical Statistics and Gastroenterology/Hepatology, Clinical Research Division, Fred Hutchinson Cancer Research Center, and the University of Washington School of Medicine, Seattle, WA
    • Gastroenterology/Hepatology Section (D2-1900), Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, P.O. Box 19024, Seattle, WA 98109-1024
    Search for more papers by this author
    • Fax: 206-667-6519

  • Conflict of interest: Nothing to report.


Many patients who undergo hematopoietic cell transplantation experience liver injury. We examined the association of serum bilirubin levels with nonrelapse mortality by day +200, testing the hypothesis that the duration of jaundice up to a given point in time provides more prognostic information than either the maximum bilirubin value or the value at that point in time. We studied 1,419 consecutive patients transplanted from allogeneic donors. Total serum bilirubin values up to day +100, death, or relapse were retrieved—along with nonrelapse mortality by day +200 as an outcome measure—using Cox regression models with each bilirubin measure modeled as a time-dependent covariate. The bilirubin value at a particular point in time provided the best fit to the model for mortality. With bilirubin at a point in time modeled as an 8th-degree polynomial, an increase in bilirubin from 1 to 3 mg/dL is associated with a mortality hazard ratio of 6.42. An increase from 4 to 6 mg/dL yields a hazard ratio of 2.05, and an increase from 10 to 12 mg/dL yields a hazard ratio of 1.17. Among patients who were deeply jaundiced, survival was related to the absence of multiorgan failure and to higher platelet counts. In conclusion, the value of total serum bilirubin at a particular point in time after transplant carries more informative prognostic information than does the maximum or average value up to that point in time. The increase in mortality for a given increase in bilirubin value is larger when the starting value is lower. (HEPATOLOGY 2005,41:345–352.)

During the process of undergoing hematopoietic cell transplantation, many patients experience hepatobiliary complications,1 including: sinusoidal obstruction syndrome (veno-occlusive disease)2; several cholestatic disorders (e.g., graft-versus-host disease [GVHD], cyclosporine cholestasis, cholangitis lenta, hepatotoxic drugs)3; hepatocellular injury caused by viruses, GVHD, drugs, and ischemia4–9; iron overload10; biliary obstruction11; and infiltration of the liver by fungi or tumor cells.12, 13

We analyzed a cohort of 1,419 consecutive patients who received allogeneic transplants to assess the relationship of the level of jaundice to mortality after transplant. We made no effort to make precise diagnoses, but rather used total serum bilirubin as an index of the severity of underlying liver damage from all cumulative causes. Previous studies have clearly shown an association of jaundice with poor outcome in the setting of hematopoietic cell transplantation.1 However, these reports have generally considered only a single measure of bilirubin, and this measure has been modeled in a relatively restrictive manner using either dichotomous or linear variables. Because clinical experience had suggested that both the degree of jaundice and its duration were associated with increased mortality, we studied the relation of three bilirubin parameters—maximum value by a specified point in time, average value up to that point in time, and daily value at that time—to the hazard of day +200 nonrelapse mortality. We have not modeled any factors other than those associated with bilirubin, because consideration of nonbilirubin variables could confound the bilirubin effect that is of interest. We also analyzed patients who were deeply jaundiced to see if we could identify clinical and laboratory factors that predicted survival despite extreme hyperbilirubinemia.


GVHD, graft-versus-host disease; NRM, nonrelapse mortality; AIC, Akaike Information Criterion.

Patients and Methods

Hematopoietic Cell Transplantation

All patients undergoing allogeneic transplantation received a myeloablative regimen followed by infusion of donor cells. The day of infusion was day 0, by convention. Graft recipients received prophylaxis against acute GVHD with immunosuppressive drugs—usually cyclosporine or tacrolimus plus methotrexate. Prophylaxis for infections included acyclovir, trimethoprim/sulfamethoxazole, oral fluconazole, and ganciclovir.14 Serum samples were collected for determination of total serum bilirubin at regular intervals through day +100. This retrospective analysis was performed under a protocol approved by our institutional review office.

Patient Selection

All recipients of allogeneic hematopoeitic cells from 1993 through 1997 were evaluated.

End Point and Bilirubin Parameter Definitions

Nonrelapse Mortality.

Day +200 nonrelapse mortality (NRM) was taken as the primary end point of this study. Failure to meet this end point was defined as any death within 200 days posttransplant that was not preceded by a relapse of the original malignancy.

Total Serum Bilirubin Parameters.

Maximum serum bilirubin was defined as the highest observed level by a given point in time. The average bilirubin level was the arithmetic mean of the bilirubin values up to the day in question. The actual bilirubin level was the level on the day in question. For each of these bilirubin summary measures, all values from day of transplant to day +100 following transplant were recorded, along with the date of relapse of underlying malignancy or date of death. On days that no measurement was taken, the last measured bilirubin value was imputed.

Statistical Methods

To assess the association of each bilirubin measure with day +200 NRM, Cox regression models were fit with each bilirubin parameter as an explanatory variable. Patients who relapsed before day +200 were censored at time of relapse in the regression models. To assess the fit of each model, the Akaike Information Criterion (AIC) was calculated for the three models, where a smaller AIC implies a better fit to the data.15 The AIC was defined as [−2logL + 3p], where logL represents the log likelihood and p represents the number of parameters contained in the regression model. The motivation behind this statistic is that if the only difference between two models is that one contains unnecessary covariates, the values of −2logL for the two models will not be very different (the value of the log likelihood will always decrease as additional covariates are added to the model), and as a result the value of the AIC will increase when unnecessary terms are added to the model (because the addition of 3p to the log likelihood will exceed the decrease in the log likelihood). The choice of the parameter 3 in the AIC corresponds roughly to using a 5% significance level in comparing two nested models that differ by 1 to 3 parameters. Each bilirubin parameter was modeled as a time-dependent covariate. In addition to comparing the AICs for models containing only one of the bilirubin measures, models additionally containing a particular measure were compared to the model not containing this measure using the likelihood ratio test. These comparisons allow one to assess the additional contribution of each bilirubin parameter after other parameters have already been considered. In addition to assessing the association of bilirubin measures with outcome, many other clinical measurements were recorded among patients who were deeply jaundiced (total serum bilirubin > 10 mg/dL) at days +20 and +50. The association of these clinical variables with day +200 NRM was assessed in an effort to identify parameters associated with survival despite hyperbilirubinemia.


Description of the Study Cohort.

During the study period, 1,419 patients received an allogeneic transplant; their demographics are given in Table 1. Among these patients, 419 (30%) died before day +200 without a prior relapse of disease. Fourteen of these deaths occurred between day 0 and day +10; the remaining 405 occurred between days +10 and +200.

Table 1. Demographics of the Study Population
Demographicn (%)
 Acute myeloid leukemia347 (24.5)
 Myelodysplastic syndrome164 (11.6)
 Chronic myeloid leukemia463 (32.6)
 Acute lymphocytic leukemia190 (13.4)
 Lymphomas91 (6.4)
 Other malignant disorders55 (3.9)
 Aplastic anemia46 (3.2)
 Paroxysmal nocturnal hemoglobinuria6 (0.4)
Age35.3 ± 15.2 yr (median 37.5, range 0.6–67.8)
 Female563 (39.7)
 Male856 (60.3)
Donor human leukocyte antigen match 
 Matched family member625 (44)
 Mismatched family member201 (14)
 Unrelated donor593 (42)

Modeling Bilirubin Parameters as Time-Dependent Covariates With Regard to Day +200 NRM.

Table 2 displays the proportion of patients who died before day +200 without a prior relapse among those who were alive without relapse on days 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100 as a function of the total serum bilirubin on these specific days. There is a substantial increase in NRM as the bilirubin level is increased, and a relatively large increase in mortality as the bilirubin level reaches above 4 mg/dL. Each of the three bilirubin measures was statistically significantly associated with the hazard of day +200 NRM, and the association of each appeared to be monotone-nondecreasing (data not shown). That is, increases in each of the parameters led to an increase in the hazard of NRM, although this increase levels off as the value of the parameter increases. In addition, these associations did not appear to depend on time after transplant (data not shown). Table 3 shows the results of modeling each measure as an nth-degree polynomial, where n was chosen to be the highest value that led to the lowest AIC for a particular measure. A linear model for a particular bilirubin parameter assumes that the association for a particular increase in the measure is the same for all increases of this magnitude, regardless of the initial value for the measure. Use of a polynomial function in modeling allows nonlinear features to be captured; this is the motivation for the use of an nth-degree polynomial. As seen in Table 3, the model in which bilirubin was modeled as the actual value at each day provided the best fit to the data. In addition to modeling each summary measure as a continuous variable, we also categorized each bilirubin measure into the following groups: total serum bilirubin 0 to 1 mg/dL; 1 to 4 mg/dL; 4 to 7 mg/dL; 7 to 10 mg/dL; and >10 mg/dL, as in Table 2. The models using continuous variables (see Table 3) led to a lower AIC for each parameter when compared with the relevant models using bilirubin groups (data not shown). Thus, we have chosen to model each bilirubin measure as an nth-degree polynomial rather than using categorical variables, because the polynomials provide better fits to the data.

Table 2. Frequency of NRM at Day +200 After Allogeneic Hematopoietic Cell Transplant as a Function of Total Serum Bilirubin Values at 10-Day Intervals (N = 1,419)
Total Serum Bilirubin (mg/dL)Proportion of Patient Deaths Due to Causes Other Than Relapse of Malignancy*
Day 10Day 20Day 30Day 40Day 50Day 60Day 70Day 80Day 90Day 100
  • NOTE. Boldface data indicate cumulative summaries.

  • *

    Values represent the number of patients who died of nonrelapse causes before day +200, divided by the number of patients who were alive without relapse at the time noted (that is, days 10 through 100 after transplant), thus giving the percentage of patients at risk who later died from nonrelapse causes.

 13–165/78/1211/145/93/43/42/3 1/20/1
 (71%)(67%)(79%)(56%)(75%)(75%)(67%) (50%)(0%)
 19–224/54/41/12/2 3/31/13/31/11/1
 (80%)(100%)(100%)(100%) (100%)(100%)(100%)(100%)(100%)
 25–282/2 2/21/11/1 1/1   
 (100%) (100%)(100%)(100%) (100%)   
 28–31 5/5 3/42/21/1  2/2 
  (100%) (75%)(100%)(100%)  (100%) 
 34–37 2/33/41/1      
>37 1/18/95/54/43/31/11/1  
Table 3. Total Serum Bilirubin Parameters as nth-Degree Polynomials and Their Fit to Day +200 NRM as Determined by the AIC
Summary MeasurenAICP Value
  • *

    Best model for each summary measure.

Daily total serum bilirubin54,737
Daily total serum bilirubin*84,697< .00001, 5th-degree vs. 8th-degree
Daily total serum bilirubin94,699.52, 8th-degree vs. 9th-degree
Maximum total serum bilirubin*55,007
Maximum total serum bilirubin65,008.17, 5th-degree vs. 6th-degree
Average total serum bilirubin*55,114
Average total serum bilirubin65,117.79, 5th-degree vs. 6th-degree

Figure 1 shows hazard ratios for relative increases in total serum bilirubin values of 1, 2, and 3 mg/dL when the daily bilirubin value is modeled as a continuous 8th-degree polynomial. For example, Fig. 1A shows that under this model, as the bilirubin value increases from 1 mg/dL to 2 mg/dL, there is an accompanying increase in the hazard of day +200 NRM of 190% (corresponding to a hazard ratio of 2.90). Note that for a specified starting bilirubin value, the hazard ratios increase when the relative increases in bilirubin get larger. For example, if a starting bilirubin value of 2 mg/dL is isolated on Fig. 1A (reflecting a change from 2 mg/dL to 3 mg/dL), Fig. 1B (a change from 2 mg/dL to 4 mg/dL), and Fig. 1C (a change from 2 mg/dL to 5 mg/dL), the corresponding hazard ratios increase from 2.21 to 3.95 to 6.01, respectively. For a specified magnitude of increase (e.g., an increase of 2 mg/dL [Fig. 1B]), the hazard ratios decrease as the starting bilirubin level increases. In other words, the hazard of NRM is increased less and less (although still increased, as demonstrated by a hazard ratio greater than 1.0 throughout) for specified rises in bilirubin as the starting level gets larger. For example, an increase in bilirubin from 1 mg/dL to 3 mg/dL is associated with a hazard ratio of 6.42, while increases in bilirubin from 4 mg/dL to 6 mg/dL and from 10 mg/dL to 12 mg/dL are associated with hazard ratios of 2.05 and 1.17, respectively.

Figure 1.

The hazard ratio of day +200 NRM when daily total serum bilirubin is modeled as an 8th-degree polynomial. Solid lines show the hazard ratio associated with an increase in total serum bilirubin of (A) 1 mg/dL, (B) 2 mg/dL, or (C) 3 mg/dL relative to the starting bilirubin value shown on the x-axis. The dotted line represents the pointwise lower 95% confidence limit.

Although this modeling indicates that the daily bilirubin measure fits the data better than the other two measures, these other measures may still add important information to the model that contains only the daily value. Table 4 summarizes the results from comparing various models via the likelihood ratio test. These results suggest that what matters most for predicting subsequent outcome is the actual bilirubin value at a particular point in time, and the route through which this value was reached is less important.

Table 4. Results From Likelihood Ratio Test Comparing Nested Models
Models ComparedP Value
5th-degree average + nth-degree daily vs. 5th-degree average< .00001 (n ≥ 1)
5th-degree maximum + nth-degree daily vs. 5th-degree maximum< .00001 (n ≥ 1)
8th-degree daily + nth-degree maximum vs. 8th-degree daily.87 (n = 1), .65 (n = 5)
8th-degree daily + nth-degree average vs. 8th-degree daily.05 (n = 1), .08 (n = 2)

The association of daily bilirubin values with day +200 NRM appeared to be relatively independent of the average bilirubin value and the maximum value. If a linear term for the average value is added to the 8th-degree polynomial for daily value and an interaction between the linear terms is fit, the interaction term was not significant (P = .28). The same holds if a linear term for the maximum value is added (P = .41). Moreover, the association of the daily bilirubin value with outcome was not demonstrably different according to donor type, source of stem cells, severity of disease, or age at transplant (data not shown).

Patients With Extreme Hyperbilirubinemia (>10 mg/dL).

Among the 1,419 patients studied, there were 292 (21%) whose total serum bilirubin exceeded 10 mg/dL before day +100. The NRM rate by day +200 among these patients was 230 (79%) of 292, compared with 189 (17%) of 1,127 among patients whose bilirubin never exceeded 10 mg/dL. The distribution of the day on which the serum bilirubin level first exceeded 10 mg/dL for these patients is shown in Fig. 2. This threshold was reached before day +20 in 156 patients (53%), between days +20 and +40 in 75 patients (26%), and after day +40 in 61 patients (21%). Sinusoidal obstruction syndrome (formerly known as venocclusive disease of the liver) was the cause for most of the cases of extreme hyperbilirubinemia that occurred before day +20, and acute GVHD was the most common cause after day +40.

Figure 2.

Dot plot showing the first day that total serum bilirubin exceeded 10 mg/dL for the 292 patients who reached such a level. Open circles (○) represent patients who became deeply jaundiced but survived to day +200. Closed circles (●) represent patients who died without relapse before day +200. Daggers (†) represent patients who relapsed before day +200.

Although most patients who were deeply jaundiced died from nonrelapse causes before day +200, there were some who survived. We examined clinical parameters during a 10-day window from day +10 to +20 among patients who were deeply jaundiced at day +20 in an effort to identify a cohort of individuals that was at decreased risk of dying from nonrelapse causes by day +200. Table 5 displays the clinical factors at or before day +20 that were statistically significantly different among 14 survivors compared with 48 patients who died. Patients who survived had a lower frequency of multiorgan failure and were more likely to be platelet transfusion–independent. There were no statistically significant differences in a large number of other clinical factors between survivors and nonsurvivors, including bilirubin parameters, the underlying hematological malignancy, conditioning regimen, HLA match, GVHD prophylaxis, engraftment, documented infection, presence of GVHD, hematocrit, or any other laboratory test (data not shown). Analysis of 41 patients whose total bilirubin level was greater than 10 mg/dL at day +50 yielded similar results (data not shown); there were only 9 survivors from this cohort.

Table 5. Clinical Variables That Carry Prognostic Value in Patients Who Were Extremely Jaundiced at Day +20 Posttransplant
 Patients Who Had Survived or Relapsed by Day +200 (n = 14)Patients Who Died Before Day +200 (n = 48)P Value*
  • *

    P value obtained from the log rank test.

Average serum bilirubin up to day +20 (mg/dL)7.8 ± 3.39.1 ± 3.7.56
Serum bilirubin maximum up to day +20 (mg/dL)17.42 ± 8.121.9 ± 7.9.26
Serum bilirubin at day +20 (mg/dL)14.4 ± 6.619.9 ± 7.9.06
Maximum serum creatinine in last 10 days (mg/dL)1.96 ± 1.462.79 ± 1.60.02
Blood urea nitrogen maximum in last 10 days (mg/dL)68 ± 27107 ± 37<.0001
Hemodialysis in last 10 days (n)1/1419/48.007
Number of days with fever (>38.5°C) in last 10 days2.86 ± 2.444.48 ± 2.95.04
Pulmonary infiltrate on X ray in last 10 days (n)3/1427/48.04
Nasal oxygen use in last 10 days (n)5/1440/48.001
Intravenous antibiotics in last 10 days (n)10/1448/48.009
Platelet transfusion independent by day +20 (n)3/140/48.02


Total serum bilirubin has previously been shown to be associated with increased mortality following hematopoietic cell transplant.16, 17 Clinical experience has suggested that the totality of liver injury as measured by the duration of jaundice up to a point in time would prove to be a better indicator of subsequent outcome than only the bilirubin value at that particular time or the maximum value achieved by that time. We formally tested this hypothesis by using Cox regression models to represent each of these summary measures as a time-dependent covariate. In addition, each summary measure was modeled in a flexible manner, allowing for nonlinear associations between the measure and outcome.

Each of the summary measures that we examined was positively correlated with the hazard of day +200 NRM in the time-dependent models. The total serum bilirubin value at a particular point in time was more important in predicting subsequent outcome than either the average value or maximum value up to that point in time. Moreover, once the bilirubin level at a particular point in time is known, additional knowledge of either the maximum or the average bilirubin value from transplant up to that point in time did not lead to large improvements in the model. Thus, the route by which a patient reaches a specific serum bilirubin level is far less important than what the level is at that time. Relatively small increases in bilirubin are associated with a relatively large increase in mortality when the starting bilirubin is relatively low. Although these latter findings may be intuitively apparent, quantification of these associations provide clinicians with guidelines for determining when additional therapies should be attempted, and for defining futility of further treatment.18

In many situations, development of jaundice in very ill patients with liver disease is well recognized as a sign of a poor prognosis.19, 20 The question of how hyperbilirubinemia confers an adverse prognosis after hematopoietic cell transplant cannot be completely addressed by these data. The simplest explanation is that total serum bilirubin is a marker for several serious conditions that are known to have an adverse outcome, such as sinusoidal obstruction syndrome, GVHD, sepsis syndrome, and renal failure.16, 17, 21 An alternative but not mutually exclusive explanation is that liver dysfunction per se leads to morbidity. There is compelling evidence in both animals and man that liver dysfunction causes neurological, renal, cardiovascular, and pulmonary dysfunction.22–30 In the setting of hematopoietic cell transplant, pulmonary and renal dysfunction are statistically more frequent among patients with liver injury and commonly follow development of jaundice by days to weeks.21, 31, 32 Our data show a relatively constant relation between levels of jaundice and mortality across time, even though the causes of jaundice differ over time, suggesting that the cause of jaundice is not as important in prognosis as its degree.

These findings have several implications for the care of patients undergoing allogeneic transplantation. Because treatment of transplant patients who become deeply jaundiced can be futile,18 emphasis should be on early recognition and treatment of liver injury and on modifying the transplant process to prevent liver injury. New methods for preparing patients for infusion of stem cells should result in less regimen-related liver damage than with the standard regimens used in this cohort of patients. For example, myeloablative preparative regimens that contain busulfan are now dosing this drug to a metabolic end point or giving an intravenous formulation of busulfan.33, 34 A study of the pharmacokinetics of cyclophosphamide has shown that its metabolism is highly variable and that liver toxicity is related to increased exposure to toxins of cyclophosphamide metabolism as well as the dose of irradiation.35 Liver toxicity might be reduced by dosing cyclophosphamide according to its metabolism or by replacing cyclophosphamide with another drug that has minimal liver toxicity (e.g., fludarabine),36 and by limiting the dose of total body irradiation. There is minimal regimen-related liver toxicity from nonmyeloablative conditioning regimens (e.g., fludarabine plus total body irradiation 200 cGy).37 Prophylactic use of oral ursodiol at 10 to 15 mg/kg/d can reduce the frequency of liver injury in patients undergoing transplantation, presumably by preventing hepatocyte damage that results from prolonged cholestasis.38, 39 Prophylaxis with ursodiol may also confer a survival benefit.39, 40

An aggressive approach to diagnosis and treatment should be undertaken when the total serum bilirubin level rises above normal. Early treatment of sinusoidal obstruction syndrome may alter the prognosis of this disease.2, 41, 42 Patients with evidence of cholestatic liver injury who are not receiving ursodiol should be started on this drug.39 When the differential diagnosis of liver disease includes infections that can be treated—such as hepatitis B virus, varicella zoster virus, herpes simplex virus, adenovirus, cytomegalovirus, or a fungal or mycobacterial process1—transvenous liver biopsy should be performed to ascertain the cause.43, 44 In acute GVHD, evidence of T-cell–mediated apoptosis of small bile ducts and ductopenia are diagnostic features that should lead to treatment.45, 46 Some drugs in common use in the transplant setting are potential liver toxins (e.g., trimethoprim-sulfamethoxazole, itraconazole, voriconazole, and fluconazole).1 Cyclosporine and tacrolimus may elevate serum bilirubin through a pharmacological effect on bilirubin transport, particularly when blood levels are outside of the therapeutic range.47, 48 Our hypothesis that prevention, early diagnosis, and treatment of liver disease will lead to improved outcomes is unproven, however.

In summary, this study shows that the association of NRM with the level of total serum bilirubin on any given day is stronger than the manner in which this level was reached. Small increases in bilirubin are accompanied by substantial increases in NRM when the starting bilirubin level is relatively low. The exact cause of jaundice may not be as important as the level of jaundice, because the risk of mortality conferred by deeper levels of jaundice is relatively constant across time. NRM among patients whose total serum bilirubin exceeds 10 mg/dL is 79%. Patients who survive despite this level of jaundice at days +20 and +50 are more likely to be without renal or pulmonary failure. Although early diagnosis and treatment of the underlying causes of jaundice after transplant are important, future research should focus on prevention of liver injury.