Liver tests and outcomes in heart failure with reduced ejection fraction: findings from DAPA‐HF

ABSTRACT Aims Reflecting both increased venous pressure and reduced cardiac output, abnormal liver tests are common in patients with severe heart failure and are associated with adverse clinical outcomes. We aimed to investigate the prognostic significance of abnormal liver tests in ambulatory patients with heart failure with reduced ejection fraction (HFrEF), explore any treatment interaction between bilirubin and sodium–glucose cotransporter 2 (SGLT2) inhibitors and examine change in liver tests with SGLT2 inhibitor treatment. Methods and results We explored these objectives in the Dapagliflozin And Prevention of Adverse outcomes in Heart Failure (DAPA‐HF) trial, with focus on bilirubin. We calculated the incidence of cardiovascular death or worsening heart failure by bilirubin tertile. Secondary cardiovascular outcomes were examined, along with the change in liver tests at the end‐of‐study visit. Baseline bilirubin was available in 4720 patients (99.5%). Participants in the highest bilirubin tertile (T3) have more severe HFrEF (lower left ventricular ejection fraction, higher N‐terminal pro‐B‐type natriuretic peptide [NT‐proBNP] and worse New York Heart Association class), had a greater burden of atrial fibrillation but less diabetes. Higher bilirubin (T3 vs. T1) was associated with worse outcomes even after adjustment for other predictive variables, including NT‐proBNP and troponin T (adjusted hazard ratio for the primary outcome 1.73 [95% confidence interval 1.37–2.17], p < 0.001; and 1.52 [1.12–2.07], p = 0.01 for cardiovascular death). Baseline bilirubin did not modify the benefits of dapagliflozin. During follow‐up, dapagliflozin had no effect on liver tests. Conclusion Bilirubin concentration was an independent predictor of worse outcomes but did not modify the benefits of dapagliflozin in HFrEF. Dapagliflozin was not associated with change in liver tests. Clinical Trial Registration: ClinicalTrials.gov NCT03036124.

including death. [1][2][3][4][5][6] Abnormal liver tests are thought to reflect both increased venous pressure and reduced cardiac output. [7][8][9] Much less is known about the prevalence or the prognostic significance of abnormal liver tests in ambulatory patients with HF, especially in such individuals receiving contemporary treatments. 10,11 In the Candesartan in Heart Failure: Assessment of Reduction in Mortality and Morbidity (CHARM) program, alkaline phosphatase (ALP) was elevated in 14.0% of patients, total bilirubin in 13.0%, alanine aminotransferase (ALT) in 3.1%, and aspartate aminotransferase (AST) in 4.1% of patients. 10 In CHARM, bilirubin was the most powerful prognostic liver test and remained an independent predictor in a multivariable model, although that model did not include a natriuretic peptide. More recently, in the Prospective Comparison of ARNI (Angiotensin Receptor-Neprilysin Inhibitor) with ACEI (Angiotensin-Converting Enzyme Inhibitor) to Determine Impact on Global Mortality and Morbidity in Heart Failure (PARADIGM-HF) trial, 11.6% of patients with HF with reduced ejection fraction (HFrEF) were found to have elevated bilirubin at baseline and bilirubin was again the most predictive liver test and remained independently so in a multivariable model including N-terminal pro-B-type natriuretic peptide (NT-proBNP). 11 Whether bilirubin remains predictive when additional prognostic biomarkers such as high-sensitivity troponin T are included is not known. 12 Recently sodium-glucose cotransporter 2 (SGLT2) inhibitors have been introduced as a treatment for the treatment of HFrEF. 13,14 Due to their effect on proximal renal tubular reabsorption of glucose, coupled with sodium, these agents cause an initial osmotic diuresis and natriuresis which might relieve hepatic congestion and improve liver tests. [15][16][17] It has also been suggested that SGLT2 inhibitors might reduce liver fat in patients with type 2 diabetes, a condition linked to obesity and frequently associated with non-alcoholic fatty liver disease. [18][19][20] We have investigated the prevalence and predictive importance of abnormal liver tests in the Dapagliflozin And Prevention of Adverse outcomes in Heart Failure (DAPA-HF) trial and the effect of dapagliflozin on liver tests in this trial. 13,21 Methods DAPA-HF was a randomized, double-blind, placebo-controlled trial in patients with HFrEF, which evaluated the efficacy and safety of dapagliflozin 10 mg once daily, added to standard care. 13,21 Ethics Committees at each of the 410 participating institutions in 20 countries approved the protocol, all patients provided written informed consent and the study complied with the Declaration of Helsinki.

Study patients
Men and women aged ≥18 years, in New York Heart Association (NYHA) functional class II-IV, with a left ventricular ejection fraction (LVEF) ≤40%, and an elevated NT-proBNP level, were eligible provided they were receiving optimal pharmacological and device therapy in the opinion of the investigator. 13,21 The main exclusion criteria included type 1 diabetes mellitus, symptomatic hypotension, systolic blood pressure <95 mmHg and estimated glomerular filtration rate (eGFR) <30 ml/min/1.73 m 2 . Patients with an AST or ALT more than three times the upper limit of normal, or total bilirubin more than two times the upper limit of normal were also excluded, as were patients judged to have a life expectancy of <2 years due to a condition other than HF. 13

Measurement of liver tests
Alkaline phosphatase, ALT, AST, and total bilirubin were measured at enrolment and at the end-of-study visit. Samples were processed in a central laboratory. Our main analysis of change in liver tests used the end-of-study measurement, which could fall at 12, 16, 20, or 24 months. Although not required, some patients had measurements taken at other scheduled visits, at the investigator's discretion, or at unscheduled visits. In an exploratory analysis, we included the results from this unplanned sampling, allocating them to the nearest scheduled visit.

Pre-specified trial outcomes
The primary outcome of DAPA-HF was the composite of worsening HF (HF hospitalization or urgent visit for HF requiring intravenous therapy) or cardiovascular (CV) death, whichever occurred first. Pre-specified secondary endpoints included HF hospitalization or CV death; and HF hospitalizations (first and recurrent) and CV deaths. The change from baseline to 8 months in Kansas City Cardiomyopathy Questionnaire total symptom score (KCCQ-TSS) was an additional secondary endpoint, with the proportion having a 5-point or more increase or decrease in their score at 8 months determined as previously described. 13,21 There was also a pre-specified secondary renal composite outcome, but this was not evaluated further in this study because of the small number of events.

Definition of elevated liver tests
The upper limits of normal were 35 IU/L for AST and ALT, 1.0 mg/dl for bilirubin, and 120 IU/L for ALP. 22 Given the evidence that bilirubin is the most prognostically important liver test, this was the focus of our analysis of the association with subsequent clinical outcomes, as described in the statistical analysis section below.

Statistical analysis
Patients were grouped by baseline bilirubin measurement into tertiles and baseline characteristics were summarized as means (standard deviations), median (interquartile ranges [IQR]), or percentages. Logistic regression was used to explore associations with elevated bilirubin at baseline, examining candidate variables in a univariable model and those with a p-value <0.2 being added into a stepwise logistic regression model.
Kaplan-Meier estimates and Cox proportional-hazards models, stratified by diabetes status, and adjusted for treatment-group assignment and history of HF hospitalization (except for all-cause death) were used to examine the primary and secondary outcomes across bilirubin tertiles, with further models adjusted for known predictors of risk of HF endpoints (age, sex, race, region, systolic blood pressure, heart rate, LVEF, eGFR, NT-proBNP [log-transformed], NYHA class, hypertension, previous stroke, previous myocardial infarction, atrial fibrillation, and HF aetiology). A second adjusted model included the same listed variables and the addition of high-sensitivity troponin T (log-transformed). A semi-parametric proportional-rates model was used to evaluate recurrent HF hospitalizations and CV death. 23 Each liver test was considered as a continuous variable in Cox regression models for the same outcomes and adjustments after being log-transformed to normalize distribution. The relationship between continuous liver tests and each outcome was further explored using restricted cubic splines to examine for a non-linear relationship.
As NT-proBNP and troponin T are established powerful predictors of outcome in HF, the incremental predictive value of bilirubin added to these biomarkers was examined. Rates of the primary outcome were assessed in groups defined by tertile of both bilirubin and either NT-proBNP or troponin T to evaluate the effect of elevation of both markers on the occurrence of the primary outcome. Groups defined by combinations of tertiles of both biomarkers were compared in a Cox model.
The effects of randomized treatment on outcomes within each tertile of bilirubin was evaluated and modification of treatment effects by baseline bilirubin tertile was assessed using a global interaction test. The differences between treatment groups in the proportion of patients with a clinically significant (≥5 points) improvement or deterioration in KCCQ-TSS at 8 months was analysed using the methods described previously and presented as an odds ratio for each baseline bilirubin category. 13,21 The effect of dapagliflozin compared with placebo on each endpoint was examined across the range of baseline bilirubin as a continuous variable using restricted cubic splines. This was repeated for the other liver tests for the primary endpoint.
Change in liver tests was analysed using a least-square means regression and by the ratio of geometric means between baseline and end-of-study visit. To use additional unplanned samples, recordings at 12 or 16 months of follow-up were combined (either a recording at 12 or 16 months or if both present the mean of the two recorded values) as well as 20-and 24-month follow-up and analysed in the same manner.
Safety analyses were performed in randomized patients who had received at least one dose of dapagliflozin or placebo. The interaction between baseline bilirubin tertile and randomized treatment on the occurrence of the pre-specified safety outcomes was tested in a logistic regression model.
All analyses were conducted using Stata version 17.0 (StataCorp, College Station, TX, USA) and SAS version 9.4 (SAS Institute, Cary, NC, USA). A p-value <0.05 was considered statistically significant.

Baseline characteristics
There were many significant differences according to baseline bilirubin level ( Table 1). Each of ALP, ALT and AST were higher in participants with higher bilirubin. Participants with higher bilirubin were more likely to be male (T3 85.3% vs. T1 67.6%), to have lower systolic blood pressure (120.0 ± 15.6 vs. 123.8 ± 16.9 mmHg), lower LVEF (30.5 ± 7.0% vs. 31 14.5% with baseline electrocardiogram in atrial fibrillation or flutter) ( Table 1). Patients in the highest bilirubin tertile were more likely than those in the lowest tertile to be treated with digoxin, an oral anticoagulant or a mineralocorticoid receptor antagonist, and less often treated with an antiplatelet agent and statin. Among patients with diabetes, fewer in the highest bilirubin tertile were prescribed treatments for diabetes, although glycated haemoglobin was similar across baseline bilirubin tertiles. The baseline characteristics identified through stepwise logistic regression independently associated with bilirubin are shown in online supplementary Table S1. Higher NT-proBNP, higher haemoglobin, atrial fibrillation, higher (worse) NYHA class, male sex, lower pulse pressure, higher ALP, higher AST and lower KCCQ-TSS score were associated with bilirubin above the normal range at baseline.

Cardiovascular outcomes according to baseline bilirubin Primary and secondary trial outcomes related to bilirubin level
Incidence rates for the primary and secondary outcomes of the trial were substantially higher in patients in bilirubin T3, compared to T1 (Table 2, Figure 1). The elevated risk associated with higher bilirubin persisted after comprehensive adjustment for other predictors of worse outcomes, including LVEF, NT-proBNP and troponin T, with a fully adjusted hazard ratio (aHR) in bilirubin T3 versus T1 for the primary outcome of 1.73 (95% confidence interval [CI] 1.37-2.17, p < 0.001). The aHR for CV death (T3 vs. T1) was 1.52 (1.12-2.07; p = 0.01). Given more patients in the highest bilirubin tertile were male, this analysis was repeated in male patients only, with consistent results (online supplementary  Table S2).
Analyses using baseline bilirubin concentration as a continuous variable showed an essentially linear relationship between event rates and bilirubin level ( Figure 2). For each unit increase in log-transformed total bilirubin, in adjusted Cox models, the aHR for the primary endpoint was 1.66 (1.39-1.98; p < 0.001); for hospitalization or urgent visit for HF 1.94 (1.56-2.40; p < 0.001); for death from CV causes 1.46 (1.16-1.85; p = 0.001); and for death from any cause 1.33 (1.07-1.64; p = 0.01) (online supplementary Table S3).
Analysed as a continuous variable, higher levels of ALP were associated with a higher risk of the primary outcome, components of the primary outcome, death from CV cause and any cause and recurrent HF hospitalization and CV death. In the adjusted model including NT-proBNP and high-sensitivity troponin T, the relationship remained significant for death from any cause only. Neither AST nor ALT level was associated with the risk of any outcome in either adjusted or unadjusted models (online supplementary  Table S3).   supplementary Figure S2). The lack of relationship between AST and ALT and any outcome was confirmed.
Allocating patients to groups defined by tertiles of bilirubin and troponin T showed a marked increase in the rate of the primary outcome when both bilirubin and troponin T were in the highest tertile (Graphical Abstract). A similar outcome was seen with tertiles of bilirubin and NT-proBNP (Graphical Abstract).

Effect of dapagliflozin on primary and secondary trial outcomes
The efficacy of dapagliflozin in preventing the primary outcome of CV death or worsening HF did not differ across bilirubin tertiles (p for interaction = 0.07). The efficacy of dapagliflozin in preventing CV death, worsening HF events and all-cause death also did not differ by bilirubin tertile ( Table 3, Graphical Abstract). The results were similar when bilirubin was treated as a continuous variable (online supplementary Figure S3). The proportion of patients with a 5-point or more decrease in KCCQ-TSS (worsening) was smaller in those randomized to dapagliflozin, and the proportion of patients with a 5-point or more increase in KCCQ-TSS score (improvement) was higher in those randomized to dapagliflozin, irrespective of baseline bilirubin tertile ( Table 3) There was no significant interaction between ALP, AST or ALT and randomized treatment on the occurrence of the primary outcome with the liver tests being modelled as continuous variables (online supplementary Figure S4).

Effect of dapagliflozin on liver tests
End-of-study visit samples were spread as follows: 13.5% at 12 months, 25.9% at 16 months, 33.5% at 20 months, 23.9% at 24 months, and 3.2% at 28 months. Although this analysis suggested a small increase in bilirubin at the end-of-study visit in patients assigned to dapagliflozin (Table 4), the supplementary analysis using additional results from unplanned samples showed no change in bilirubin or any other liver tests with dapagliflozin (online supplementary Table S4). There was no difference in change in liver tests when diabetic patients were analysed separately (data not shown).

Safety and adverse events
Each of the adverse events of interest was uncommon. A similar proportion of patients experienced adverse events across bilirubin  tertiles (online supplementary Table S5). The rate of adverse events did not differ notably between patients assigned to placebo or dapagliflozin, in any bilirubin tertile (online supplementary Table S5).

Discussion
In a contemporary, well-treated ambulatory cohort of patients with HFrEF, most of whom had mild symptoms, the prevalence of abnormal liver tests was low (ranging from 8% to 15% for the various liver tests measured), although patients with significant hepatic disease were not enrolled in DAPA-HF. Bilirubin was the most frequently elevated liver test, and it remained an independent predictor of outcomes, despite adjustment for other prognostic variables, including NT-proBNP and high-sensitivity troponin T, a finding we believe has not been reported before. ALP was also independently predictive of outcome. The benefit of dapagliflozin was consistent across the range of bilirubin concentrations measured at baseline. (11.6%) and in the patients with reduced ejection fraction (15.8%) recruited approximately 20 years ago in CHARM when background therapy was markedly different. 10,11 In all three studies, bilirubin was associated with both the composite outcome of CV death or worsening HF and all-cause mortality in predictive models including an array of clinical and routinely measured biochemical variables. In PARADIGM-HF, bilirubin was an independent predictor of each outcome in multivariable models including these variables and NT-proBNP, which is the single most powerful predictor of outcomes in HFrEF (NT-proBNP was not measured in CHARM). 11,24 High-sensitivity troponin has emerged as one of the few additional biomarkers to consistently add prognostic information when added to NT-proBNP. 12,25,26 In DAPA-HF we tested whether bilirubin retained its independent predictive value even in models containing both NT-proBNP and troponin, in addition to clinical variables. We found that bilirubin provided incremental prognostic information even when added to these other biomarkers and the fully aHR for each outcome related to bilirubin level was not attenuated to any significant extent compared to the unadjusted hazard ratio. The interesting question is what aspect of HF pathophysiology is measured by bilirubin? Bilirubin is associated with high central venous/right atrial pressure in patients with HF and this finding raises the possibility that bilirubin provides different information about central haemodynamics than NT-proBNP, which may be more reflective of left-than right-sided pressures. [7][8][9] Because bilirubin is associated with high central venous/right atrial pressure, we anticipated that dapagliflozin would reduce bilirubin as SGLT2 inhibitors have a diuretic action that might alleviate hepatic congestion. [15][16][17] However, we did not find this, possibly because the diuretic action of SGLT2 inhibitors is short-lived and might not have led to a sustained decrease in bilirubin (there were small numbers of bilirubin measurements before 12 months after randomization, therefore early change could not be assessed). [15][16][17] Moreover, trials examining haemodynamic measurements have reported inconsistent findings. In a recent placebo-controlled invasive haemodynamic study, 3 months of treatment with empagliflozin did not reduce right-sided pressures. 27  significantly reduced pulmonary artery end-diastolic pressure after 1 week, with a difference of 1.7 (95% CI 0.3-3.2) mmHg, compared with placebo (p = 0.02), by 12 weeks. 28 This lack of effect on bilirubin contrasts with the observation that sacubitril/valsartan, when compared with enalapril, did reduce bilirubin significantly in PARADIGM-HF. 11 It is not clear why these two trials differed in this respect. Although there is no randomized controlled trial of the central haemodynamic effects of sacubitril/valsartan, this agent may have greater effects on preload and afterload than SGLT2 inhibitors as indirectly suggested by the much larger reduction in NT-proBNP with sacubitril/valsartan compared to SGLT2 inhibitors. [29][30][31][32] Transaminase levels are not increased as often as bilirubin in patients with HF and transaminase levels may reflect a decrease in cardiac index and liver blood flow more than the elevation of central venous pressure. [7][8][9] In DAPA-HF dapagliflozin treatment was not associated with a reduction in transaminases. By contrast, sacubitril/valsartan did reduce AST and ALT in PARADIGM-HF, suggesting either a specific effect of neprilysin inhibition on transaminase   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   activity or, more likely, a greater effect of sacubitril/valsartan on preload and afterload, compared with an SGLT2 inhibitor. 11 Consistent with this hypothesis, one recent study showed that the improvement in cardiac index with intravenous vasodilator and inotropic therapy in patients with advanced HF was maintained following a switch to sacubitril/valsartan (although this was not a controlled trial). 33 SGLT2 inhibitors do decrease transaminases in patients with type 2 diabetes, but this is not thought to be haemodynamically mediated and, instead, probably reflects a reduction in visceral fat, accumulation of which is not thought to be a feature of HFrEF. [18][19][20] Baseline bilirubin level did not modify the effect of dapagliflozin, as was also observed with sacubitril/valsartan (but was not examined with candesartan). This is important because these treatments are beneficial even in patients at high risk related to elevated bilirubin levels and the absolute benefits in such patients are large.

Study limitations
This was not a pre-specified analysis. Inclusion and exclusion criteria applied may have limited the generalizability of our findings. Specifically, patients with an AST or ALT more than three times the upper limit or normal (or bilirubin greater than twice the upper limit or normal) were excluded. We did not collect data on history of liver disease or alcohol intake. Other measures reflecting hepatic function, including albumin, platelet count and international normalized ratio, were not carried out in DAPA-HF. We did not collect information on right-sided filling pressures or right ventricular function. Regional variation in prevalence of sub-hepatitis may account for some of the variation in abnormal liver tests, as patients . were not screened for viral hepatitis. 34 Scheduled sampling of liver tests occurred only at baseline and the end-of-study visit (between 12 and 28 months after randomization). Including only end-of-study visits may have introduced survivor bias and our supplementary analysis, including unscheduled visits, may not be a representative sample as there was an indication for additional investigation.

Conclusion
Baseline bilirubin concentration was an independent predictor of worse outcomes but did not modify the benefits of dapagliflozin on morbidity and mortality in HFrEF.