Prognostic value of multiparametric magnetic resonance imaging, transient elastography and blood‐based fibrosis markers in patients with chronic liver disease

Liver cT1, liver T1, transient elastography (TE) and blood‐based biomarkers have independently been shown to predict clinical outcomes but have not been directly compared in a single cohort of patients. Our aim was to compare these tests’ prognostic value in a cohort of patients with compensated chronic liver disease.


| BACKG ROUND AND AIMS
Chronic liver diseases (CLDs) affect an estimated 1.5 billion people worldwide, 1 with the predominant causes being non-alcoholic fatty liver disease (NAFLD), 2 chronic viral hepatitis (VH) B and C 3 and alcohol. 4 Liver fibrosis is the final common pathway of injury in CLDs. Once fibrosis progresses to cirrhosis, each year approximately 5%-7% will become decompensated 5 (develop variceal bleeding, ascites and hepatic encephalopathy). Cirrhosis is the 11th most common cause of mortality worldwide 6 and liver cancer is reported as the 4th most common cause of cancer-related mortality. 7 In the face of this public health epidemic there is an unmet clinical need to stratify disease severity in patients with CLD and to flag patients at risk of decompensation as early as possible. Liver biopsy is used routinely in clinical practice to assess fibrosis stage and inform prognosis, 8 but is unsuitable for routine longitudinal follow-up of patients because of its invasive nature, associated risk and cost.
The magnetic resonance imaging (MRI) parameter T 1 , when corrected for iron content (cT 1 ) 9 can quantify extracellular water content, which rises with fibrosis and inflammation. Liver cT 1 correlates with liver fibrosis 10 and severity of steatohepatitis, 11 has excellent repeatability and reproducibility, 12 and is used in the UK Biobank population health study as the reference for liver fibroinflammatory disease. 13 Liver cT 1 , when stratified into groups according to a liver inflammation and fibrosis (LIF) score, was found to predict clinical outcomes in a general hepatology outpatient setting by Pavlides et al. 14 Liver cT 1 is no longer converted into the LIF score and refinement of the algorithm used to generate liver cT 1 now enables it to be standardised across MRI scanner field strengths and vendors. Liver T 1 has also been shown to be associated with development of clinical outcomes in other cohorts 15,16 Transient elastography (TE) correlates with fibrosis 17,18 and can predict clinical outcomes. [19][20][21] Fibrosis-4 (FIB-4), aspartate aminotransferase to platelet ratio index (APRI) and AST/ALT ratio are commonly used markers of liver fibrosis/inflammation and have also been shown to predict clinical outcomes in patients with CLD. [22][23][24][25] To date there has been no study evaluating the prognostic significance of MRI, TE and blood-based markers alongside histology as a reference.
Using a larger cohort of patients with CLD, over longer follow-up than Pavlides et al, the primary aim of this study was to assess the prognostic value of liver cT 1 . Secondary aims were to assess the relative prognostic value of other liver-related biomarkers, including liver fat measured by MR proton spectroscopy ( 1 H-MRS), liver iron measured by T 2 *, transient elastography (TE), histological fibrosis stage and blood-based composite scores (FIB-4, APRI and AST/ALT ratio).

| Patient selection
This study follows up patients from two cohort studies primarily assessing the role of MRI in liver disease assessment (details in Table S1). Patients referred for a clinically indicated liver biopsy, or with a known diagnosis of liver cirrhosis, were eligible to take part. At baseline, patients underwent multiparametric MRI scans, TE and blood sampling following a minimum of 4 hours fasting.

| MR Analysis
Spectroscopic data were fitted using the OXSA toolbox 26 implementation of the AMARES algorithm 27 with an in-house MATLAB (The Mathworks, Natick, MA, USA) script. 28 Liver fat fraction was calculated as the fat signal divided by the total fat and water signal.
LiverMultiScan TM is a software product, developed specifically to measure T 1 , T 2 * (iron content) and cT 1 (liver fibroinflammation), and was used for the image analysis in this study. Images were analysed

Lay summary
Liver disease affects people in different ways with some more likely to get worse than others. Liver biopsy involves passing a needle into the body to collect a sample of the liver. Liver biopsy helps assess how severe the liver disease is and the likelihood it will get worse in the future. Liver biopsies, however, are expensive and carry a risk of bleeding and other complications. There is, therefore, a need for alternative non-invasive tests that can inform clinical decision making as a liver biopsy would, but without the risks.
In this study, we assessed whether 3 types of non-invasive test could predict which patients' liver disease would worsen in the future. We found that magnetic resonance imaging (MRI) scans, measurement of liver stiffness (called transient elastography (TE)) and blood-based tests can all help identify patients who are more likely to get worse liver disease in the future. by trained analysts blinded to the clinical and histological data (for details, see Supplementary methods; for example images, see Figure S1).
In previous publications, 14 cT 1 was tri-linearly mapped to a discrete scoring system graded from 0 to 4 named the LIF score in an attempt to align with histological assessment gradations. The LIF score is no longer supported by LiverMultiScan TM , and here we report cT 1 values.

| Histological assessment
All liver biopsies were performed as part of the patients' clinical care using 18G needles for percutaneous biopsies and 19G needles for transjugular biopsies (quality and timing reported in Supplementary   Information). All biopsies were included in the analysis as they were used to inform clinical care irrespective of the core length or number of portal tracts. All biopsies were assessed for the Ishak stage by a specialist liver histopathologist blinded to the imaging data. Fibrosis severity was defined as mild (Ishak F0-2), moderate (Ishak F3-4) and severe (Ishak F5-6).

| Blood samples
Blood samples were taken on the same day as the MRI scan. Simple scores for serum-based fibrosis markers Fibrosis-4 (FIB-4), 29 aspartate aminotransferase/alanine aminotransferase (AST/ALT) ratio and AST to platelet ratio index (APRI) 30 were calculated.

| Transient elastography (liver stiffness)
Transient elastography (TE) measurements of liver stiffness were performed using Fibroscan® (Echosens, Paris) by trained operators with no prior experience. The M probe was used first and, if unable to obtain a measurement, the XL probe was then used. For a successful measurement, 10 valid readings were required and, as per recommended guidelines, unreliable readings were defined as having an IQR/median > 0.3, or success rate < 60%. Failure was defined as no measurement obtained using either M or XL probes.

| Clinical follow up
Outcome data were extracted from the patients' electronic medical records. The primary clinical endpoint was a composite endpoint comprising ascites, variceal bleeding, hepatic encephalopathy, hepatocellular carcinoma (HCC), liver transplantation and mortality. A secondary endpoint was all-cause mortality. Patient records were reviewed by two researchers (blinded for review) blinded to the patients' MRI and liver-related biomarker results until the patients' latest medical evaluation, until they died, or were censored at 72 months maximum follow-up. Disagreements were adjudicated by a senior clinician (blinded for review). Patients were considered "lost to follow-up" if they did not return for any clinical follow-up after their baseline assessments. Where multiple events occurred, only the first event was counted in analysis.
Patients were excluded if they had decompensated liver disease at baseline as a non-invasive liver test is unlikely to be needed in these patients.

| Statistical analysis
Baseline statistics are described as mean ± standard deviation (SD) for normally distributed variables and median (interquartile range (IQR)) for non-normally distributed variables. The primary outcome was survival of the composite endpoint, hereafter, termed a 'clinical event'. All-cause mortality was a secondary endpoint. Survival analysis was carried out on all the available data for each outcome with each biomarker, missing results were excluded from analysis and no imputation performed. Kaplan-Meier curves were compared using the log-rank test to evaluate significance of survival differences of binary and grouped cut-offs. All p values quoted for binary cut-offs were calculated using the logrank test. Cox proportional hazards analysis was used to calculate The primary variable of interest was liver cT 1 . Cut-off values were predefined with cT 1 > 825 ms corresponding to the previously reported LIF score ≥ 2 14 and the best (Youden index) cut-off was also assessed. Liver cT 1 was also grouped using the cut-offs corresponding to the 90% sensitivity and 90% specificity for identification of clinical events. To determine the importance of iron correction, we also assessed uncorrected liver T 1 at 825 ms and the best cut-off Given the spatial heterogeneity of disease often seen in autoimmune hepatitis (AIH) and biliary liver disease, which could affect the reliability of MRI and TE through sampling error, subgroup analysis was conducted for event-free survival in patients with only the 3 main liver disease aetiologies (NAFLD, ArLD and VH) where disease distribution is more homogenous.
To address the influence of technical rate of failure/unreliability of tests, separate intention to diagnose (ITD) analysis was conducted for event-free survival taking into account all attempted measurements, including technical failures and unreliable results. Unreliable results were included, and failed attempts were assigned either as false positive or false negative depending on the patient's outcome, generating 3 x 2 tables (Table S2). 31 Non-invasive tests were assessed for event-free survival by multivariate Cox proportional hazards analysis in variables with sufficient data available.
Statistical significance was set at P < .05. Analysis was performed, and plots generated using R Statistical software. 32

| Cohort characteristics
Two-hundred-and-thirty-five patients were included, of which 17 (7%) were lost to follow up and 21 (9%) were excluded for having decompensated liver disease at baseline ( Figure 1). In total, 197 patients were followed up for a total of 693 patient-years with a median (IQR) follow-up of 43 (26-58) months.
Baseline characteristics for the whole cohort and for those who had/did not have clinical events are shown in Table 1. The three most common aetiologies of chronic liver disease were NAFLD (n = 85, 43%), ArLD (n = 22, 11%) and VH (n = 50, 25%). Overall, 178 (90%) patients had liver biopsy, in whom fibrosis was mild, moderate and severe in 95 (48%), 28 (14%) and 55 (28%) respectively. Fifty-nine patients had their underlying liver disease treated during the follow-up period-these included patients with chronic hepatitis C achieving a sustained virological response (SVR) (n = 29), patients with chronic hepatitis B achieving viral suppression (n = 8), patients regularly consuming harmful amounts of alcohol achieving complete abstinence (n = 3) and patients undergoing bariatric surgery who lost > 10% body weight (n = 9).
There were 14 new clinical events. Mortality occurred in 11 patients of which five patients died from non-liver-related causes (Table S3). Concordance of clinical event reporting between adjudicators was 93% (see Supplementary information for details).
Patients who had clinical events were older and had higher prevalence of ArLD, more severe fibrosis by histology and non-invasive biomarkers (serum scores, liver stiffness and liver cT 1 ), higher AST, ALP, GGT and bilirubin and lower albumin and platelets than patients without clinical events (Table 1).

| Whole cohort
MR scanning was successful in 182 of 197 (92%) patients, but failed in 15 (3 had claustrophobia, 1 did not fit in the scanner and 13 scans were of poor quality).

F I G U R E 1 Flow diagram of study
Liver cT 1 when stratified into groups showed significant increase in risk of clinical events with increasing cT 1 thresholds (P < .001, Figure 2D, Table S6).
In ITD analysis liver cT 1 > 825 ms could predict event-free survival (P = .009), identifying all events. for all-cause mortality.

F I G U R E 2
In ITD analysis (n = 197, 14 events) neither liver T 1 > 825 ms nor T 1 > 868 ms could predict event-free survival (P = .2 and 0.07 respectively).

| Liver iron and liver fat
Neither liver iron nor liver fat was predictive of clinical outcomes in either the whole cohort or the subset of NAFLD/ArLD/VH (Tables S4, S5, S7 and S8).

| Whole cohort
Biopsies were performed in 178/197 (90%) patients. In those with biopsy there were 13 clinical events and 11 deaths.
When grouped into mild, moderate and severe fibrosis, Ishak stage showed an increase in risk of clinical events (P = .002, Table S10, Figure S2). Patients with moderate fibrosis showed no greater risk of clinical events than those with mild fibrosis.
When stratified into groups, TE showed significant increase in risk of liver events with increasing TE thresholds (P < .001, Table S12).  Figure S4).
ITD analysis results were exactly the same as there were no failed tests. The details of the intention to diagnose analysis for all biomarkers are included in the supplement (Tables S13-S19).

| Traditional risk factors
Age was predictive of event-free survival (P = .008, HR: 1.073, 95% nor treatment for underlying liver disease was predictive of eventfree survival or all-cause mortality.

| Multivariate analysis
TE was excluded as a result of insufficient quantity of data. No more than two variables were assessed together because of the low number of events per variable (EPV).

| Continuous variables
Liver cT 1 showed a trend towards being predictive of event-free survival independently of liver T 1 . Liver cT 1 was predictive of event-free survival independently of APRI score. Liver cT 1 was not predictive of event-free survival independently of either FIB-4 or AST/ALT ratio, (Table 2). Liver cT 1 was predictive of event-free survival independently of Age (Table S20).
All multivariate analysis results should be taken with caution as EPV varied between 6 and 7 rather than the recommended 10.

| D ISCUSS I ON
We have shown further evidence with 693 patient-years of followup that liver cT 1 can predict all-cause mortality and event-free survival. The prognostic values of uncorrected liver T 1 , TE, Ishak fibrosis stage and blood-based markers of liver disease severity were also assessed in the same cohort. Liver cT 1 > 825 ms correctly identified 12/13 (92%) of clinical events and performed even better when the patient cohort was restricted to NAFLD, ArLD and VH, identifying all liver events.
Higher liver T 1 has been shown in a previous study to be associated with the development of liver-related clinical outcomes, albeit in a cohort of patients with CLD with normal liver iron content. 15 In our study there were two patients correctly indentified by liver cT 1 > 825 ms who developed liver events who were missed by uncorrected T 1 > 825 ms This reduced the correct classification percentage from 92% to 77%. The best T 1 cut-off was > 868 ms, which could not predict clinical events in ITD analysis, whereas cT 1 's best cut-off of > 840 ms could. As liver iron content itself (as measured by MRI T 2 *) was not predictive of clinical outcomes, these results underscore the utility of the iron correction method.
This is especially important given the prevalence of elevated liver iron content in the general population 33  Liver cT 1 is likely to have a greater impact earlier rather than later in the screening process for the detection of early liver fibroinflammatory disease and help in monitoring these patients non-invasively.
Studies of even larger cohorts may bring out these differences in the future.
We have shown liver cT 1 performs especially well in the most prevalent CLDs, namely, NAFLD, ArLD and VH, with cT 1 cut-offs correctly identifying all liver events. These particular diseases represent a significant financial cost to the UK's social and healthcare systems. 45 Accurate, cost-effective stratification of these patients with non-invasive biomarkers would allow significant savings in time and money and an increase in patient comfort. There is also an unmet need for reliable TA B L E 3 Cox proportional hazards multivariate analysis for event-free survival-binary cut-offs non-invasive endpoints in NASH drug development. Liver cT 1 can differentiate simple steatosis from NASH, 11 and the prognostic value of liver cT 1 shown in this study provides further evidence to support the use of liver cT 1 for prognostic enrichment in NASH drug trials.
We conducted multivariate analysis which indicated liver cT 1 to perform as well as T 1 , better than APRI, but worse than FIB-4 and AST/ ALT. However, analysis also showed that at the prespecified cut-offs, liver cT 1 performed better than T 1 and FIB-4 and similarly to AST/ALT and APRI. Meaningful cut-offs are desirable in clinical use, indicating liver cT 1 has independent clinical prognostic relevance. These multivariate results should be taken with caution as the number of events per variable was lower than the minimum recommended threshold 46 and therefore may yield results with high margin of error.
Limitations of this study included no fixed follow-up time point.
This means that the AUC analyses, sensitivities, specificities, PPVs and NPVs generated in cut-off analysis should be taken with caution because unlike in Cox proportional hazards and Kaplan-Meier analysis, the effect of patients being censored at different time points is not taken into account. Our patient cohort included a wide range of liver disease severity and aetiologies from patients with mild fibrosis to those with cirrhosis. Future studies should examine the prognostic value of liver cT 1 alongside other non-invasive tests in prespecified liver disease severity groups.
In conclusion, liver iron-corrected T 1 (cT 1 ), TE and serum-based blood biomarkers can identify patients at risk of developing clinical outcomes in a cohort with mixed CLD aetiologies, typical of general hepatology cohorts, but when taking into account technical failures of MRI and TE, MRI and blood markers perform better than TE. Further multicentre studies should be carried out to validate our results.