The Combination of MR Elastography and Proton Density Fat Fraction Improves Diagnosis of Nonalcoholic Steatohepatitis

Nonalcoholic fatty liver disease (NAFLD) is rapidly increasing worldwide. It is subdivided into nonalcoholic fatty liver (NAFL) and the more aggressive form, nonalcoholic steatohepatitis (NASH), which carries a higher risk of developing fibrosis and cirrhosis. There is currently no reliable non‐invasive method for differentiating NASH from NAFL.

(AUROC = 0.74). A model combining MRE with AST improved the diagnosis of F2-4 (AUROC = 0.83). The ICC for repeatability was 0.94 and 0.99 for MRE and PDFF, respectively. N onalcoholic fatty liver disease (NAFLD) is the most rapidly growing cause of chronic liver disease worldwide, affecting about 25% of the global adult population. 1 NAFLD is a disease spectrum whose mild form, nonalcoholic fatty liver (NAFL), is defined as the presence of hepatic steatosis without any secondary cause of hepatic fat accumulation such as excessive alcohol consumption, long-term use of a steatogenic medication, or other liver disease etiologies. 2 NAFLD is associated with metabolic syndrome, obesity, diabetes mellitus, dyslipidemia, and cardiovascular disease. 3,4 Nonalcoholic steatohepatitis (NASH) is a more aggressive form of NAFLD, characterized by the presence of inflammatory features and degenerative hepatocellular changes in addition to steatosis. 5,6 The overall prevalence of NASH in the general population is estimated between 1.5% and 6.45%. 1 The dynamic nature of NAFLD has been described in many studies. 7,8 Patients with NAFLD, especially with uncontrolled metabolic disease and diabetes, suffer an increased risk of developing fibrosis with eventual progression to cirrhosis and end-stage liver disease. The presence of inflammation in NASH triggers fibrogenesis and causes progression into higher stages of fibrosis and cirrhosis. 9,10 NASH is also related to increased incidence of hepatocellular carcinoma 11,12 and liver transplantation. 13 Higher stages of fibrosis are associated with increased overall and liver-related mortality. 14,15 Liver biopsy has been the reference standard for diagnosing NAFLD, including identifying NASH and staging fibrosis. 2 However, biopsy has several limitations such as cost, sampling and inter-observer variability, and risk of discomfort and complications. 16,17 Thus, developing non-invasive imaging and biochemical markers for diagnosing and grading NAFLD has been the subject of extensive research in the last decade. 18,19 MRI techniques for the quantification of liver fat and the measurement of liver stiffness are widely studied. 18,19 Magnetic resonance proton density fat fraction (MR-PDFF) and magnetic resonance elastography (MRE) have high diagnostic accuracy for detecting and grading steatosis 20 and staging fibrosis, 21,22 respectively. Both techniques have higher diagnostic performance than non-MRI-based techniques, such as transient elastography (TE) and TE-based controlled attenuation parameter. [22][23][24][25][26] However, differentiating NASH from NAFL is still challenging.
The primary aim of this study was to investigate the ability of multiple MRI biomarkers (MRE, PDFF, R2* mapping, T1 mapping, and diffusion-weighted imaging [DWI]), either as single measures or in combination with each other or with biochemical markers, to differentiate between NASH and NAFL, and between lower and higher stages of liver fibrosis in adults with clinically suspected NAFLD. The reliability of a biomarker is not only determined by its diagnostic performance, but also by its repeatability. Hence, a secondary aim was to measure the repeatability of the MRI biomarkers.

Study Population
After approval from the regional ethical review board, a prospective study was conducted at our hospital between March 2017 and December 2019. Written informed consent was obtained from all study participants. One hundred and thirty-four individuals, recruited from the Department of Gastroenterology and Hepatology and from the Swedish CArdioPulmonary BioImage Study "SCAPIS", 27 were invited to a screening visit, where data on demographics, medical history, and concomitant medication were collected. Blood sampling was also performed at screening visit to measure cytokeratin-18 (CK18) M30 and liver function tests including alanine transaminase (ALT) and aspartate transaminase (AST).
Eligibility included: individuals aged 18-70 with clinically suspected NAFLD and at least one of the following: imaging indicative of NAFLD, 19 ALT more than 1.5 Â upper limit of normal (upper limit being 1.1 μkat/liter for men and 0.75 μkat/liter for women), CK18 M30 more than 180 U/liter, and/or biopsy showing NAFLD within 3 months prior to screening visit. Individuals with a past or present alcohol consumption of more than 30 g alcohol per day for men and 20 g for women, drug abuse, other liver diseases, corticosteroid or immunosuppressive therapy within 10 weeks, pregnancy/breastfeeding, and/or contraindication for MRI or liver biopsy were excluded. Seventy-five individuals fulfilled the inclusion and exclusion criteria.
Individuals with no available liver biopsy within 3 months underwent liver biopsy 1-4 weeks after the screening visit. Three out of the 75 persons were excluded since the liver biopsy did not show any steatosis. One of the included persons discontinued the study voluntarily before MRI examination. Thus, 71 individuals were referred to MRI. Of these, three were excluded because of claustrophobia. Consequently, the study population consisted of 68 participants.
Thirty participants out of the study population (11 NAFL and 19 NASH determined from liver biopsy) underwent a second MRI in order to assess repeatability. Those participants were selected to represent various histopathological groups, i.e., including participants with both NAFL and NASH and with different stages of fibrosis.

Histopathological Analysis
Biopsies were evaluated by two liver pathologists (AW) with more than 30 years of experience blinded to clinical, biochemical, and radiological data individually and in consensus. The steatosis-activity-fibrosis (SAF) histological scoring system was used, 5 grading steatosis 0-3, activity 0-4, and fibrosis 0-4. Activity score was calculated by the summation of hepatocyte ballooning (0-2) and lobular inflammation (0-2), and thus ranging 0-4. All cases with at least grade 1 steatosis were diagnosed as NAFLD independently of other criteria. When each of the three features (steatosis, ballooning, and lobular inflammation) was classified as at least grade 1, then the biopsy was categorized as NASH. For analysis of fibrosis, two groups were formed according to the severity and clinical relevance of the fibrosis, i.e., F0-1 (no or mild fibrosis) and F2-4 (moderate to advanced fibrosis).

Transient Elastography
TE was performed prior to liver biopsy by one of two experienced specialist nurses, blinded to all other data. Examinations were performed using the FibroScan 402 system (Echosens, Paris, France), and either the M probe or the XL probe based on the computerguided recommendation. Patients were asked to fast for at least 6 hours before the examination. TE was performed with the participant in supine position. The median value of TE-measured liver stiffness (TE-LS) in kilopascals (kPa) of at least 10 valid measurements was calculated. The examination was considered invalid if the interquartile range/median value exceeded 30%. 28 Magnetic Resonance Imaging MRI was performed 4-8 weeks after biopsy to allow for healing. The participants were asked to fast for at least 6 hours before the examination. A 3.0-T scanner (Signa PET/MR, General Electric Healthcare, Waukesha, WI) with a 16-channel body coil was used. The 30 participants in the repeatability group underwent a second MRI within 2-4 weeks of the first scan.

Magnetic Resonance Elastography
MRE was performed as previously described, 29 using a commercially available acoustic driver system (Resoundant, Rochester, MN) generating 60-Hz shear waves which were transmitted using a passive driver placed against the abdominal wall anterior to the liver. A spinecho echo-planar imaging (SE-EPI) pulse sequence with motionencoding gradients was used. 30 The acquisition parameters are listed in the Supplemental Material. Quantitative liver stiffness maps and confidence maps (elastograms) were generated on the scanner.

MR-PDFF and R2* Mapping
PDFF was performed using Iterative Decomposition of water and fat with Echo Asymmetry and Least squares estimation (IDEAL-IQ), a commercially available multi-echo 3D gradient-echo sequence which has the ability to limit the confounding effects of T1 and T2* and implements multi-peak fat model to account for the multiple resonant peaks of triglycerides. 31 The acquisition parameters are listed in the Supplemental Material. PDFF maps and R2* maps (relaxation rate = 1/T2*) were generated with IDEAL-IQ.

T1 Mapping
Saturation Method using Adaptive Recovery Times for T1 Mapping (SMART1Map) has been described elsewhere 32 as a method for T1 mapping in cardiac applications. It applies a single-point saturationrecovery FIESTA technique with the ability to measure true T1. The acquisition parameters are listed in the Supplemental Material.

Diffusion-Weighted Imaging
DWI was performed using a conventional SE-EPI sequence with bvalues of 150 seconds/mm 2 , 400 seconds/mm 2 , and 800 seconds/ mm 2 . The acquisition parameters are listed in the Supplemental Material. Apparent diffusion coefficient (ADC) maps were generated automatically.

Liver Volume Measurement
A commercially available 3D gradient-echo T1-weighted sequence with two-point Dixon technique (LAVA-Flex) was used to acquire 32 axial slices through the liver in a single full-expiration breathhold. SmartPaint software (version 1.0, Centre for Image Analysis, Uppsala University, Uppsala, Sweden) was used for post-processing the generated water-images and measuring liver volume (cm 3 ).

Image Analysis
An image analyst (AH) with 5 years of experience in quantitative liver MRI, blinded to histopathological and biochemical results, performed the quantitative MRI analysis using ImageJ software (version 1.50i, National Institutes of Health, Bethesda, MD). In accordance with the Quantitative Imaging Biomarker Alliance (QIBA) MRE protocol, 33 a free-hand region of interest (ROI) was drawn separately on each acquired slice of MRE elastograms excluding large blood vessels, the edge of the liver, fissures, and masked regions on the confidence maps. Slices with less than 500 pixels in the ROI were excluded. The ROI was cloned between the elastograms and the related anatomic/magnitude images to ensure a good anatomic correlation. The mean liver stiffness (kPa) and the ROI size (mm 2 ) were used to calculate the overall mean MRE-measured liver stiffness (MRE-LS) in kPa, weighted by ROI size. A free-hand ROI was drawn separately on each acquired slice of the PDFF, R2*, T1, and ADC maps using the same approach as for MRE. The mean values of all the acquired slices were obtained for PDFF (%), R2* (second À1 ), and ADC (10 À6 mm 2 /second). The median value was obtained from the T1 maps (single slice) and used to calculate R1 (relaxation rate = 1/T1, second À1 ).
A second reader (SA), a radiologist with 5 years of experience in general and abdominal radiology, blinded to histopathological and biochemical results and to the first reader's measurements, performed the analysis of MRE separately using the same approach mentioned above in order to evaluate the inter-rater reliability.

Statistical Analysis
All statistical analyses were done using SAS software (version 9.4, SAS Institute Inc., Cary, NC) and IBM SPSS Statistics for Windows (version 27, IBM Corp., Armonk, NY). The study population was initially divided into two groups by diagnosis (NASH/NAFL). For baseline characteristics, independent samples t-test was used to compare continuous variables and Pearson's chi-squared test was used to compare categorical variables between the two groups. Descriptive statistics of the studied biomarkers were summarized as mean, SD, and median, and grouped according to the diagnosis (NASH/NAFL) and the dichotomized fibrosis stages (F0-1/F2-4) from the histopathology analysis. Univariate logistic regression analysis was performed on all the biomarkers as independent variables, first with NASH/ NAFL and then with F0-1/F2-4 as the dependent variable. Using logistic regression analysis, the best performing bivariate models were identified. Receiver operating characteristic curves (ROC) were used to determine the diagnostic accuracy of the univariate biomarkers and the bivariate models by calculating the area under the ROC (AUROC) and thus identifying the optimal cutoffs and the corresponding sensitivity and specificity. Spearman's correlation was used to analyze the correlation between the imaging biomarkers and the grades of activity, ballooning, lobular inflammation, and fibrosis. Repeatability of imaging biomarkers was analyzed by intra-individual coefficient of variation (CV) and intraclass correlation coefficient  Youden cutoff, i.e., the maximal vertical distance between the reference line and the ROC curve.
c Alternative cutoff, i.e., the minimum distance between the ROC curve and the highest point on the Y-axis.
(ICC). ICC was also used to analyze the inter-rater reliability between the two readers who performed MRE analysis. Statistical significance was set at P < 0.05.

Baseline Characteristics
The study population consisted of 68 individuals with biopsyproven NAFLD (40 men, 28 women) with a mean age of 54.5 years and a mean body mass index of 30.8 kg/m 2 . NASH diagnosis was established in 53 participants and 15 were diagnosed as NAFL based on the histopathological assessment. Baseline characteristics and the distribution of different steatosis grades, activity grades, and fibrosis stages are presented in Table 1. There were no statistically significant differences between the groups, except for the frequency of type 2 diabetes which was significantly higher in the NASH group. TE, MRE, PDFF, R2* mapping, R1 mapping, ADC, and liver volume measurement were able to be obtained in 66, 64, 68, 68, 64, 65, and 67 participants, respectively. In the second MRI examination, PDFF, R2* mapping, R1 mapping, ADC, and liver volume measurement were obtained for all the 30 participants, while MRE could be assessed in 29 participants.
The ICC of MRE analysis by two readers (inter-rater reliability) was 0.98.

Differentiation Between NASH and NAFL
Summarized descriptive statistics for imaging and biochemical markers grouped by NASH/NAFL are presented in Table 2.
Univariate logistic regression analysis showed significant differences between the groups in TE-LS, MRE-LS, CK18 M30, and ALT.  In bivariate logistic regression analysis, both MRE and PDFF contributed significantly to a bivariate model for diagnosing NASH (AUROC = 0.84) ( Table 3 and Fig. 1).
Combining MRE with biochemical markers did not improve the diagnostic accuracy, eg, CK18 M30, ALT, and AST showed no significant improvement of the bivariate logistic regression models combining each of them with MRE-LS (P = 0.51, 0.64, and 0.158, respectively). Likewise, CK18 M30, ALT, and AST showed no significant improvement of the bivariate logistic regression models combining each of them with TE-LS (P = 0.122, 0.211, and 0.529, respectively). MR elastograms, PDFF, R2*, T1, and ADC maps for three participants with different histopathological findings, i.e., both NAFL and NASH and with different stages of fibrosis and steatosis, are shown in Figs. 2 and 3.

Repeatability of Imaging Biomarkers
The mean interval between the first and second MRI was 23.5 (range 14-48) days. The CV and ICC values of the imaging biomarkers are listed in Table 5. The ICC of MRE, PDFF, R2*, and liver volume was higher than 0.9. MR images demonstrating the repeatability in two different participants are displayed in Figs. S3 and S4 in the Supplemental Material.

Discussion
In our study, liver stiffness measured by both TE and MRE could differentiate NASH from NAFL. Prior prospective studies which tested the ability of MRE to differentiate between NASH and NAFL have reported cutoff values between 2.53 and 3.26 (AUROC 0.70-0.79). 24,34,35 In the present study, a cutoff value of 2.74 kPa (2.5 kPa if the minimum distance between the ROC curve and the highest point on the Y-axis was used) could be identified with an AUROC of 0.74. The fact that cutoff values vary between studies might be explained by demographic differences, differences in study designs and in the used technique to obtain and analyze MRE. Furthermore, the diagnosis of NASH is complicated by the heterogeneity of the histopathological findings, the variations in biopsy sampling and interpretation, and the variations in the available histopathological scoring systems. 6,16,17 A major difference between our study and the above-mentioned studies 24,34,35 is the histopathological scoring system used. Those studies used the NASH Clinical Research Network scoring system, 36 while the present study used the newer SAF scoring system, 5 which offers a more distinct definition of NASH. Technically, obtaining seven slices for MRE through the liver in full-inspiration instead of the routinely used technique of four slices in fullexpiration might cause differences when comparing the MRE results of the present study with previous studies. Full-inspiration was preferred since abdominal obesity is common in NAFLD. Thus, in full-expiration, the liver might be positioned too cranially to permit adequate transfer of the acoustic waves into the liver.
A bivariate logistic regression model combining MRE-LS with PDFF showed a better performance in diagnosing NASH than MRE-LS alone. This might be partly explained by the observation that MRE-LS was correlated to lobular inflammation and PDFF was correlated to ballooning, which are the two components used for NASH diagnosis in the SAF scoring system. A similar model combining gradient-echo MRE and PDFF showed an AUROC of 0.87 in a recently published study. 37 In another study, 38 a multiparametric MR index, combining MRE, PDFF (determined by magnetic resonance spectroscopy), and T1 mapping, could diagnose NASH with AUROC of 0.883. In the present study, R1, calculated from T1 mapping, could not diagnose NASH or F2-4. However, there is yet no reference standard method for liver T1 mapping and the method used in our study (SMART1Map applying a single-point saturation-recovery technique) differs from the method used in the other study (Look-  Locker inversion recovery technique). Likewise, the used DWI sequence might not be optimal for diagnosing NASH.
In addition to its known role in accurately quantifying steatosis, 39 PDFF showed a potential role in improving the performance of MRE in diagnosing NASH in the present study. Even though the PDFF cutoff value for NASH of 10.4% was not statistically significant in the present study, it differs from the reported cutoff between normal and steatotic liver parenchyma of 5.2%. 23 Moreover, the MRE cutoff for NASH (2.5 kPa or 2.74 kPa), which was statistically significant in the present cohort of individuals with confirmed NAFLD, did not differ substantially from the cutoff reported in another study in individuals with or without NAFLD (2.49). 35 This suggests that MRE and PDFF might be needed to be assessed in combination to diagnose NASH.
TE and MRE could differentiate between F2-4 (moderate to advanced fibrosis) and F0-1 (no or mild fibrosis). In a recently published meta-analysis 40 including 12 studies, MRE was found to have a pooled AUROC of 0.93 and cutoff values ranging from 2.38 kPa to 5.37 kPa to diagnose F2-4. In the present study, the cutoff was 2.82. Another meta-analysis 22 concluded that MRE has a higher diagnostic accuracy in grading fibrosis than TE, with an AUROC of 0.92 and 0.87 for MRE and TE, respectively, for diagnosing F2-4. A recent prospective study 25 stated the same conclusion with an AUROC of 0.85 and 0.75, while another recent prospective study 26 showed an AUROC of 0.85 and 0.77, but with no statistically significant difference between the AUROC values for MRE and TE, respectively, for diagnosing F2-4. In the present study, TE had a higher AUROC than MRE for diagnosing F2-4. This might be influenced by the fact that TE acquired data from the same liver lobe where biopsy was performed. Furthermore, TE was performed on the same day as liver biopsy in most participants (those with no recent biopsy available), while MRI was performed 4-8 weeks after biopsy.
Two bivariate logistic regression models combining MRE with AST and MRE with ALT showed better performance in diagnosing F2-4 than the single univariate biomarkers and improved the AUROC. However, the diagnostic performance of ALT as well as CK18 M30, as univariate biomarkers or in the model combining MRE with ALT for diagnosing F2-4, has to be interpreted cautiously, as both were considered as optional inclusion criteria.
It is important to point out that the results presented in this study, including different cut-off values and bivariate models to differentiate between NASH and NAFL and between F2-4 and F0-1, were determined in a cohort of adult participants with biopsy-proven NAFLD. Thus, these results are considered applicable only when there is clinical suspicion of NAFLD and the presence of hepatic steatosis without any secondary cause of hepatic fat accumulation has been confirmed. MRI-based techniques, primarily PDFF, can readily confirm and grade hepatic steatosis 20 which is the first hallmark in the diagnosis of NAFLD.
A strength of the present study was the wide range of imaging biomarkers that were compared in the same population. Another strength was the excellent repeatability of most of the studied imaging biomarkers, including MRE and PDFF, as well as the high inter-rater reliability of MRE. The results of this study emphasize the potential role of quantitative MRI techniques in diagnosing and grading diffuse liver diseases when included as a part of a multiparametric MRI liver protocol in clinical practice, and thus reducing the need for liver biopsy.

Limitations
A limitation of the present study was the small number of participants, particularly participants having advanced fibrosis (F3-4) (N = 8), limiting the possibility to study the differentiation between the individual fibrosis stages. The skewed populations of NASH vs. NAFL were another limitation.

Conclusions
Our study demonstrated that liver stiffness measurement by both TE and MRE could identify individuals with NASH and differentiate between those with no or mild fibrosis and those with moderate and higher stages of fibrosis. Combining MRE with PDFF improved the diagnosis of NASH, implying that specific imaging parameters might reflect specific histological criteria.