Dr Jeremy F L Cobbold, Hepatology and Gastroenterology Section, Department of Medicine, Imperial College London, Liver Unit, 10th Floor, QEQM building, St Mary's Hospital Campus, London W2 1NY, UK. Email: firstname.lastname@example.org
Non-alcoholic fatty liver disease (NALFD) is a burgeoning global health problem, and the assessment of disease severity remains a clinical challenge. Conventional imaging and clinical blood tests are frequently unable to determine disease activity (the degree of inflammatory change) and fibrotic severity, while the applicability of histological examination of liver biopsy is limited. Imaging platforms provide liver-specific structural information, while newer applications of these technologies non-invasively exploit the physical and chemical characteristics of liver tissue in health and disease. In this review, conventional and newer imaging-based techniques for the assessment of inflammation and fibrosis in NAFLD are discussed in terms of diagnostic accuracy, radio-pathological correlations, and practical considerations. In particular, recent clinical studies of ultrasound (US)-based and magnetic resonance elastography techniques are evaluated, while the potential of contrast-enhanced US and magnetic resonance spectroscopy techniques is discussed.
The development and application of these techniques is starting to reduce the clinical need for liver biopsy, to produce surrogate end-points for interventional and observational clinical studies, and through this, to provide new insights into the natural history of NAFLD.
Non-alcoholic fatty liver disease (NAFLD) represents a range of liver disorders characterized by hepatic steatosis in the absence of excessive alcohol consumption, viral, or drug-related etiologies. It is the most common cause of liver disease in the absence of viral hepatitis or alcohol excess in the industrialized world, affecting approximately 15–50% of the adult population across various countries, depending on the population studied and how the cases are defined (e.g. by liver biopsy, ultrasonography, or elevated liver enzymes).1 The prevalence of NAFLD rises to approximately 70% in patients with type 2 diabetes, and as high as 95% in morbidly obese patients, as discussed in a recent systematic review.1,2 Central obesity, type 2 diabetes or impaired fasting glucose, hypertension, and dyslipidemia, comprise the metabolic syndrome, of which NAFLD is considered to be the hepatic manifestation.
The pathogenesis of NAFLD combines both genetic predisposition and environmental factors and has been reviewed elsewhere.3 Where there is an imbalance of free fatty acid import and synthesis to export and catabolism in hepatocytes, triglyceride accumulates. Oxidative stress and immune activation, associated with hepatic insulin resistance, can lead to chronic liver damage and fibrogenesis by pathways common to other liver diseases.4 Polymorphisms in patatin-like phospholipase domain-containing protein-3 might account for much of the racial difference in hepatic steatosis evident in the USA,5 and also the presence of steatohepatitis,6 while in a Chinese cohort, variation in the glucokinase regulator gene was associated with NAFLD.7 Physical activity and diet are potentially modifiable environmental factors,8 with contributions from the intestinal microbiota gaining prominence as a contributory factor.9
Why is NAFLD important? The histological spectrum of NAFLD ranges from simple steatosis (SS) (presence of intrahepatocellular triglyceride droplets), through to the presence of liver cell injury and inflammation (non-alcoholic steatohepatitis [NASH]), fibrosis, and cirrhosis. Lesser degrees of inflammation without hepatocyte injury or fibrosis might be combined with SS in the term “not NASH”. Hepatocellular carcinoma (HCC) can develop in patients with NAFLD, particularly, but not exclusively, at the stage of cirrhosis.10 Furthermore, the proportion of patients requiring a liver transplant due to NAFLD in the USA has increased from 1.2% in 1997–2003 to 7.4% in 2010, while this exceeds 12% if cases of cryptogenic cirrhosis are added.11
Two large population-based studies using magnetic resonance spectroscopy (MRS) have been published. The first showed that over one-third of people in a large, urban US population had hepatic steatosis (hepatic triglyceride [TG] content > 5.5%);12 the second found 29% of an adult Hong Kong Chinese population had hepatic TG > 5%.13 Although the natural history of the disease has not yet been fully elucidated, a community-based study showed patients with NAFLD to have a higher mortality than the general population, with liver-related deaths ranking as the third most common cause of death.14 Increased mortality is associated with increased disease severity of NAFLD, particularly the development of cirrhosis or HCC,15,16 and NASH is the hallmark of potentially progressive disease.17 The presence of fibrosis of any grade is an independent predictor of liver-related mortality, but advanced stages of fibrosis are more strongly related to a worse outcome.18,19 This makes the detection of liver injury and inflammation (i.e. NASH) and the staging of fibrosis central to the management of patients with NAFLD.
Approaches to the assessment of NAFLD
Histology. Liver biopsy is still considered the gold standard for the diagnosis and staging of NAFLD. Steatosis and steatohepatitis are histologically defined, and histology provides a semiquantitative assessment of disease severity.19,20 However, it is limited by its invasive nature, rendering it unsuitable for serial assessment. There is also significant associated inter- and intra-observer variability, and sampling error, as < 1/50 000 of the liver is assessed. This is important, as NAFLD can have a heterogeneous disease distribution.21 These factors have driven the development of alternative techniques, including clinical algorithms, serum markers, and non-invasive imaging modalities, as a means to diagnose and gauge the severity of NAFLD.22
Blood markers and clinical algorithms. Non-invasive serum markers of liver disease might be direct or indirect, and can be used singly or in combination with other markers, or with anthropometric data. Numerous markers have been developed, which correlate with fibrosis stage, and have been reviewed elsewhere.23 Of these, a few have been validated extensively.
The NAFLD fibrosis score is an example of an indirect panel marker, which includes clinical, anthropometric, and blood-derived data, and can reliably exclude advanced fibrosis with a high negative predictive value (NPV).24 These findings were confirmed in a study assessing the aspartate aminotransferase/alanine aminotransferase ratio, BARD score, and FIB-4 score, compared to histology. While the NPV exceeded 92% for all scores, the positive predictive value (PPV) ranged from only 27% to 75%, indicating that these tests are more valuable in excluding advanced disease, rather than diagnosing it.25 Furthermore, these tests were not shown to be effective in distinguishing between patients with SS and those with NASH, which is a significant limitation, given that some inflammatory changes are predictive of eventual disease progression.17
Direct markers, such as hyaluronic acid, the amino terminal of procollagenase III, and the tissue inhibitor of metalloproteinase-1 comprising the enhanced liver fibrosis panel, are reflective of intermediates or metabolites that are produced during fibrogenesis.26 Levels of cytokeratin-18 fragments, a marker of apoptosis, have been shown to be significantly higher in patients with NASH, compared to those with SS in a multicentre study.27 These tests might reduce the need for liver biopsy, and their diagnostic accuracy might be enhanced with the addition of specific serum markers of fibrosis or non-invasive imaging modalities. However, important limitations remain. Such blood markers might be confounded by extrahepatic disease, contain large ranges of values with an indeterminate risk of (advanced) fibrosis, and studies have yet to assess the prevalence of disease which affects the pretest probability, and thus the predictive values of the tests.23
Imaging techniques. Conventional radiological modalities, such as ultrasound (US), computed tomography (CT), and magnetic resonance (MR) techniques, provide a visual representation of macroscopic pathological changes, based on the physical properties of the tissues imaged and the techniques used. Hepatic steatosis can be detected and quantified using imaging modalities, as reviewed elsewhere.28 However, epidemiological studies demonstrating a lack of increased mortality in patients with SS compared to the normal population,15 and newer studies of pathophysiology29 suggest that the quantity of hepatic lipid might not be important for prognostic purposes and does not relate to the risk of disease progression. Conversely, structural changes associated with cirrhosis and portal hypertension, such as an irregular liver contour, splenomegaly, and the presence of intra-abdominal varices, might be visualized using conventional imaging techniques, providing evidence of cirrhosis with high specificity, but limited sensitivity.30
A limitation of the above techniques is that they do not allow accurate distinction between cases of SS (and “not NASH”) and NASH, or differing stages of fibrosis. There are also logistic issues. Thus, US is operator dependent; CT subjects the patient to ionizing radiation, and is therefore unsuitable for serial assessment; while MR is limited by high costs. Newer imaging-based techniques, such as transient elastography (TE), magnetic resonance elastography (MRE), acoustic radiation force impulse imaging (ARFI), contrast-enhanced US (CEUS), and MRS, show promise for the detection of liver fibrosis, and in some cases, the presence of steatohepatitis. Such imaging-based techniques have the advantage of being liver specific, whereas blood-based markers might be confounded by extrahepatic disease processes. Radio-pathological correlations, diagnostic accuracy, and the limitations of these imaging-based approaches are discussed in this review.
Conventional radiological modalities, such as US, CT, and magnetic resonance imaging (MRI), are able to detect gross macroscopic changes accompanying chronic liver disease, which can then be used to generate indices reflective of disease severity. Features of these techniques, as applied to NAFLD, are outlined in Table 1.
Table 1. Features of conventional and newer imaging modalities for the assessment of non-alcoholic fatty liver disease (NAFLD)
US is a widely-available imaging technique that assesses echogenicity, visible as tissue “brightness”. Using conventional B-mode US, hepatic steatosis results in a “bright liver” when compared to other internal organs, such as the renal cortex or the spleen, which are devoid of fat, and would be of similar echogenicity to a fat-free liver. Sonographic signs of cirrhosis include a nodular liver surface, parenchymal inhomogeneity, and altered intrahepatic and extrahepatic vascular architecture and blood flow.31 Indices based on calculating the ratio between the right lobe and the caudate lobe, which disproportionately hypertrophies in cirrhosis, can be used to diagnose cirrhosis, but are of limited value in routine practice.32 In a study of 300 cases of chronic liver disease (52% with chronic viral hepatitis, 6% with NAFLD) with paired ultrasonography and liver biopsy, surface nodularity had a specificity of up to 95% for severe fibrosis or cirrhosis (F3 and F4), but the sensitivity was just 54%.33 Other indices, such as flattened hepatic vein Doppler waveform and caudate lobe hypertrophy, resulted in sensitivities of 57% and 41%, and specificities of 76% and 57%, respectively. Combining all three indices increased specificity to 98%, but decreased sensitivity substantially.33
CT measures hepatic attenuation, which is dependent on tissue density. Attenuation can be expressed absolutely (as Hounsfield units), or relatively, the latter by comparing hepatic attenuation to that of an internal organ without fat, usually the spleen. This latter method has been used for the quantification of hepatic lipid.34 MRI is based on the difference in precessional frequency of protons in different chemico–physical environments, caused by the shielding effect of surrounding electrons, which partially counteract the applied magnetic field. MRI and MRS techniques have been developed to quantify steatosis,12,28 and as with both US and CT, anatomical differences in advanced disease might be observed. However, the use of MRI to evaluate tissue/liver inflammation and fibrosis in NAFLD has not been assessed extensively (Table 2).
Table 2. Evaluation of imaging modalities for assessment of inflammation
Measurement of liver stiffness/elasticity. Hepatic parenchymal pathology influences liver stiffness. While early studies stressed the correlation between the presence of fibrosis and liver stiffness,46 increasing evidence has demonstrated that inflammation, vascular factors, and possibly steatosis all contribute to stiffness measurements.47 These effects have been exploited in three platforms: TE (Fig. 1), ARFI,48 (Fig. 2) and MRE49 (Fig. 3), and applied in NAFLD. While TE and ARFI are US-based elastography modalities, MRE similarly uses the differing visco-elastic properties of healthy and disease liver, but is an MR-based technique, assessing a 2-D or 3-D region of interest. Although TE, marketed and commonly known as FibroScan (Echosens SA, Paris, France), has gained widespread clinical acceptance for gauging the severity of fibrosis, a frequently cited limitation is the difficulty in accurately assessing obese patients, leading to poorer success rates, accuracy, and reproducibility.50 Clearly, this is a major drawback in patients with NAFLD, as obesity is prevalent in this population. Studies have since validated the use of a new “XL” probe.51,52 The proportion of scan failures in a predominantly obese patient group was reduced from 16% with the M probe to 1.1% with the XL-probe, while the proportion of successful reliable measurements (≥ 10 valid measurements, ≥ 60% success rate and interquartile range ≤ 30% of median value) increased correspondingly, from 50% with the M probe to 73% with the XL-probe.
CEUS. Microbubble US contrast agents are echo enhancers for both Doppler and gray-scale US imaging. Their use markedly improves the US signal, with a > 300-fold increase in echo strength, and also allows dynamic imaging of the liver. CEUS can be used as a kinetic tracer to assess the severity of liver disease, by exploiting the local and systemic hemodynamic changes that accompany chronic liver disease. “Transit time” curves are generated by tracking the passage of a peripherally-administered intravenous microbubble bolus through the circulation.53 Different measurements of transit times have been used to assess the severity of chronic liver disease, reflecting both the extrahepatic and intrahepatic hemodynamic changes that accompany both cirrhosis and precirrhotic liver disease.54 However, these times have not yet been independently validated in NAFLD. Analysis of the parenchymal phase of contrast enhancement has shown promise for the detection of NASH.44
MRS. Proton MRS is widely used for the quantification of hepatic lipid.12,28In vitro MRS studies of oils55 and intact liver tissue56 have demonstrated that lipid resonances might be quantified to derive indices of lipid composition, including saturation and polyunsaturation. These compositional indices differed between obese patients with and without hepatic steatosis,55 but assignment of resonances was clarified by reference to better-resolved in vitro spectra at high field strength.57 Subsequently, indices of lipid composition using in vivo proton (1H) MRS at 1.5 Tesla have been shown to delineate the severity of fibrosis in patients with chronic hepatitis C (in whom hepatic steatosis is prevalent).58 Further improvement in the resolution of these smaller resonances is still required for application and validation in well-characterized cohorts of NAFLD.
Phosphorus-31 MRS (31P MRS) has also been applied to the assessment of chronic liver disease. The 31P MR spectrum contains resonances from intermediates of phospholipid membrane metabolism, including phosphomonoesters (PME) and phosphodiesters (PDE), with additional resonances from bioenergetic processes. The latter include inorganic phosphate, nucleotide triphosphates, and nicotinamide adenine dinucleotide phosphate (NADPH).59 The rationale is that resonance ratios could reflect hepatic inflammation and regeneration, and/or correspond to disease severity.60 However, studies in NAFLD are limited.
Conventional imaging modalities are limited by their inability to distinguish between patients with steatosis and those with histological NASH. Furthermore, they cannot be used to assess the severity of steatohepatitis in affected patients. CT was shown to quantify hepatic steatosis in patients with biopsy-proven NAFLD using a liver–spleen attenuation difference, but there was no correlation with histological inflammation or fibrosis.61 However, newer imaging techniques show promise for detecting inflammation, as outlined in Table 2.
Studies in NAFLD, using transient elastography, demonstrated a significant correlation between liver stiffness values and increasing necroinflammatory activity, but not independently of fibrosis.62 However, in chronic viral hepatitis, necroinflammation independently influences liver stiffness.63 Furthermore, Arena and colleagues investigated the effect of inflammation on the use of TE in a cohort of 18 patients with acute viral hepatitis. They concluded that the extent of necroinflammatory activity should be carefully considered, particularly in patients with mild fibrosis (F0–2), as there is a risk of over-estimating fibrotic severity.64
As the histological criteria for diagnosis of NASH include the presence of inflammatory cellular infiltrates and hepatocyte ballooning,65 it might be surmised that inflammation would affect liver stiffness. However, to show an independent effect on multivariate regression analysis would require a small range of fibrosis stages (the dominant factor) and a wide range of severity of inflammation in a large number of patients; for such logistic reasons, such an analysis has not yet been performed in NAFLD cohorts. Furthermore, the temporal gap between biopsy and imaging might affect the correlation with inflammation (and fat), as this fluctuates more rapidly than fibrosis.
Although ARFI measures liver stiffness in a manner similar to TE, a stepwise increase in median velocity/stiffness with greater histological necroinflammatory activity has not been seen.41 Furthermore, Palmeri and colleagues did not report any significant effect of either lobular inflammation or hepatocyte ballooning on shear stiffness values.66 In comparison, MRE demonstrated liver stiffness values that were significantly higher among patients with inflammation compared to those with SS, and stiffness correlated with the grade of inflammatory activity.43 Furthermore, animal studies have shown increased liver stiffness at the stage of hepatocellular injury, preceding the development of fibrosis.67,68 MRE has a high diagnostic accuracy to discern NASH from steatosis, with an area under the receiver–operator characteristic curve (AUROC) value of 0.93. However, the sample size of patients with inflammation, but without fibrosis, was small (n = 7), and as an expected consequence, there was substantial overlap in stiffness values between inflammation and fibrosis groups.43 These findings must now be validated in larger, prospective studies.
A markedly different approach using CEUS was pursued by Iijima and colleagues. They examined the parenchymal phase of microbubble US enhancement, and observed decreased parenchymal signal in patients with NASH compared to those with NAFLD and healthy volunteers.44 The diagnostic accuracy for NASH in this cohort was high (80% at 5 min, 100% at 20 min), but data on correlation with degree of inflammation or fibrosis stage were not presented. In another small study, there was a correlation between decreased microbubble accumulation and centrilobular and pericellular fibrosis, although the data presented were limited.69 Moreover, the biological interpretation of these data are unclear; differential microbubble trapping with fibrosis pattern and Kupffer cell phagocytic activity were considered by the authors to be possible explanations, but further investigation is required.
Another approach has been the use of 31P MRS, which was recently applied to patients with NAFLD/NASH. Of the indices tested, NADPH/(PME + PDE) was significantly higher in patients with NASH (n = 13) than those with steatosis alone (n = 9). While this is an interesting observation, the numbers in each group were small, and the contributing effect of fibrosis stage within the NASH group was not explicitly examined.45 Both 31P MRS and CEUS require independent validation in larger cohorts.
It is clinically useful to distinguish mild fibrosis (F0–2) from advanced fibrosis (F3) and cirrhosis (F4), as the latter stages are associated with the risk of liver-related complications. The application of conventional techniques for the assessment of fibrosis in NAFLD has been the subject of few limited studies. A retrospective study using both US and CT demonstrated that the sensitivity for the detection of advanced fibrosis was reduced substantially by the presence of severe hepatic steatosis. Conversely, sensitivity for the detection of histological steatosis was reduced in the presence of advanced fibrosis.70 Retrospective assessment of gadolinium-enhanced MR images in applying a semiquantitative scoring system for reticular patterning in 30 patients with NAFLD/NASH revealed a moderate correlation with histological fibrosis (r = 0.6, P < 0.001); again, prospective validation of such scoring systems is required.36 However, the newer elastography-based imaging techniques have demonstrated high accuracy in predicting advanced disease (Table 3).
Table 3. Evaluation of elastography-based techniques for assessment of fibrosis
Many studies use the ROC as a basis for the measurement of diagnostic accuracy. The curve is plotted on axes representing the sensitivity and 1-specificity for all the observed cut-off values used to distinguish two disease states (for example, F0–2 vs F3–4). The AUROC is considered a convenient measure of diagnostic accuracy, where an area of 1.0 represents perfect accuracy, and 0.5 is what would be expected by chance. Cut-off values are often derived from where sensitivity and specificity are maximal, but might also be set to a predefined sensitivity or specificity as required.
A significant increase in liver stiffness values measured by TE was seen with increasing severity of fibrosis in patients with NAFLD.40,41,62,71–78 Diagnostic accuracy was high, with the AUROC ranging from 0.75 to 0.91 for the detection of mild fibrosis (F0–2) and 0.94–0.99 for cirrhosis (F4). However, although TE has a high diagnostic accuracy for the detection of severe and mild fibrosis (F4 and F1, respectively), the intermediate stages (F2–3) remain less easily discriminated.
With the development of the XL probe for use with obese patients, it should be noted that median liver stiffness values are lower than with the M probe,76 so different cut-off values are necessary. An important practical point follows: probes should not be used interchangeably in longitudinal assessments with repeated measures.
As histology of liver biopsy samples provides only a semiquantitative assessment of fibrosis, Mori and colleagues conducted a study to investigate whether liver stiffness measurements reflect collagen deposition and myofibroblast activity in the liver parenchyma in NAFLD. They demonstrated that liver stiffness values correlated significantly, not only with histological stage of hepatic fibrosis, but also with fibrosis area; serum levels of pro-fibrogenic molecules, such as procollagen III aminopeptide; type IV collagen 7S; and hyaluronic acid, providing further validation of the technique.72
ARFI demonstrated a stepwise increase in median velocity with increasing stages of fibrosis. The diagnostic accuracy for the detection of mild fibrosis (F0–2) was high (0.85), and improved for the detection of severe fibrosis (F3–4), with a range AUROC values of 0.74–0.98.40,41,66,75
While MRE has shown to delineate fibrosis stage in mixed cohorts of chronic liver disease,79,80 stratification of fibrosis stage has not yet been demonstrated in a cohort of patients with NAFLD, although liver stiffness has been noted to be higher in patients with NASH with fibrosis compared to NASH without fibrosis.43
TE has been evaluated in a large number of patients with NAFLD in separate trials over different geographic regions, but studies of other imaging techniques for the assessment of chronic liver diseases contain relatively small patient numbers. Furthermore, there is great variation in the reporting of classification of patients by histological severity in published studies, making direct comparison between studies or techniques difficult (Table 3). While the AUROC can be a useful summary figure, the cut-off value used, sensitivity, specificity, PPV, and NPV are required to evaluate performance fully, and a number of studies fail to report these figures. Indeed, for a clinically-relevant assessment, the pretest probability of a disease state and investigation failure rates should also be included. Studies of TE have led the way in examining reproducibility, reliability, and accuracy in well-defined cohorts of patients, and it is important that future studies of imaging techniques in NAFLD address these issues.
Nevertheless, the studies discussed in this review have demonstrated that different approaches might be conducted from the same platform to provide information on different aspects of the disease. Thus, there is potential for a more comprehensive picture of the disease state using combination of techniques than with each technique alone. For example, ARFI or CEUS might be combined with B-mode US to provide information (using anatomical views) on steatosis, fibrosis stage, and potential inflammation. There is potential for MRE, MRS (31P and 1H), and MRI to be combined in a single scanning session to provide quantitative and multiparametric data. However, the utility of such approaches requires examination in NAFLD and other chronic liver disease cohorts. Already, however, imaging-based techniques have been combined to derive data on disease prevalence and severity in a manner not possible with liver biopsy in view of the costs and risks involved. A combination of 1H MR spectroscopy and transient elastography was applied to assess the prevalence of both significant steatosis and significant fibrosis in a population-based study in Hong Kong.13 The prevalence of hepatic steatosis, defined as ≥ 5% hepatic lipid, was 28.6% in 922 ethnic Chinese adults without viral hepatitis, and importantly, among these, 3.7% had a liver stiffness of ≥ 9.6 kPa, indicating likely advanced fibrosis, compared with 1.3% of the non-steatotic volunteers. This novel approach has indicated a high burden of advanced fibrosis associated with NAFLD in the population.
NAFLD is an increasing problem worldwide in terms of its prevalence and the resultant liver-related and cardiovascular complications in the subset of patients with NASH and progressive disease. In order to target therapy appropriately, accurate, robust, reliable, and cost-effective tools are required for diagnosis and staging. Conventional imaging techniques have a role in the detection of steatosis and signs of cirrhosis, but are inadequate for the assessment of inflammation or staging. Elastography techniques have been validated extensively in chronic liver diseases, including NAFLD, and are being adopted increasingly in clinical practice, but questions remain as to the relative contributions of inflammation and fibrosis to liver stiffness measurements. CEUS has had limited application in NAFLD/NASH, and the biological interpretation of the output is still unclear. As with non-invasive markers of other chronic liver diseases, concordant results from complementary techniques (including serum markers) might limit, but not remove, the need for liver biopsy. In the future, imaging-based techniques might be combined to provide a more complete description of the natural history of NAFLD/NASH, and there is need for further validation, singly and in combination, in prospective longitudinal studies with clinically-important end-points.
JFLC is an NIHR clinical lecturer. The authors thank the Imperial College Biomedical Research Centre for infrastructural support.