Noninvasive markers of fibrosis in nonalcoholic fatty liver disease: Validating the European Liver Fibrosis Panel and exploring simple markers

Authors


Abstract

The detection of fibrosis within nonalcoholic fatty liver disease (NAFLD) is important for ascertaining prognosis and the stratification of patients for emerging therapeutic intervention. We validated the Original European Liver Fibrosis panel (OELF) and a simplified algorithm not containing age, the Enhanced Liver fibrosis panel (ELF), in an independent cohort of patients with NAFLD. Furthermore, we explored whether the addition of simple markers to the existing panel test could improve diagnostic performance. One hundred ninety-six consecutively recruited patients from 2 centers were included in the validation study. The diagnostic accuracy of the discriminant scores of the ELF panel, simple markers, and a combined panel were compared using receiver operator curves, predictive values, and a clinical utility model. The ELF panel had an area under the curve (AUC) of 0.90 for distinguishing severe fibrosis, 0.82 for moderate fibrosis, and 0.76 for no fibrosis. Simplification of the algorithm by removing age did not alter diagnostic performance. Addition of simple markers to the panel improved diagnostic performance with AUCs of 0.98, 0.93, and 0.84 for the detection of severe fibrosis, moderate fibrosis, and no fibrosis, respectively. The clinical utility model showed that 82% and 88% of liver biopsies could be potentially avoided for the diagnosis of severe fibrosis using ELF and the combined panel, respectively. The ELF panel has good diagnostic accuracy in an independent validation cohort of patients with NAFLD. The addition of established simple markers augments the diagnostic performance across different stages of fibrosis, which will potentially allow superior stratification of patients with NAFLD for emerging therapeutic strategies. (HEPATOLOGY 2007.)

Nonalcoholic fatty liver disease (NAFLD) is emerging as a major global cause of liver disease on the background of an increasing prevalence of obesity and type 2 diabetes. NALFD encompasses a spectrum of disease from simple steatosis through nonalcoholic steatohepatitis, to fibrosis and ultimately cirrhosis and hepatocellular carcinoma. The identification of the minority of patients with fibrosis amongst those with NAFLD is critically important for prognosis1, 2 and therefore for the selection of patients that are candidates for existing and emerging therapeutic interventions. Furthermore, the identification of the subset of patients who have developed cirrhosis has clear importance for prophylaxis against variceal bleeding, surveillance for hepatocellular cancer, and the timing of transplantation. Currently, the differentiation of steatosis from steatohepatitis and fibrosis in NAFLD is dependent on histological examination of liver biopsies. However, liver biopsy is invasive and is limited by sampling error, diagnostic accuracy, and hazard to the patient.3, 4 In addition, the numbers of patients with NAFLD means that use of liver biopsy in their investigation is both practically and financially impractical.

The Original European Liver Fibrosis (OELF) test is an example of a panel of markers (which highlight matrix turnover) and consists of age, tissue inhibitor of matrix metalloproteinase 1(TIMP 1), hyaluronic acid (HA), and aminoterminal peptide of pro-collagen III (P3NP) developed for a variety of liver disorders.5 The initial study suggested the test performed particularly well in NAFLD although the number of patients with this pathological condition was small. The OELF can be simplified by removing age without reducing diagnostic accuracy in other liver diseases (Parkes et al., unpublished observations) and therefore we wanted to validate this finding specifically in NAFLD.

A number of questions regarding noninvasive markers remain unanswered in the context of NAFLD. First, many of the published algorithms concentrate on detecting severe fibrosis, but there are clinical reasons for why the detection of earlier stages of fibrosis also may be desirable. Second, there is emerging evidence from systematic reviews and original research that simple clinical and biochemical markers have a role in distinguishing severe fibrosis.6 We hypothesized that the combination of these simple markers with established markers of matrix turnover may improve the diagnostic performance of noninvasive markers in NAFLD.

This study had the following aims: (1) to validate the OELF panel and modified panel not containing age, the Enhanced Liver Fibrosis panel (ELF), as surrogate markers of fibrosis in an independent cohort exclusively containing patients with NAFLD; (2) to compare the performance of the ELF panel with an algorithm consisting of simple clinical and biochemical parameters7 and to ascertain whether this combination improves diagnostic performance.

Abbreviations

ALT, alanine aminotransferase; AST, aspartate aminotransferase; AUC, area under the curve; BMI, body mass index; DS, discriminant scores; ELF, Enhanced Liver fibrosis panel; HA, hyaluronic acid; NAFLD, nonalcoholic fatty liver disease; OELF, Original European Liver Fibrosis panel; TIMP1, tissue inhibitor of matrix metalloproteinase 1.

Patients and Methods

Patients were recruited consecutively from 2 tertiary outpatient liver centers in the United Kingdom, Nottingham and Newcastle-upon-Tyne. The diagnosis of NAFLD was based on the following criteria: (1) elevated aminotransferases [aspartate aminotransferase (AST) or alanine aminotransferase (ALT)]; (2) appropriate exclusion of liver disease of other origin including alcohol-induced or drug-induced liver disease, autoimmune or viral hepatitis, or cholestatic or metabolic/genetic liver disease. These other liver diseases were excluded using specific clinical, biochemical, radiographic, or histological criteria. All patients had a negative history of ethanol consumption of less than 140 g in women and 210 g in men. In the Newcastle center, alcohol levels in urine were measured randomly to rule out patients who abused alcohol. Patients included in this study had consecutive liver biopsies at the individual centers, between October 3, 2002 and December 31, 2006, where the histology was consistent with NAFLD and serum samples were taken within 3 months of biopsy. The following anthropometric measurements were obtained: waist circumference, hip circumference, and body mass index (BMI). Serum samples were obtained for routine liver chemistry (including ALT, AST, gamma glutamyl transpeptidase, bilirubin, albumin, and alkaline phosphatase), full blood count, measures of insulin resistance (including fasting glucose, insulin and C peptide), ferritin, total cholesterol, high-density lipoprotein, low-density lipoprotein, and triglycerides. Serum samples were analyzed for levels of TIMP-1, HA, and P3NP at an independent reference laboratory (iQur Limited, Southampton, UK). Results were entered into the established algorithm5 and expressed as discriminant scores (DS).

Liver Biopsy.

Liver biopsies were assessed by 2 hepatopathologists, one at each center. Biopsy specimens were scored for fibrosis using a 5-stage classification system for fibrosis that has recently been published by the National Institute of Diabetes and Digestive and Kidney Diseases8; stage 0 = absence of fibrosis, stage 1 = perisinusoidal or portal, stage 2 = perisinusoidal and portal/periportal, stage 3 = septal or bridging fibrosis, and stage 4 = cirrhosis.

Statistical Analysis.

DS scores were compared with histological staging of liver biopsies from corresponding patients, and the sensitivity and specificity of the DS for detecting fibrosis was calculated. These results were then used to plot receiver operator characteristic curves, and the area under the curve (AUC) was calculated. Positive and negative predictive values for detecting different degrees of severity of fibrosis were also calculated in the cohort.

Validation of OELF and ELF Panels.

The OELF panel was validated using the original algorithm and an adjusted algorithm not containing age—the ELF panel (see Appendix 1 for algorithms). Three end points were chosen for the evaluation of fibrosis: (i) any fibrosis (stage 0 versus 1/2/3/4); (ii) moderate fibrosis (stage 0/1 versus 2/3/4); and (iii) severe fibrosis (stage 0/1/2 versus 3/4). The ELF panel was used for the remaining analyses.

Comparison of the ELF with Simple Clinical and Biochemical Parameters.

To compare the ELF panel with a panel of simple markers, we used the same markers recently validated in an international, multicenter study by Angulo et al.7 This constituted of the parameters age, BMI, presence of diabetes or impaired fasting glucose, AST/ALT ratio, platelets, and albumin and herein will be referred to as the simple panel. The panel was originally developed for distinguishing severe fibrosis, and therefore the algorithm was modified using logistic regression to account for the other end-points of fibrosis (see Appendix 1 for list of algorithms). The components of the simple panel were present only in a subgroup of the cohort (n = 91); therefore, further analysis of the ELF panel in comparison with the simple panel was performed in this subgroup. In addition, a combined algorithm consisting of the ELF panel and simple markers was produced to see whether diagnostic accuracy was improved (Appendix 1). The 3 algorithms (i) ELF panel, (ii) simple panel, and (iii) combined panel (ELF panel + simple) were tested on the 3 levels of fibrosis. Receiver operating characteristic curves were produced to compare diagnostic performance. All analyses were carried out using SPSS 14.

Results

Paired serum and histological data were available for 192 subjects. The baseline characteristics of these patients are shown in Table 1. The demographic data were similar for the 2 populations; 64% of subjects were male, the mean age in the study was 49 years, and 63% of subjects had evidence of metabolic syndrome.

Table 1. Baseline Patient Characteristics in Individual and Combined Cohorts
CategoryNottingham centreNewcastle centreEntire cohort
  1. Values are given as in mean ± standard deviation unless stated.

Number88104192
Age (years)50.4 ± 11.547.3 ± 11.148.7 ± 12.5
Male subjects65%63%64%
BMI (kg/m2)30 ± 4.534.4 ± 5.932.4 ± 5.7
Waist (cm)104.5 ± 12.5111.2 ± 12.7107.8 ± 13
Metabolic syndrome (yes)66%60%63%
Fasting glucose (mmol/L)6.0 ± 1.76.5 ± 3.36.3 ± 2.7
Triglycerides (mmmol/L)2.1 ± 1.62.8 ± 1.82.5 ± 1.8
HDL (mmmol/l)1.4 ± 0.421.1 ± 0.281.2 ± 0.4
ALT (U/L)76.1 ± 48.978.4 ± 64.677.3 ± 57.8
GGT (U/L)140 ± 135104 ± 102121 ± 119.5
Platelets (×109/L)234 ± 71.6243 ± 70.6239 ± 71
Albumin (g/L)43.7 ± 3.444.9 ± 4.944.3 ± 4.3
Fibrosis stage   
 032%49%41%
 118%19%19%
 227%8%17%
 315%12%13%
 48%12%10%

Both OELF and ELF had similar diagnostic performance as exemplified by the box plots in Fig. 1 for the identification of stages of fibrosis. Therefore, to simplify the algorithm, we omitted age and used the ELF panel for the remaining analysis. The ELF panel had excellent performance in distinguishing severe fibrosis (stage 3/4) with an AUC of 0.90 [confidence interval, 0.84–0.96]. A threshold of 0.3576 was associated with a sensitivity of 80%, a specificity of 90%, a positive predictive value of 71%, and a negative predictive value of 94%. In distinguishing moderate fibrosis, the overall AUC was 0.82 (confidence interval, 0.75–0.88). A threshold of −0.1068 was associated with a sensitivity of 70%, a specificity of 80%, a positive predictive value of 70%, and a negative predictive value of 80%. In distinguishing no fibrosis, the overall AUC was 0.76 (confidence interval, 0.69–0.83). A threshold of −0.2070 was associated with a sensitivity of 61%, a specificity of 80%, a positive predictive value of 81%, and a negative predictive value of 79%. See Table 2 for a full list of thresholds associated with the different end-points of fibrosis.

Figure 1.

Box plots for OELF (blue) and ELF (green) for discriminant score and fibrosis stage.

Table 2. Diagnostic Performance of the Enhanced Liver Fibrosis Panel at Different Thresholds in the Combined Cohort
Stage of FibrosisDiscriminant Score ThresholdSensSpecPPVNPVLR (+ve)LR (-ve)
  1. Abbreviations: Sens, sensitivity; Spec, specificity; PPV, positive predictive value; NPV, negative predictive value; LR, likelihood ratio; LR +ve, positive likelihood ratio; LR −ve, negative likelihood ratio.

0 versus 1/2/3/4 (any fibrosis)−1.65331004601001.040
 −1.2009951963711.170.28
 −1.02813902764661.230.36
 −0.6415805672661.800.37
 −0.2070618081793.010.49
 0.2112459086534.470.61
 0.3272439593548.670.59
 1.64541310010045N/A0.87
0/1 versus 2/3/4 (moderate fibrosis)−1.421710074210010
 −1.0691952245861.210.24
 −0.6746905054881.780.21
 −0.3625806762842.440.29
 −0.1068708070803.510.37
 0.3145569078755.350.49
 0.5734459585728.710.57
 2.2859910010062N/A0.91
0/1/2 versus 3/4 (severe fibrosis)−1.241310012261001.140
 −0.7121984234981.680.05
 −0.4184965741982.240.07
 −0.1068907552963.540.15
 0.3576809071947.840.22
 0.81396295788911.400.41
 1.64542999878221.320.72
 2.28581610010080N/A0.84

Clinical Utility of Noninvasive Markers in NAFLD.

If thresholds are used to “rule in” fibrosis (upper threshold with high specificity and hence positive predictive value) or “rule out” fibrosis (lower threshold with high sensitivity and hence negative predictive value) with a high degree of accuracy, the clinical utility of noninvasive markers can be appreciated.9 Using the ELF panel to distinguish severe fibrosis, at the thresholds of −1.0281 and 0.2112 (sensitivity and specificity of 90%, respectively), in this cohort 86% would have avoided a liver biopsy, with 76% correctly classified, 14% would have had an indeterminate classification (in other words, had values between these thresholds) and would have required a liver biopsy. For the detection of moderate fibrosis (using thresholds with a sensitivity and specificity of 90%, respectively), 62% would avoid a liver biopsy, with 52% correctly classified, 38% would have had an indeterminate classification. If ELF was used to delineate any fibrosis (using thresholds with a sensitivity and specificity of 90%, respectively), 48% would have avoided a liver biopsy; with 38% correctly classified, 52% would have had an indeterminate classification. If a sensitivity and specificity of 80% is chosen for the detection of any fibrosis, 79% would avoid a liver biopsy; with 59% correctly classified, 21% would have had an indeterminate classification.

Simple Panel Markers.

The performance of the simple panel is shown in Table 3 in comparison with ELF and combining ELF with the simple markers. The best algorithm is the combination of simple markers and ELF panel. Figure 2 shows the translation of the improved AUC scores into number of biopsies avoided using the clinical utility model described (optimal thresholds for simple and combined panels are shown in Appendix 2).

Table 3. Performance of ELF Panel and Simple Markers Panel in Distinguishing Different Stages of Fibrosis as Measured by AUC Values With Confidence Intervals (n = 91)
 0 Versus 1/2/3/4 Any Fibrosis0/1 Versus 2/3/4 Moderate Fibrosis0/1/2 Versus 3/4 Severe Fibrosis
Simple0.79 (0.69–0.88)0.86 (0.78–0.94)0.89 (0.81–0.97)
ELF0.82 (0.73–0.90)0.90 (0.84–0.96)0.93 (0.88–0.98)
Simple + ELF0.84 (0.76–0.92)0.93 (0.88–0.99)0.98 (0.96–1)
Figure 2.

Clinical utility model of simple, ELF and combined panel in avoiding liver biopsies at different severities of liver fibrosis.

Discussion

The performance of the ELF panel in this cohort in distinguishing severe fibrosis is excellent, with an AUC of 0.9, and is comparable to the original cohort and other panel marker tests in this disease.7, 10, 11 Moreover, age is not required in this modified algorithm, simplifying the panel and allowing age to be used as an independent variable that could be useful in future prognosis studies.

The direct comparison of ELF and simple markers in the same cohort attempts to address the question of whether the inclusion of specific makers of matrix turnover confers any additional benefit in the diagnosis of liver fibrosis in comparison with simple clinical and biochemical parameters. There is a suggestion from the AUC values and clinical utility model that improvement of the algorithm can be made by combining the ELF and simple markers panel; larger studies are required to confirm these findings.

The long-term prognostic studies suggest that fibrosis, and in particular severe fibrosis, is the most important histological determinant for developing future disease.1, 2 However, there are a number of reasons why the identification of other severities of fibrosis could be beneficial. In the community setting, it will facilitate the identification of patients with any fibrosis so that dietary and lifestyle interventions can be implemented early as well as deciding which of the many patients with abnormal liver function tests require referral to secondary care. In the secondary care setting, serum marker tests could be used to identify suitable candidates for new and emerging pharmacological treatments for NAFLD and liver fibrosis, the selection of the test threshold being determined by the risk–benefit ratio of the therapy; for example, an effective but potentially toxic therapy will require a threshold to be chosen with a high specificity. Finally, the identification of severe fibrosis and cirrhosis will aid stratification of patients into those requiring careful surveillance of the complications of liver disease and those in greatest need of any emerging pharmacological therapy.

Common misperceptions about noninvasive diagnostic tests for liver fibrosis include the suggestion that they can obviate the need for liver biopsy and that they inadequately distinguish moderate stages of fibrosis. The idea that any noninvasive marker can completely replace liver biopsy as a diagnostic tool is simply unrealistic. The biopsy offers a wealth of information about the liver, including the severity of necroinflammation, presence of multiple pathological conditions, and architectural disturbance. However, to suggest that this detailed information is required for every patient presenting with abnormal liver function tests serves the best interests of neither the patients nor the funders of healthcare. Evidence presented in this study clearly demonstrates that noninvasive markers can be used to identify patients who have early fibrosis in NAFLD, making them suitable tests for screening the increasing number of patients presenting with abnormal liver function tests, obesity, and metabolic syndrome.

This study has a number of limitations. The study population has been selected from a tertiary care setting and represents a more severe disease spectrum. This is a criticism of the majority of noninvasive markers and is attributable to the requirement of the liver biopsy as a reference standard. Although these results cannot be extrapolated to other healthcare settings, with different prevalence of disease, modeling suggests that the diagnostic accuracy for the identification of severe fibrosis will improve in a community setting (unpublished data). Use of noninvasive tests may depend on simple practical considerations such as the necessity for a fasted sample and availability of specific panel components. Whereas a single ELF algorithm is used to evaluate all stages of fibrosis, the simple marker algorithm must be adjusted to achieve optimal performance. Moreover, the ELF algorithm does not require demographic or anthropometric data, potentially simplifying the acquisition of data. The economic cost of any commercial test will need to be balanced against any practical or diagnostic benefit gained.

One of the intriguing aspects of the ELF panel is the variation of performance in different diseases as shown in the original study.5 Although it is tempting to think of fibrosis as a common pathway for all liver disease, differences in the distribution of fibrosis and mechanisms of fibrosis may account for the variation in diagnostic accuracy, exemplified by the periportal distribution of fibrosis in hepatitis C compared with the perisinusoidal distribution of fibrosis in NAFLD and confounding effects attributable to inflammation, necrosis, and apoptosis. Other possibilities include the direct effects of the disease origin or the influence of extrahepatic manifestations on the serum markers. Although these uncertainties do not detract from the diagnostic use of ELF, clarifying these issues may provide further insights into fibrogenesis.

The true potential of serum markers may not be realized until longitudinal studies measuring serum markers against clinical outcomes are published. The ability to measure disease progression, regression, and response to treatment by serial measurement of serum markers would give clinicians valuable information to aid management decisions. In this regard there is emerging evidence that serum markers can offer prognostic information; specifically in the context of PBC (presented at the American Association for the Study of Liver Diseases, 2006), and this also may be applicable to other diseases, including NAFLD.

We have validated the performance of the ELF panel and have shown that it is able to distinguish patients with any fibrosis, moderate fibrosis, and severe fibrosis in an independent cohort of patients with NAFLD. The modification of the algorithm by removing age simplifies the panel without losing any diagnostic accuracy. The addition of established simple markers augments the diagnostic performance of the ELF panel, across different stages of fibrosis, which will potentially allow superior stratification of patients with NAFLD for emerging therapeutic strategies.

Appendix 1

OELF Algorithm

DS = −6.38 − (ln(age)*0.14) + (ln(HA)*0.616) + (ln(P3NP)*0.586) + (ln(TIMP1)*0.472).

ELF Algorithm

DS = −7.412 + (ln(HA)*0.681) + (ln(P3NP)*0.775) + (ln(TIMP1)*0.494).

Simple Panel for Distinguishing No Fibrosis

Score = 6.375 + 0.062 *BMI (kg/m2) + 1.745*diabetes/IFG (yes = 1, no = 0) − 1.103 AST/ALT ratio − 0.037*age (years) − 0.005*platelets (× 109/l) − 0.093*alb (g/L)

Combined Panel for Distinguishing No Fibrosis

Score = −2.722 + 1.482*ELF (discriminant score) + 0.062 *BMI (kg/m2) + 1.241*diabetes/IFG (yes = 1, no = 0) − 0.590 AST/ALT ratio − 0.002*platelets (×109/L) − 0.043*alb (g/L)

Simple Panel for Distinguishing Moderate Fibrosis

Score = 1.224 + 0.01*age (years) + 0.105 *BMI (kg/m2) + 1.946*diabetes/IFG (yes = 1, no = 0) − 1.786 AST/ALT − 0.01*platelets (× 109/L) − 0.04*alb (g/L)

Combined Panel for Distinguishing Moderate Fibrosis

Score = −5.257+ 2.408*ELF (discriminant score) + 0.084 *BMI (kg/m2) + 1.848*diabetes/IFG (yes = 1, no = 0) − 1.839 AST/ALT ratio − 0.012*platelets (×109/L) + 0.141*alb (g/L)

Simple Panel for Distinguishing Severe Fibrosis

Score = −1.122 + 0.03*age (years) + 0.08 *BMI (kg/m2) + 2.494*diabetes/IFG (yes = 1, no = 0) − 1.661 AST/ALT − 0.011*platelets (×109/L) − 0.015*alb (g/L)

Combined Panel for Distinguishing Severe Fibrosis

Score = −20.870 + 5.506*ELF (discriminant score) + 4.513*diabetes/IFG (yes = 1, no = 0) − 3.144 AST/ALT ratio − 0.058*BMI (kg/m2) − 0.026*platelets (×109/L) + 0.639*alb (g/L)

Appendix 2

Table  . Optimal Thresholds for Simple and Combined Algorithms
Stage of FibrosisScoreSensSpecNPVPPVLR +veLR −ve
Simple panel       
 Any fibrosis−0.7069925085661.830.17
 1.1397329156793.510.75
 Moderate fibrosis−1.6326895792482.080.19
 −0.1657688986736.120.36
 Severe fibrosis−2.3824915995422.240.15
 −.83257793937710.740.25
Combined panel       
 Any fibrosis−5.002925286661.920.16
 −3.346609169876.550.44
 Moderate fibrosis−0.995898694756.240.12
 −0.016799190818.270.24
 Severe fibrosis−0.28269196779921.130.09
 0.003386997010061.710.14

Ancillary