SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information

Liver stiffness evaluation (LSE) is usually considered as reliable when it fulfills all the following criteria: ≥10 valid measurements, ≥60% success rate, and interquartile range / median ratio (IQR/M) ≤0.30. However, such reliable LSE have never been shown to be more accurate than unreliable LSE. Thus, we aimed to evaluate the relevance of the usual definition for LSE reliability, and to improve reliability by using diagnostic accuracy as a primary outcome in a large population. 1,165 patients with chronic liver disease from 19 French centers were included. All patients had liver biopsy and LSE. 75.7% of LSE were reliable according to the usual definition. However, these reliable LSE were not significantly more accurate than unreliable LSE with, respectively: 85.8% versus 81.5% well-classified patients for the diagnosis of cirrhosis (P = 0.082). In multivariate analyses with different diagnostic targets, LSE median and IQR/M were independent predictors of fibrosis staging, with no significant influence of ≥10 valid measurements or LSE success rate. These two reliability criteria determined three LSE groups: “very reliable” (IQR/M ≤0.10), “reliable” (0.10< IQR/M ≤0.30, or IQR/M >0.30 with LSE median <7.1 kPa), and “poorly reliable” (IQR/M >0.30 with LSE median ≥7.1 kPa). The rates of well-classified patients for the diagnosis of cirrhosis were, respectively: 90.4%, 85.8%, and 69.5% (P < 10−3). According to these new reliability criteria, 9.1% of LSE were poorly reliable (versus 24.3% unreliable LSE with the usual definition, P < 10−3), 74.3% were reliable, and 16.6% were very reliable. Conclusion: The usual definition for LSE reliability is not relevant. LSE reliability depends on IQR/M according to liver stiffness median level, defining thus three reliability categories: very reliable, reliable, and poorly reliable LSE. (HEPATOLOGY 2013)

Liver stiffness evaluation (LSE) by Fibroscan is now widely used in several countries for the assessment of liver fibrosis in chronic liver diseases. According to the usual definition, all the following criteria have to be met to consider LSE as reliable: ≥10 valid measurements, LSE success rate ≥60%, and LSE interquartile range / median (IQR/M) ≤0.30.1-3 Reliability criteria for LSE are of great importance, first in clinical practice because reliable LSE result is useful for patient management, and also in clinical research because unreliable LSE are very often excluded from statistical analyses. When the usual definition is applied in clinical practice, 15% of LSE are considered unreliable.4 However, the relevance of the usual definition for LSE reliability has never been demonstrated, as no study has yet shown that LSE with at least 10 valid measurements and success rate ≥60% and IQR/M ≤0.30 provide better diagnostic accuracy than those not fulfilling these three criteria.

Two recent studies focused on determining the reliability criteria of LSE.5, 6 In the Lucidarme et al.5 study, including 254 patients with chronic hepatitis C (CHC), neither the number of valid measurements nor the LSE success rate were independent predictors of discrepancy between LSE median and fibrosis stages as determined on liver biopsy. Independent predictors were pathological fibrosis stage and IQR/M, with the most significantly discriminating cutoff value for IQR/M calculated at 0.21. In the Myers et al.6 study, including 251 patients with various causes of chronic liver disease, independent predictors of discrepancy between LSE median and liver biopsy were IQR/M, body mass index, and low pathological fibrosis stages, with no influence of LSE success rate or ≥10 valid measurements. The most discriminative IQR/M cutoff for discrepancy was ≥0.17.

However, those studies had several limits. First, they included pathological predictors leading their reliability criteria of LSE not applicable to clinical practice. Second, their main judgment criterion was discrepancy rate. To evaluate discrepancies between liver biopsy and LSE median, both studies categorized the latter into estimated Metavir F stages (called FFS stages in the present study) according to several diagnostic cutoffs provided by binary diagnoses such as significant fibrosis or cirrhosis. We have previously shown that the combination of such diagnostic cutoffs accumulates the diagnostic errors of each, resulting in a loss of accuracy.7 Consequently, the study of discrepancies between histological fibrosis stages and such poorly accurate LSE classifications seems not adequate and calls into question the relevance of the ensuing calculated cutoffs for IQR/M. This may explain why calculated cutoffs for IQR/M in the Lucidarme et al. and Myers et al. studies failed to identify subgroups of LSE with significantly different diagnostic accuracies. Third, the sample size might have been weak considering the low prevalence of putative discrepancies. Finally, to determine the reliability criteria for LSE, a better study outcome may be diagnostic accuracy rather than discrepancy rate.

The aims of the present study were to evaluate the diagnostic relevance of the usual definition for LSE reliability and to precisely determine the noninvasive reliability criteria of LSE by using diagnostic accuracy as a primary outcome in a large population.

Patients and Methods

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information

Patients

Two populations with liver biopsy and LSE were included in the present study. The first population was composed of patients with chronic liver disease recruited in three French centers between 2004 and 2009 (Angers: n = 383; Bordeaux: n = 309; and Grenoble: n = 142). Patients included in the Angers and Bordeaux centers had various causes of chronic liver diseases, whereas those from Grenoble had CHC. CHC patients of the three centers (n = 467) have been included in previous studies.8, 9 The second population was that of the multicenter ANRS/HC/EP23 Fibrostar study promoted by the French National Agency for Research in AIDS and Hepatitis.3 The patients included in both populations were identified and ultimately grouped as a single observation for statistical analyses. All patients gave written informed consent. The study protocol conformed to the ethical guidelines of the current Declaration of Helsinki and received approval from the local Ethics Committees.

Histological Assessment

Liver fibrosis was evaluated according to Metavir fibrosis (FM) staging. Significant fibrosis was defined as Metavir FM≥2, severe fibrosis as Metavir FM≥3, and cirrhosis as Metavir FM4. In the first population, histological evaluations were performed in each center by blinded senior pathologists specialized in hepatology. In the Fibrostar study, histological lesions were centrally evaluated by two senior experts with a consensus reading in cases of discordance. Fibrosis staging was considered as reliable when the liver specimen length was ≥15 mm and/or portal tract number ≥8.10

Liver Stiffness Evaluation

Precise definitions are provided in the Glossary in the Supporting Material.

Examination Conditions.

LSE by Fibroscan (Echosens, Paris, France) was performed with the M probe and by an experienced observer (>50 examinations before the study), blinded for patient data. A time interval of ≤3 months between liver biopsy and LSE was considered acceptable for the purposes of the study. Examination conditions were those recommended by the manufacturer,11 with the objective of obtaining at least 10 valid measurements. Results were expressed as the median and the IQR (kPa) of all valid measurements. According to the usual definition, LSE was considered reliable when it included ≥10 valid measurements with a success rate ≥60% and IQR/M ≤0.30.

Interpretation of LSE Result.

LSE median was interpreted according to the diagnostic cutoffs published in previous studies. As CHC was the main cause of liver disease in our study population (68%), we tested the cutoffs published by Castera et al.12: ≥7.1 kPa for FM≥2 and ≥12.5 kPa for FM4, those by Ziol et al.13: ≥8.8 kPa for FM≥2 and ≥14.6 kPa for FM4, and those specifically calculated for CHC in the meta-analysis of Stebbing et al.14: ≥8.5 kPa for FM≥2 and ≥16.2 kPa for FM4. As there were various causes of chronic liver disease in our study population, we also tested the cutoff published in the meta-analysis of Friedrich-Rust et al.15: ≥7.7 kPa for FM≥2 and ≥13.1 kPa for FM4. By using the diagnostic cutoffs, LSE median was categorized into estimated FFS stages according to the most probable Metavir F stage(s). This approach provided the following LSE classification: LSE result <cutoff for FM≥2: FFS0/1; ≥cutoff for FM≥2 and <cutoff for FM4: FFS2/3; ≥cutoff for FM4: FFS4.

Statistical Analysis

Because distribution was skewed for most quantitative variables, they were expressed as median with 1st and 3rd quartiles in brackets. Diagnostic accuracy was mainly expressed as area under the receiver operating characteristic (AUROC) (for binary diagnoses of significant fibrosis, severe fibrosis, or cirrhosis) or the rate of well-classified patients by the LSE classification. AUROCs were compared according to Delong et al.16 for paired groups, and Hanley and McNeil17 for unpaired groups.

To identify the factors influencing LSE accuracy, we determined the variables independently associated with the following diagnostic target: significant fibrosis, severe fibrosis, or cirrhosis by stepwise forward binary logistic regression. Indeed, by definition, each variable selected by a multivariate analysis is an independent predictor of the diagnostic target studied. In other words, when selected with LSE median, an independent predictor influences the outcome (diagnostic target) for each fixed level of liver stiffness. Consequently, the multivariate analysis allowed for the identification of the predictor influencing LSE accuracy. The dependent variable, LSE median, was tested with the following independent variables: age, sex, body mass index, cause of chronic liver disease (CHC versus other), ≥10 LSE valid measurements, LSE success rate, IQR/M, and biopsy length as a putative confounding variable. Statistical analyses were performed using SPSS v. 18.0 software (IBM, Armonk, NY) and SAS 9.1 (SAS Institute, Cary, NC).

Results

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information

Patients

The main characteristics of the 1,165 patients included in the study are presented in Table 1. The cause of chronic liver disease was CHC in 68.5% of patients, hepatitis B monoinfection: 5.7%, alcohol: 12.4%, nonalcoholic fatty liver disease (NAFLD): 3.3%, and other: 10.1%. Overweight status (body mass index ≥25.0 kg/m2) was present in 44.0% of patients. Liver biopsies were considered reliable in 92.0% of the cases. The prevalence of significant fibrosis, severe fibrosis, and cirrhosis was, respectively, 63.3%, 38.9%, and 21.0%.

Table 1. Characteristics at Inclusion of the 1,165 Patients
 Cause of Liver Disease
 AllCHCOtherP*
  • CHC: chronic hepatitis C monoinfection, IQR/M: LSE interquartile range/LSE median.

  • *

    Between CHC and other causes of liver disease.

  • Mean ± standard deviation are shown here since median, 1st and 3r§ quartiles were all equal to 10 (92.8% of LSE had ≥10 valid measurements).

  • According to the usual definition for LSE reliability (≥10 valid measurements and ≥60% success rate and IQR/M ≤0.30).

Patients (n)1165798367
Age (years)51.1 (43.9-60.5)50.1 (43.9-59.7)54.2 (43.9-63.3)0.084
Male (%)65.262.970.00.018
Body mass index (kg/m2)24.5 (22.2-27.6)24.2 (22.1-26.7)25.1 (22.5-29.4)<10−3
Body mass index ≥25 kg/m2 (%)44.040.150.910−3
Metavir FM stage (%):   <10−3
- 05.63.510.3
- 131.037.117.6
- 224.527.417.9
- 317.917.518.7
- 421.014.535.5
Biopsy length (mm)25 (18-30)24 (18-30)25 (17-32)0.093
Reliable biopsy (%)92.093.888.010−3
LSE median (kPa)8.1 (5.8-14.0)7.8 (5.6-11.1)11.0 (6.6-25.1)<10−3
Valid measurements (n)9.8 ± 1.5 9.8 ± 1.3 9.7 ± 1.9 0.227
≥10 LSE valid measurements (%)92.893.391.60.291
LSE success rate (%)100 (83-100)100 (83-100)91 (77-100)10−3
LSE success rate ≥60% (%)89.891.985.1<10−3
IQR/M0.17 (0.12-0.25)0.17 (0.12-0.24)0.18 (0.11-0.25)0.211
IQR/M ≤0.30 (%)85.586.184.30.416
Reliable LSE (%) 75.777.671.60.027

LSE Accuracy

The AUROCs (±standard deviation [SD]) of LSE for the diagnosis of significant fibrosis, severe fibrosis, and cirrhosis were, respectively, 0.822 ± 0.012, 0.872 ± 0.010, and 0.910 ± 0.011 (Table 2). AUROCs of LSE in unreliable biopsies were not significantly different from those in reliable biopsies (details not shown). The rates of well-classified patients according to the various diagnostic cutoffs tested are presented in Table S1 in the Supporting Material. Cutoffs published by Castera et al.12 provided the highest accuracy for significant fibrosis and LSE classification, and were thus used for further statistical analysis.

Table 2. AUROC of LSE as a Function of LSE Reliability by the Usual Definition, and Cause of Liver Disease
Cause of Liver DiseaseDiagnostic TargetLiver Stiffness Evaluation
AllReliable *UnreliableP
  • CHC: chronic hepatitis C monoinfection.

  • *

    According to the usual definition for LSE reliability (LSE with ≥10 valid measurements and ≥60% success rate and LSE interquartile range/LSE median ≤0.30).

  • Between reliable and unreliable LSE.

  • P ≤ 10−3 vs. CHC patients.

  • §

    P ≤ 0.010 vs. CHC patients.

  • P ≤ 0.05 vs. CHC patients.

AllFM≥20.822 ± 0.0120.835 ± 0.0140.794 ± 0.0260.165
 FM≥30.872 ± 0.0100.881 ± 0.0120.856 ± 0.0230.344
 FM40.910 ± 0.0110.913 ± 0.0120.906 ± 0.0220.780
CHCFM≥20.787 ± 0.0160.805 ± 0.0180.733 ± 0.0370.080
 FM≥30.843 ± 0.0150.856 ± 0.0160.811 ± 0.0350.242
 FM40.897 ± 0.0160.900 ± 0.0180.918 ± 0.0380.669
OtherFM≥20.883 ± 0.019 0.888 ± 0.024 §0.889 ± 0.032 0.980
 FM≥30.905 ± 0.016 §0.913 ± 0.0180.888 ± 0.0340.516
 FM40.908 ± 0.0160.920 ± 0.0180.862 ± 0.0370.159

Usual Definition for LSE Reliability

92.8% of LSE included at least 10 valid measurements, 89.8% achieved a ≥60% success rate, and 85.5% had an IQR/M ≤0.30 (Table 1). None of these conditions led to a significant increase in LSE AUROC (Table S2).

75.7% of LSE fulfilled these three criteria; they were consequently considered as reliable according to the usual definition for LSE reliability. AUROCs for significant fibrosis, severe fibrosis, or cirrhosis were not significantly different between reliable and unreliable LSE (Table 2). By using Castera et al.12 cutoffs (≥7.1 kPa for FM≥2 and ≥12.5 kPa for FM4), LSE accuracy was not significantly different between reliable and unreliable LSE for the diagnosis of significant fibrosis (respectively: 75.5% versus 72.1%, P = 0.255) or cirrhosis (85.8% versus 81.5%, P = 0.082). Similarly, the rate of well-classified patients by the LSE classification (FFS0/1, FFS2/3, FFS4) derived from Castera et al. cutoffs was not significantly different between reliable and unreliable LSE (respectively: 63.5% versus 57.2%, P = 0.064).

Independent Predictors of Fibrosis Staging

Independent predictors of significant fibrosis, severe fibrosis, or cirrhosis are detailed in Table 3. Briefly, in addition to LSE median, IQR/M was the only LSE characteristic independently associated with the three diagnostic targets of fibrosis, with no significant influence of the number of LSE valid measurements, LSE success rate, or the cause of liver disease. There was no colinearity between LSE median and IQR/M (Spearman coefficient correlation = 0.047, P = 0.109). Independent predictors were the same when variables were introduced as dichotomous results (IQR/M ≤0.30, LSE success rate ≥60%, reliable versus unreliable biopsy) in the multivariate analyses (details not shown).

Table 3. Variables Independently Associated with Each Diagnostic Target of Fibrosis Staging by Stepwise Forward Binary Logistic Regression
Diagnostic TargetStepVariablePOdds Ratio (95%CI)
FM≥21stLSE median<10−31.323 (1.262-1.387)
 2ndAge<10−31.023 (1.011-1.035)
 3rdIQR/M0.0020.197 (0.072-0.543)
FM≥31stLSE median<10−31.278 (1.234-1.324)
 2ndIQR/M10−30.121 (0.034-0.433)
 3rdAge0.0071.017 (1.005-1.030)
FM41stLSE median<10−31.201 (1.168-1.234)
 2ndBiopsy length0.0020.965 (0.944-0.987)
 3rdIQR/M0.0050.070 (0.011-0.442)

Classification of LSE Accuracy

We develop here a classification using the preceding independent predictors of accuracy.

IQR/M.

LSE accuracy as a function of increasing intervals of IQR/M is depicted in Table S3. Briefly, LSE accuracy decreased when IQR/M increased and three subgroups of LSE were identified: IQR/M ≤0.10 (16.6% of patients); 0.10< IQR/M ≤0.30 (69.0%); IQR/M >0.30 (14.5%). LSE with IQR/M ≤0.10 had significantly higher accuracy than LSE with IQR/M >0.10 (Table 4). LSE with 0.10< IQR/M ≤0.30 had higher accuracy than LSE with IQR/M >0.30, but the difference did not reach statistical significance.

Table 4. Accuracy of LSE Median as a Function of IQR/M
Diagnostic Target:AUROCDiagnostic Accuracy (%)*
FM≥2FM≥3FM4FM≥2FM4LSE Classification
  • *

    Rate of well-classified patients using 7.1 kPa as the LSE cutoff for the diagnosis of significant fibrosis (FM≥2), 12.5 kPa for the diagnosis of cirrhosis (FM4), or LSE classification (FFS0/1, FFS2/3, FFS4) derived from the 2 previous diagnostic cutoffs (12).

  • P for linear trend of diagnostic accuracy across the 3 subgroups of IQR/M.

IQR/M≤0.100.886 ± 0.0240.937 ± 0.0180.970 ± 0.01177.190.469.1
 0.10< and ≤0.300.822 ± 0.0150.868 ± 0.0130.895 ± 0.01575.684.762.6
 >0.300.785 ± 0.0350.842 ± 0.0320.898 ± 0.03169.180.653.9
Comparison (P):      
≤0.10 vs. 0.10< and ≤0.300.0240.002<10−30.6610.0430.092
≤0.10 vs. >0.300.0170.0100.0290.0880.0080.003
0.10< and ≤0.30 vs. >0.300.3310.4510.9310.0810.1960.039
Linear trend 0.0910.0090.003
LSE Median.

By using 7.1 kPa as a diagnostic cutoff,12 the rate of well-classified patients for significant fibrosis was very good in LSE medians ≥7.1 kPa, but only fair in LSE medians <7.1 kPa: 81.5% versus 64.5%, respectively (P < 10−3). By using 12.5 kPa as a diagnostic cutoff,12 the rate of well-classified patients for cirrhosis was excellent in LSE medians <12.5 kPa, but only fair in LSE medians ≥12.5 kPa: 94.3% versus 60.4%, respectively (P < 10−3). LSE thus demonstrated excellent negative predictive value for cirrhosis and very good positive predictive value for significant fibrosis. Conversely, it had insufficient positive predictive value for cirrhosis and insufficient negative predictive value for significant fibrosis. Finally, the rate of well-classified patients by the LSE classification derived from Castera et al. cutoffs was not significantly different among its three classes, FFS0/1: 64.5%, FFS2/3: 60.4%, and FFS4: 60.4% (P = 0.379).

IQR/M and LSE Median.

In patients with LSE median <7.1 kPa, the diagnostic accuracy of the LSE classification derived from Castera et al. cutoffs was not significantly different among the three IQR/M subgroups (P = 0.458; Fig. 1). Conversely, in patients with LSE median ≥7.1 kPa the diagnostic accuracy of the LSE classification was significantly lower in LSE with IQR/M >0.30 compared to LSE with IQR/M ≤0.30 (43.8% versus 64.1%, P < 10−3; Fig. 1). The rates of well-classified patients for the binary diagnoses of significant fibrosis or cirrhosis as a function of IQR/M and LSE median are detailed in Supporting Fig. S1. Briefly, in patients with LSE median ≥7.1 kPa, LSE with IQR/M >0.30 had lower accuracy for significant fibrosis than LSE with IQR/M ≤0.30 (67.6% versus 84.3%, P < 10−3). In patients with LSE median ≥12.5 kPa, LSE with IQR/M >0.30 had lower accuracy for cirrhosis than LSE with IQR/M ≤0.30 (45.1% versus 64.0%, P = 0.011).

thumbnail image

Figure 1. Rate of well-classified patients by the LSE classification derived from Castera et al.12 cutoffs, as a function of the three classes of the classification and IQR/M.

Download figure to PowerPoint

Proposal for New Reliability Criteria in LSE

The previous findings led us to develop new criteria for the interpretation of LSE results (Table 5). LSE accuracy in the subgroup of LSE with IQR/M ≤0.10 was higher than in the whole population (Table 6). LSEs in this subgroup were thus considered “very reliable.” LSE with 0.10< IQR/M ≤0.30 or with IQR/M >0.30 and LSE median <7.1 kPa provided accuracy similar to that of the whole population and were thus considered “reliable.” Finally, LSE with IQR/M >0.30 and LSE median ≥7.1 kPa provided accuracy lower than that of the whole population and were thus considered “poorly reliable.”

Table 5. New Reliability Criteria for LSE and Ensuing Interpretation as Very Reliable (white), Reliable (gray), and Poorly Reliable (dark gray) LSE
inline image
Table 6. Accuracy of LSE as a Function of LSE Reliability Defined by the New Criteria
Diagnostic Target:AUROCDiagnostic Accuracy (%) *
FM≥2FM≥3FM4FM≥2FM4LSE Classification
  • *

    Rate of well-classified patients using 7.1 kPa as the LSE cutoff for the diagnosis of significant fibrosis (FM≥2), 12.5 kPa for the diagnosis of cirrhosis (FM4), or LSE classification (FFS0/1, FFS2/3, FFS4) derived from the 2 previous diagnostic cutoffs (12).

  • This result, already presented in table 2, is provided here for comparison with subgroups.

  • P for linear trend of diagnostic accuracy across the 3 subgroups of LSE.

LSE:All 0.822 ± 0.0120.872 ± 0.0100.910 ± 0.01174.985.062.4
 Very reliable0.886 ± 0.0240.937 ± 0.0180.970 ± 0.01177.190.469.1
 Reliable0.823 ± 0.0140.876 ± 0.0120.904 ± 0.01475.385.863.2
 Poorly reliable0.773 ± 0.0450.745 ± 0.0490.819 ± 0.05267.669.543.8
Comparison (P):      
Very reliable vs. reliable0.0230.005<10−30.6030.0900.125
Very reliable vs. poorly reliable0.027<10−30.0040.076<10−3<10−3
Reliable vs. poorly reliable0.2890.0090.1150.088<10−3<10−3
Linear trend   0.107<10−3<10−3

According to these new criteria, 16.6% of LSE were considered “very reliable,” 74.3% “reliable,” and 9.1% “poorly reliable.” Importantly, LSE AUROCs and diagnostic accuracies were significantly different among these three subgroups (Table 6). Finally, the rate of poorly reliable LSE according to the new criteria was significantly lower than that of unreliable LSE according to the usual definition (9.1% versus 24.3%, P < 10−3).

Sensitivity Analysis

We evaluated our new criteria for LSE reliability as a function of several potential influencing characteristics: cause of liver disease (CHC versus others), diagnostic indexes (AUROC, binary diagnosis of significant fibrosis or cirrhosis, LSE classification), and diagnostic cutoffs published by Ziol et al.,13 Stebbing et al.,14 and Friedrich-Rust et al.15 The detailed results are presented in Tables S4 and S5. Briefly, whatever the potential influencing factor, a decrease in LSE reliability, according to our new criteria, was associated with a decrease in LSE accuracy. Body mass index (<25 versus ≥25 kg/m2) did not influence LSE accuracy in any of the three new categories of LSE reliability (details not shown). Because of the few numbers of patients with hepatitis B, alcohol abuse, or NAFLD, it was not possible to perform a sensitivity analysis for these causes of chronic liver disease.

Discussion

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information

There is currently a critical need in clinical practice and in clinical research to precisely define the reliability criteria of LSE. Indeed, Fibroscan is now widely used and physicians have to daily determine whether LSEs are reliable and permit a more accurate diagnosis. Moreover, in clinical research the reliability criteria of LSEs directly influence the results of studies because unreliable LSEs are usually excluded from statistical analyses.

Relevance of the Usual Definition for LSE Reliability

To our knowledge, the present study is the first to evaluate the relevance of the usual definition for LSE reliability. The strengths of our work include the large number of included patients, the high rate of reliable liver biopsy (92.0%), and a thorough analysis of accuracy including either global indexes of performance such as AUROC, or useful indexes for daily clinical practice such as the rate of well-classified patients. Our results clearly show that LSE considered as reliable according to the usual definition have higher diagnostic accuracy than unreliable LSE, but this difference is slight and not statistically significant (Table 2). The usual definition for LSE reliability, including the number of valid measurements, LSE success rate, and IQR/M, is thus not relevant for clinical practice or clinical research.

New Reliability Criteria for LSE

Multivariate analyses showed that liver fibrosis staging was independently linked to IQR/M, with no influence of the number of LSE valid measurements or LSE success rate (Table 3). These results confirm the key role of IQR/M, as suggested in the Lucidarme et al. and Myers et al. studies.5, 6 However, these two studies were based on a discrepancy analysis between FM stages by liver biopsy and FFS stages (defined by LSE median categorized into equivalent Metavir fibrosis stages). IQR/M cutoffs were thus calculated to predict the discrepancy, but they failed to delineate subgroups of LSE where accuracies for liver fibrosis diagnosis were significantly different. In the present study, we used diagnosis of fibrosis stages as the main outcome. This allowed us to determine the thresholds of IQR/M that define subgroups of LSE with significantly different diagnostic accuracies, and thus the precise reliability criteria for LSE.

IQR/M ≤0.10.

LSE with IQR/M ≤0.10 (i.e., with minimal signal variability) provided significantly higher AUROCs, a higher rate of well-classified patients for the diagnosis of cirrhosis, and a higher rate of well-classified patients by LSE classification (Table 4). LSE with IQR/M ≤0.10 may thus be considered “very reliable,” especially when the LSE median is ≥12.5 kPa (Fig. 1).

IQR/M >0.30.

LSE with IQR/M >0.30 (i.e., with large variability) provided lower AUROCs and a lower rate of well-classified patients when compared to LSE with <0.10 IQR/M ≤0.30, but the difference was not statistically significant (Table 4). Because multivariate analyses showed a significant interaction between these two variables, we evaluated the influence of IQR/M according to LSE median. The deleterious effect of IQR/M >0.30 on LSE accuracy was amplified by the liver stiffness level: the diagnostic accuracy for cirrhosis decreased even more in patients with LSE median ≥12.5 kPa, and accuracy for significant fibrosis significantly decreased in patients with LSE median ≥7.1 kPa. Finally, LSE with IQR/M >0.30 may be considered “poorly reliable” in patients with LSE median ≥7.1 kPa and “reliable” in patients with LSE median <7.1 kPa (Fig. 1).

The interaction between IQR/M and liver stiffness level is not surprising: IQR corresponds to the interval around the LSE median containing 50% of the valid measurements between the 25th and 75th percentiles, and is usually expressed as the ratio IQR/M. A high IQR/M implies a large distribution of LSE valid measurements and thus a higher risk of an aberrant LSE median. However, by definition, a high IQR/M also implies a smaller interval in low liver stiffness levels (compared to high stiffness levels). For example, an IQR/M at 0.30 represents a 1.5 kPa interval when liver stiffness is 5.0 kPa, but a 4.5 kPa interval when liver stiffness is 15.0 kPa. Consequently, IQR/M has little impact on LSE median in low liver stiffness levels, thus explaining why LSE with IQR/M >0.30 may be considered “reliable” when LSE median is <7.1 kPa (Fig. 1). Because increasing liver stiffness amplifies the deleterious effect of IQR/M >0.30 with a significant decrease in diagnostic accuracy, LSE with IQR/M >0.30 and median ≥7.1 kPa may be considered “poorly reliable” (Table 6; Fig. 1). Finally, by inverting the same reasoning, one can explain why LSE with IQR/M ≤0.10 are very accurate in high liver stiffness values (Fig. 1).

0.10< IQR/M ≤0.30.

The intermediate category, LSE with 0.10< IQR/M ≤0.30, may be considered “reliable” (Table 4; Fig. 1).

Finally, our results permitted the establishment of new reliability criteria identifying three LSE subgroups according to IQR/M and liver stiffness level (Table 5). The accuracy of LSE for fibrosis staging was significantly different between these three subgroups, thus demonstrating the relevance of these new criteria (Table 6). Moreover, the rate of poorly reliable LSE according to the new criteria (9.1%) was significantly lower than “unreliable” LSE as defined in the previous usual criteria (24.3%).

How Many Valid Measurements Are Needed for LSE?

In our study, as in those of Lucidarme et al. and Myers et al.,5, 6 the ≥10 valid measurements variable had no influence on LSE accuracy (Table 3). This leads to the question: How many valid measurements are required for LSE? Kettaneh et al.18 have shown in 935 patients with CHC that AUROCs for the diagnosis of significant fibrosis or cirrhosis barely differed across LSE median values obtained from the three first, five first, and ten first valid measurements. We found similar results in our population (details not shown). However, the analysis by Kettaneh et al. was performed in a subgroup of patients with LSE including at least 10 valid measurements; their results probably do not reflect the accuracy of LSE for which only three or five valid measurements are genuinely available because of examination difficulties. In our study, 92.8% of LSE had at least 10 valid measurements and this rate was 96.9% in the large series of Castera et al.4 Considering the current state of knowledge, and because LSE is a quick and easy procedure, the pragmatic goal of operators should be to obtain 10 valid measurements, whatever the success rate.19

Key Role of IQR/M in LSE

Several recent longitudinal studies have shown that LSE median was linked to clinical events such as liver decompensation,20, 21 hepatocellular carcinoma,22, 23 or death.24 This suggests that liver stiffness may be used as a prognostic index in chronic liver diseases. Reliability criteria of LSE are thus important to correctly compare LSE repeated over time and accurately evaluate the course of liver stiffness in patients. We have previously shown that interobserver reproducibility of LSE median depends on IQR/M and liver stiffness level.25, 26 Interobserver agreement decreased in LSE with IQR/M >0.25,25 confirming the key role of this index for the interpretation of LSE median in the management of patients with chronic liver diseases.

Sensitivity Analysis

Our results suggest that LSE is less accurate in CHC patients than in patients with other causes of chronic liver disease (Table 2). However, the cause of liver disease was not an independent predictor of fibrosis (Table 3). Moreover, the characteristics of CHC and non-CHC patients were significantly different, especially for F stages with a significantly higher prevalence of FM≥2, FM≥3, and F4 in non-CHC patients (Table 1). It has been previously shown that a higher prevalence of the diagnostic target is associated with an increase in fibrosis tests accuracy.27 Finally, the difference in LSE accuracy observed between CHC and non-CHC patients is probably explained by the significantly different characteristics of these two subgroups.

LSE diagnostic cutoffs calculated in published studies are very heterogeneous.28 We tested several cutoffs, some calculated for CHC12-14 and others determined in a large meta-analysis including patients with various causes of chronic liver disease.15 Interestingly, we found significant but slight differences in diagnostic accuracy, either in CHC patients or in patients with other causes of chronic liver disease. This supports the interest to evaluate the influence of the cause of chronic liver disease on LSE accuracy and diagnostic cutoffs determination in well-matched populations of alcoholic, NAFLD, CHC, or chronic hepatitis B patients.

Finally, we evaluated in a sensitivity analysis the influence of several characteristics on our new criteria for LSE reliability. Regardless of the characteristic tested (cause of chronic liver disease, diagnostic cutoffs used, diagnostic index, body mass index), a decrease in LSE reliability according to our new criteria was associated with a decrease in LSE accuracy, reinforcing the relevance of these new criteria for the interpretation of LSE results in daily clinical practice.

Relevance of the New Reliability Criteria for LSE

Our new reliability criteria for LSE represent a significant improvement for the interpretation of LSE in clinical practice. First, we have shown that the usual definition of LSE reliability is not relevant and the criteria “success rate ≥60%” is unnecessary. Second, we have defined a new category of “very reliable LSE” which provides very good positive predictive value for the diagnosis of cirrhosis. As a complement to diagnostic accuracy, which is useful for the individual diagnosis in clinical practice, AUROC, based on sensitivity and specificity, is another important index especially for fibrosis screening in the general population.29 In this setting, “very reliable” LSE provided the highest AUROC significantly different from those of the other two new reliability classes. Third, we have refined the usual definition of unreliable LSE (IQR/M >0.30) only in patients with LSE median ≥7.1 kPa. Consequently, the rate of patients with “poorly reliable” LSE, as defined by our new reliability criteria, was 3 times lower than in LSE considered as unreliable according to the usual definition. Compared to “reliable” LSE, “poorly reliable” LSE are impaired by a significantly lower diagnostic accuracy for cirrhosis or LSE classification. For the diagnosis of significant fibrosis, the accuracy reached borderline significance in the whole population and was significantly lower in the subgroup of CHC patients.

It is now well documented that several conditions influence LSE accuracy for the noninvasive evaluation of liver fibrosis: liver inflammation,30 cholestasis,31 central venous pressure,32 food intake,33 and probably liver steatosis.34 Our results show that intrinsic characteristic of LSE (IQR/M) also influences its accuracy. Finally, our new reliability criteria are an additional characteristic that must be taken into account by physicians for an accurate evaluation of liver fibrosis by LSE.

In conclusion, the usual definition for LSE reliability is not relevant. LSE median must be interpreted according to IQR/M and liver stiffness level. Using these two characteristics, we defined new reliability criteria for LSE resulting in three categories: “very reliable,” “reliable,” and “poorly reliable” with significantly different diagnostic accuracies.

Acknowledgements

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information

Angers: Sophie Michalak, Anselme Konaté, Catherine Ternisien, Alain Chevailler, Françoise Lunel, Wael Mansour; Grenoble: Vincent Leroy, Marie-Noelle Hilleret, Patrice Faure, Jean-Charles Renversez, Francoise Morel, Candice Trocme; Bordeaux: Juliette Foucher, Laurent Castéra, Patrice Couzigou, Pierre-Henri Bernard, Wassil Merrouche, Paulette Bioulac-Sage. FIBROSTAR study: Hepatologists: R. Poupon, A. Poujol, Saint-Antoine, Paris; A. Abergel, Clermont-Ferrand; J.P. Bronowicki, Nancy; J.P. Vinel, S. Metivier, Toulouse; V. De Ledinghen, J. Foucher, Bordeaux; O. Goria, Rouen; M. Maynard-Muet, C. Trepo, Lyon; Ph. Mathurin, Lille; D. Guyader, H. Danielou, Rennes; O. Rogeaux, Chambéry; S. Pol, Ph. Sogni, Cochin, Paris; A. Tran, Nice; P. Calès, Angers; P. Marcellin, T. Asselah, Clichy; M. Bourliere, V. Oulès, Saint Joseph, Marseille; D. Larrey, Montpellier; F. Habersetzer, Strasbourg; M. Beaugrand, Bondy; V Leroy, MN Hilleret, Grenoble. Biologists: R-C. Boisson, Lyon Sud; M-C. Gelineau, B. Poggi, Hôtel Dieu, Lyon; J-C. Renversez, Candice Trocmé, Grenoble; J. Guéchot, R. Lasnier, M. Vaubourdolle, Paris; H. Voitot, Beaujon, Paris; A. Vassault, Necker, Paris; A. Rosenthal-Allieri, Nice; A. Lavoinne, F. Ziegler, Rouen; M. Bartoli, C. Lebrun, Chambéry; A. Myara, Paris Saint-Joseph; F. Guerber, A. Pottier, Elibio, Vizille. Pathologists: E-S. Zafrani, Créteil; N. Sturm, Grenoble. Methodologists: A. Bechet, J-L Bosson, A. Paris, S. Royannais, CIC, Grenoble; A. Plages, Grenoble. We also thank the following contributors: Gilles Hunault, Pascal Veillon, Gwenaëlle Soulard; and Kevin L. Erwin (for English proofreading).

References

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information
  • 1
    Degos F, Perez P, Roche B, Mahmoudi A, Asselineau J, Voitot H, et al. Diagnostic accuracy of FibroScan and comparison to liver fibrosis biomarkers in chronic viral hepatitis: a multicenter prospective study (the FIBROSTIC study). J Hepatol 2010; 53: 1013-1021.
  • 2
    Nobili V, Vizzutti F, Arena U, Abraldes JG, Marra F, Pietrobattista A, et al. Accuracy and reproducibility of transient elastography for the diagnosis of fibrosis in pediatric nonalcoholic steatohepatitis. HEPATOLOGY 2008; 48: 442-448.
  • 3
    Zarski JP, Sturm N, Guechot J, Paris A, Zafrani ES, Asselah T, et al. Comparison of 9 blood tests and transient elastography for liver fibrosis in chronic hepatitis C: the ANRS HCEP-23 study. J Hepatol 2012; 56: 55-62.
  • 4
    Castera L, Foucher J, Bernard PH, Carvalho F, Allaix D, Merrouche W, et al. Pitfalls of liver stiffness measurement: a 5-year prospective study of 13,369 examinations. HEPATOLOGY 2010; 51: 828-835.
  • 5
    Lucidarme D, Foucher J, Le Bail B, Vergniol J, Castera L, Duburque C, et al. Factors of accuracy of transient elastography (fibroscan) for the diagnosis of liver fibrosis in chronic hepatitis C. HEPATOLOGY 2009; 49: 1083-1089.
  • 6
    Myers RP, Crotty P, Pomier-Layrargues G, Ma M, Urbanski SJ, Elkashab M. Prevalence, risk factors and causes of discordance in fibrosis staging by transient elastography and liver biopsy. Liver Int 2010; 30: 1471-1480.
  • 7
    Boursier J, Bertrais S, Oberti F, Gallois Y, Fouchard-Hubert I, Rousselet MC, et al. Comparison of accuracy of fibrosis degree classifications by liver biopsy and non invasive tests in chronic hepatitis C. BMC Gastroenterol 2011; 30: 132.
  • 8
    Boursier J, De Ledinghen V, Zarski J, Rousselet MC, Sturm N, Foucher J, et al. A new combination of blood test and Fibroscan for accurate non-invasive diagnosis of liver fibrosis stages in chronic hepatitis C. Am J Gastroenterol 2011; 106: 1255-1263.
  • 9
    Boursier J, de Ledinghen V, Zarski JP, Fouchard-Hubert I, Gallois Y, Oberti F, et al. Comparison of 8 diagnostic algorithms for liver fibrosis in hepatitis C: new algorithms are more precise and entirely non-invasive. HEPATOLOGY 2012; 55: 58-67.
  • 10
    Nousbaum JB, Cadranel JF, Bonnemaison G, Bourliere M, Chiche L, Chor H, et al. Clinical practice guidelines on the use of liver biopsy. Gastroenterol Clin Biol 2002; 26: 848-878.
  • 11
    Castera L, Forns X, Alberti A. Non-invasive evaluation of liver fibrosis using transient elastography. J Hepatol 2008; 48: 835-847.
  • 12
    Castera L, Vergniol J, Foucher J, Le Bail B, Chanteloup E, Haaser M, et al. Prospective comparison of transient elastography, Fibrotest, APRI, and liver biopsy for the assessment of fibrosis in chronic hepatitis C. Gastroenterology 2005; 128: 343-350.
  • 13
    Ziol M, Handra-Luca A, Kettaneh A, Christidis C, Mal F, Kazemi F, et al. Noninvasive assessment of liver fibrosis by measurement of stiffness in patients with chronic hepatitis C. HEPATOLOGY 2005; 41: 48-54.
  • 14
    Stebbing J, Farouk L, Panos G, Anderson M, Jiao LR, Mandalia S, et al. A meta-analysis of transient elastography for the detection of hepatic fibrosis. J Clin Gastroenterol 2010; 44: 214-219.
  • 15
    Friedrich-Rust M, Ong MF, Martens S, Sarrazin C, Bojunga J, Zeuzem S, et al. Performance of transient elastography for the staging of liver fibrosis: a meta-analysis. Gastroenterology 2008; 134: 960-974.
  • 16
    DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988; 44: 837-845.
  • 17
    Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982; 143: 29-36.
  • 18
    Kettaneh A, Marcellin P, Douvin C, Poupon R, Ziol M, Beaugrand M, et al. Features associated with success rate and performance of fibroscan measurements for the diagnosis of cirrhosis in HCV patients: a prospective study of 935 patients. J Hepatol 2007; 46: 628-634.
  • 19
    Rigamonti C, Fraquelli M. Do not trivialize the Fibroscan examination, value its accuracy. J Hepatol 2007; 46: 1149.
  • 20
    Forestier J, Dumortier J, Guillaud O, Ecochard M, Roman S, Boillot O, et al. Noninvasive diagnosis and prognosis of liver cirrhosis: a comparison of biological scores, elastometry, and metabolic liver function tests. Eur J Gastroenterol Hepatol 2010; 22: 532-540.
  • 21
    Robic MA, Procopet B, Metivier S, Peron JM, Selves J, Vinel JP, et al. Liver stiffness accurately predicts portal hypertension related complications in patients with chronic liver disease: a prospective study. J Hepatol 2011; 55: 1017-1024.
  • 22
    Jung KS, Kim SU, Ahn SH, Park YN, Kim do Y, Park JY, et al. Risk assessment of hepatitis B virus-related hepatocellular carcinoma development using liver stiffness measurement (FibroScan). HEPATOLOGY 2011; 53: 885-894.
  • 23
    Masuzaki R, Tateishi R, Yoshida H, Goto E, Sato T, Ohki T, et al. Prospective risk assessment for hepatocellular carcinoma development in patients with chronic hepatitis C by transient elastography. HEPATOLOGY 2009; 49: 1954-1961.
  • 24
    Vergniol J, Foucher J, Terrebonne E, Bernard PH, le Bail B, Merrouche W, et al. Noninvasive tests for fibrosis and liver stiffness predict 5-year outcomes of patients with chronic hepatitis C. Gastroenterology 2011; 140: 1970-1979.
  • 25
    Boursier J, Konate A, Gorea G, Reaud S, Quemener E, Oberti F, et al. Reproducibility of liver stiffness measurement by ultrasonographic elastometry. Clin Gastroenterol Hepatol 2008; 6: 1263-1269.
  • 26
    Boursier J, Konate A, Guilluy M, Gorea G, Sawadogo A, Quemener E, et al. Learning curve and interobserver reproducibility evaluation of liver stiffness measurement by transient elastography. Eur J Gastroenterol Hepatol 2008; 20: 693-701.
  • 27
    Poynard T, Halfon P, Castera L, Munteanu M, Imbert-Bismut F, Ratziu V, et al. Standardization of ROC curve areas for diagnostic evaluation of liver fibrosis markers based on prevalences of fibrosis stages. Clin Chem 2007; 53: 1615-1622.
  • 28
    Tsochatzis EA, Gurusamy KS, Ntaoula S, Cholongitas E, Davidson BR, Burroughs AK. Elastography for the diagnosis of severity of fibrosis in chronic liver disease: a meta-analysis of diagnostic accuracy. J Hepatol 2011; 54: 650-659.
  • 29
    Roulot D, Costes JL, Buyck JF, Warzocha U, Gambier N, Czernichow S, et al. Transient elastography as a screening tool for liver fibrosis and cirrhosis in a community-based population aged over 45 years. Gut 2011; 60: 977-984.
  • 30
    Coco B, Oliveri F, Maina AM, Ciccorossi P, Sacco R, Colombatto P, et al. Transient elastography: a new surrogate marker of liver fibrosis influenced by major changes of transaminases. J Viral Hepat 2007; 14: 360-369.
  • 31
    Millonig G, Reimann FM, Friedrich S, Fonouni H, Mehrabi A, Buchler MW, et al. Extrahepatic cholestasis increases liver stiffness (FibroScan) irrespective of fibrosis. HEPATOLOGY 2008; 48: 1718-1723.
  • 32
    Millonig G, Friedrich S, Adolf S, Fonouni H, Golriz M, Mehrabi A, et al. Liver stiffness is directly influenced by central venous pressure. J Hepatol 2010; 52: 206-210.
  • 33
    Mederacke I, Wursthorn K, Kirschner J, Rifai K, Manns MP, Wedemeyer H, et al. Food intake increases liver stiffness in patients with chronic or resolved hepatitis C virus infection. Liver Int 2009; 29: 1500-1506.
  • 34
    Ziol M, Kettaneh A, Ganne-Carrie N, Barget N, Tengher-Barna I, Beaugrand M. Relationships between fibrosis amounts assessed by morphometry and liver stiffness measurements in chronic hepatitis or steatohepatitis. Eur J Gastroenterol Hepatol 2009; 21: 1261-1268.

Supporting Information

  1. Top of page
  2. Abstract
  3. Patients and Methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  8. Supporting Information

Additional Supporting Information may be found in the online version of this article.

FilenameFormatSizeDescription
HEP_25993_sm_SuppInfo.doc220KSupporting Information

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.