Chronic hepatitis C virus (HCV) has become the global epidemic of the new millennium.1 Currently, liver transplantation (LT) is performed approximately 5000 times annually. However, a major challenge facing LT recipients and their physicians is HCV recurrence after LT.2 HCV recurrence is universal in candidates who are HCV RNA–positive at the time of LT.1, 3 The deposition of fibrotic tissue takes place more rapidly in the new liver versus the native liver, and this results in the rapid development of cirrhosis and graft failure.2, 4 The early recognition of recipients with progressive, recurrent HCV after LT is the only practical approach for improving the clinical outcomes of these patients.5
Currently, liver biopsy remains the gold standard for assessing the severity and progression of acute and chronic liver injury.1, 6 With liver histology, the degree of necroinflammation and the stage of fibrosis can be directly assessed.1 The precise staging of liver fibrosis is also important for determining the timing of antiviral therapy among eligible patients with chronic HCV infections.7 Therefore, it has been argued that protocol biopsies should be continued in all patients undergoing transplantation for HCV unless cirrhosis has been identified.1
However, the limitations associated with liver biopsy for native liver diseases also apply in the post-LT setting.8 Even with adequate biopsy samples (≥15 mm in length with 5 or more portal tracts), the presence of cirrhosis can still be understaged in 10% to 30% of cases.8 The understaging of liver fibrosis due to recurrent HCV after LT may have even greater consequences because of the narrow window for early intervention with antiviral therapy for the prevention of graft failure.7, 8
In the last few years, noninvasive imaging techniques have evolved to better estimate the severity of fibrosis.4 One such noninvasive imaging modality that appears to be a clinically useful test for detecting cirrhosis is ultrasound-based transient elastography (TE).9 TE is a rapid, painless, noninvasive, and reproducible method that has been proposed for the assessment of liver fibrosis through the measurement of liver stiffness.7, 10 TE involves the use of an ultrasound transducer to transmit mild-amplitude and low-frequency (50-Hz) vibrations, which induce a 1-dimensional shear wave that propagates through the liver tissue.7 Pulse echo ultrasound acquisition is used to follow the propagation of the shear wave and to measure its velocity.7 TE measures the liver stiffness in a volume that is 1 cm × 4 cm but is at least 100 times bigger than a biopsy sample.7
In addition to the data supporting TE as an accurate technique for detecting hepatic fibrosis in the native liver, a number of investigations examining TE after LT in different clinical settings have recently been published. Therefore, we conducted a systematic review and meta-analysis to characterize the diagnostic performance of TE versus liver biopsy for the detection of hepatic fibrosis in patients with recurrent HCV after LT.
AUC, area under the curve; AUROC, area under the receiver operating characteristic curve; BMI, body mass index; CI, confidence interval; df, degrees of freedom; HCV, hepatitis C virus; HTN, hypertension; HVPG, hepatic venous pressure gradient; I2, inconsistency index; kPa, kilopascals; LT, liver transplantation; QUADAS, Quality Assessment of Diagnostic Accuracy Studies; ROC, receiver operating characteristic; SE, standard error; SROC, summary receiver operating characteristic; TE, transient elastography.
MATERIALS AND METHODS
A computer-aided, systematic evaluation of the literature on TE for the assessment of fibrosis due to recurrent HCV after LT was performed with the following: MEDLINE/PubMed, Embase, Ovid, Cochrane Library, American College of Physicians Journal Club, Google Scholar, Database of Abstracts of Reviews of Effects, and Web of Science (from the inception of the database to October 12, 2010). An initial search strategy using free-text words (transient elastography, transplant, hepatitis C, and fibrosis) was conducted in all languages. Two authors (C.O.A. and J.A.T.) identified 104 articles. A manual search of the reference lists of the primary studies was then performed to locate any potential studies missed by the electronic search strategies. Published abstracts from annual meetings of the American Association for the Study of Liver Diseases, the European Association for the Study of the Liver, and Digestive Disease Week between October 2004 and October 2010 were also reviewed to identify potential studies.
Two independent reviewers (C.O.A. and J.A.T.) read all candidate articles (including abstracts), and they retrieved the full texts of published articles that could not be evaluated with the title and the abstract alone. Primary studies that reported data required for the meta-analysis were identified and included. We identified 10 full articles and 8 abstracts in which liver biopsy was listed as the reference for the assessment of TE for fibrosis in patients with recurrent HCV infections after LT.
The inclusion criteria for primary studies were as follows: a detailed description of the human subjects under study, a description of ultrasound-based TE as the index test, a description of liver biopsy as the reference standard, and the status of HCV-infected patients after LT. The inclusion of non–English language studies was allowed. Studies in which TE was compared to other noninvasive methods of hepatic fibrosis (ie, serum markers) were allowed if discrete information on TE alone could be extracted from the data. Studies including recipients with other etiologies of liver disease were included if the data for HCV-infected patients could be extracted. Special populations of HCV patients (eg, renal transplant recipients) and patients with hepatitis B virus or human immunodeficiency virus coinfections were excluded. We defined significant fibrosis as a fibrosis stage ≥2 for studies using grading systems with 5 stages (F0-F4; ie, the METAVIR, Knodell, Scheuer, and Desmet systems) or as a fibrosis stage ≥3 for studies using the Ishak scoring system (S0-S6).11 For grading systems using 5 stages and for the Ishak scoring system, cirrhosis was defined as a fibrosis stage >4 or ≥5, respectively.11 For duplicate publications of a primary study, the updated article was chosen if the relevant data for the meta-analysis were available.
Quality Assessment of the Primary Studies
Each of the studies meeting the inclusion criteria was analyzed by 2 independent reviewers (C.O.A. and J.A.T.) for quality with the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) checklist. This tool is a 14-item instrument that allows for the identification of important design elements in diagnostic accuracy studies, such as the patient spectrum, the presence or absence of observer blinding and verification bias, the handling of indeterminate results, and the reporting of the loss of patients to follow-up evaluations. Discrepancies in results were handled by a consensus review.
Two reviewers independently extracted the required information from primary studies. Data elements that were prespecified for collection included the age, sex, and body mass index (BMI) of the patients; the sample size; the area under the receiver operating characteristic curve (AUROC); the median liver stiffness; the average liver biopsy size; the number of portal tracts; the histological score; the time between LT and liver biopsy; the HCV genotype; and the histological fibrosis stage. Other variables that were sought included the diagnostic threshold (or cutoff) values used for detecting hepatic fibrosis and the test performance characteristics.
The primary outcome for analysis was the diagnostic test performance of ultrasound-based TE versus the reference standard of liver biopsy for the detection of cirrhosis (stage 4) fibrosis in patients with recurrent HCV infections after LT. The sensitivity values, specificity values, likelihood ratios, and diagnostic odds ratios with 95% confidence intervals (CIs) were reported for individual studies. The diagnostic odds ratio was defined as the odds of having a positive test result in patients with disease versus the odds of a positive test result in patients without disease. If zero cells were identified in the calculation of likelihood ratios, then a value of 0.5 was added to all cells to facilitate the analysis.
The heterogeneity of all diagnostic test parameters was evaluated initially with a graphic examination of forest plots for each parameter. A statistical assessment was then performed with a χ2 test of homogeneity and the inconsistency index (I2). The I2 statistic was defined as the percentage of variability due to heterogeneity beyond that from chance; values greater than 50% represented the possibility of substantial heterogeneity. The pooled summary statistics for the sensitivities, specificities, likelihood ratios, and diagnostic odds ratios of the individual studies were reported. Analyses were conducted to include diagnostic threshold values corresponding to the maximum sensitivity and specificity values from a receiver operating characteristic (ROC) curve analysis. Because of a priori assumptions about the likelihood of heterogeneity between the primary studies, the DerSimonian-Laird random effects model was used for the pooled analyses.
Summary receiver operating characteristic (SROC) curves were also constructed to express the test parameter results as diagnostic odds ratios. These curves were also used to assess the presence of a diagnostic threshold bias as a cause of between-study heterogeneity. Analyses were performed with Meta-Disc 1.1.1 statistical software (Ramón y Cajal Hospital, Madrid, Spain).
The abstracts and titles of 104 primary studies were identified for an initial review with the aforementioned search strategies. A full-text review was required for 12 studies to determine study eligibility,7, 12-22 and 6 investigations were identified for inclusion in this study12, 14, 16, 18, 19, 21 (Table 1). Five studies provided data for the analysis of significant fibrosis,14, 16, 18, 19, 21 and 5 studies provided data for the analysis of cirrhosis12, 14, 18, 19, 21. The investigation by Corradi et al.16 provided only sensitivity and specificity data for the performance of TE in the detection of significant fibrosis; conversely, the study by Beckebaum et al.12 provided only sensitivity and specificity data for assessing the performance of TE in the detection of cirrhosis.
Table 1. Published Studies of TE for the Assessment of Fibrosis in HCV-Infected LT Recipients
The 6 studies, which were published as full articles, had very good quality scores according to the QUADAS criteria (ie, they fulfilled more than 10 of the 14 QUADAS items describing methodological quality).12, 14, 16, 18, 19, 21
Five studies reported quality criteria for liver biopsy specimens.12, 16, 18, 19, 21 Two investigations reported a minimum length of 15 mm,12, 19 2 reported a range of lengths (but they were at least 16 mm),16, 21 and 1 reported a median length of 15 mm.18 One of the studies did not provide the length of the biopsy samples.14 Information on the number of portal tracts per liver biopsy was provided in only 2 studies16, 21 (Table 1). Four studies provided data on the HCV genotype14, 18, 19, 21 (Table 2). The Ishak histological scoring system was used in 1 study,21 the Scheuer system was used in 3 studies,14, 18, 19 the METAVIR study was used in 1 study,16 and the Batts-Ludwig system was used in 1 study12 (Table 2).
Table 2. Test Characteristics and Histological Scores From Published Studies of TE for the Assessment of Fibrosis in HCV-Infected LT Recipients
The demographic and clinical features of the patients in the analyzed studies are listed in Table 1. The median sample size of the studies assessing the presence of significant fibrosis was 90 (range = 56-124), and the median sample size of the studies assessing cirrhosis was also 90 (range = 50-124).12, 14, 16, 18, 19, 21 The median age of the patients with significant fibrosis was 58 years (range = 51.7-63.1 years); the median percentage of men was 66% (range = 54%-84%).14, 16, 18, 19, 21 In the studies that evaluated cirrhosis, the median age was 57.5 years (range = 51.7-63.1 years); the median percentage of men was 66% (range = 54%-81%).12, 14, 18, 19, 21 In the publications that discussed significant fibrosis and cirrhosis, the median BMIs for the subjects were 24.9 (range = 23.9-25 kg/m2) and 25.05 kg/m2 (range = 23.9-26.7 kg/m2), respectively.12, 14, 16, 18, 19, 21
In the analyzed studies, the median proportion of individuals with cirrhosis was 10% (range = 9%-25%).12, 14, 18, 19, 21 The diagnostic cutoff values for significant fibrosis ranged from 7.1 to 10.1 kPa,14, 16, 18, 19, 21 whereas the diagnostic cutoff values for cirrhosis ranged from 10.5 to 26.5 kPa.12, 14, 18, 19, 21
Summary Estimates of Primary Studies
For studies examining the presence or absence of significant fibrosis on liver biopsy samples, there was no qualitative evidence for obvious heterogeneity between the reported sensitivities according to a forest plot inspection (Fig. 1). However, there was evidence for statistical heterogeneity between the sensitivity values, which ranged from 72% to 95% (P = 0.03, I2 = 63.6%; Fig. 1A). A lower degree of heterogeneity was observed between the specificity values, which ranged from 76% to 91% (P = 0.26, I2 = 24.1%; Fig. 1B).
For the diagnosis of significant fibrosis by TE, the pooled estimate for sensitivity was 83% (95% CI = 77%-88%), and the pooled estimate for specificity was 83% (95% CI = 77%-88%). The pooled positive likelihood ratio was 4.95 (95% CI = 3.4-7.2) without a demonstration of heterogeneity (P = 2.08, I2 = 32%), and the pooled negative likelihood ratio was 0.17 (95% CI = 0.09-0.35) with evidence of heterogeneity (P = 0.01, I2 = 68.1%). The summary diagnostic odds ratio was 30.5 (95% CI = 12.8-72.4) with borderline statistical heterogeneity (P = 0.09, I2 = 50.2%).
Among the studies examining the presence or absence of cirrhosis on liver biopsy samples, there was no evidence for heterogeneity in the reported sensitivity values (P = 0.61, I2 = 0.0%; Fig. 2A). In contrast, heterogeneity was observed for the publications reporting cirrhosis; the specificity values ranged from 65% to 98% (P < 0.001, I2 = 89.1%; Fig. 2B).
For the diagnosis of cirrhosis by TE, the pooled estimate for sensitivity was 98% (95% CI = 90%-100%), and the pooled estimate for specificity was 84% (95% CI = 80%-88%). For the pooled positive likelihood ratio of 7 (95% CI = 2.8-17.3), there was evidence of statistical heterogeneity (I2 = 87.9%). In contrast, for the pooled negative likelihood ratio of 0.06 (95% CI = 0.02-0.19), there was no evidence of statistical heterogeneity (I2 = 0.0%). The summary diagnostic odds ratio for cirrhosis was 130 (95% CI = 36.5-462.1), and there was no evidence of statistical heterogeneity (P = 0.80, I2 = 0.0%)
Diagnostic Threshold Bias and Meta-Regression Assessment
To assess the diagnostic threshold bias as a cause of heterogeneity in test performance, we created an ROC plot of the sensitivity versus 1 − the specificity. Among the 5 primary studies providing data for the detection of cirrhosis, the diagnostic threshold yielded an area under the curve (AUC) of 0.9795, and this suggested no effect of a diagnostic threshold bias on the results (Fig. 3). In contrast, an ROC plot of the sensitivity versus 1 − the specificity revealed evidence supporting the diagnostic threshold bias as a major cause of heterogeneity among the 5 studies of patients with significant fibrosis (Fig. 4).
The development and refinement of noninvasive techniques for the detection of liver fibrosis have been motivated by an increased awareness of the limitations of liver biopsy.4, 19 In the post-LT setting, studies have identified TE as a method that can accurately predict the severity of allograft fibrosis in the HCV-infected patient.12, 14, 16, 18, 19, 21 In this systematic review and meta-analysis, we identified and evaluated primary studies from the published literature comparing TE with liver biopsy for the detection of significant fibrosis (5 studies) and cirrhosis (5 studies) in patients with recurrent HCV after LT. The results yielded excellent summary estimates of the sensitivity and specificity for detecting cirrhosis and good estimates for detecting significant fibrosis. The magnitude of the summary positive and negative likelihood rates for detecting cirrhosis was consistent with the values seen for tests considered to provide strong diagnostic evidence for clinical decision making. For both patient subgroups, the summary results were also associated with varying degrees of statistical heterogeneity between the primary studies.
Indeed, differences in study design methodology are well-recognized causes of heterogeneity in meta-analyses of diagnostic tests. However, it is likely that subtle variations in the technical performances of TE and liver biopsy may also contribute to between-study variations.9 The size of liver biopsy tissue cores may also influence the accuracy of liver fibrosis staging.21 Criteria for liver biopsy specimens (ie, ≥20 mm in length and/or ≥11 complete portal tracts) have been described.8, 23 In practice, however, samples meeting these criteria are rarely achieved.8, 23, 24 In our analysis, only Rigamonti et al.21 and Corradi et al.16 reported liver biopsy specimens within the optimal length range of 20 to 25 mm. Hence, the observed heterogeneity may be secondary to intrinsic errors of liver biopsy measurements that limit the diagnostic accuracy of noninvasive evaluations.24, 25
The discrepancies in the findings of the studies may also be related to the use of different histological scoring systems. Rigamonti et al.21 used the Ishak score, which includes stages 0 to 6 to describe fibrosis (the METAVIR system consists of stages 0 to 40. Indeed, histological staging complexity has been shown to be relevant for the assessment, follow-up, and definition of the rate of fibrosis progression.26 However, these scores are categorical in nature and do not represent continuous variables measuring fibrosis on a linear scale. The current reliance on histological staging using categorical scores for liver biopsy samples is recognized as suboptimal for assessing efficacy, and this may be a source of heterogeneity.27
Interestingly, the 2006 study by Carrión et al.14 discovered a more significant correlation between the hepatic venous pressure gradient (HVPG) and liver stiffness versus the histological stage on liver biopsy and liver stiffness. However, in the absence of details about the parameters of the liver biopsy specimens, it is difficult for us to discern whether there is an adequate histological reference standard for HVPG or TE.23 The value of HVPG measurement as a dynamic test for assessing the progression of liver disease in the precirrhotic stage has been established.28 That is, HVPG accurately reflects the portal pressure in patients with HCV-related cirrhosis, and when it is adequately measured, it has a very low variability.28 However, the measurement of hepatic hemodynamics is an invasive procedure with several limitations, including expense and morbidity.14 Although TE cannot provide histological details, its ability to capture early increases in portal venous pressures suggests that it provides more information on disease severity than liver biopsy alone.14 We strongly encourage investigators to include HVPG measurements with liver biopsy in future studies examining the clinical utility of TE in assessing patients after LT.
The reproducibility of TE is not well established in the post-LT setting so far, but it may be reasonable to assume that the results will be similar to those in the nontransplant setting once operator experience is substantial.29 Furthermore, none of the studies reported an assessment of reproducibility between 2 operators, and this certainly raises questions about the effect of operator experience on the test results of individual studies.29
The BMIs of individual subjects are another likely source of variation in the study results (Table 1).14, 18, 19 In the 2008 study of Harada et al.,18 the BMI value was 23.9 kg/m2, which was similar to the BMI value of Carrión et al.'s study14 (25 kg/m2); AUROC values of 0.99 and 0.98, respectively, were obtained for the detection of liver cirrhosis. However, the studies of Beckebaum et al.12 and Kamphues et al.19 yielded lower AUROC values for the evaluation of liver cirrhosis (Table 1). A possible explanation was provided by Kamphues et al., who found that the sensitivity, specificity, negative predictive value, and positive predictive value reached higher values in a group with lower BMIs. In their work, patients with a BMI < 25 kg/m2 had an AUROC value of 0.91 for the diagnosis of liver cirrhosis (>F4), whereas patients with a BMI > 25 kg/m2 (n = 46) had an AUROC value of 0.83. An elevated BMI is recognized as a limiting factor for achieving a valid test result with TE.22, 30 Notably, Rigamonti et al.21 reported an AUROC of 0.85 for the evaluation of cirrhosis (Table 1) despite an average BMI value of 24.8 kg/m2. Therefore, even in select patients with low BMIs, the diagnostic value of TE can be improved.19
The AUROC technique is the most frequently used method for measuring the diagnostic accuracy of noninvasive fibrosis indices.31 However, when the efficacy of TE is examined across different study populations, its performance may vary with the pathological, clinical, and comorbid characteristics of the patients.27 The AUROC depends on the proportion of patients at each fibrosis stage in the study sample, and its use may result in a significant loss of information.26 Moreover, the predominance of early fibrosis (ie, stages 0-1) and advanced fibrosis (ie, stage 4) in cohort samples will cause the highest type I error rates when the diagnostic performance is being assessed with AUROCs.31 The comparison of different AUROCs based on samples with different stage distributions thus may be affected by this spectrum bias in fibrosis stage distribution. In the primary studies that we analyzed, a wide spectrum of fibrosis stages was noted, and this leads to the possibility of a spectrum bias contributing to the heterogeneity identified in our results.27, 31
In patients undergoing transplantation for HCV-related disease, TE appears to be a reliable diagnostic test for the exclusion of liver cirrhosis. Furthermore, a low TE value can reliably exclude cirrhosis in patients with recurrent HCV after LT, and liver biopsy might even be avoided in these situations.19 The major limitation of TE and other noninvasive tests is the interpretation of results corresponding to intermediate stages of fibrosis; then, liver biopsy will be required for diagnostic confirmation.32 As such, further studies on the serial assessment of recurrent HCV patients with TE are warranted to determine whether this approach can reliably detect fibrosis progression leading to antiviral therapy.18 TE has the potential to become an important tool in clinical practice because additional studies of patients with recurrent HCV after LT are expected to further refine the initial results presented here.