Noninvasive tests for evaluation of fibrosis in HCV recurrence after liver transplantation: a systematic review

Authors


Andrew K. Burroughs FRCP, Professor of Hepatology, Liver Transplantation and Hepatobililary Medicine, Royal Free Hospital, Pond Street, Hampstead, London NW3 2QG, UK. Tel.: +44 20 74726229; fax: +44 20 74726226; e-mail: andrew.burroughs@royalfree.nhs.uk

Abstract

Noninvasive tests (NIT) for evaluation of hepatic fibrosis have not been evaluated extensively in liver transplantation. We systematically reviewed the literature regarding NIT after liver transplantation. We identified 14 studies evaluating NIT based on serum markers and/or liver imaging techniques: 10 studies assessed NIT in recipients with recurrent HCV infection for fibrosis and four studies evaluated predictors of progression of fibrosis in recurrent HCV. Transient Elastography (TE) had good discrimination for significant fibrosis (median AUROC: 0.88). Among the serum NIT, APRI had good performance (median AUROC: 0.75). TE performed better than serum (direct and indirect) NIT for significant fibrosis with median AUROC 0.88 (vs. 0.66, P < 0.001), median sensitivity 0.86 (vs. 0.56, P = 0.002), median NPV 0.90 (vs. 0.74, P = 0.05) and median PPV 0.80 (vs. 0.63, = 0.02). TE compared to indirect serum NIT, had better performance, but was not superior to APRI score. Finally, direct, compared to indirect NIT, were not significantly different except for specificity: median: 0.83 vs. 0.69, respectively, = 0.04. In conclusion, NIT could become an important tool in clinical management of liver transplant recipients, but whether they can improve clinical practice needs further evidence. Their optimal combination with liver biopsy and assessment of collagen content requires investigation.

Introduction

Histopathological examination of a liver biopsy is considered the gold standard for diagnosis and planning therapy in acute and chronic liver disease [1]. However, liver biopsy (LB) is an invasive procedure [2] and represents only 1/50 000 of liver mass [3], limiting the interpretation in diffuse liver disease [2]. Optimal size of biopsy, intra and inter-observer variation and sampling error are problems [2]. Insufficient size of biopsies is the major problem; for chronic viral hepatitis the LB should be ≥20–25 mm long and/or contain ≥11 complete portal tracts [4,5]. However, more than 1 pass with percutaneous LB is normally required to obtain optimal biopsy [2,6], increasing the risk of complications and costs [7–9]. In addition, LB is difficult to justify for repeated assessment.

Noninvasive tests (NIT) have been developed to substitute LB. They are patient friendly and simple procedures [10]. NIT evaluating fibrosis are of two types [11]. The first are serum markers: direct and indirect [10]. Indirect NIT comprise routine tests, age, platelet count, γGT, cholesterol (Forns’ index) [12] or AST/platelet count (APRI index) [13], or a2-macroglobulin, haptoglobin, gamma globulin, apolipoprotein, bilirubin (Fibrotest) [14]. Direct NIT measure extracellular matrix components (glycoproteins, collagen IV, pro-collagen III) [15]. Some scores combine direct and indirect NIT.

The second type of NIT derive from liver imaging [11]. Transient elastography (TE) based on ultrasound technology, measures hepatic elasticity-a surrogate of fibrosis [16,17]. Intra- and inter-observer agreement is 98% [18] in pre-LT settings.

There are few studies in liver transplant recipients, in whom evaluating of liver fibrosis during recurrent viral infection would be useful. Currently, serial liver biopsies are used to assess progression or severity of graft disease. However, the diagnostic accuracy of NIT could be reduced in transplant settings due to the possibility of multiple aetiologies for graft damage.

Methods

Systematic evaluation of the literature regarding NIT assessing fibrosis due to recurrent HCV after LT. We used Medline/PubMed and Embase databases using the search terms ‘noninvasive’ test, ‘transient elastography’ and ‘liver transplantation’, in all languages. Two authors (EC, ET) identified 171 articles independently and conducted a manual search of reference lists and abstracts of Hepatology and Transplant congresses. We identified 14 full articles [19–32] and four abstracts [33–36] in which liver biopsy was used as the reference investigation for fibrosis assessment. Special populations of HCV patients (e.g. renal transplant recipients) were excluded. All studies published as full papers had very good scores of the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) for systematic review [37], while the four studies published in abstract scored badly [33–36]. Thus, we evaluated only the 14 full articles [19–32] (Table 1) excluding the four abstracts. However, we performed a sensitivity analysis in order to establish if the results would change if the abstracts were included. Studies including recipients with other aetiologies of liver disease were included if data for HCV infected patients could be extracted. EC and ET performed data abstraction and any conflicts were arbitrated by AKB. We defined significant fibrosis as fibrosis stage ≥2 for grading systems with 5 (F0–F4) stages (METAVIR, Knodell, Scheuer score and Desmet scores) and fibrosis stage ≥3 for Ishak score. The Mann–Whitney U-test was used to compare the performance of NIT. Significance testing was two sided and set at P < 0.05. We used the area under the receiver-operating characteristic curve (AUROC) for comparison of performance between different NIT. No selected studies were excluded.

Table 1.   Published studies of non invasive tests (NIT) for assessment of fibrosis in HCV liver transplant recipients.
StudyPatients, nLiver imaging NITSerum NITCharacteristics of LBHistological scoreTime interval between NIT-LBHCV genotypeFibrosis stageComments
  1. *This index was evaluated prospectively (as HULF index) in 86 patients with HCV recurrence[30] from the same centre. In this study [30], diameter of biopsy needle: 1.2 mm, histological score: Knodell, median length and portal tracts of biopsy specimens: 25 (3–50) mm and 12 (range: 6–16). The same study group developed an artificial neural network [31] based on serum cholesterol, AST, ALP, albumin, sodium, platelet count and prothrombin time.

  2. NIT, non-invasive tests; TE, Transient Elastography; LB, liver biopsy; HVPG, hepatic venous pressure gradient; PIIINP, procollagen III N-terminal peptide; TIMP-1, tissue-inhibitors of metalloproteinases; CPT, complete portal tracts; LTC, London Transplant Centres; NA, not available.

Benlloch, 2005[19]*188NoneBenlloch index≥15 mm or ≥6CPT (needle diameter:1.2 mm)Knodell60 daysNAF0-F1:77%, ≥F2: 23%The first NIT derived frompatients with recurrent HCV
Carrion J, 2006[20]124TENoneTrucut 14 GScheuer15 days1/2/3/4: 104/3/4/1 patientsF0–F1:57%, ≥F2: 43%Strong correlation betweenTE and HVPG. AUROCfor HVPG ≥ 6 mmHg: 0.93
Toniutto, 2006[21]51NoneAPRI, AST/ALT, age/PLT, Forns, BonaciniNAIshak.NANAF0–F2:76%, ≥F3: 24%All five NIT performed betterin female recipients
Harada, 2008[22] 56TEAPRI, ALT, hyaluronic acid, collagen IVPercutaneously (14-G needle)Scheuer2 weeks1/2/others: 45/6/5 patientsF0–F1:62%, ≥F2: 38%ALT had low AUROC (0.54)
Rigamonti, 2008[23] 90TEAPRIMedian: 32 mm.
All ≥11 CPT.
Menghini needle (16 G)
IshakConcurrentGenotype 1: 68 patientsF0–F2:75%, ≥F3: 25%Stage, grade and G-GT>200 IU/l: independent factors of TE values
Corradi, 2008[24]56TEAPRI, Forns, Fibrotest, Benlloch indexMedian length 29 (16–55) mm
CPT: 9 to >20.
Menghini needle (17–18 G)
METAVIR40 daysNAF0–F1:68%, ≥F2: 32%TE was superior to theother serum NIT
Cross, 2010 [25]185NoneLTC score, APRI≥10 mm or ≥10 CPTIshakNANAF0–F2:46%, ≥F3: 54%LTC score was superior to APRI
Kamphues, 2010[26]94TEAPRI FIB-4≥15 mmScheuer2 days1/2/others/unknown: 62/1/7/24F0–F1:32%, ≥F2: 68%Higher AUROC values inpatients with BMI <25 kg/m2
Pungpapong, 2008 [29]46NoneHyaluronic acid, YKL-40Length: 24 mm, CPT: 13.1IshakNAGenotype 1: 80%NADirect NIT had betterperformance comparedto hepatic stellate cell activity
Carrion, 2009[27] 84TENonePLB (14 G) or TJLB (16 G) (>5 mmlength) ScheuerFibroscan at 6, 9and 12 months; LB/HVPG at 12 months1/2/3/4: 73/1/3/1 patients F0–F1:55%, ≥F2: 45%The AUROC for HVPG≥6:0.87 (estimation group) and 0.80 (validation group)
Carrion, 2009[28]133NoneHyaluronic acid, PIIINP and TIMP-1PLB (length17 mm, 14 G needle)/TJLB (length 15 mm, 16 G needle)ScheuerSerum markers at 3 6, and 12 months; LB/HVPG at 12 months1/2/3/4: 117/2/7/3 patients F0–F1:62%, ≥F2: 38%At 12 months, 3-M-ALG≥ 2 identified most patientsat risk of decompensation/Death
Micheloud, 2009[32] 19NoneDirect serum markersNAMETAVIRSerum markers at 3 months; LB/HVPG at 12 months NANA

Results

Ten studies assessed NIT for fibrosis due to recurrent HCV infection [19–26,30,31], but two studies [19,31] used the same patient cohort. One study [30] prospectively validated the original noninvasive index from the same group [19]. Four studies evaluated predictors of progression of fibrosis [27–29,32]. Ten studies reported quality criteria of biopsy specimens [19,23–31], but only three evaluated optimal LB [23,24,29]. Nine studies reported needle biopsy size [19,20,22–24,27,28,30,31], only six studies the number of complete portal tracts [19,23–25,29,30], and seven [20,22,23,26–29] documented HCV genotypes. Fibrosis stage was given in all but two [29,32] studies and histological score in all studies: Ishak in four [21,23,25,29], Knodell in three [19,30,31], Scheuer in five [20,22,26–28], and METAVIR in two [24,32]. Two studies [22,23] documented patients excluded due to technical problems, and 10 the interval between NIT and liver biopsy or LT [19,20,22–24,26–28,31,32] (Table 1).

NIT and severity of fibrosis

In four studies [22–24,26] the discriminative ability of TE was compared to serum NIT. In another two studies [21,25], two or more serum NIT were compared, while a further study compared TE to direct and indirect serum NIT [22].

NIT based on liver imaging techniques (transient elastography)

Transient elastography was evaluated in five studies [20,22–24,26] (Table 1) with 420 patients and mean age 55 ± 6 years. TE was compared to direct and/or indirect serum NIT in four studies [22–24,26], while in one [20] to hepatic venous gradient pressure (HVPG) (Table 2).

Table 2.   Diagnostic performance of transient elastography for significant fibrosis in recipients with recurrent of HCV after liver transplantation.
StudyCarrion 2006 [20]Harada 2008 [22]Rigamonti 2008 [23]Corradi 2008 [24]Kamphues 2010 [26]
  1. *Significant fibrosis was defined as the presence of fibrosis stage ≥2 for METAVIR or Scheuer scores and fibrosis stage ≥3 for Ishak score.

  2. NA, not available; TE, transient elastography.

Number of patients12456905694
Non-invasive testTETETETETE
Histological scoreScheuerScheuerIshakMETAVIRScheuer
Prevalence of significant fibrosis*, %4338253268
Proposed cut-offs (kPa)8.59.97.910.18.5
Sensitivity0.900.900.810.940.72
Specificity0.810.910.760.890.83
Negative predictive value0.920.940.880.940.58
Positive predictive value0.790.860.650.810.90
Area under the ROC curve0.900.920.850.940.81

TE had good discriminative ability for significant fibrosis [F ≥ 2 for METAVIR [24], or Scheuer [20,22,26], and F ≥ 3 for Ishak [23]] (range: 0.81–0.94) with high sensitivity (range:0.72–0.94) and specificity (range:0.76–0.91). In three studies [20,22,24], the discriminative ability for significant fibrosis was higher (0.92, 0.94 and 0.90, respectively), compared to the other two studies [23,26] (AUROC: 0.85 and 0.81, respectively) (Table 2). These differences were not due to sizes of liver biopsies: median lengths were 32 mm [23] and 28 mm [24], although three studies [20,22,26] did not report length. Different scoring systems, patient number and different proportion with significant fibrosis may have contributed (Table 2). Nevertheless, all five studies [20,22–24,26], had high sensitivity and specificity for significant fibrosis. However, except one study [26], NPV (range:0.58–0.94) was superior to PPV (range:0.65–0.90) (Table 2). Thus, similar to the pre-LT setting, TE was more useful for exclusion of significant fibrosis, rather than diagnosing the precise stage of fibrosis.

Only four studies [20,22,23,26] evaluated the diagnostic performance of TE for cirrhosis, for which TE was very good: fibrosis stage 4 according to Scheuer score [20,22,26] and fibrosis stage 5–6 according to Ishak score [23] (AUROC: 0.87–0.99, and NPV: 0.99–1). However, PPV was low (0.83, 0.74, 0.50, and 0.23, respectively) (Table 3).

Table 3.   Diagnostic performance of transient elastography for the diagnosis of cirrhosis in HCV patients after liver transplantation.
StudyCarrion 2006 [20]Harada 2008 [22]Rigamonti 2008 [23]Kamphues 2010 [26]
  1. NA, not available

  2. *Cirrhosis was defined as the presence of fibrosis stage 4 for Scheuer score and fibrosis stage 5–6 for Ishak score.

Number of patients124569094
Non-invasive testTETETETE
Prevalence of cirrhosis* (histological score)9% (Scheuer)9% (Scheuer)17% (Ishak)9.6% (Scheuer)
Proposed cut-offs (kPa)12.526.51210.5
Sensitivity110.931
Specificity0.870.980.930.65
Negative predictive value110.991
Positive predictive value0.500.830.740.23
Area under the ROC curve0.980.99NA0.87

NIT based on direct markers

There was only one study [22] evaluating collagen type IV and hyaluronic acid. The discriminative ability for fibrosis stage ≥2 (Scheuer score) was low (0.62 and 0.52, respectively), with relatively high specificity (0.83), but very low sensitivity (0.52 and 0.38), NPV (0.74 and 0.69) and PPV (0.65 and 0.57), making them insufficiently accurate for clinical use.

NIT based on indirect serum markers

Nine studies [19,21–26,30,31] evaluated indirect serum markers for significant fibrosis (fibrosis stage ≥2 METAVIR in 1 [24], Scheuer in 2 [22,26], and fibrosis stage ≥3 Ishak in 3 [21,23,25] and Knodell in 3 [19,30,31]) (Table 4), but only one [26] evaluated cirrhosis. All studies evaluated serum NIT, as used in nontransplanted patients, except the London Transplant Centres (LTC) score [25], Benlloch (or HULF) index [19,30] and an artificial neural network [31].

Table 4.   Diagnostic performance of indirect serum non-invasive tests for significant fibrosis in recipients with recurrent of HCV after liver transplantation.
Study*Toniutto†, 2006 [21]Benlloch, 2006 [19]‡Benlloch, 2009 [30]Corradi 2008 [24]Harada, 2008 [22]Kamphues 2010 [26]Kamphues 2010 [26]Corradi, 2008 [24]Corradi, 2008 [24]Corradi, 2008 [24]Cross§, 2010 [25]
  1. *Rigamonti et al. [23] evaluated also APRI (AUROC: 0.59).

  2. †Toniuto et al. [21] evaluated also: AST/ALT (AUROC: 0.749), age/PLT ratio (AUROC: 0.659), Forn’s index (AUROC: 0.723) and Bonacini (AUROC: 0.785) without further data.

  3. ‡Benlloch index: Prothrombin time, albumin/total protein ratio, AST, and time since liver transplantation. This index (as HULF index) [30] was prospectively evaluated by the same study group [19], as well as an artificial neural network [31] with higher AUROC, compared to the Benlloch index (AUROC: 0.93 vs 0.84, respectively).

  4. §LTC score: INR, AST, PLT, and time from liver transplantation. In the same study [25], AUROC for APRI: 0.72.

  5. ¶Significant fibrosis was defined as the presence of fibrosis stage ≥3 for Ishak score and ≥2 for the other scores. NA: not available.

Number of patients511888656569494565656185
Non-invasive testAPRIBenlloch index‡HULF (Benlloch) index‡Benlloch indexAPRIAPRIFIB4APRIForn’sFibrotestLTC score§
Prevalence of significant fibrosis¶24%23%32%32%38%68%68%32%32%32%54%
Proposed cut-offs1.40.20.20.20.840.482.81.390.810.7
Sensitivity0.760.740.900.670.730.700.440.590.620.560.84
Specificity0.770.690.350.780.910.630.870.740.650.610.63
Negative predictive value0.930.900.880.820.760.800.420.810.750.740.64
Positive predictive value0.460.420.400.620.630.800.880.500.500.440.75
Area under the ROC curve0.800.800.680.790.700.680.660.810.710.560.82

Six studies evaluated discriminative ability (AUROC) of APRI for significant fibrosis [21–26] with a range between 0.59 and 0.81 for cut-off values between 0.48 and 1.4. These different cut offs are possibly related to using different scores (Scheuer [22,26], METAVIR [24], and Ishak [21,23,25]). APRI had better specificity (range: 0.63–0.91), than sensitivity (range: 0.59–0.76), while, similar to TE, NPV (range: 0.76–0.93) was always better than PPV (range: 0.46–0.80) (Table 4).

Forn’s index was evaluated in two studies [21,24], with 56 and 51 patients, respectively, with similar discriminative ability for fibrosis stage ≥2 METAVIR [24] and fibrosis stage ≥3 Ishak [21] (AUROC: 0.71 and 0.72, respectively). Sensitivity, specificity, NPV and PPV was reported in one study [24] (Table 4). Fibrotest was evaluated in 56 patients [24], but its discriminative ability was not good (AUROC: 0.56) with poor PPV (44%) (Table 4). FIB-4 was assessed in one study [26]. The AST/ALT ratio, Bonacini index and age/PLT ratio, were evaluated by Toniuto et al. [21], but only their discriminative ability for fibrosis stage ≥3 Ishak score was reported (Table 4).

The LTC score was derived from 185 patients with recurrent HCV based on INR, AST, PLT, and time from LT [25]. The Benlloch index was derived from patients with recurrent HCV after LT [19], and is based on prothrombin time, albumin/total protein ratio, AST, and time since LT. This index [19] had very good discriminative ability for fibrosis stage ≥3 Knodell with AUROC 0.80 (training set) and 0.84 (validation set). However, prospective validation in 86 patients (HULF index) gave an AUROC of 0.68 for fibrosis stage ≥3 Knodell [30], and its external validation [24] in 56 patients gave an AUROC of 0.80. Artificial neural network based on serum cholesterol, AST, ALP, albumin, sodium, platelet count and prothrombin time [31] was developed by the same group [19]. This had higher discriminative ability for significant fibrosis, compared to the Benlloch index (AUROC: 0.93 vs. 0.84).

Comparison between TE and serum NIT

In all studies, TE performed better than serum NIT (direct and indirect combined), with better median AUROC (0.88 vs. 0.66, P < 0.001), sensitivity (median: 0.86 vs. 0.56, P = 0.002), NPV (median: 0.90 vs. 0.74, P = 0.05) and PPV (median: 0.80 vs. 0.63, P = 0.02) (Fig. 1). Compared to indirect serum NIT, TE had significantly better performance with AUROC (0.88 vs. 0.70, P = 0.002), sensitivity (median: 0.86 vs. 0.59, P = 0.01) and PPV (median: 0.80 vs. 0.65, P = 0.05) (Fig. 2), but TE was not superior to APRI score. Direct NIT had similar performance to TE. Finally, there were no significant differences between direct and indirect NIT, regarding discriminative ability, sensitivity, NPV, PPV, but direct NIT had better specificity (median: 0.83 vs. 0.69, P = 0.04).

Figure 1.

 Box plots and comparison of discriminative ability (AUROC), sensitivity, negative predictive value (NPV) and positive predictive value (PPV) between transient elastography (TE) and serum noninvasive tests (NIT) (direct and indirect) for evaluation of significant fibrosis (≥2 for grading systems with five stages and fibrosis stage ≥3 for systems with six stages).

Figure 2.

 Box plots and comparison of discriminative ability (AUROC), sensitivity, negative predictive value (NPV) and positive predictive value (PPV) between transient elastography (TE) and indirect serum noninvasive tests (NIT) for evaluation of significant fibrosis (≥2 for grading systems with 5 stages and fibrosis stage ≥3 for systems with six stages) in patients with HCV recurrence after liver transplantation.

NIT to predict progression of fibrosis in HCV recurrence (Table 1)

Predictive ability of TE

Rigamonti et al. [23] evaluated 40 patients who had paired protocol LB and TE examinations separated by 6–21 months. Changes in fibrosis staging (Ishak), positively correlated with percentage changes in TE values (Spearman r = 0.71, P < 0.0001) with a high sensitivity and specificity in predicting increases in stage (86% and 92%, respectively). To date a confirmatory study has not been published.

One study [27] evaluated 84 patients between 3 and 12 months after LT with TE. Protocol liver biopsies were performed at 12 months after LT, together with hepatic vein pressure gradient (HVPG) measurements in 74 patients. Patients with fibrosis stage ≥2, compared to <2 (Scheuer), had significantly higher median liver stiffness values at 6 months (9.9 kPa vs. 6.9 kPa), 9 months (9.5 kPa vs. 7.5 kPa) and 12 months (12.1 kPa vs. 6.6 kPa) (P < 0.001 at all time points). Similarly, patients with HVPG≥6 mmHg versus those <6 mmHg, had significantly higher median liver stiffness values. In multivariate analysis, donor age, bilirubin, and TE were independent predictors of rapid fibrosis progression, with the AUROC at 6 months of 0.83 (training group) and 0.75 (validation group). The authors concluded that that there were ‘rapid and slow fibrosers’, which could be easily separated with early and repeated TE measurements. This study is potentially important but needs confirmation.

Predictive ability of serum markers

The same study group [28] evaluated whether serum NIT could predict the evolution of HCV recurrence in 133 patients at 3, 6 and 12 months after LT using hyaluronic acid, procollagen III N-terminal peptide (PIIINP) and tissue-inhibitors of metalloproteinases (TIMP-1). Patients had protocol liver biopsies at 1 year, and 94 had concomitant measurement of HVPG. Algorithm (3-M-ALG), using these three serum markers (all individually were significantly associated with fibrosis stage and HVPG) gave a discriminative ability of 3-M-ALG at 3, 6 and 12 months to predict fibrosis stage ≥2 (Scheuer) of 0.67, 0.77 and 0.78, respectively. The predictive capacity for an HVPG≥6 mmHg at 12 months had an AUROC at 3, 6 and 12 months of 0.75, 0.87 and 0.90, respectively. However, the optimal cut off of 3-M-ALG at 6 months post-LT only correctly identified 21% with fibrosis score F ≥ 2 (PPV: 100%) and 44% with HVPG ≥ 6 mmHg (PPV: 89%) showing again that predicting the absence of significant fibrosis appears to be far better than predicting its presence.

Micheloud et al. [32] evaluated 37 consecutive patients transplanted for either HCV (n = 19) or alcohol related cirrhosis (n = 18). At 1 year post-LT, 12 (63%) of 19 patients with HCV had severe recurrence, defined as a METAVIR score F ≥ 2 and/or HVPG value ≥6 mmHg. Direct fibrosis indices measured at 3 months, the interferon-inducible protein (IP)-10 (cut off >59 pg/ml), vascular cell adhesion molecule (sVCAM) (cut off >1481 ng/ml) and hyaluronic acid (cut off >461 pg/ml) had the best ability to predict severe recurrence of HCV at 1-year post-LT (AUROC 0.74, 0.89 and 0.80, respectively).

Pungpapong et al. [29] evaluated 46 recipients. Fast fibrosis was defined as an increase in fibrosis score ≥2 from first to second biopsy (mean interval of 33 ± 6 months). Serum hyaluronic acid (HA) and YKL-40 at baseline were both significantly higher in rapid fibrosers, compared to slow fibrosers (HA: 367 μg/l vs. 71 μg/l, P = 0.007; YKL-40: 711 μg/l vs. 101 μg/l, P = 0.001). Both HA and YKL-40 predicted progression of fibrosis (AUROC: 0.89 and 0.92, respectively). However, similar to previous studies [32,33], no comparison with liver function tests (ALT/AST) was performed, and the clinical value of both serum markers remains unconfirmed.

When the above analyses were repeated by including the excluded abstracts in the appropriate categories, this did not change the interpretation of the combined data from the full papers.

Discussion

Although NIT have the potential to become an important tool in clinical practice [38], several aspects of the evaluation of NIT need further consideration. To date, the major limitation of NIT, is the identification of intermediate stages of fibrosis [39,40]. In addition, NIT cannot discriminate between different pathologies coexisting in any patient. The latter is a particular challenge to the appropriateness of using NIT in the LT setting, because liver graft damage could be the result of multiple aetiologies, which may coexist. In this respect, abnormal liver function tests serve the same function, i.e. to indicate possible graft dysfunction of whatever cause. This could in part explain why the diagnostic performance of TE for fibrosis was lower in liver transplant recipients with recurrence of HCV, compared to nontransplanted patients with chronic HCV infection, as there may be other associated causes of inflammation and fibrosis causing graft dysfunction.

The accuracy of diagnostic tests depends on identifying correctly the abnormal versus normal (calibration) and identifying the abnormal result correctly in a range of severity (discrimination). Calibration thus compares the predicted stage with the actual stage across the spectrum of fibrosis, and it is considered the best approach to evaluate the diagnostic performance of NIT [41]. However, this has never been done in liver transplant recipients. Indeed, most studies have only evaluated the discriminative ability (AUROC) of NIT. This is expressed as a plot of sensitivity versus 1-specificity, but its accuracy is related to the relative prevalence of the stages of fibrosis present in the cohort under evaluation [42]. In the LT setting, NIT have only been evaluated in a relatively low number of patients with significant fibrosis. Thus, the ‘true’ discriminative ability has not been tested adequately, as has been the case in the nontransplant setting [43].

In addition, NIT, all of which have continuous scores, have been correlated with categorical variables, i.e. the stage scores, which are only descriptive categories of fibrosis. These are not only different amongst the various histological scoring systems but also do not have an arithmetical progression, e.g. stage 2 fibrosis (F2) is neither twice the severity of stage 1 (F1) nor half the severity of stage 4 (F4) in METAVIR [44]. A methodologically more correct comparison, would be between NIT scores and a quantitative measurement of liver fibrosis. However, clinical correlations with quantification of liver collagen have not been extensively evaluated. Calvaruso et al., from our centre [45] evaluated 115 recipients with recurrence of HCV: collagen proportionate area was independently associated with the presence of HVPG ≥6 mmHg [odds ratio: 1.206, < 0.001], or HVPG ≥10 mmHg (odds ratio: 1.105; = 0.009), and not with the fibrosis stage according to Ishak score, while Isgro et al. [46] found that collagen proportionate area had a better correlation with TE than HVPG. Quantitative correlations with collagen content in biopsy specimens could help to validate NIT, and may help reduce errors, particularly when evaluating LB of suboptimal quality. A further problem is that some patients with recurrent HCV have sinusoidal fibrosis, particularly if cholestasis is a clinical feature. This type of fibrosis has not been evaluated extensively with TE and may not be adequately evaluated by it [47]. Lastly, even discounting the issue of a gold standard for the histological quantification of collagen, the problem of assessing NIT in both transplant and nontransplant settings is the quality of the current gold standard i.e. accurate histological assessment of stage. This is another reason for discordant results between LB and NIT. Indeed, in the LT setting, only three studies evaluated optimal liver biopsies (liver samples of 20–25 mm length and/or containing ≥11 complete portal tracts) [23,24,29], suggesting this is likely to be a major source of error.

A common finding in the studies of HCV recurrence (Table 1) was that TE always had a significantly better performance, compared to serum NIT. Amongst the direct and indirect serum NIT, APRI was always superior to other serum NIT (e.g. Forns’ index, Bonacini’s score, Fibrotest). Similar to TE, APRI had high NPV (median: 81%), but its PPV was lower (median: 56%) (Table 4). Interestingly, no study of the NIT evaluated the distinction between no fibrosis and any fibrosis in the LT setting. The reproducibility of NIT is a major issue in clinical practice as already seen in the pre-transplant setting. There are different cut offs for NIT values reported for prediction of significant fibrosis or cirrhosis in the LT setting compared to the nontransplant setting [35]. Thus, validation and standardization studies are needed. In fact, only the LTC score [25] and the HULF (formerly Benlloch) index [30] were derived from patients with recurrent HCV. Only one study has been validated prospectively [30] showing inferior performance than in the original study [19]. It was estimated that this index would have prevented 24% of the biopsy procedures performed, but of course it cannot identify whether this could benefit the specific patient for whom a biopsy is considered.

Our review provides some encouraging results regarding NIT for evaluation of significant fibrosis or cirrhosis after LT, but we identified several additional limitations to the ones discussed above. However, most studies in the LT setting had small cohorts, with no validation cohort (Table 1). In contrast to most studies evaluating NIT in the pre-LT setting [48], only two studies [21,22] evaluated the performance of liver function tests in comparison with other NIT in liver transplant recipients. Interestingly, in the second study [21], the AST/ALT ratio had much better discriminative ability (AUROC: 0.75) for significant fibrosis (fibrosis stage ≥3 Ishak). Importantly, in only 2 studies [22,23], documented patients who were excluded due to unsuccessful examination with TE. Indeed, as metabolic syndrome is frequent in liver transplant recipients [49], and obesity can prevent obtaining reliable values, the applicability of TE after LT may be less than pre-transplant. Thus, in one study [50], the overall success of TE in liver transplant recipients was 82.7%, but among patients with BMI >30 kg/m2 it was only 50%.

Similar to the pre-LT setting, further studies are necessary to elucidate the precise association between NIT and HVPG determination [51,52], as it is significantly correlated with fibrosis progression and early prediction of liver decompensation [53]. After LT, some studies have encouraging results for TE [20,27] and/or serum NIT [28,32] for the presence of portal hypertension or to predict its course in patients with recurrent HCV. In addition, further studies are needed to evaluate the impact of anti-viral therapy on NIT values, as this was assessed in only one study [54].

The optimal time for initiation of anti-viral therapy in patients with HCV recurrence remains controversial. We identified four studies using TE [27] or direct [28,29,32] serum NIT, evaluating the prediction of the course of recurrent HCV. However, these studies [27–29,32] evaluated different NIT, at different time points, after LT. Thus, their data cannot be analyzed further to assess which methodology might be best. Three studies [27,28,32] evaluated NIT in the presence of portal hypertension (HVPG ≥ 6 mmHg). Based on this limited data, NIT could prove useful for the noninvasive detection of portal hypertension. For example, the 3-M-ALG had excellent discriminative ability for diagnosis of HVPG ≥ 6 mmHg at 12 months after LT.

In conclusion, given their excellent acceptance and simplicity, NIT have the potential to become an important tool in liver transplant recipients [55] as they could reduce the need for protocol liver biopsies in the evaluation of fibrosis progression post-LT. However, the positive predictive value for the development of significant fibrosis needs to improve significantly. An initial diagnostic biopsy will still be needed, but follow up for fibrosis could be based on NIT. Further studies with better validation in larger cohorts are needed in order to establish the precise association of NIT values and cut off values, with the corresponding histological lesions and collagen content of liver biopsies using optimally sized biopsies as a reference standard. The correlation with abnormal liver function tests should be evaluated further. Ideally, NIT should be evaluated according to other features, including patient outcomes after LT, following antiviral treatment and cost-effectiveness. In addition, a major shortcoming of the existing literature is the failure to demonstrate the cost-effectiveness of these measures. In addition,. Studies comparing collagen content of liver biopsies, and using only optimal biopsies to provide the best reference standard, still need to be performed. In the future, new imaging techniques (e.g. magnetic resonance elastography) and novel serum markers may overcome some limitations of the existing NIT highlighted in this review.

Authorship

EC, AKB: participated in research design and in the writing of the paper. EC, ET: participated in the performance of the research; JG, AKB: participated in data analysis.

Ancillary