Diagnostic accuracy of fibrosis tests in children with non‐alcoholic fatty liver disease: A systematic review

Abstract Background & Aims Non‐alcoholic fatty liver disease (NAFLD) has become the most common chronic liver disease in children. Even at young age, it can progress to liver fibrosis. Given the drawbacks of liver biopsy, there is a need for non‐invasive methods to accurately stage liver fibrosis in this age group. In this systematic review, we evaluate the diagnostic accuracy of non‐invasive methods for staging liver fibrosis in children with NAFLD. Methods We searched MEDLINE, Embase, Web of Science and the Cochrane Library, for studies that evaluated the performance of a blood‐based biomarker, prediction score or imaging technique in staging liver fibrosis in children with NAFLD, using liver biopsy as the reference standard. Results Twenty studies with a total of 1787 NAFLD subjects were included, which evaluated three prediction scores, five simple biomarkers, two combined biomarkers and six imaging techniques. Most studies lacked validation. Substantial heterogeneity of studies and limited available study data precluded a meta‐analysis of the few fibrosis tests evaluated in more than one study. The most consistent accuracy data were found for transient elastography by FibroScan®, ELF test and ultrasound elastography, with an area under the receiver operating characteristics curve varying between 0.92 and 1.00 for detecting significant fibrosis. Conclusion Due to the lack of validation, the accuracy and clinical utility of non‐invasive fibrosis tests in children with NAFLD remains uncertain. As studies have solely been performed in tertiary care settings, accuracy data cannot directly be translated to screening populations.


| INTRODUC TI ON
Due to the obesity epidemic, non-alcoholic fatty liver disease (NAFLD) has become the most common chronic liver disease in children and adults. 1 The pooled prevalence of NAFLD in children with obesity is 34% (95% CI: 27.8% to 41.2%). 2 Simple steatosis, or non-alcoholic fatty liver (NAFL), is the first stage of the NAFLD spectrum and is defined as fat accumulation in more than 5% of the hepatocytes in the biopsy specimen on histological evaluation.
A NAFLD subtype that is characterized by significant inflammation is categorized as non-alcoholic steatohepatitis (NASH) and can progress to severe stages of fibrosis and cirrhosis. 1 Although most children with NAFLD will have simple steatosis, advanced fibrosis is reported in up to 17% of children referred to liver centres after screening, 3,4 and some cases of NAFLD-related cirrhosis in children have been reported. 5,6 Evidence shows that fibrosis is the most important predictor for liver-related complications in adults, such as liver failure and hepatocellular carcinoma, and is associated with increased overall mortality. 7,8 Therefore, liver fibrosis represents the most clinically relevant determinant of long-term outcomes in this disorder. 7 The development of fibrosis at a young age is considered worrisome and, although long-term longitudinal studies are lacking to prove this, could be related to a higher risk of developing longterm liver and non-liver complications. Current paediatric guidelines recommend screening for fibrosis in children with NAFLD but do not specify what test should be used to assess fibrosis. 1,9 In addition, accurate tests could serve as surrogate endpoints in future paediatric therapeutic trials. 10 Liver biopsy is the current reference standard to determine the stage of liver fibrosis in patients with NAFLD. However, in addition to the risk of complications, the costly and invasive nature of this procedure makes it unsuitable for screening purposes or for monitoring disease progression in this highly prevalent disorder. 11 Moreover, the diagnostic accuracy of liver biopsy is not optimal due to sampling variability caused by the often patchy distribution of NAFLD in the liver and interobserver and intraobserver variability of the histological interpretation. 12 Therefore, there is an urgent need for accurate, safe and cost-effective alternatives to accurately stage liver fibrosis in patients with NAFLD. Over the past decade, many fibrosis tests have been developed, ranging from simple laboratory tests to more complex biomarkers or prediction scores as well as imaging techniques. 13 Although most of these tests were developed and validated in the adult population, several research groups have investigated their utility in the paediatric population. 14  Fatty Liver Disease, children, diagnosis and fibrosis. No date limit was applied to the search. The bibliographic reference lists of included articles and reviews were manually searched. Article selection was accomplished in April 2020.

| Selection criteria
Articles were included if they fulfilled the following criteria: (a) the study included patients with biopsy proven NAFLD/NASH/steatosis, and in case of inclusion of other causes of chronic liver disease, the study provided discrete data on the NAFLD population separately; (b) the study consisted of children up to 18 years, or reported separately on children, if adults were included; (c) the study evaluated the performance of a blood-based biomarker, prediction score or imaging technique to detect different stages of liver fibrosis; (d) liver biopsy was used as the reference test; (e) the study included ≥60 participants or the diagnostic test was reported in ≥2 studies; and (f) the study provided enough data to construct a 2 × 2 table.
Studies were excluded if they did not meet the inclusion criteria or (a) had a case report, case series, conference abstract or commentary design and (b) were conducted in animal subjects. No language restriction was used.

| Data extraction and quality assessment
Two authors (L.D. and J.O.) independently screened the titles and abstracts to identify articles that met the inclusion criteria using Rayyan software (https://rayyan.qcri.org). Then, the full texts of the potentially eligible studies were screened independently by the two authors. Data extraction was performed independently by two authors (L.D. and S.Z.) using a predesigned data extraction form. For studies that included adults or patients with various liver diseases as well, that did not report paediatric data or NAFLD data separately, the authors were contacted and requested to provide raw data. The study design, patients characteristics,

| Data analysis
Medcalc was used to analyse the tests for sensitivity, specificity, PPV, NPV, positive likelihood ratio (LR+) and negative likelihood ratio (LR−). 16 Review Manager version 5.3 was used for quality assessment and creating figures. A meta-analysis of any of the evaluated fibrosis tests could not be performed because a summary ROC curve (HSROC) and summary sensitivities and specificities could not be constructed due to the use of different reported thresholds and different settings of magnetic resonance elastography (MRE) and ultrasound elastography among studies.

| Study characteristics
Characteristics of the 20 included studies are provided in Table 1.

| Patient characteristics
In total, 1787 subjects with NAFLD were included. The mean age of the NAFLD patients ranged from 8. 5

| Methodological quality assessment
The results of the methodological quality assessment of the individual studies using the QUADAS-2 tool are presented in Figure 2 and are summarized in Figure 3. Only three studies had a low risk of bias in all four domains. In all studies, there were concerns about applicability regarding patient selection because patients were recruited in tertiary hospitals and were selected to undergo liver biopsy based on clinical grounds. In two studies, there were concerns about applicability regarding the index test. These tests evaluated time-harmonic elastography: a technique developed by Hudert et al. 19 which is not commercially available and the S-probe of the FibroScan® (Echosens, France) that was evaluated in children with overweight/obesity by Alkhouri et al. 34 This specific probe was developed for children with a chest circumference < 75 cm and is generally unsuitable for children with obesity.
Among the prediction scores and biomarkers, only the three studies using the ELF test alone or combined with PNFI showed an AUC greater than 0.90. However, the optimal threshold for the ELF test alone as reported by Nobili et al. (9.28) 24 could not be reproduced by Alkhouri et al. who reported a far lower optimal threshold (8.49). 17 All other evaluated biomarkers had a lower AUC and reported either a low sensitivity or specificity at their optimal threshold. The PNFI is the only test with accuracy data reported at a wide range of thresholds.
Nobili et al. found that a score < 3 could be used to rule out fibrosis with a sensitivity of 96%, and a score of ≥9 could be used to rule in fibrosis with a specificity of 98%. 26 However, this resulted in 56% of patients with an undetermined classification. Alkhouri et al. validated these thresholds with similar accuracy results and 52% with an undetermined classification. 17 He subsequently combined the PNFI with the ELF test for patients with an undetermined classification, which resulted in classifying all patients with an overall sensitivity of 86% and specificity of 89%. 17 Most imaging techniques showed higher accuracy than prediction scores and biomarkers for detecting mild fibrosis. Near perfect accuracy was reported for TE of FibroScan® (AUC 0.98, 95% CI: 0.90-0.99) resulting in a sensitivity and specificity of both >90% at a threshold of 5.1 kPa in the only study available. 23 None of the studies validated their results externally. Three studies performed internal cross-validation using bootstrapping. 22 for the optimal threshold in the latter as described above.

| Detecting significant fibrosis (≥F2)
Ten studies evaluated the accuracy of non-invasive methods to detect significant fibrosis (≥F2). These studies addressed the prediction score PFNI, the biomarkers HA, CK-18 and procollagen type III amino terminal peptide (PIIINP) and the imaging techniques MRE, TE of Fibroscan®, time-harmonic elastography and SWE.

| Detecting advanced fibrosis (≥F3)
Nine studies evaluated the accuracy of non-invasive methods to detect advanced fibrosis (≥F3). These studies addressed the predic-      g Method for finding optimal threshold not reported.

REG IS TR ATI ON
The protocol of this systematic review is available in PROSPERO: CRD42019117504.

PATIENT CON S ENT
Not applicable for systematic review.

ACK N OWLED G EM ENT
No funding was received for the purpose of this study.

CO N FLI C T O F I NTE R E S T
All authors do not have any disclosures to report.

E TH I C S A PPROVA L
Not applicable for systematic review.