Accuracy of FGF‐21 and GDF‐15 for the diagnosis of mitochondrial disorders: A meta‐analysis

Abstract Objective Given their diverse phenotypes, mitochondrial diseases (MDs) are often difficult to diagnose. Fibroblast growth factor 21 (FGF‐21) and growth differentiation factor 15 (GDF‐15) represent promising biomarkers for MD diagnosis. Herein we conducted a meta‐analysis to compare their diagnostic accuracy for MDs. Methods We comprehensively searched PubMed, EMBASE, MEDLINE, the Web of Science, and Cochrane Library up to 1 January 2020. Data were analyzed by two independent reviewers. We obtained the sensitivity and specificity, positive and negative likelihood ratios (LR+ and LR‐), diagnostic odds ratios (DORs) and summary receiver operating characteristic (SROC) curves of each diagnostic method. Results Eight randomized controlled trials (RCTs) including 1563 participants (five encompassing 718 FGF‐21 assessments; seven encompassing 845 participants for GDF‐15) were included. Pooled sensitivity, specificity, DOR and SROC of FGF‐21 were 0.71 (95% CI 0.53, 0.84), 0.88(95% CI 0.82, 0.93), 18 (95% CI 6, 54), 0.90 (95% CI 0.87, 0.92), respectively, which were lower than GDF‐15 values; 0.83 (95% CI 0.65, 0.92), 0.92 (95% CI 0.84, 0.96), 52 (95% CI 13, 205), 0.94 (95% CI 0.92, 0.96). Interpretation FGF‐21 and GDF‐15 showed acceptable sensitivity and high specificity. Of the biomarkers, GDF‐15 had the highest diagnostic accuracy.


Introduction
Mitochondrial diseases (MDs) are heritable multisystem metabolic disorders resulting from diverse genetic mutations in nuclear (nDNA) or mitochondrial DNA (mtDNA). 1,2 MD diagnosis remains challenging even for experienced clinicians due to its wide range of symptoms, particularly in children and the elderly. Effective diagnostics are also lacking, with current MD assessments based on clinical presentation, muscle biopsy, and next-generation sequencing (NGS). 3,4 However, these procedures are invasive and time-consuming. Historically, lactate, creatine kinase (CK), and pyruvate levels in the blood are used for diagnosis, but these markers are nonspecific and lack sensitivity. 5 Considering the complexity of the diagnostic process, more relevant mitochondrial biomarkers should be identified in the clinic.
Fibroblast growth factor 21 (FGF-21) regulates lipid and glucose homeostasis. 6 It is secreted in the liver and functions via binding to cell-surface FGF receptor (FGFRs) and an essential coreceptor b-klotho. 6,7 In 2005, 8 FGF-21 was revealed as a metabolic regulator. In 2011, 9 upon the analysis of 67 patients with MDs, FGF-21 was shown to be a biomarker. Since its first description, FGF-21 has attracted intense research attention. Salehi et al. 10 described it as an indicator to distinguish MDs from other diseases. Morovat et al. 11 suggested FGF-21 as a useful tool for MD examinations, particularly in those with chronic progressive external ophthalmoplegia (CPEO). In 2019, Tsygankova et al. 12 concluded that FGF-21 levels are elevated in specific metabolic diseases, questioning its reliability as a diagnostic for MDs. The effectiveness of FGF-21 as an MD marker therefore remains questionable.
Growth differentiation factor 15 (GDF-15) serves as a TGF-b family protein that is produced upon detection of inflammation and oxidative stress to maintain tissue homeostasis. 13,14 In 2014, GDF-15 was put forward as an MD diagnostic 15 in TK2-deficient human skeletal muscle. Similarly in 2015, Yatsuga et al. 16 highlighted GDF-15 as a highly specific diagnostic in patients with suspected MDs. In 2016, Davis et al 17 showed that GDF-15 outperformed FGF-21 as a predictor of MD. In 2019, Poulsen and colleagues 18 further showed the utility of serum GDF-15 isolated from patients with mitochondrial myopathy to distinguish MD from other myopathy related diseases.
This meta-analysis was performed to analyze the effectiveness of current MD diagnostics. We comprehensively examined randomized controlled clinical trials to reinvestigate the diagnostic accuracy of FGF-21 and GDF-15 for MD patients.

Methods
The study was carried out following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Diagnostic Test Accuracy Studies (PRISMA), 19 Meta-analysis of Observational Studies in Epidemiology (MOOSE) 20 guidelines, and the Cochrane Handbook for Systematic Reviews of Interventions.

Database search
PubMed, EMBASE, MEDLINE, the Web of Science and Cochrane Library were reviewed for relevant studies. Trials were published before 1 January 2020 and all publications were written in English. The following terms were used: ("mitochondrial disorders" OR "mitochondrial diseases" OR "mitochondrial myopathies" OR "oxidative phosphorylation deficiencies" OR "respiratory chain deficiency" OR "MDs") AND ("fibroblast growth factor 21" OR "FGF-21" or "FGF21") AND ("growth differentiation factor 15" OR "GDF-15" OR "GDF15"). Reference lists were employed for the identification of other relevant studies.

Study inclusion/exclusion
The following inclusion criteria were used: (i) human studies; (ii) participants with MDs or mitochondrial related disease; (iii) FGF-21 or GDF-15 used as index tests, muscle biopsy (or genetic diagnosis) as reference standards; (iv) study design: randomized controlled trials (RCTs); (v) studies in which sufficient original data were provided. Specific exclusion criteria were as follows: (i) subjects who were not human beings or patients with MDs; (ii) literature published in the form of review, case report, letter, and commentary; (iii) articles not published in English language; (iv) duplicate publications.

Data extraction
Data were extracted by two independent researchers. For disagreements, a third researcher reassessed the data to achieve a consensus. For each study, relevant information included: (1) first author and publication year; (2) patients' number; (3) patients' mean age and sex ratio; (4) diagnostic accuracy: sensitivity (Sn), specificity (Sp), positive and negative likelihood ratios (LR+ and LR-), true and false positive (TP and FP), false and true negative (FN and TN). Authors were requested for additional information for studies with incomplete data. If publications stemmed from overlapping sample data, those with the highest number of participants or most detailed information were selected.

Methodological quality assessment
We evaluated data quality and the risk of bias using QUADAS-2 (Quality Assessment Tool for Diagnostic Accuracy Studies-2). 21 Briefly, QUADAS-2 consists of four domains, including patient selection, index test, reference standard, and flow and timing. Each domain was assessed in terms of risk of bias (graded as low risk, high risk, or unclear risk), and the first three domains were also considered in terms of applicability (rated as low risk, high risk, or unclear risk). QUADAS-2 allowed for more objective rating of bias.

Statistical analysis
Heterogeneity between studies was investigated using Cochran Q and I 2 statistics. I 2 values ≥ 50% were considered substantial heterogeneity and a random-effects model should be used. Otherwise, if I 2 values＜50% (indicated lower heterogeneity), a fixed-effects model was applied. 22 To construct 2x2 tables, information on TP, FP, TN, and FN were recalculated based on the available parameters. The bivariate meta-analysis model was used to calculate pooled Sn, Sp, LR+, LR-and diagnostic odds ratio (DOR). 23 Based on the Sn and Sp for a single test threshold from each study, the summary receiver operator characteristic (SROC) curve was derived and area under the curve (AUC) calculated. The primary outcome was the diagnostic accuracy of FGF-21 and GDF-15 for the diagnosis of MDs, expressed based on Sn, Sp and AUC with corresponding 95% confidence interval (CI). Moreover, to explore the potential sources of heterogeneity, sensitivity analysis and subgroup analysis were carried out. Threshold effects were calculated by testing the correlation coefficient between sensitivity and specificity in the bivariate model, positive values indicated the possibility of heterogeneity. Also, we checked the beta-coefficient significance in hierarchical summary receiver operator characteristic (HSROC) model, P ª 2020 The Authors. Annals of Clinical and Translational Neurology published by Wiley Periodicals LLC on behalf of American Neurological Association values < 0.05 represent high heterogeneity. For the assessment of publication bias, Deeks' funnel plot was performed. 24 P < 0.05 was the significance threshold. Data were compared using Review Manager 5.3 or STATA 15.0.

Data availability statement
The corresponding author will provide the data used in this meta-analysis which are available to qualified investigators upon request.

Literature analysis
Databases were comprehensively searched up to 1 January 2020. Data were analyzed by two independent reviewers. Eight RCTs including 1563 participants (five encompassing 718 FGF-21 assessments; seven encompassing 845 participants for GDF-15) were included. The initial search yielded 673 references. After screening the titles, 381 were eliminated due to data duplication. After assessment of the titles and abstracts, a further of 244 studies were excluded due to irrelevant records, basic experiments, reviews, case reports comments, and articles not in English. Following text reviews of the 49 articles, 16 were excluded for lacking diagnostic accuracy assessments, 13 were excluded as they were not RCTs, five were excluded for overlapping participants, and seven were removed for 2x2 table construction. Finally, eight studies reached eligibility for subsequent meta-analysis ( Fig. 1).

Patient characteristics
A total of four studies investigated both FGF-21 and GDF-15, a single study investigated FGF-21, and three studies investigated GDF-15. 9,12,[16][17][18][25][26][27] Included studies were cohort studies published from 2011 to 2019. Individual studies included 49-194 cases. In total, FGF-21 and GDF-15 were measured via ELISA in 718 and 845 patients, respectively. These studies were carried out in Asia (China and Japan), Europe (Denmark, Finland, Netherlands, Russia, and Spain) and Oceania (Australia). All the assays of FGF-21 and GDF-15 were performed in duplicate. The optimal cut-off values differed for each study and were based on the Youden index (sensitivity + specificity À 1). Basic characteristics are summarized in Table 1.

Study quality
Included studies had a low of bias. Six studies were unbiased in terms of patient selection, six in index tests, and five due to reference standards. Timing and flow showed only one study was unclear and the rest were no bias (Fig. 2).

Sensitivity analysis and subgroup analysis
Some heterogeneity was evident amongst the included studies in our meta-analysis. Goodness of fit and bivariate normality suggested that the bivariate random-effect model was suitable for conduction of the pooled analysis. Influence analysis and outlier detection identified two studies that may overshadow the robustness of the metaanalysis. (Fig. 5). In order to evaluate the effect of each study on the summary results, we sequentially eliminated individual studies and performed exploratory subgroup analysis for the following factors: ethnicity, sample size, gender and age. Based on the results, we considered that ethnicity and sample size could be the main causes of heterogeneity. The exact values were given in Table 3.

Publication bias
For bias assessments, Deeks' funnel plot asymmetry test was used. The results show that the plots were symmetrical indicating minimal publication bias in this meta-analysis (Fig. 6).

Discussion
To our knowledge, this is the first meta-analysis to summarize the current evidence on the diagnostic accuracy of FGF-21 and GDF-15 in detecting MDs. An important finding of this work, is that only a small number of articles are available on this topic and the time for publication were relatively new. This could partly reflect FGF-21 and GDF-15 being relatively new biomarkers among MDs. Based on the current eight eligible studies, FGF-21 and GDF-15 were shown as valid tools for MD diagnosis. Comparison between the two diagnostic indicators showed that GDF-15 was more sensitive and specific than FGF-21.
FGF-21 regulates glucose and lipid homeostasis. In 2000, Fgf-21 was documented as the 21st Fgf gene. 28 The function of FGF-21 was unknown until 2005 upon its identification as a metabolic regulator. 8 Eleven RCTs now highlight the diagnostic accuracy of FGF-21 in MDs, but only five were included in this meta-analysis. This was because we were unable to reconstruct the 2 9 2 tables in five studies 10,11,[29][30][31] and eliminated a single study as it was a follow-up for one of the included articles. 32 Our meta-analysis showed medium sensitivity (0.71) and high specificity (0.88) for FGF-21. It is known that the AUC and DOR represent overall measures of diagnostic accuracy. The higher the AUC and DOR are, the higher the diagnosis accuracy achieved. 33 Based on this, the diagnostic utility of FGF-21 was high (AUC = 0.90; DOR = 18). Other serum biomarkers such as lactate, pyruvate, CK, and lactate-to-pyruvate ratio were considered to be nonspecific and lack sensitivity. According to previous research reports, 9,34 the sensitivity of lactate and lactateto-pyruvate ratio were 63% and 44%, and the specificity was 93% and 100%, respectively. CK levels can be normal or mildly-to-moderate elevated with poor sensitivity and specificity in MDs. There is also some evidence which indicates that plasma amino acids, urine organic acids (UOA), and acylcarnitines are useful for the diagnosis of MDs, but the evidence was mostly based on case studies. 35,36 Our meta-analysis showed that compared to traditional serum biomarkers, FGF-21 was superior for the discrimination between MDs and healthy controls.
GDF-15 is a promising diagnostic biomarker with high sensitivity and reproducibility for MDs and is a TGF-b cytokine expressed in the central and peripheral nervous systems. 36,37 Our meta-analysis summarizes the available evidence of the diagnostic accuracy for GDF-15 in MDs, including seven RCTs published from 2015 to 2019. The sensitivity, specificity, AUC and DOR for GDF-15 were      There was significant heterogeneity existed between the included studies for sensitivity and specificity. By analyzing the correlation coefficient between sensitivity and specificity in bivariate model and HSROC model, we considered that the heterogeneity was acceptable. Exploring the sources of heterogeneity is useful for understanding potential factors that affect the diagnosis accuracy. We used a random-effects model and carried out exploratory subgroup analysis. The results showed that ethnicity and sample size might be potential sources of heterogeneity. However, due to the small number of studies and lack of    FGF-21 and GDF-15 are highly sensitive and specific for MD diagnosis, particularly GDF-15. In addition to their diagnostic value, other advantages such as reproducibility, safety, cost-effectiveness, and time-efficiency must be considered to prove their clinical applicability. Regarding safety, compared with muscle biopsy, FGF-21 and GDF-15 were noninvasive and avoided complications (bleeding and infection). Regarding cost-effectiveness, genetic tests are the gold standard for MD diagnosis, but they are expensive. Less-costly clinical evaluations should be used to assess the requirement for NGS analysis.
In recent years, it has been reported that GDF-15 can predict the therapeutic outcomes of mitochondrial treatment. In 2015, Tanaka et al. 40 used GDF-15 to assess the efficacy of pyruvate in 2SD hybrid cells (with MELAScausing mutations) and control cells, identifying GDF-15 as a promising therapeutic indicator. In 2019, they enrolled 11 MD patients and confirmed these findings. 41 The major strengths of this meta-analysis are the rigorous protocols for the selection and assessment of eligible studies. Due to the relatively strict inclusion criteria, the included studies showed minimal bias and were of high quality. Some limitations should however be noted. Firstly, the small number of studies and sample size did not allow us to explore the data further. Secondly, during the selection process, there were a total of seven observational studies that focused on this topic that were excluded due to incomplete data (unable to construct 2 9 2 tables). We attempted to contact the corresponding authors for their data, but failed. Thirdly, although we evaluated the diagnostic accuracy of FGF-21 and GDF-15 for MDs, we did not assess their relationship to clinical phenotype, severity, and progression given the limited number of RCTs and inadequate data. Finally, due to the small number of studies and incomplete data, certain limitations in the heterogeneity analysis has existed in this meta-analysis. These inconsistencies may mask our data.
In summary, this meta-analysis was the first to assess the diagnostic value of FGF-21 and GDF-15. Considering accuracy, safety, cost, availability, and efficiency, these two cellular factors may be viable biomarkers for MD diagnosis. GDF-15 seems to outperform FGF-21 as a diagnostic biomarker. GDF-15 should therefore be combined with FGF-21 for first-line test when MDs are suspected and chosen to prioritize patients for invasive muscle biopsy. Further studies with larger samples are essential to establish the utility of these markers for guiding clinical management decisions and treatment selection.