Systematic evaluation, verification and comparison of tuberculosis‐related non‐coding RNA diagnostic panels

We systematically summarized tuberculosis (TB)‐related non‐coding RNA (ncRNA) diagnostic panels, validated and compared panel performance. We searched TB‐related ncRNA panels in PubMed, OVID and Web of Science up to 28 February 2020, and available datasets in GEO, SRA and EBI ArrayExpress up to 1 March 2020. We rebuilt models and synthesized the results of each model in validation sets by bivariate mixed models. Specificity at 90% sensitivity, area under curve (AUC) and inconsistence index (I2) were calculated. NcRNA biofunctions were analysed. Nineteen models based on 18 ncRNA panels (miRNA, lncRNA, circRNA and snoRNA panels) and 18 datasets were included. Limited available datasets only allowed to evaluate miRNA panels further. Cui 2017 and Latorre 2015 exhibited specificity >70% at 90% sensitivity and AUC >80% in all validation sets. Cui 2017 showed higher specificity at 90% sensitivity (92%) and AUC (95%) and lower heterogeneity (I2 = 0%) in ethological‐confirmation validation sets. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes analysis indicated that most ncRNAs in panels involved in immune cell activation, oxidative stress, and Wnt and MAPK signalling pathway. Cui 2017 outperformed other models in both all available and aetiological‐confirmed validation sets, meeting the criteria of target product profile of WHO. This work provided a basis for clinical choice of TB‐related ncRNA diagnostic panels to a certain extent.

with improved diagnostic and predictive performance. 3,15 These panels serve different clinical purposes, including distinguishing active TB patients from healthy controls (HCs) or latent TB infection (LTBI), and predicting TB progression. However, these panels are not widely applied to the clinic. Of noting, different panels use different sample types (whole blood (WB), serum, etc), ncRNA types (miRNAs, lncRNAs, etc) and modelling methods (logistic regression, linear combination, etc). Clinicians are confused to choose an optimal panel due to the diversity of these panels. Besides, most ncRNA panels are selected and validated by participants of the same ethnicity, and thus, the robustness and generalizability of these panels are unclear. It is hard to guarantee the capacity of these panels in different situations.
Moreover, relevant studies have not provided proper approaches to tailor the structure of ncRNA panels to meet corresponding needs.
Furthermore, detecting multiple ncRNAs simultaneously in a panel is still technically challenging. 16 Both the number of ncRNAs in panels and diagnostic performance should be carefully considered.
TB-related host response gene diagnostic signatures (ie based on coding RNAs) have been systematically evaluated. 17 However, to our knowledge, no systematic assessment of ncRNA diagnostic panels in TB has been reported. Here, we (i) systematically evaluated

| MATERIAL S AND ME THODS
The design of this work is shown in Figure 1.

| Collecting ncRNA panels
To identify eligible ncRNA panels, we searched PubMed, OVID and Web of Science from database inception up to 28 February 2020. We limited the species to Homo sapiens, but not study type or language. The search terms included TB (tuberculosis) AND diagnosis ("diagnose" OR "diagnostic" OR "panel" OR "signature" OR "combination" OR "profile") AND ncRNA ("non-coding RNA" OR "miRNA" OR "microRNA" OR "lncRNA" OR "long non-coding RNA" OR "circular RNA" OR "circRNA" OR "PIWI-interacting RNA" OR "piRNA" OR "small nucleolar RNA" OR "snoRNA" OR "small nuclear RNA" OR "snRNA"). Reference lists of relevant studies and articles which cited relevant studies as references were also reviewed.
We only included articles which constructed ncRNA panels to diagnose TB based on peripheral blood or its components, but not studies focusing on the diagnostic performance of signal ncRNA (see Figure S1).

Two investigators (Lyu M and Cheng Y) independently undertook
the work of search, data extraction and assessment of modelling quality based on Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD), 18 and disagreements would be discussed with a third investigator (Zhou J).

| Identifying eligible microarray and sequencing data
We searched public databases including NCBI GEO, NCBI Sequence Read Archive (SRA) and European Bioinformatics Institute (EBI) ArrayExpress on 1 March 2020, with the terms of TB or its full name AND non-coding RNA or its alternative terms, as described above.

TA B L E 1 The characteristics of included panels
We did not restrict detection methods and platforms. We included studies using peripheral blood or its components, but excluded studies using cultured human blood cells infected with MTB in vitro such as GSE94007 and GSE145770. The expression profiles of ncR-

| Model rebuilding
Included models would be rebuilt by R if necessary. In order to reproduce model exactly, the same modelling method and parameters in original articles were used (Text S1). If available, we trained the model with the original data or similar datasets. We compared diagnostic performances between the rebuilt models and original models to ensure the accuracy of rebuilding (Table S1). The remaining available datasets were treated as validation sets. We excluded polymerase chain reaction (PCR) data to maintain consistency between training set and validation sets and keep sufficient dataset coverage.

| Validation of model performance
To comprehensively assess the applicability of each model, each

| Bivariate meta-analysis
Bivariate mixed models were taken to pool the results 21 by midas in Stata 15.1. We pooled sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and AUC, each with a 95% confidence interval (CI). According to the criteria of the target product profile (TPP) provided by WHO, 4 specificity at 90% sensitivity was also examined. The heterogeneity across different datasets was quantitatively evaluated by Higgin's I 2 , with I 2 > 50% denoting significant heterogeneity. 22 Meta-regression was conducted to explore the sources of heterogeneity and application scope for each model.
Altogether, 18 eligible datasets were selected (GSE70425 included two different miRNA expression matrixes based on different cell types). MiRNA expression data were obtained from 14 datasets, whereas snoRNA expression data were obtained from one dataset.
One dataset provided lncRNA expression data, and three datasets offered circRNA-related data (Table 3).

| MiRNA diagnostic panels
Of all miRNA diagnostic models implemented on available vali-

| Other types of ncRNA diagnostic panels
No validation sets were accessed for one miRNA and snoRNA model, lncRNA model and circRNA model. Two validation sets were accessed for three circRNA models, which were not sufficient to pool. The performance of these models in training set is shown in Table S1.

| Meta-regression
We performed meta-regressions for 13 models implemented on 12 miRNA panels (Table 5), whereas we cannot conduct further meta-regressions regarding limited available lncRNA, circRNA and snoRNA.

| Sample type
NcRNA expression profiles in WB, PBMC or other blood cells (noncell-free samples) were different from these in serum, plasma or exosome (cell-free samples). 39 Herein, we performed meta-regression based on the consistency of sample type between training set and validation sets.

| Ethnicity
Ethnicity was considered as a covariate based on the country of training sets of each model (African, Caucasian and Asian), regarding the reported differences in miRNA expression among different ethnicities in TB. 40 In the consistent subgroup, Barry 2018 showed highest sensitivity (90%), whereas Cui 2017 still had the highest specificity (97%).
In Significant difference was also observed between these two subgroups regarding sensitivity of Wang 2011 (P = .04).

| LncRNAs in related panels
GO analysis showed that targeted genes of lncRNAs were related to divalent inorganic cation transmembrane transporter activity, whereas KEGG analysis indicated that these genes mainly involved in the process of ferroptosis.

| CircRNAs in related panels
GO analysis showed that the parental genes of circRNA participated in GTPase activity, protein autophosphorylation and other biological processes. KEGG suggested that these parental genes participated in many important pathways including Wnt signalling pathway and JAK-STAT signalling pathway and some infection such as influenza A, HIV, cytomegalovirus and papillomavirus ( Figure 5).     For other models with unsatisfactory performance, failure to reproduce these models, a methodological difference and heterogeneity across different datasets may contribute to this result. 44 In this paper, we developed models as close as possible to original papers and thought that the impact of such alterations on the model can be negligible, which is also supported by other scholars. 17 Now, numerous models have been proliferated, whereas most of them do not provide detailed parameters of modelling which is of utmost importance for the performance of models. 45 Missing parameters impede model reproduction, external validation and further improvement. Therefore, we suggest that the details of modelling including parameters, algorithms composition

| D ISCUSS I ON
and even the all coefficients should be provided. Some algorithm cheat sheets have been plotted to assist scholars to identify an algorithm for their own data. 46,47 Briefly, identifying research purpose, the characteristics of raw data and requirements for algorithm features can guide the choice of modelling methods. 46 Heterogeneity across different datasets is also a cause of fluctuations in the performance of panels. 48 Usually, the diagnostic abil- insensitive to ethnicities. It is difficult to judge which model is the optimal, the one with better generalization ability or with better diagnostic performance for the specific population, the one with the highest sensitivity or with the highest specificity. We recommend that the models with excellent sensitivity should be selected for high TB burden areas, whereas these with outstanding specificity ought to be applicable to low TB burden regions.
Desired sensitivity enables to improve generalizability and capacity of triaging TB patients and also reduces the costs and requirements of using high accurate detection tools, 52  *The training set referred to the dataset used to train the model in this work and the detailed information of training set of each model was provided in Table S1. † P1 referred to the P value when comparing the sensitivity of each model in consistent subgroup and inconsistent subgroup. ‡ P2 referred to the P value when comparing the specificity of each model in consistent subgroup and inconsistent subgroup.

TA B L E 5 (Continued)
do not suffer from TB is a more cost-effective approach for low TB burden areas; thus, high specificity ensures the reliability of classifying outcome. 52 In summary, the evaluation of model ought to take study objectives, disease burden and prevalence, and socio-economic requirements into consideration. 54 GO analysis and KEGG analysis demonstrated that most ncRNAs in included panels involved into Wnt signalling pathway, oxidative stress and immune cell activation and differentiation, which were closely related to the pathogenesis of TB. [55][56][57] MiR-150-5p was selected by 3/13 miRNA panels which targeted different populations and used different sample types. MiR-

150-5p is widely expressed in immune cells and is responsible
for the development of lymphocytes, lung cancer and acute lung injury. 58  wide application ranges, especially in limited-resource regions. [66][67][68] Benefiting from the rapid development of these technologies, ncRNA can play an increasingly valuable role. Therefore, it is important to systematically assess reported TB-related ncRNA diagnostic panels and thus offer a certain support for clinical choice in diverse situations. Inevitably, this work suffers from some limitations. Missing parameters and modelling steps prevented us from completely reproducing these models. Moreover, limited available datasets restricted us to further analyse lncRNA, circRNA and snoRNA panels and also have a negative impact on improving the effectiveness of evaluation for miRNA panels. The capacity of these panels in predicting TB progression and diagnosing paediatric TB failed to be assessed due to lacking available datasets.
Large-scale prospective validation in diverse populations is a necessary step for the entry of these panels into the clinic.

| CON CLUS ION
Cui 2017 showed strong generalization ability and outperformed in both all available validation sets and aetiological-confirmed

CO N FLI C T O F I NTE R E S T S
The author reports no conflicts of interest in this work.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data sharing is not applicable to this article as no new data were created or analysed in this study.