Combined analysis of RNA‐sequence and microarray data reveals effective metabolism‐based prognostic signature for neuroblastoma

Abstract The relationship between metabolism reprogramming and neuroblastoma (NB) is largely unknown. In this study, one RNA‐sequence data set (n = 153) was used as discovery cohort and two microarray data sets (n = 498 and n = 223) were used as validation cohorts. Differentially expressed metabolic genes were identified by comparing stage 4s and stage 4 NBs. Twelve metabolic genes were selected by LASSO regression analysis and integrated into the prognostic signature. The metabolic gene signature successfully stratifies NB patients into two risk groups and performs well in predicting survival of NB patients. The prognostic value of the metabolic gene signature is also independent with other clinical risk factors. Nine metabolism‐related long non‐coding RNAs (lncRNAs) were also identified and integrated into the metabolism‐related lncRNA signature. The lncRNA signature also performs well in predicting survival of NB patients. These results suggest that the metabolic signatures have the potential to be used for risk stratification of NB. Gene set enrichment analysis (GSEA) reveals that multiple metabolic processes (including oxidative phosphorylation and tricarboxylic acid cycle, both of which are emerging targets for cancer therapy) are enriched in the high‐risk NB group, and no metabolic process is enriched in the low‐risk NB group. This result indicates that metabolism reprogramming is associated with the progression of NB and targeting certain metabolic pathways might be a promising therapy for NB.

status, International Neuroblastoma Staging System (INSS) stage, histopathology and tumour cell ploidy. 3,4 Patients with high-risk NB still demonstrate poor survival outcome (<50%) despite multimodal aggressive therapy. 4 While NB patients with metastatic stage 4 disease are at high risk for death from refractory disease, patients with metastatic stage 4s disease generally had a surprisingly good prognosis and even underwent spontaneous regression without tumour-specific therapy. [4][5][6][7] The spontaneous regression of NB has been validated by mass screening programmes, [8][9][10][11] and it is most evident in patients with stage 4s disease. [12][13][14][15] Patients with stage 4s NB generally had a small primary tumour but with metastatic disease in the liver, skin and bone marrow, or any combination of these. 12 In fact, spontaneous regression is not restricted to stage 4s NB and can be seen in patients with any stage of NB. 5 Since spontaneous regression of NB is most evident in infants with stage 4s disease, investigators have focused on stage 4s tumours as a surrogate to investigate the mechanisms that underlie spontaneous regression. [13][14][15] However, the underlying mechanism responsible for the spontaneous regression of NB is still largely unknown.
In recent years, it is becoming evident that changes in cell metabolism contribute to cancer development and progression. [16][17][18][19][20] Converging evidence indicates that the well-known tumour suppressor p53 controls multiple metabolic pathways and plays a major role in metabolism of cancer cells. 20 Researchers have also found that MYCN oncogene is involved in metabolic processes: MYCN could enhance glutaminolysis and induce oxidative stress by producing reactive oxygen species (ROS), 21 and could also negatively regulate monocarboxylate transporter 4 (MCT4) in NB cells. 22 The association between metabolism and spontaneous regression of NB is unknown. We wish to know whether metabolism is involved in the process of spontaneous regression.
In this study, we use stage 4s tumours as a surrogate to those NBs underwent spontaneous regression. One RNA-sequence data sets (n = 153) and two microarray data sets (n = 498 and n = 223) were utilized in this study to perform the analyses. Differentially expressed metabolic genes were identified by comparing those subjects died in stage 4 NB group and those survived in stage 4s NB group. Excluding the dead cases in the stage 4s group would make it better to serve as surrogates to those NBs underwent spontaneous regression. Finally, twelve differentially expressed and survival-related metabolic genes were incorporated into the metabolic gene signature. Nine metabolism-related long non-coding RNAs (lncRNAs) were also identified and incorporated into a metabolism-related ln-cRNA signature. The metabolic gene signature and metabolism-related lncRNA signature performed well in predicting overall survival (OS) of NB patients. Gene set enrichment analysis (GSEA) identifies that multiple metabolic processes are significantly enriched in the high-risk group, while no metabolic gene set is identified in the lowrisk group. These results suggest that metabolism reprogramming might play important roles in promoting NB progression and thus inhibiting NB regression.

| Neuroblastoma data set processing
The RNA-sequence data sets (TARGET NBL, n = 153) were obtained from National Cancer Institute GDC Data Portal. The Agilent microarray data sets GSE49710 (n = 498) were obtained from Gene Expression Omnibus (GEO) database. The Agilent microarray data sets E-MTAB-8248 (n = 223) were obtained from ArrayExpress database. The clinical characteristics of the patients in each of the three cohorts are shown in Table S1. The RNA-sequence data set was used as the discovery cohort and termed as cohort 1. The microarray data sets GSE49710 and E-MTAB-8248 were used as the validation cohorts and termed as cohort 2 and cohort 3, respectively.
The Agilent microarray probes IDs were firstly annotated using the platform GPL16876 (Agilent-020382 Human Custom Microarray 44k). Then, to renew the annotation, the probes IDs were re-annotated according to their corresponding GenBank Accession number.
When multiple probes mapped to the same gene, the mean of the signal intensities will be used. The genomic alterations (mutation and copy-number alteration) of the identified genes were analysed on the open platform of cBio Cancer Genomics Portal (cBioPortal) (http://www.cbiop ortal.org/). 23

| Extraction of differentially expressed metabolic genes
The KEGG (Kyoto Encyclopedia of Genes and Genomes) gene sets were obtained from Molecular Signatures Database (MSigDB) on gene set enrichment analysis (GSEA) website. All genes included in the KEGG gene sets that associated with metabolism were extracted, a total of 910 genes were identified as involved in metabolism pathways. Differential expression analyses were performed by 'limma' package using the R software. Genes with FDR < 0.05 and |log2FoldChange| > 0.5 were extracted as differentially expressed genes. lncRNAs correlated (Pearson correlation, |r| ≥ 0.3) with metabolic genes were extracted as metabolism-related lncRNAs. Only those lncRNAs matched to GENCODE annotation of long non-coding RNA (release 31, GRCh38.p12) were selected.

| Construction of the prognostic metabolic gene signatures
Univariate Cox proportional hazards regression analyses were performed to identify those genes associated with overall survival (OS) in the entire cohort 1. A P-value of ≤.05 was considered statistically significant. Those survival-related genes were put into the least absolute shrinkage and selection operator (LASSO) penalty Cox regression analysis to eliminate false positives because of over-fitting. 24 Finally, the prognostic metabolic gene signature was constructed by weighting the Cox regression coefficients to calculate a risk score for each subject. The median value was used as the cut-off value, and the patients were classified as low-risk and high-risk group accordingly. The same formula and the same cut-off value were applied to the validation cohorts. Metabolism-related lncRNA signature was constructed by the same method.

| Function annotation and gene set enrichment analysis
The bubble plot and circle plot of Gene Ontology (GO) and KEGG pathway annotation were generated by R package 'ggplot2', 'enrichplot' and 'GOplot'. Functional annotation with a P-value < .05 was considered statistically significant. Gene set enrichment analysis (GSEA) comparing low-risk and high-risk groups was performed by GSEA software (version 4.0.03). 25 A nominal P-value < .05 and false discovery rate (FDR) q-value < .25 were considered statistically significant for GSEAs. Single sample gene set enrichment analysis (ssGSEA) was performed by R package 'GSVA' to assess the immune status of the tumour samples. 26,27 The immune-related gene sets used in this study have been reported previously. 28,29 Estimation of stromal and immune cells infiltrated into the tumour tissues was performed by R package 'ESTIMATE'. 30  were blinded to all clinical information. The AKR1C1 protein was found to be expressed primarily in the cytoplasm of tumour cells.

| Statistical analysis
The univariate and multivariate Cox proportional hazards regression analyses were performed by the R package 'survival'. The LASSO regression analyses were performed by the R package 'glmnet'. The Kaplan-Meier plots were constructed by R software or GraphPad Prism 5, and the statistical significance was assessed by two-sided log-rank test. The time-dependent receiver operating characteristic (ROC) curves and area under the curve (AUC) analyses were used to evaluate the predictive performance of the prognostic signatures and performed by the R package 'time ROC'. Correlation network was constructed by R package 'corrr' and 'dplyr'. Volcano plot was plotted by the R package 'ggplot2'. Heatmaps and unsupervised clustering were generated either by the R package 'pheatmap' or by TMeV software (Tigr MultiExperiment Viewer, version 4.9.0). The R software version 3.6.2 was utilized in this study for the statistical analyses. All statistical tests were two-sided, and a P-value < .05 was considered statistically significant.

| Identification of differentially expressed and survival-related metabolic genes
A flow chart of this study is shown in Figure 1. Since most of the subjects in the RNA-sequence data sets (TARGET NBL) belong to stage 4 or stage 4s disease (Table S1), we decided to first make a comparison between stage 4 and stage 4s NBs in order to find out whether metabolism is associated with spontaneous regression of NB. We excluded those cases who died during follow-up in stage 4s group to make it better to serve as surrogates to NBs that spontaneously regressed, and also excluded those cases who survived during follow-up in stage 4 group to make it better to represent those real 'high-risk' NBs. Finally, 19 cases who survived in stage 4s group and 73 cases who deceased in stage 4 group were used for comparison.
The average follow-up time (OS time) for those survived cases in the stage 4s group (6.04 ± 2.08 years) was relatively longer than that for those deceased cases in the stage 4 group (2.76 ± 1.87 years).
A total of 65 metabolic genes were found to be differentially expressed between these two groups in cohort 1. Thirty-six metabolic genes were up-regulated in stage 4 NBs, while 29 metabolic genes were up-regulated in stage 4s NBs (Figure 2A and B).
Univariate survival analyses revealed that 36 metabolic genes were significantly (P < .05) associated with OS in entire cohort 1 ( Figure S1A). Sixteen of them were up-regulated in stage 4s NBs and associated with good survival, while twenty were up-regulated in stage 4 NBs and associated with bad survival.
Unsupervised hierarchical clustering analyses revealed that the 36 survival-related metabolic genes stratified each of the three cohorts into two distinct clusters (Figures 2C, D and S1B): One cluster has good survival outcome and another cluster has bad survival outcome. In cohort 1, the OS rates at 13 years were 24.40% in cluster 1 compared with 68.78% in cluster 2 ( Figure 2E). In cohort 2, the OS rates at 18 years were 42.52% in cluster 1 compared with 92.11% in cluster 2 ( Figure 2F). In cohort 3, the OS rates at 19 years were 55.84% in cluster 1 compared with 89.68% in cluster 2 ( Figure S1C).

| Construction and validation of the prognostic metabolic gene signature
Twelve metabolic genes were selected by LASSO regression analysis and incorporated into the prognostic gene signature ( Figure S1D and E). The characteristics of the 12 metabolism-related genes are shown in Table S2 plot shows that patients in the high-risk group had a significantly poorer OS than those in the low-risk group in cohort 2 (54.84% vs 91.52%, P < .001) ( Figure 3F). The metabolic gene signature has good performance in predicting OS in cohort 2, while the AUC at 3, 5 and 8 years was 0.79, 0.82 and 0.79, respectively ( Figure 3G).
In cohort 2, the metabolic gene signature risk score is also significantly associated with OS both by univariate model (HR = 7.828; 95% CI: 5.277-1.611; P < .001; Figure S2C) and by multivariate model (HR = 2.360; 95% CI: 1.419-3.924; P < .001; Figure 3H). Consistent with cohort 1 and cohort 2, the validation in cohort 3 shows similar results ( Figure S2D-H). A nomogram including the metabolic gene signature risk score for OS prediction was constructed in cohort 2, which has the largest sample size and consists all different INSS stages (1, 2, 3, 4 and 4s) ( Figure 3I). The C-index for the nomogram is 0.87 (95%: 0.84-0.89), indicating high prediction accuracy. The calibration curve also revealed that the nomogram-predicted 5-year OS was very close to the actual 5-year OS ( Figure 3J).  Figure 4A and B), COG risk status (low and high; Figure 4C and D), age status (age <18 months and age >18 months; Figure 4E and F), histology subtype (favourable and unfavourable; Figure 4G and H), differentiation status (differentiating and poorly differentiated; Figure 4I Figure 4M and N), ploidy status (hyperdiploid and diploid; Figure 4O and P) and INSS stage (stage 4; Figure 4Q). Since only one patient is classified as stage 2, six patients as stage 3 and no patients as stage 1 in cohort 1, the subgroup analysis is not conducted for stage 1, stage 2 and stage 3.
All patients in stage 4s were classified as low-risk group; thus, the Kaplan-Meier plot is also not constructed.
Within each of these subgroups, patients were grouped into lowrisk and high-risk groups according to the same cut-off value as the entire cohort 1. The results show that patients with low-risk score tend to have a good survival than patients with high-risk score in each of the subgroup. The survival difference does not reach statistically significant in only three subgroups (MYCN-amplified subgroup, differentiating subgroup and ganglioneuroblastoma subgroup) with low case number.

| Function annotation and GSEA
The 36 survival-related metabolic genes were put into GO and KEGG functional annotation. The bubble plots show the top ranking GO enrichment ( Figure S3A) and KEGG pathway enrichment ( Figure S3B).
The circle plots reveal that most of the metabolic processes of GO ( Figure 5A) and metabolic pathways of KEGG ( Figure 5B Since metabolism is also associated with immune system, 32,33 we also investigated whether the high-risk group and the low-risk group have different status of immune activity. We found that the lowrisk group has a relatively higher level of immune cell infiltrations than the high-risk group ( Figure S3C). The immune score ( Figure 5D) and stromal score ( Figure 5E) were significantly higher in the lowrisk group than those in the high-risk group, which indicates a relatively high level of immune activity in the low-risk group. On the contrary, the level of tumour purity was significantly higher in the high-risk group than that in the low-risk group (Figure 5F), indicating a relatively low level of immune cell infiltration in the high-risk group.
The ssGSEAs also revealed that multiple immune-related KEGG pathways (eg graft-vs-host disease and intestinal immune network for IgA production) or GO biological processes (eg gamma-delta T-cell activation and interleukin-8 biosynthetic process) were significantly enriched in the low-risk group (Figure 5G and H; Tables S6   and S8). Thus, the results of our study also support the link between metabolism and immune activity.
The top 30 KEGG pathways and top 30 GO biological processes enriched in the high-risk group by the ssGSEAs are shown in Tables S5 and S7, respectively. The KEGG pathway of one-carbon pool by folate (hsa00670) and oxidative phosphorylation (hsa00190) was also found to be significantly enriched in the high-risk group by the ssGSEAs (Table S5), indicating that this two metabolism pathways may play important roles in the progression of NB.

| Identification of metabolism-related lncRNAs
Those lncRNAs whose expressions were correlated with the expression of the 12 metabolic genes in the metabolic gene signature were extracted as metabolism-related lncRNAs. A total of 47 metabolismrelated lncRNAs were shown to be significantly associated with OS by the univariate Cox survival analysis ( Figure S4A). Finally, nine survival-related lncRNAs were selected by LASSO regression analysis and incorporated into the metabolism-related lncRNA signature ( Figure S4B and C). The characteristics of the 9 metabolism-related lncRNAs are shown in Table S9.
The relationship (Pearson correlation) between these lncRNAs and metabolic genes is shown in Figure 6A

| Construction and validation of metabolismrelated lncRNA signature
The lncRNA signature risk scores were calculated for each sub-  Figure 7A). The heatmap shows that six metabolic lncRNAs were highly expressed in the high-risk group, while only three metabolic lncRNAs were highly expressed in the low-risk group ( Figure 7A).
The Kaplan-Meier plot shows that patients in the high-risk group had a significantly poorer OS than patients in the low-risk group (14.41% vs 66.52%, P < .001) ( Figure 7B). Time-dependent ROC curves reveal that the lncRNA signature has good performance in predicting OS in cohort 1, while the AUC at 3, 5 and 8 years was 0.76, 0.8 and 0.79, respectively ( Figure 7C). The lncRNA signature risk score is significantly associated with OS both by univariate survival analysis (HR = 6.950; 95% CI: 4.247-11.407; P < .001; Figure S5A) and by multivariate survival analyses adjusted by aforementioned clinical risk factors (HR = 13.173; 95% CI: 5.798-29.930, P < .001; Figure 7D). The ROC curve analyses reveal that the 5-year AUC value (0.807) of the lncRNA signature is higher than all the other risk factors ( Figure S5B).
The lncRNA signature was tested in cohort 2 and cohort 3 for validation. The risk distribution, survival status and gene expression pattern of cohort 2 are shown in Figure 7E. The Kaplan-Meier plot shows that patients in the high-risk group have a significantly poorer OS than those in the low-risk group (61.64% vs 91.03%, P < .001) ( Figure 7F). The AUC at 3, 5 and 8 years was 0.72, 0.74 and 0.68, respectively ( Figure 7G). In cohort 2, the lncRNA signature risk score is also significantly associated with OS both by univariate model (HR = 15.494; 95% CI: 7.793-30.805; P < .001; Figure S5C) and by multivariate model (HR = 3.528; 95% CI: 1.533-8.116; P = .003; Figure 7H). Consistent with cohort 1 and cohort 2, the validation in cohort 3 shows similar results ( Figure S5D-H). A nomogram including the lncRNA signature risk score for OS prediction was constructed in cohort 2, which has the largest sample size and consists all different INSS stages (1, 2, 3, 4 and 4s) ( Figure 7I). The C-index for this nomogram is 0.87 (95%: 0.85-0.89), which indicates high prediction accuracy. The calibration curve also revealed that the nomogram-predicted 5-year OS was very close to the actual 5-year OS ( Figure 7J).

| Genetic alterations of the genes in the prognostic signatures
A total of 755 NB cases with mutation data and 59 NB cases with gene copy-number data are available in cBioPortal platform. POLR2A is found to have missense mutation in 0.1% of NB cases. FADS2 had missense mutation in 0.1% of NB cases and amplification in 1.7% of NB cases ( Figure 8A and B). PGM2L1 had missense mutation in 0.1% of NB cases and deep deletion in 1.7% of NB cases ( Figure 8A and B). POLR2L, PLCB3 and NME2 had amplification in 3% of NB cases, respectively ( Figure 8B). MIF and PHPT1 had amplification in 1.7% of NB cases, respectively ( Figure 8B). AKR1C1 and LAP3 had deep deletion in 1.7% of NB cases, respectively ( Figure 8B). No gene alteration data were available for the lncRNA AC0022075.1. No mutation or somatic gene copy-number alteration was discovered for each of the other lncRNAs ( Figure 8C and D).

| Immunohistochemistry evaluation of AKR1C1 expression in NB tissues
Since AKR1C1 has the highest weighting coefficient (0.1016) in the metabolic gene prognostic signature, we then decided

| D ISCUSS I ON
The association between metabolism and NB progression is largely unknown. To our knowledge, our study is the first aimed at screening for metabolic genes that associated with survival of NB patients through mining of both RNA-sequence and microarray data. Most of the subjects in the RNA-sequence data sets (TARGET NBL) belong to stage 4 or stage 4s disease (one stage 2, six stage 3, 125 stage 4 and 21 stage 4s). Thus, we decided to make a comparison between stage 4 and stage 4s NBs. Since spontaneous regression of NB is most evident in patients with stage 4s disease, we hope that our research would also have some enlightening significance for the study of spontaneous regression. We excluded the dead cases in stage 4s to make it better to serve as surrogates for NBs underwent spontaneous regression. Actually, only two out of 21 stage 4s patients died in cohort 1, five out 54 stage 4s patients died in cohort 2, and one out 30 stage 4s patients died in cohort 3 during more than 10 years of follow-up.
In this study, a total of 36 metabolic genes were found to be differentially expressed and significantly associated with OS of NB in the high-risk group, while only three (PGM2L1, AKR1C1 and CDO1) of them are highly expressed in the low-risk group. Among these metabolic genes, only PLCB3 has been found to play roles in suppressing neuronal differentiation. 35 None of the other genes have been reported to be associated with NB. Since AKR1C1 (aldo-keto reductase family 1 member C1) has the largest weighting coefficient in the prognostic signature, we performed a simple experimental validation for AKR1C1. IHC staining of AKR1C1 protein in NB tissue specimens revealed that higher stage NBs tend to have lower AKR1C1 protein expression levels. This result is consistent with our bioinformatic analysis. AKR1C1 is a member of the aldo/ keto reductase (AKR) superfamily, which has also been reported to be involved with multiple malignancies. [36][37][38] Here, we for the first time identified AKR1C1 also associated with the development of NB, though its exact function in NB is unknown. Another drawback is that the survival outcomes of the patients providing these tumour specimens are unavailable, which hinder further evaluation of the prognostic role of AKR1C1 for NB patients. All in all, the exact roles of these metabolic genes in NB and their underlying mechanisms need to be investigated by further studies.
In recent years, lncRNAs are found to play important roles in the genesis and progression of multiple cancer types including NB. [39][40][41][42][43][44] In this study, nine metabolism-related and survival-related lncRNAs were selected by the LASSO regression analyses and incorporated into the metabolism-related lncRNA signature risk score. The ln-cRNA signature also successfully stratified each of the cohorts into two risk groups: The low-risk group has good survival outcome, and the high-risk group has bad survival outcome. The lncRNA signature performed well in the subgroup survival analyses and was also shown to be independent with other clinical risk factors.
Six (SLX1A-SULT1A3, LINC02076, SNHG22, ARRDC1-AS1, FOXN3-AS1 and URB1-AS1) of the nine lncRNAs are highly expressed in the high-risk group, while only three (AGPAT4-IT1, FAM13A-AS1 and AC022075.1) of them are highly expressed in the low-risk group. Overexpression of SNHG22 has been reported to be associated with poor prognosis of patients with epithelial ovarian carcinoma. 45 ARRDC1-AS1 was found to be associated with recurrence of breast cancer. 46 The function of the other lncRNAs and their relation with cancer have not been reported by the literature.
The relation of these lncRNAs with metabolism and their role in NB need to be clarified by further researches.
Investigation of 755 NB cases with mutation data and 59 NB cases with somatic gene copy-number data discovered that most of the metabolic genes had genetic alterations. It seems that genetic alterations somewhat contribute to their altered expression. However, considering the low level of genetic alterations, the altered expression of these metabolic genes seems largely controlled by other mechanisms, which need to be further clarified.
It is very interesting to find that the GO and KEGG function annotation reveal that most of the metabolism biological processes are up-regulated in stage 4 NB. In consistent with this finding, the GSEAs also reveal that certain metabolic gene sets are significantly enriched in the high-risk group. No metabolism gene set is enriched in the low-risk groups by the GSEA. These results suggest that metabolism reprogramming might contribute to the progression of NB and thus inhibit NB to undergo regression. Undoubtedly, the underlying mechanisms need to be clarified by further investigations.
Oxidative phosphorylation (OXPHOX), 47 pyrimidine metabolism, 48 one-carbon pool by folate, 49 citrate cycle TCA cycle 50 and   purine metabolism 51 have all been reported to be associated with cancers. Recent studies found that OXPHOX is up-regulated in certain cancers, promoting OXPHOS inhibitors to be used to target these cancer subtypes. 47 In this study, we found that OXPHOS is highly enriched in the high-risk NB group, which highlighting that OXPHOS inhibitor might also be used for this subtype of NB to improve prognosis. We also found that tricarboxylic acid (TCA) cycle is enriched in high-risk NBs, indicating that high-risk NBs rely on the TCA cycle for energy production and macromolecule synthesis.
Since TCA cycle is also an emerging target for cancer therapy, 50 targeting TCA cycle might also be a possible therapy for NB. Further studies are needed to clarify how these enriched metabolic processes contribute to NB progression.
There are also some limitations in our study. Firstly, we did not perform enough experimental studies to corroborate our findings.
The specific function of these identified metabolic genes and their underlying mechanisms in NB progression or regression need to be investigated by further experimental studies. Secondly, spontaneous regression of NB is not restricted to stage 4s and not all cases in stage 4s underwent spontaneous regression. However, many other investigators have used stage 4s tumours as a surrogate to investigate the mechanisms responsible for spontaneous regression.
Despite these drawbacks, the combination of RNA-sequence data and microarray data, the large sample size of the three cohorts, and the validation of the findings by two independent cohorts all provide a high level of confidence.
It also has to be mentioned that, during the peer-review process, one reviewer suggested to combine metabolism-related coding gene and lncRNA into one prediction model, we did not follow this suggestion due to the following three reasons. Firstly, we initially tried to combine coding gene and lncRNA together and selected out eight genes (MIF, LAP3, FADS2, SNHG22, CDO1, POLR2A, ARRDC1-AS1 and AC022075.1) to construct one prediction model. However, the prediction ability of the new model was not that good. The new prediction model was not independent with other risk factors when it was tested in both of the two validation cohorts. Secondly, in view of convenient clinical use, it would be better to not combine coding and non-coding genes together in one prediction model. For the coding genes, we can test them in the tumour tissues by immunohistochemistry, which is usually used in clinical practice. However, for non-coding genes, we have to resort to other tasting methods. Thirdly, as another reviewer pointed out, the selection of the metabolism-related lncRNAs is solely based on gene expression correlation with the metabolic coding genes. These lncRNAs are poorly characterized so far, and their actual association with metabolism processes is still need to be investigated and validated by future studies. However, we wish these metabolism-related lncRNA identified in our study would give some clues to future studies to investigate lncRNAs that are associated with metabolism in NB. Indeed, for a better clinical use, these prediction models also need to be validated and improved by further researches.
In conclusion, we find that metabolic genes are differentially expressed between the stage 4 and stage 4s NB samples. The metabolic gene signature has good performance in predicting OS of NB patients. The metabolism-related lncRNA signature also has good performance in predicting OS of NB patients. The prognostic value of both the metabolic gene signature and metabolism-related lncRNA signature is independent with other clinical risk factors. Therefore, the metabolic signatures have the potential to be used for risk stratification of NB. Multiple metabolic processes are enriched in high-risk NB, indicating that metabolic reprogramming is associated with the progression of NB. Targeting certain metabolic pathways might also be a promising therapy for NB. We appreciate the language help provided by Xinmei Nie.

CO N FLI C T O F I NTE R E S T
The authors confirm that there are no conflicts of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data set TARGET NBL is public available and can be found in