Role of alternative splicing signatures in the prognosis of glioblastoma

Abstract Background Increasing evidence has validated the crucial role of alternative splicing (AS) in tumors. However, comprehensive investigations on the entirety of AS and their clinical value in glioblastoma (GBM) are lacking. Methods The AS profiles and clinical survival data related to GBM were obtained from The Cancer Genome Atlas database. Univariate and multivariate Cox regression analyses were performed to identify survival‐associated AS events. A risk score was calculated, and prognostic signatures were constructed using seven different types of independent prognostic AS events, respectively. The Kaplan‐Meier estimator was used to display the survival of GBM patients. The receiver operating characteristic curve was applied to compare the predictive efficacy of each prognostic signature. Enrichment analysis and protein interactive networks were conducted using the gene symbols of the AS events to investigate important processes in GBM. A splicing network between splicing factors and AS events was constructed to display the potential regulatory mechanism in GBM. Results A total of 2355 survival‐associated AS events were identified. The splicing prognostic model revealed that patients in the high‐risk group have worse survival rates than those in the low‐risk group. The predictive efficacy of each prognostic model showed satisfactory performance; among these, the Alternate Terminator (AT) model showed the best performance at an area under the curve (AUC) of 0.906. Enrichment analysis uncovered that autophagy was the most enriched process of prognostic AS gene symbols in GBM. The protein network revealed that UBC, VHL, KCTD7, FBXL19, RNF7, and UBE2N were the core genes in GBM. The splicing network showed complex regulatory correlations, among which ELAVL2 and SYNE1_AT_78181 were the most correlated (r = −.506). Conclusions Applying the prognostic signatures constructed by independent AS events shows promise for predicting the survival of GBM patients. A splicing regulatory network might be the potential splicing mechanism in GBM.


| INTRODUCTION
Glioblastoma (GBM) remains the most aggressive and malignant form of primary central nervous system tumor in adults. Glioblastoma is defined as a grade IV tumor characterized by high heterogeneity and a dismal prognosis, according to the World Health Organization guidelines. 1 The current first-line treatment for newly diagnosed GBM is maximal safe resection followed by chemoradiation. [2][3][4][5][6] Despite aggressive interventions, the disease almost inevitably turns into recurrent GBM, for which no standard and effective treatment approach has been shown to prolong patient survival significantly. [7][8][9][10][11][12] The overall survival of recurrent GBM is generally no more than half a year. [13][14][15][16] Consequently, incurable GBM exerts challenging pressure on future work related to searching for novel treatment targets.
Alternative splicing (AS) is the primary driving force for generating diverse proteins, which is the basis for the remarkable and complex functional regulation seen in eukaryotic cells. Alternative splicing is a nearly ubiquitous process, occurring in about 95% transcripts. 17 Research over the past few decades has provided plenty of evidence concerning the role of AS in cancer. The aberrations of AS in cancers primarily include four categories: (a) AS alterations in oncogenes or tumor suppressor genes; (b) splicing factors mutations; (c) alterations in the upstream signaling pathways that decontrol splicing factors; and (d) aberrations in cancer-specific spliceosomal components. 18 Targeting aberrant splicing has provided novel perspectives for clinical therapy strategies. 19,20 Research on AS in GBM is continually emerging, increasing understanding about the role of AS in GBM. For example, Barbagallo et al found that circS-MARCA5 negatively regulated splicing of VEGFA through splicing factor SRSF1, which exerted antiangiogenic function. 21 Mogilevsky et al discovered that the manipulation of MKNK2 AS significantly suppressed the oncogenic properties of GBM cells and resensitized the cells to chemotherapy, which suggested a novel treatment strategy for clinical practice. 22 However, the literature primarily focuses on one AS and is lacking an exhaustive overview of all splicing events. In Sadeque's study, the prognostic value of alternative exon usage, a broad category of AS, was comprehensively investigated in GBM patients. Over 2400 alternative exon usage associated prognostic genes in GBM were identified, which provided great inspiration in mining prognostic biomarkers for GBM patients. 23 Nevertheless, constructing a survival prediction model and investigating the potential AS regulatory mechanism, which are important for both clinical practice and AS research, are lacking in Sadeque's study. Therefore, conducting a study to explore prognostic splicing models and possible AS regulatory mechanism in GBM is necessary.
The TCGA program provides a vast resource of genomic, epigenomic, transcriptomic, and proteomic data related to 33 different cancers. 24 Well-documented and freely accessible data make the analysis of AS in cancers possible. Many studies have used TCGA splicing data to investigate AS events and their clinical value related to cancers, such as lung cancer, 25 bladder cancer, 26 prostate cancer, 27 ovarian cancer, 28 and gastrointestinal adenocarcinomas. 29 However, no study has comprehensively investigated AS events and their prognostic value related to GBM.
Consequently, we aimed to identify survival-associated AS events and construct prognostic signatures to predict the survival of GBM patients using the splicing data of 155 samples in TCGA database. We also intended to investigate the possible regulatory mechanism of AS in GBM by constructing an interesting splicing network between AS events and splicing factors. This is the first study to systematically identify survival-associated AS events and to use splicing signatures for the prediction of survival of GBM patients . We hope this study will help scholars understand the spliceosome and provide new perspectives related to GBM-related clinical practice.

| Collecting AS events and GBM-related clinical data
The splicing profiles of seven types of AS, including alternate acceptor site (AA), alternate donor site (AD), alternate promoter (AP), alternate terminator (AT), exon skip (ES), mutually exclusive exons (ME), and retained intron (RI) were freely available in the TCGA database. TCGASpliceSeq (http://bioin forma tics.mdand erson.org/TCGAS pliceSeq), is a resource for investigation of mRNA AS patterns for 33 different tumor types in TCGA database, was used for downloading the splicing profiles of GBM. 30 The profile of the AS events was defined by a percent-spliced-in (PSI) value, which ranged from zero to one. Each AS event was presented as a combination of the gene symbol, splicing type, and splicing ID number. The clinical survival data of GBM patients were also downloaded from the TCGA database. Only those patients with an overall survival over 90 days were selected for the purpose of excluding the patient whose death is not because of tumor itself, but other factors such as surgical complications.

| Survival analysis and prognostic signatures for AS events
The AS events and survival information of GBM patients were matched using TCGA ID. For each type of AS event, univariate Cox regression analysis was achieved to screen | 7625 XIE Et al.
prognostic AS events (P < .05). The most significant AS events, namely for AA (P < .001), AD (P < .005), AP (P < .0005), AT (P < .001), ES (P < .0005), ME (P < .05), RI (P < .005) were collected. Moreover, we took seven types of AS events as a whole, namely seven combined-AS, to select the most significant AS events (P < .0001) for further analysis. Multivariate Cox regression was then applied to identify the independent prognostic AS events (P < .05). The prognostic signatures for each type of AS were constructed using the significantly independent AS events. A risk score for each splicing prognostic signature was achieved using the formula: risk score = ∑ n i PSIi × i, where β was the regression coefficient in multivariate Cox regression. 26,27 The median value of the risk score was employed as a threshold value to divide the patients into high-risk and low-risk groups. A Kaplan-Meier curve was plotted to determine the survival rate of GBM patients in the high-and low-risk groups. The 3-year survival rate of GBM patients was predicted in each splicing model using the receiver operator characteristic (ROC) curve. To compare the predictive efficacy of each prognostic signature in GBM, a survival package was applied to calculate the estimated area under the curve (AUC) of the ROC curve. 26,29,[31][32][33]

| Construction of the Upset plot and gene enrichment analysis
Because one gene may undergo multiple splicing, we wondered about the details concerning genes and splicing in GBM. As a result, we applied the UpsetR package to draw an Upset plot, which makes intersections in interactive sets more lucid. To acquire biologically enriched pathways, we compiled the top significant gene symbols of the prognostic AS events. The ClusterProfiles package in R software 34 was used to obtain the functionally enriched terms of genes (P < .05) in GBM. A P value and a q value both below .05 was regarded as significant. The top 20 significantly enriched terms were displayed as a bar plot and dot plot. A protein-protein interactive network of the gene assembles (P < .01) was achieved using the STRING database (https ://string-db.org/). 35 A confidence score over 0.9 was used to identify the most confident interactions.

| Differentially expressed splicing factors and the construction of a splicing network
The splicing factor is an important regulator in the AS process. Consequently, we gathered the human splicing factors in the SlpiceAid2 database (www.intro ni.it/splic eaid. html). 36 The mRNA profile data of the splicing factors in both GBM and normal tissues were obtained from the TCGA database. The Deseq 2 package 37 in R software was adopted to screen for differentially expressed splicing factors. A fold change value over 2 and below 0.5 was regarded as up-regulated and down-regulated, respectively. A P value less than .05 was regarded as statistically significant. Besides, the prognostic value of the differentially expressed splicing factors was also evaluated using Kaplan-Meier survival curve. Afterwards, the correlation between differentially expressed splicing factors and independent AS events was investigated. Spearman correlation analysis was utilized using SPSS version 25.0 (SPSS, Inc). Significant correlation (P < .05) was visualized as an interactive network using Cytoscape version 3.71. 38

| Details of AS events
In summary, 134 GBM patients with 45 610 AS events detected in 10 433 gene symbols were collected. These results comprised 3827 AAs in 2684 genes, 3269 ADs in 2270 genes, 8686 APs in 3476 genes, 8456 ATs 3695 genes, 18 360 ESs in 6934 genes, 184 MEs in 180 genes, and 2828 RIs in 1897 genes ( Figure 1). Exon skip was the most dominant splicing type, followed by the AP splicing type. The number of AS events is larger than the total number of genes, implying that one single gene may have undergone multiple splicing.

| Survival-associated AS events
A total of 134 GBM samples with overall patient survival over 90 days were included in the univariate Cox regression analysis. In the univariate Cox regression, 2355 survival-associated AS events, including 174 AAs, 160 ADs, 514 APs, 404 ATs, F I G U R E 1 A bar diagram displaying the number of alternative splicing events and gene symbols. AA, alternate acceptor site; AD, alternate donor site; AP, alternate promoter; AT, alternate terminator; ES, exon skip; ME, mutually exclusive exons; RI, retained intron 942 ESs, 10 MEs, and 141 RIs, were found. The most significant AS events-6 AAs, 24 ADs, 11 APs, 9 ATs, 17 ESs, 10 MEs, 21 RIs, and 14 seven combined events-were chosen for multivariate Cox regression. Finally, independent prognostic AS events, including 5 AAs, 5 ADs, 5 APs, 5 ATs, 8 ESs, 5 MEs, 8 RIs, and 7 seven combined events, were chosen to construct prognostic signatures. These independent survival-associated AS events are promising to act as prognostic biomarkers and therapy targets.
In order to display the overview of splicing, we visualized the intersecting sets of the seven types of survival-associated AS events. As shown in Figure 2, one gene may have up to three types of AS events. For example, COPZ1, YY1AP1, NFYC, PLEKHM1, TMUB2, ISCU, CHCHD3, ANKS3, VEZT, FYN, and CEP63 all had three types of AS events; ES events existed in all genes.

| Enrichment analysis and protein network
The ClusterProfiles R package was used to conduct enrichment analysis. Gene ontology enrichment consists of three categories: biological process, cellular component, and molecular function. In biological process, the top three enriched terms were establishment of protein localization to membrane, protein targeting to membrane, and vacuole organization ( Figure 3A,B). In cellular component, the top significant terms were focal adhesion, cell-substrate adherens junction, and cell-substrate junction ( Figure 3C,D). In molecular function, tau protein binding, cell adhesion molecule binding and cadherin binding were the first three significantly enriched terms ( Figure 4A,B). A Kyoto Encyclopedia of Genes and Genomes analysis revealed some important pathways, such as autophagy pathway, ubiquitin-mediated proteolysis, lysosome pathway, and apelin signaling pathway ( Figure 4C,D). Taken together, the AS genes mainly enriched in autophagy related processes. Autophagy has been widely reported in biology and pathogenesis of GBM. [39][40][41][42] These may indicate that splicing of some core genes in autophagy may be important in GBM. In Figure 5, the protein interactive network presents the core genes, like UBC, VHL, KCTD7, FBXL19, RNF7, and UBE2N. These core genes may play an important role in GBM, which is worthy of more focus. For example, VHL inactivation by ID2 protein has been discovered to be a mechanism of inhibition of GBM growth. 43

| Prognostic signatures of AS events
Independent prognostic AS events for each type of splicing were utilized to construct the prognostic predictors. The survival data of the included patients are displayed in Figure 6. Detailed information on the prognostic AS events revealed by the multivariate Cox regression is shown in Table 1. Figure  7 indicates that GBM patient survival in the high-risk group was significantly poorer than in the low-risk score group for all prognostic signatures. These indicated that more distinct molecular characteristics of AS events are adverse prognostic factors for GBM patients. To compare the predictive efficacy of each prognostic model, we performed survival ROC analysis. As displayed in Figure 8, the AUC of AA, AD, AP, AT, ES, ME, RI, and the seven combined-AS models was 0.740, 0.840, 0.659, 0.906, 0.854, 0.770, 0.719, and 0.811, respectively. The prognostic model of AT showed the best predictive performance, followed by the ES model. There is significant potential for the AT splicing model to predict the survival of GBM patients.

| Differentially expressed splicing factors and splicing network
A total of 67 splicing factors were gathered during the differential expression analysis. The expression of splicing factors in each GBM sample and normal tissue was visualized as a heatmap ( Figure 9A). Eleven significantly differentially expressed splicing factors were identified. Two (MIR4745 and YBX1) were up-regulated, and nine (RBFOX1, RBFOX2, ELAVL2, ELAVL3, ELAVL4, KHDRBS2, CELF2, NOVA2, and PTBP2) were downregulated ( Figure 9B). Regarding the prognostic value of the splicing factors in GBM, no statistical significance was observed in the survival analysis (data not shown). Spearman correlation analysis was performed to investigate the correlation between differentially expressed splicing factors and independent prognostic AS events. A total of 106 correlations, including 26 negative correlations and 80 positive correlations were identified. The positive regulations between splicing factors and AS events occurred more frequently than negative regulations. The correlation network was constructed to display the regulatory relationship ( Figure 10). The top correlations were displayed via linear correlation plots ( Figure 11).

| DISCUSSION
In this study, the entire prognostic AS events in GBM were identified using the TCGA splicing profile for the first time. Using the splicing prognostic signatures to predict the survival of GBM patients shows excellent potential for clinical practice. Splicing network constructed using AS events and splicing factors lays the foundation for future research on the splicing regulatory mechanism in GBM. The survival-associated splicing in this study shows promise for novel targeted therapy in GBM.
AS is a universal and pivotal mechanism in pre-mRNA processing, allowing for considerable proteomic diversity and complexity in a relatively limited human genome. 44 Research efforts over the last few decades have increased our knowledge on the alteration of AS in diseases, including cancer. The aberration of AS in cancers may present in various ways, which  45 The aberrant splicing of pre-mRNA contributes to various cell functions, such as proliferation, invasion, migration, metastasis, apoptosis, and drug resistance. 46 For example, AS of TCF-4 was found to inhibit the proliferation and metastasis of lung cancer cells. 47 Chen et al found that CD44 splicing was associated with invasion and migration in ovarian cancer. 48 The splicing variants of TP53, FAS, CASP9, and BCL2L1 have been associated with cancer cell apoptosis and survival. 18, 49 Calabretta's study showed that modulating the pyruvate kinase gene (PKM) splicing could promote gemcitabine resistance in pancreatic cancer cells. 50 Recently, studies have focused on investigating the potential therapeutic value of AS in cancers. Scholars have found that splicing pre-mRNA, such as human telomerase reverse transcriptase (hTERT) and epidermal growth factor receptor (EGFR), is correlated to cancer progression and prognosis. 51,52 Some researchers have begun to link AS to cancer subtypes, exploring their influence on prognosis. For example, Leivonen et al utilized AS to discriminate between the molecular subtypes and determine the prognostic impact in diffuse large B-cell lymphoma. 53 In 2017, Li et al pioneered an investigation of prognostic AS events to construct prognostic signatures for predicting the survival of nonsmall cell lung cancer patients using TCGA data. 25 Similar approaches were used to research other tumors, such as bladder cancer, 26 prostate cancer, 27 ovarian cancer, 28 esophageal carcinoma, 54 and gastrointestinal adenocarcinomas. 29 These studies have applied AS to clinical practice regarding cancer, providing predictive references for prognosis at a molecular level. However, no such study has been carried out for clinical practice regarding GBM. We conducted this study to provide survival-associated AS events and construct prognostic models for predicting the survival of GBM patients. Moreover, enrichment analysis, protein network, and splicing network were performed to investigate the potential molecular regulatory mechanism of AS in GBM.
In this study, 2355 survival-associated AS events in GBM were identified. Each independent survival-associated AS event, which could be a potential therapy target, is worth being focused in GBM-related research. Here, we focused on   55 Our prognostic model of AT splicing showed better predictive performance than the mRNA and miRNA models. Therefore, applying AS prognostic signatures to predict a GBM patient's survival is promising. We wondered what pathways the gene symbols of prognostic AS events would enrich. Therefore, we performed enrichment analysis and revealed important pathways. The autophagy-related processes were the most enriched regarding GBM. Autophagy is an intracellular degradative process, which exerts pivotal functions in maintaining metabolism and homeostasis. [56][57][58] In cancer biology, autophagy may exert dual effects in tumor promotion and suppression. [59][60][61][62] Targeting autophagy has been a focus in searching for novel therapies for cancers. [63][64][65][66] With GBM, scholars have previously noted the significant role of autophagy. For example, Peng et al discovered that TRIM28 could activate autophagy, thus promoting cell proliferation in GBM. 67 Regulating autophagy was the key process involved in the chemoresistance and radioresistance of GBM. 42 Sensitizing GBM cells to temozolomide via regulating autophagy has been widely reported by scholars. [68][69][70][71][72] With respect to the interplay between autophagy and AS, splicing autophagy-related genes has been reported to influence autophagy. 73,74 However, no study has reported the correlation between AS and autophagy in GBM. We speculated that the role of autophagy and its correlation with AS might be an interesting molecular mechanism in GBM worthy of future focus. The spliceosome, consisting of five small nuclear RNAs (U1, U2, U4, U5, and U6) and abundant protein factors, is the place where AS happens. 75 Splicing factors are a core protein in the spliceosome, playing pivotal roles in regulating the splicing process. Two well-studied splicing factor families, the serine-rich proteins and the heterogenous nuclear ribonucleoproteins, have been extensively reported in cancers. 76,77 In our study, we constructed an interesting splicing network to illuminate the potential splicing regulatory mechanism. Positive correlations were more common than negative correlations between splicing factors and AS events in GBM. A single splicing factor may play dual roles in the positive and negative regulation of different AS events. For a single AS event, different splicing factors generally exert a synergistic effect, but there are exceptions. For instance, MS4A6A_ RI_16057 was negatively regulated by YBX1, while it was positively regulated by PTBP2 and CELF2. These suggested a complex regulatory mechanism between splicing factors and AS events. The results provided a better understanding of the GBM spliceosome.
Several issues that arose in this study must be addressed. First, the study was based solely on online database sources. There is no cross validation for the results, which is certainly a limitation in this study. It is necessary to validate the results using other datasets and experiment in the future. Second, the survival analysis method in this study is based on Cox regression. However, in Shen's study, 78 a statistical method named SURVIV (Survival analysis of mRNA Isoform Variation) was reported, which is designed for analyzing mRNA isoform and patients' survival. The authors of the study suggested that SURVIV outperforms the conventional Cox regression survival analysis. Therefore, the statistical methods used in this study can be improved in future research. Besides, though we discovered that autophagy is an important process in GBM, its relationship with AS remains unclear. Thus, this requires further research. In short, we have established data concerning the entirety of survival-associated AS events in GBM. The prognostic signatures constructed using AS events were also found to show satisfactory predictive efficacy for the survival of GBM patients. The splicing regulatory network between the splicing factors and AS events provided a better understanding of the GBM spliceosome. The work achieved in this study underpins future splicing research and provides novel perspectives regarding potential GBM therapy.