Exostosin1 as a novel prognostic and predictive biomarker for squamous cell lung carcinoma: A study based on bioinformatics analysis

Abstract The exostosin (EXT) protein family is involved in diverse human diseases. However, the expression and prognostic value of EXT genes in human lung squamous cell carcinoma (LUSC) is not well understood. In this study, we analyzed the association between expression of EXT1 and EXT2 genes and survival in patients with LUSC using bioinformatics resources such as Oncomine and The Cancer Genome Atlas (TCGA) databases, the Gene Expression Profiling Interactive Analysis (GEPIA) server and Kaplan–Meier plotter. Furthermore, regulatory microRNAs (miRNAs) were predicted for EXT1 and used to establish a potential miRNA‐messenger RNA (mRNA) regulation network for LUSC using the ENCORI platform. We observed that EXT1 and EXT2 expression levels were higher in LUSC than those in normal tissues. However, only EXT1 expression was significantly associated with poor overall survival (OS) in LUSC patients. Functional annotation enrichment analysis showed that genes co‐expressed with the EXT1 gene were enriched in biological processes such as cell adhesion and migration, and KEGG pathways such as extracellular matrix receptor interactions, complement and coagulation cascades, and cell death. Furthermore, three miRNAs, hsa‐mir‐190a‐5p, hsa‐mir‐195‐5p, and hsa‐mir‐490‐3p, were identified to be potentially involved in the regulation of EXT1. In summary, we identified EXT1 expression as a novel potential prognostic marker for human LUSC and the regulatory miRNAs that could possibly contribute to the prognosis of the disease.


| INTRODUCTION
Heparin sulfate proteoglycans (HSPGs) are ubiquitous components of the extracellular matrix and play an important role in tissue homeostasis. 1 Extensive research has demonstrated that heparin sulfate (HS) is essential for signal transduction in various processes such as cell survival, division, migration, differentiation, and cancer development. 2 The exostosin (EXT) family of glycosyltransferases, including EXT1 and EXT2, mediate the synthesis of HS. 3 Both genes that encode exostosin glycosyltransferases (EXT1 and EXT2) function as tumor-suppressors, 4 although the molecular mechanisms and prognostic value of exostosins (EXTs) in cancer is still unclear.
The EXT1 gene, located on chromosome 8, encodes an endoplasmic reticulum-resident type II transmembrane glycosyltransferase involved in the chain elongation step of HS biosynthesis. Mutations in this gene cause the type I form of multiple exostoses. Furthermore, EXT1 is overexpressed in various cancers such as adult acute lymphoblastic leukemia (ALL), 5 hepatocellular carcinoma (HCC) 6 and breast cancer. 7 Furthermore, EXT1 expression has been reported to be a promising indicator of breast cancer metastasis risk 8 and shown to be associated with a poor prognosis in multiple myeloma. 9 Mutations in the EXT2 gene, located on chromosome 11, cause the type II form of multiple exostoses. In addition, different isoforms encoded by alternatively spliced transcript variants are also currently known. EXT2 has been reported to be associated with type 2 diabetes mellitus (T2DM) in some populations 10 as well as multiple osteochondromas, 11 not only in humans but also in zebrafish. 12 According to the global cancer statistics in 2018, lung cancer has the highest incidence and mortality among all tumors. 13 Non-small cell lung cancer (NSCLC) is the most common pathological type accounting for approximately 85% of all lung cancers. 14 Among NSCLCs, lung squamous cell carcinoma (LUSC) is the second most common type of NSCLCs, with more than 400,000 new cases per year, and accounts for 20%-30% of NSCLCs. 15,16 Despite advances in treatment methods for LUSC, the 5-year overall survival (OS) rate of LUSC patients in clinical stages I and II is about 40%, and that of LUSC patients in clinical stages III and IV is less than 5%. 17 Therefore, the identification of new prognostic markers and therapeutic targets is important for the clinical treatment of LUSC.
In this study, we performed a series of bioinformatics analyses on EXT1 and EXT2 in LUSC, including transcriptional analysis, co-expression analysis, functional annotation enrichment analysis, protein-protein interaction (PPI) analysis, survival analysis, and constructed a miRNA-EXT regulation network. We observed increased levels of EXT1 and EXT2 expression in LUSC, whereas only EXT1 was associated with poor OS prognosis in LUSC. Furthermore, we identified three regulatory miRNAs of EXT1, hsa-mir-190a-5p, hsa-mir-195-5p, and hsa-mir-490-3p, which could potentially be involved in molecular mechanisms underlying of the disease. Our results thus provide novel insights to improve the prognosis of LUSC patients.

| Bibliometric analysis
VOS viewer is primarily intended to be used for analyzing bibliometric networks. 18 In the view, the larger the number of items in the neighborhood of a point and the higher the weights of the items, the closer the color of the point is to red.

| Oncomine analysis
Oncomine (www.oncom ine.org), a cancer microarray database and web-based data-mining platform, was used to analyze the transcription levels of EXT1 and EXT2 in different cancers. The mRNA expression of EXT1 and EXT2 in clinical cancer specimens were compared with that in normal controls, using the Student's t-test. Fold change>1.5 with pvalue <0.01 was considered statistically significant.

| UALCAN analysis
To increase the credibility of the data, we further analyzed the transcriptional and clinical data for EXT1 and EXT2 from TCGA. The UALCAN platform (http://ualcan.path. uab.edu) allows users to examine relative expression levels of a query gene or gene set among specified tumor sub-groups. These pre-defined tumor sub-groups include cancer stage, tumor grade, race, or other clinicopathologic features. 19

| CCLE analysis
The Cancer Cell Line Encyclopedia (CCLE) 20 (www.broad insti tute.org/ccle) project is a collaboration between the Broad Institute, the Novartis Institutes for Biomedical Research and the Genomics Institute of the Novartis Research Foundation to conduct a detailed genetic and pharmacologic characterization of a large panel of human cancer models, to develop integrated computational analyses that link distinct pharmacologic vulnerabilities to genomic patterns and to translate cell line integrative genomics into cancer patient stratification. 21 CCLE is a public database that supports genomic data analysis and visualization of about 1000 cell lines. EXT1 and EXT2 expression in cancer cell lines was verified using the CCLE datasets.

| RNA extraction and quantitative realtime PCR
Total RNA was extracted from cells using Trizol reagent (Sangon Biotech) according to the manufacturer's instructions. For mRNAs quantification, RNA was reverse transcribed to cDNA using the PrimeScript™ RT reagent Kit with gDNA Eraser (Takara). Quantitative real-time PCR was performed using cDNA primers specific for mRNA. All the realtime PCR reactions were performed using Takara Bio's SYBR Premix Ex Taq™ II in the BIO-RAD CFX96 Real-Time PCR System. The 2 −ΔΔCt method was used for quantification and fold change for target genes was normalized by internal control. The PCR reaction conditions were as follows: 95°C for 10 min followed by 40 cycles of 95°C for 5 sec, 60°C for 30 sec and 72°C for 30 sec. The expression levels were normalized against those of the internal reference gene β-actin.

| Co-expressed genes
The top 100 genes co-expressed genes with EXT1 were selected from the co-expressed genes datasets in the Oncomine database, based on a cut-off of p-value ≤0.01 and fold change ≥1.5.

| PPI networks
The STRING (Search Tool for the Retrieval of Interacting Genes) database (https://strin g-db.org, version 11.0) is a biological database designed for the construction of PPI network of genes, based on known and predicted PPIs, and analysis of the functional interactions between proteins. 23 Analysis of the functional interactions between proteins may provide insights into the mechanisms underlying the development of diseases. In this study, a PPI network of co-expressed genes was constructed using the STRING database and an interaction with a combined score >0.4 was considered statistically significant. Cytoscape (version 3.7.2), 24 an open source bioinformatics software platform, was used for visualizing the molecular interaction networks.

| GO annotation enrichment and KEGG pathway enrichment analysis
The gene ontology (GO) resource provides a platform for functional annotation and enrichment analysis of genes. 25 KEGG (Kyoto Encyclopedia of Genes and Genomes) is a comprehensive database of biological information designed to assist in the interpretation of large-scale molecular data sets. 26 p < 0.05 was considered statistically significant for GO annotation enrichment analysis and KEGG pathway enrichment analysis.

| The Kaplan-Meier plotter
The prognostic significance of expression of identified miR-NAs in LUSC was evaluated using the Kaplan-Meier plotter (www.kmplot.com), an online tool for meta-analysis based discovery and validation of survival biomarkers with data based on gene expression and clinical data from multiple sources. To assess the prognostic value of a specific miRNA, patient samples are divided into two cohorts according to the median expression of the gene (high vs. low). We obtained the Kaplan-Meier survival plots for the shortlisted miRNAs and assessed the association with OS in LUSC patients based on the number-at-risk values, log rank p-value and hazard ratio (HR) with 95% confidence intervals available for each plot.

| Identification of candidate miRNAs and miRNA-mRNA regulation network
Although considerable progress has been made, identification of differentially expressed miRNAs involved in the regulation of mRNA is still critical for a complete understanding of miRNA-mRNA regulation network in LUSC. We compared the transcriptional levels of miRNAs in LUSC with those in normal samples by using ENCORI database. Further, as EXT1 overexpression was associated with unfavorable prognosis in LUSC, we hypothesize that the miRNAs regulating EXT1 should ideally predict favorable prognosis. Predicted miRNAs-mRNA regulation networks were visualized in Cytoscape.

| EXT1 and EXT2 expression in LUSC patients
There were 239 relevant literatures with EXT as the keyword in PubMed from 2010 to 2020. As shown in Figure S1, EXT is worth noting that tumor biomarkers are also a prominent focus of research. Nevertheless, the expression and prognostic value of EXT genes in human LUSC are not well understood. We compared the mRNA expression of EXT1 and EXT2 in LUSC samples with those in normal samples in the Oncomine database ( Figure 1A). The expression levels of EXT1 were significantly higher (p < 0.001) in two datasets (the Talbot Lung and Hou Lung) as compared with normal samples ( Figure 1B). However, EXT2 expression levels were not significantly different between tumor and normal tissues ( Figure 1C). Notably, the expression of both EXT1 and EXT2 in LUSC tissues was significantly higher than those in normal tissues in the UALCAN analysis of samples from TCGA database ( Figure 1D and E). Statistically significant differences were observed between tumor and normal samples grouped based on clinical data such as age, tumor stage, lymph node metastasis, smoking habits, histological subtypes, and TP53-muation status ( Figure 2C-H), but there were no differences in race or gender (Figure 2A and B).

| EXT1 and EXT2 expression in NSCLC cell lines
We included data from the Cancer Cell Line Encyclopedia (CCLE) (www.broad insti tute.org/ccle) database to extend our analysis to preclinical human cancer models. We observed high expression of EXT1 and EXT2 in NSCLC cell lines ( Figure 3A). To validate the findings from the analysis of microarray-based datasets, we measured the expression of EXT mRNA and protein in five NSCLC cell lines (A549, PC9, NCI-H1299, NCI-H460, and NCI-H23) and human bronchial epithelioid (HBE) cells by qRT-PCR and western blot, respectively. Those results confirmed that not only EXT1, but also EXT1 expression levels were significantly higher in NSCLC cell lines than those in the control HBE cells (p < 0.01), consistent with the results of our analysis ( Figure 3B-D). Similarly, EXT2 was also significantly overexpressed in all NSCLC cell lines (p < 0.01), except NCI-H23 ( Figure 3B-D). These results suggest that upregulation of EXT1 and EXT2 may be closely associated with the biological characteristics of malignant LUSC.

| Association of EXT1 and EXT2 expression with prognosis in LUSC patients
The association of EXT1 and EXT2 expression with OS and disease-free survival (DFS) in patients with LUSC was analyzed using the GEPIA server. As shown in Figure 4A, the OS rate of patients with high EXT1 expression was significantly lower than that of patients with low EXT1 expression (p = 0.027), but the association with DFS rate was not statistically significant (p = 0.35). The association of EXT2 expression with both OS rate and DFS rate of LUSC patients was not statistically significant ( Figure 4B). Thus, survival analysis revealed that increased EXT1 mRNA levels were significantly associated with reduced OS in LUSC patients.

| Genes co-expressed with EXT1 and functional enrichment analysis
Based on the results of the expression and survival analysis described above, we selected EXT1 for further bioinformatics analysis. The top 100 genes co-expressed with the EXT1 gene in LUSC were screened from the Gemma Cell Line dataset of Oncomine database ( Figure 5). A protein-protein interaction (PPI) network was generated in the STRING protein interaction database ( Figure 6A) and imported into the bioinformatics software platform Cytoscape (Version 3.7.1) for visualization ( Figure 6B) and further analysis. Functional annotation enrichment analysis using Gene Ontology (GO) ( Figure 7A) and KEGG pathway enrichment analysis ( Figure 7B) showed that the co-expressed genes were significantly enriched in biological processes such as cell matrix adhesion, cell connectivity, regulation of inflammatory response, regulation of multi-organism processes, and regulation of NIK/NF-kappaB signaling, molecular functions such as cytokine binding, protein binding, receptor binding and matrix adhesion and cellular components such as cell matrix junction, membrane microstructural domain, receptor complex, and adhesion spot. The most enriched KEGG pathways included extracellular matrix receptor interaction, proteoglycans in cancer, complement and coagulation cascade, tumor necrosis factor signaling pathway, and cell death among others.

| Regulatory miRNAs and survival analysis
MiRNAs are short non-coding RNAs that induce mRNA silencing and destabilization by binding to specific target sites. 28 MiRNAs inversely regulate their target mRNAs resulting in a negative correlation between miRNA and mRNA expression. 29 Therefore, potential regulatory miR-NAs should meet the following two criteria, decreased expression in LUSC samples and association of decreased expression with poor prognosis in LUSC patients. The ENCORI platform predicted a total of 42 miRNAs regulating EXT1 (Table 1). Among them, 22 miRNA-EXT1 pairs were negatively correlated. The Kaplan-Meier plotter was used to evaluate the prognostic value of the 22 miRNAs in LUSC. Of these, the prediction of poor prognosis for low expression in LUSC patients was significant for nine miRNAs (Figure 8). The ENCORI pan-cancer analysis platform was used to compare the expression of these nine miRNAs in LUSC and normal samples. Three miRNAs (hsa-miR-190a-5p, hsa-miR-195-5p, and hsa-miR-490-3p) were found to be significantly downregulated in LUSC samples ( Figure 9A-C).

| MiRNA-EXT1 regulation network
We established a potential miRNA-EXT1 regulation network based on the regulatory miRNAs of EXT1 identified by bioinformatics analysis using the ENCORI database and visualized it in Cytoscape ( Figure 9D). Thus, the establishment of a potential regulatory network of miRNA-EXT1 may be prognostics biomarkers and a therapeutic target.

| DISCUSSION
Dysregulation of the EXT1 gene has been reported in many cancers, including multiple osteochondroma (MO), 30 breast cancer, 7 ALL 31 and HCC. 6 To the best of our knowledge, the association of EXT1 expression with LUSC has not been reported. This is the first study to explore the prognostic value of EXT1 mRNA expression in LUSC. Our findings add to the current knowledge and may contribute towards improving treatment options and increase the accuracy of prognosis for patients with LUSC. It is reported that 70% to 90% MO cases are caused by pathogenic mutations in the EXT1 or EXT2 genes, and EXT1 is more frequently mutated than the EXT2 gene. 32 Furthermore, EXT1 regulates the NOTCH pathway in an FBXW7-dependent manner in ALL. 5 Moreover EXT1-dependent HS structure is involved in modifying tumor-stroma interactions through altering stromal TGF-ß1 expression in human A549 carcinoma cells. 33 Our study of transcriptional data from Oncomine, UALCAN, TCGA and CCLE revealed increased levels of EXT1 and EXT2 expression in LUSC samples and cell lines. There were significant differences between tumor and normal samples grouped in age, tumor stage, lymph node metastasis, smoking habits, histological subtypes, and TP53-muation status. Notably, the difference in expression levels between cancer and adjacent normal tissues was statistically significant only for of EXT1 in the Talbot Lung and Hou Lung datasets. Furthermore, EXT1 mRNA and protein expression was significantly overexpressed in the five NSCLC cell lines studied (A549, PC9, NCI-H1299, NCI-H460, NCI-H23), as compared with HBE cells, whereas EXT2 mRNA and protein expression was significantly overexpressed in all except the NCI-H23 cell line. Survival analysis showed that patients with high EXT1 expression had unfavorable OS prognosis. These results suggest that the overexpression of EXT1 could be a novel potential prognostic marker in LUSC.
We mapped the top 100 genes co-expressed with EXT1 into the STRING database and obtained the PPI network to identify the interactions between these genes. A functional enrichment and analyze was carried out to further understand the role of genes co-expressed with EXT1 in LUSC. The GO enrichment analysis results indicated that these genes are primarily involved in biological processes such as cell adhesion and migration. Furthermore, KEGG pathway enrichment analysis revealed that the co-expressed genes were enriched in multiple pathways including, extracellular matrix-receptor interaction, proteoglycans in cancer, the complement and coagulation cascade, tumor necrosis factor signaling pathway, and cell death, among others. In particular, GO and pathway enrichment analysis indicated that the co-expressed genes were significantly enriched in focal adhesion. It is well documented that focal adhesion and cell adhesion play a key role in cancer invasion and metastasis. 34,35 Thus, our findings show that EXT1 may be involved in the invasion and metastasis of LUSC. MicroRNAs (miRNAs) are short non-coding RNAs with regulatory functions in various biological processes including cell differentiation, development and oncogenic transformation. 36 Numerous studies have shown that miR-NAs bind to the mRNA transcripts of protein-coding genes, inhibiting their translation or leading to mRNA degradation. We used the ENCORI platform to predict the miRNAs regulating EXT1 and found 42 miRNAs, listed in Table 1, of which 22 were down-regulated in LUSC. Furthermore, we analyzed OS and DFS associated with the expression of these F I G U R E 8 Kaplan-Meier plots for miRNAs negatively correlated with EXT1 expression in LUSC patients (Kaplan-Meier plotter). LUSC patients with low expression of miRNAs had a poor prognosis. (EXT, Exostosin; LUSC, lung squamous cell carcinoma; miRNA, microRNA) 22 miRNAs. Negatively regulated miRNA-mRNA pairs have been reported to significantly contribute to the initialization and development of different types of cancers. [37][38][39] We identified three significantly down-regulated miRNAs, hsa-miR-190a-5p, hsa-miR-195-5p, and hsa-miR-490-3p, with good prognostic value.
Functionally, hsa-miR-190a-5p has been reported to act as a tumor suppressor in multiple malignancies. miR-190a-5p expression levels are significantly decreased in the cancer group compared with the normal group, and overexpression of miR-190a-5p inhibits cell proliferation and invasion and promotes apoptosis in cancers such as cervical cancer, neuroblastoma, and breast tumors. [40][41][42] A recent study showed that smoking-induced dysregulation of hsa-miR-190a-5p was significantly associated with epithelial-mesenchymal transition (EMT) and carcinogenesis. 43 Furthermore, hsa-miR-195-5p has also been demonstrated as a tumor suppressor in many human cancers, including renal cell carcinoma, gastric cancer, ovarian cancer, pancreatic cancer, melanoma, HCC, and colorectal cancer. [44][45][46][47][48][49][50] The expression of miR-195-5p is decreased in NSCLC tissues and cell lines and significantly associated with the TNM stage, tumor size and lymph node metastasis, while being correlated with poor prognosis in NSCLC patients. Functional analysis has revealed that overexpression of miR-195-5p suppressed cell proliferation, promoted cell cycle arrest and apoptosis in NSCLC significantly. 51 Several studies have also demonstrated similar behavior for hsa-miR-490-3p, wherein decreased expression of the miRNA was significantly associated with tumorigenesis of human cancers, such as ovarian carcinoma, 52 colorectal cancer, 53,54 glioma, 55 prostate cancer, 56 esophageal squamous cell carcinoma, 57 HCC 58 and increased expression of the miRNA inhibited cellular growth, suppressed cellular migration and invasion.
Overall, our findings are consistent with previous studies and indicate that the three miRNAs identified in this study, hsa-miR-190a-5p, hsa-miR-195-5p and hsa-miR-490-3p, play an important role in the inhibition of malignant tumors. Thus, we have established a potential miRNAs-EXT1 regulation network that may be associated with prognosis in LUSC. In summary, based on the bioinformatics analyses presented in this study, we suggest EXT1 as a novel potential prognostic marker for LUSC and present the miRNAs regulating EXT1 which could be involved in carcinogenesis. We hope that our findings will benefit future studies and improve the prognosis of LUSC patients.