Identification of PSMD14 as a potential novel prognosis biomarker and therapeutic target for osteosarcoma

Abstract Background Osteosarcoma is the most common primary bone tumor. The survival rate of osteosarcoma patients has not significantly increased in the past decades. Uncovering the mechanisms of malignancy, progression, and metastasis will shed light on the development of new therapeutic targets and treatment for osteosarcoma. Aim The aim of this study is to identify potential osteosarcoma biomarker and/or therapeutic targets by using integrated bioinformatics analysis. Methods and results We utilized existing gene expression datasets to identify differential expressed genes (DEGs) that could serve as osteosarcoma biomarkers or even as therapeutic targets. We found 48 DEGs were overlapped in three datasets. Among these 48 DEGs, PSMD14 was on the top of the up‐regulated gene list. We further found that higher PSMD14 expression was correlated with higher risk group (younger age group, ≤20.83 years of age), metastasis within 5 years and higher grade of tumor. Higher PSMD14 expression in osteosarcoma had positive correlation with higher infiltration of CD8+ T cells, neutrophils and myeloid dendritic cells. Kaplan‐Myer survival data further revealed that higher expression of PSMD14 predicted significantly worse prognosis (p = .013). Gene set enrichment analysis was further performed for the DEGs related to PSMD14 in osteosarcoma. We found that lower PSMD14 expression group had more immune responses such as interferon γ, α responses, inflammation response etc. However, the higher PSMD14 expression group had more cell proliferation‐related biological processes, such as G2M checkpoints and Myc targets. Through establishing protein–protein interaction networks using PSMD14 related DEGs, we identified 10 hub genes that were all ribosomal proteins. These hub genes may play roles in osteosarcoma tumorigenesis, progression and/or metastasis. Conclusion We identified PSMD14 gene as a possible osteosarcoma biomarker, and/or a possible therapeutic target.


| BACKGROUND
Osteosarcoma is the most common primary bone tumor. 1 It mostly occurs in children and young adults. Although osteosarcoma can develop at any age, teenagers are the most commonly affected age group. 1 It occurs to more than 3.4 million people worldwide every year. 2 Great advances have been made in the treatment and management of osteosarcoma in the recent years, but a significant percentage of patients do not survive because of numerous reasons, such as metastasis. 3 The 10-year survival rate for patients with metastatic osteosarcoma is only 24.0% compared to 65.8% for the patients with local/regional disease. 3 Although by developing target therapies has significantly increased the survival rate of other cancers, the survival rate of osteosarcoma patients has remained the same in the recent a few decades. 4 Understanding the mechanisms of the malignancy, finding the prognosis markers and potential therapeutic target(s) are the keys to significantly increasing the survival rate after treatment.
Although the mechanisms of osteosarcoma malignancy is far from fully understood, alteration of gene expression has been known to play significant roles in the tumorigenesis of osteosarcoma. 4 Several gene alterations have been identified to be associated with osteosarcoma, such as TP53, [5][6][7] Rb, 8 Myc. 9 Recently gene expression profiling was used to uncover the pathophysiology. These gene expression datasets provided insight to the gene expression alteration in osteosarcoma. Comparing multiple gene expression datasets of osteosarcoma will provide further information for potential prognosis markers or therapeutic targets. In this study, we utilized multiple datasets to identify potential osteosarcoma prognosis markers and/or therapeutic targets.
As a component of 26S proteasome, PSMD14 forms multiprotein complex with several other components in the proteasome. Together, they are involved in the ATP-dependent degradation of ubiquitinated proteins. [10][11][12] This is the most important machinery to maintenance of protein homeostasis by removing misfolded or damaged proteins. Besides involved in the proteasome function, PSMD14 reportedly plays critical roles in regulating many signaling pathways, such as double-strand DNA break repairs, 10 embryonic stem cell differentiation, 13 cellular proliferation, 14,15 maintenance of protein stability, 16,17 Golgito-ER retrograde transport, 18 and inflammasome activation. 19 However, whether PSMD14 plays any roles in osteosarcoma is not elucidated. In this study, we utilized existing osteosarcoma gene expression profile datasets from Gene Expression Ominbus (GEO) and identified PSMD14 as a top differential expressed gene (DEG). Based on the novelty and our preliminary screens results, we did further analysis on the correlation of PSMD14 expression and osteosarcoma clinical features. The expression level of PSMD14 was found to be highly correlated to osteosarcoma prognosis. This finding hinted that PSMD14 could potentially be used as an osteosarcoma prognosis marker, could also potentially be used as a therapeutic target.

| Identification of differential expressed genes
DEGs between the osteosarcoma samples and control samples in all datasets were screened by using the Limma package. 24 Criteria used for DEGs are p < .05 and "fold change" >1.5. The overlapping of the DEGs between the datasets identified from Limma analysis of each dataset were analyzed with VENN package and plotted with Venn-Diagram. 25 For identification of DEGs associated with PSMD14 expression, osteosarcoma patients were divided into high PSMD14 expression and low PSMD14 expression groups by using median expression of PSMD14 as cut-off.

| Gene ontology enrichment assay
Gene ontology enrichment of the DEGs were analyzed by using Metascape (https://metascape.org) online software, 26 with the parameters setting as follows: min overlap = 3, min enrichment = 1.5 and p value cut-off = .01. Biological Process, Molecular Function, Cellular Component categories were focused and presented.

| Protein-protein interaction network analysis
In order to understand the interactions among the DEGs in osteosarcoma, we used STRING (http://www.string-db.org) 27 to predict the potential interactions (score > 0.4). The interaction networks were visualized by using Cytoscape software. 28 PPI network within the proteins that potentially interact with DEG proteins were analyzed and established by using NetworkAnalyst (https://www.networkanalyst.   ca) online analysis. 29 PPI networks within the DEGS associated to PMSD14 expression were analyzed by using MCODE, and top 10 hub genes were identified by using CytoHubba. 30

| Immune cells infiltration analysis
Immune cells infiltration in osteosarcoma tissues analysis was performed by analyzing the TCGA gene expression profile dataset and TARGET dataset on TIMER2.0 web server. 31 The correlation between expression of PSMD14 gene and the immune cells infiltration was analyzed.

| Survival analysis
The overall effect of the expression level of PSMD14 on patient survival was analyzed by using Kaplan-Meier curve. Patients in the dataset obtained from TARGET were divided into PSMD14 high expression group and PSMD14 low expression group by using median expression of PSMD14 as cut-off. The difference between the low and high PSMD14 expression groups was analyzed by log-rank test. A P value less than 0.05 was defined as significant different.

| Gene set enrichment analysis
In order to analyze the gene ontology of the genes significantly correlated to PSMD14 expression in osteosarcoma, we utilized gene set enrichment analysis (GSEA) computational method. 32 Patients were divided into PSMD14 high-expression and lowexpression groups by using median expression of PSMD14 as cut-

| RESULTS
Identification of DEGs in osteosarcoma and gene ontology analysis. In order to identify the DEGs in osteosarcoma, we analyzed three genome-wide gene expression profile datasets GSE14359, 20 GSE42352, 21,22 and GSE39262. From these three datasets, we identified 1708, 2214 and 451 DEGs, respectively, in osteosarcoma to compared normal (or non-neoplastic) cells ( Figure 1A). There are 48 DEGs were overlapping within the DEGs from the three datasets ( Figure 1A). To understand the overall function of the overlapped DEGs, we further performed gene ontology enrichment analysis for these 48 genes, focusing on three categories, biological process, molecular function and cellular component. We found that these genes were enriched in these biological process categories: mitotic cell cycle phase transition, nuclear DNA replication, spindle organization ( Figure 1B). In molecular function category, these genes were mainly enriched in single-stranded DNA binding, motor activity ( Figure 1C). In cellular component category, we found nuclear replisome was the most enriched category ( Figure 1D). To understand the relationship between the proteins encoded by these DEGs, we performed protein-protein interaction (PPI) analysis, we found that 25 proteins among the proteins encoded by these DEGs interact with each other ( Figure 1E). Furthermore, numerous proteins are predicted to interact with these proteins and form a PPI network ( Figure 1F).
The results indicated that these DEGs may play important roles in DNA replication and cell proliferation.

| Up-regulation of PSMD14 expression was correlated to osteosarcoma
Among the 48 DEGs, Table 1 listed top 10 up-regulated and Table 2 listed top 10 down-regulated genes. Among the top 10 up-regulated genes, we found that PSMD14 was the most significantly upregulated gene. PSMD14 has been recently reported to play important roles in other cancers, such as lung adenocarcinoma, 14,33 hepatocellular carcinoma, 34,35 glioma, 17 and esophageal squamous cell carcinoma. 36 But whether PSMD14 plays any roles in osteosarcoma is not studied. Based on the preliminary screening analysis and the novelty, we performed further analysis focus on PSMD14 gene. In all three datasets, PSMD14 gene was significantly up-regulated ( Figure 2A-C). By using relative larger sample sized GSE42352 dataset, the receiver operating characteristic (ROC) showed that PSMD14 expression could positively predict whether the tissue was osteosarcoma ( Figure 2D). This finding hinted a possibility of using PSMD14 as a diagnosis marker for osteosarcoma. Huvos Grade of osteosarcoma, we found that higher grade, that is, grade 4 had higher level of PSMD14 expression compare to grade 2 and grade 3 ( Figure 3C). Logic regression model analysis further confirmed that age of the patients was significantly correlated to PSMD14 gene expression level ( Figure 3D).

| High expression of PMSD14 gene was correlated to osteosarcoma immune cell infiltration and poor prognosis
By analyzing the SARC patients gene profile dataset from the TCGA, we found that expression level of PSMD14 was positively correlated to infiltration of CD8+ T lymphocytes ( Figure 4A), neutrophils ( Figure 4B), as well as myeloid dendritic cells ( Figure 4C). We further analyzed osteosarcoma patient dataset from TARGET database by comparing PSMD14 high expression patient group to PSMD14 low expression patient group. The result indicated that higher PSMD14 expression was correlated to shorter survival time ( Figure 4D). The Kaplan-Meier curve further showed that high expression of PSMD14 was significantly correlated to poor prognosis ( Figure 4E).

| Gene ontology analysis of DEGs associated with expression level of PSMD14 gene
We utilized GSE42352 dataset to identify genes that were differentially expressed between the high and low PSMD14 expression groups in osteosarcoma, by dividing the GSE42352 cohort into PSMD14 high expression and PSMD14 low expression groups based on the median expression level of PSMD14. The characteristics of the patients are listed in Table 3. We identified 753 DEGs, including 619 genes that were significantly up-regulated in PSMD14 high group, and 134 genes that were significantly down regulated in the PSMD14 high group ( Figure 5A,B). Gene ontology enrichment analysis (Tables 4-6) further revealed that these DEGs were overrepresented in translational initiation, mitotic cell cycle phase transition, antigen processing and presentation of peptide antigen, and ribosome biogenesis categories ( Figure 5C, Table 4). For molecular function, these DEGs were overrepresented in amide binding, cell adhesion molecule binding, peptide antigen binding, structural constituent of ribosome, and protein kinase regulator activity categories ( Figure 5D, Table 5). As for cellular component terms, genes were primarily enriched in anchoring junction, MHC protein complex, vacuolar part, ribosome, and cytoplasmic vesicle membrane ( Figure 5E, Table 6).

| Gene set enrichment analysis
GSEA verified that enriched biological processes were different, when comparing PSMD14 low expression group with PSMD14 high expression group (Table 7). Significantly enriched biological process in the PSMD14 low expression group included: interferon γ response ( Figure 6A), interferon α response ( Figure 6B Figure 6N). However, in the PSMD14 high expression group, enriched categories were more related to cell proliferation, such as G2M checkpoint ( Figure 6O) and Myc target ( Figure 6P). This data indicated that low PSMD14 expression osteosarcoma patients may have higher immune response.

| Protein-protein interaction network analysis
To further understand the potential relationships between the DEGs associated to PSMD14 expression level, we performed PPI analysis ( Figure 7A-E). The PPI network revealed that these DEGs were involved in five different subnetworks. Ten hub genes were identified by using CytoHubba. Interestingly, these genes were all ribosomal genes: RPL7A, RPS24, RPL26, RPS18, RPL23A, RPL27A, RPL7, RPL10A, RPS3A, and RPL23. This was consistent with the GO enrichment analysis ( Figure 5). The PPI network within the 10 hub genes further proved the interaction between these proteins. identify potential prognosis marker(s), and/or therapeutic targets.
We utilized existing datasets to identify the DEGs between osteosarcoma and normal tissue or cells counterpart. Among the top DEGs, PSMD14 was the most significant one. By further analyzing the data, we found that high expression of PSMD14 was significantly correlated to higher osteosarcoma grade and metastasis ( Figure 3). This data indicated that PSMD14 could potentially serve as a prognosis marker.  47 Interestingly, T lymphocytes are the one of major immune cells infiltrated in osteosarcoma tissues, 48,49 and high CD8+ T lymphocytes/FOXP3 cells ratio predicted better prognosis. 48 However, by identifying from the gene expression, it seemed that CD8+ T lymphocytes were relatively abundant in the PSMD14 high expression osteosarcoma tumors (Figure 4).
The status of the CD8+ T lymphocytes within these tumors will be very interesting to be further investigated. For instance, the possibility that PSMD14 high expression tumor cells could induce CD8+ T lymphocytes senescence or exhaustion could be an explanations for the positive correlation between CD8+ T lymphocytes, PSMD14 high expression and worse prognosis in osteosarcoma. In fact, the low PSMD14 expression osteosarcoma group, which had better prognosis, F I G U R E 7 Visualization of protein-protein-interaction networks of significant differential expressed genes (DEGs) based on the expression of PSMD14. (A-E) Five subnetworks identified by the MCODE algorism. (F) Ten hub genes of significant DEGs identified by PPI networks showed relatively enriched immunresponse as shown in Figure 6A-D, and showed higher apoptosis ( Figure 6F). However, the DEGs in the higher PSMD14 expression group indicated more enrichment of cell proliferation genes, such as G2M checkpoints and Myc targets ( Figure 6O,P). Whether high expression of PSMD14 osteosarcoma cells suppress immune cells activation and the possible cellular and molecular mechanisms warrant further investigation, which will potentially lead to better understanding of the mechanisms of osteosarcoma progression and/or metastasis or even more possible osteosarcoma treatments.
Taken together, in this study we reported that PSMD14 was significantly up-regulated in osteosarcoma compare to normal tissue in all datasets we analyzed. The high expression level of PSMD14 was found to be correlated with metastasis and worse prognosis in osteosarcoma. Therefore, it is possible that PSMD14 can serve as a prognosis marker and a therapeutic target. To our knowledge, this is the first report demonstrating that PSMD14 can be potentially used as a prognosis marker for osteosarcoma. However, due to the limitation of bioinformatics analysis, the protein level of PSMD14 in osteosarcoma is not well studied. Our next goal is to confirm the protein level expression of PSMD14 in human osteosarcoma biopsies, and to further prove the possibility of PSMD14 as a prognosis marker and/or therapeutic target.