Identification of a prognostic gene signature based on an immunogenomic landscape analysis of bladder cancer

Abstract Cancer immune plays a critical role in cancer progression. Tumour immunology and immunotherapy are one of the exciting areas in bladder cancer research. In this study, we aimed to develop an immune‐related gene signature to improve the prognostic prediction of bladder cancer. Firstly, we identified 392 differentially expressed immune‐related genes (IRGs) based on TCGA and ImmPort databases. Functional enrichment analysis revealed that these genes were enriched in inflammatory and immune‐related pathways, including in ‘regulation of signaling receptor activity’, ‘cytokine‐cytokine receptor interaction’ and ‘GPCR ligand binding’. Then, we separated all samples in TCGA data set into the training cohort and the testing cohort in a ratio of 3:1 randomly. Data set GSE13507 was set as the validation cohort. We constructed a prognostic six‐IRG signature with LASSO Cox regression in the training cohort, including AHNAK, OAS1, APOBEC3H, SCG2, CTSE and KIR2DS4. Six IRGs reflected the microenvironment of bladder cancer, especially immune cell infiltration. The prognostic value of six‐IRG signature was further validated in the testing cohort and the validation cohort. The results of multivariable Cox regression and subgroup analysis revealed that six‐IRG signature was a clinically independent prognostic factor for bladder cancer patients. Further, we constructed a nomogram based on six‐IRG signature and other clinicopathological risk factors, and it performed well in predict patients' survival. Finally, we found six‐IRG signature showed significant difference in different molecular subtypes of bladder cancer. In conclusions, our research provided a novel immune‐related gene signature to estimate prognosis for patients' survival with bladder cancer.


| INTRODUC TI ON
Bladder cancer is one of the most common malignant cancers worldwide. The new global cancer statistics showed an estimated 549 000 new bladder cancer cases and 200 000 deaths in 2018. 1 The main treatment of bladder cancer is surgery and chemotherapy. The therapeutic effect is still not satisfactory, Patients who underwent radical cystectomy had a 5-year overall survival rate of 66% and a 10-year survival rate of 43%, and there is still a lack of public recognized, universally applicable bladder cancer diagnosis and prognostic evaluation markers. Increasing evidence has revealed tumour microenvironment plays an important role in carcinogenesis and progression of bladder cancer. [2][3][4] Some immunological checkpoint molecules have been found to be particularly important in the regulation of tumour microenvironment, A series of corresponding immunological checkpoint inhibitors as PD-1/PD-L1 has been applied in bladder cancer treatment. 5,6 However, current PD-1/PD-L1 immunotherapies showed a poor clinical efficacy for clinical patients only range from 16% to 25%. An increasing number of studies have found that many other immune-related genes can also affect the prognosis and efficacy of immunotherapy. Therefore, there is an urgent need to elucidate the immunophenotypes and identify novel immune-related prognostic genes in bladder cancer.
In this study, we aimed to gain insight into the immunogenomic landscape of bladder cancer and develop an immune-related gene signature to improve the prognostic predictions of bladder cancer. nlm.nih.gov/geo/). This data set included 165 primary bladder cancer samples, 23 recurrent non-muscle invasive tumour tissues, 58 normal looking bladder mucosae surrounding cancer and 10 normal bladder mucosae for microarray analysis. We also obtained a list of immune-related genes (IRGs) from the Immunology Database and Analysis Portal (ImmPort). 7,8

| Data processing and differentially expressed gene (DEG) screening
For TCGA-BLCA data, the gene expression data were based on the RNA-sequencing technology of Illumina HiSeq. Ensembl IDs were converted to gene symbol by the corresponding GENCODE files.
'edgeR' R package was performed for differential expression analysis. The cut-off criteria for screening DEGs were the FDR (false discovery rate) < 0.05 and |log 2 (fold change)| ≥ 1. For the microarray analyses, RMA method was used to perform background correction for the raw expression data at first, and log2 transformation and normalization were performed for processed signals. Then, probes were annotated by the corresponding annotation files.

| Pathway and functional enrichment analysis
To further indicate the underlying mechanism of IRGs impact on correlative clinical feature, the pathway enrichment analysis was performed using above differentially expressed IRGs. Metascape (http://metas cape.org/) was used to gain insights into the biological functions of IRGs. 9 Metascape is a web portal for gene annotation and analysis. It provides automated meta-analysis tool to understand common and unique pathways within a group of orthogonal target-discovery studies. It also supports protein-protein interaction (PPI) analysis based on BioGRID, interactive visualization of Gene Ontology (GO) Networks and enrichment heatmap generation.

| Establishment of IRG signature with LASSO Cox regression model
The LASSO Cox regression analysis (LASSO, least absolute shrinkage and selection operator 10 ) can achieve shrinkage and variable selection simultaneously by performing the Cox regression model with LASSO penalty. It was performed to establish the most valuable predictable IRG signature. Firstly, we used R software to generate a random number, and all samples in TCGA data sets were randomly assigned to the training cohort (n = 303) and the testing cohort (n = 101) in a ratio of 3:1 using the randomization method. Data set GSE13507 was set as the validation cohort. Univariable Cox analysis was used to identify immune-related genes associated with survival.
Then, LASSO Cox regression analysis was performed using R package 'glmnet' to construct a prognostic IRG gene signature based on survival-related IRGs in the training group. We estimated the optimal values of penalty parameter lambda through 10-fold cross-validations. 11,12 A prognostic IRG signature was then constructed based on the mRNA expression level weighted by the estimated regression coefficient in the LASSO Cox regression model. The risk score of six-IRG signature = ∑ n i=1 (coefi * Expri), where Expr i was the expression of the IRGs in the signature for patient i, and coef i is the LASSO coefficient of the IRGs i.

| Estimate of IRG signature for patients' prognosis
The optimal cut-off values of prognostic IRG signature were evaluated using the X-tile software version 3.6.1 (Yale University School of Medicine, New Haven, CT, USA). 13 The patients were separated into the high-risk and low-risk groups according to the optimal cutoff values. Kaplan-Meier survival analysis and log-rank test were selected to assess the association between each IRG and survival of patients. The accuracy of the survival prediction based on the IRG signature was analysed by a time-dependent receiver operating characteristic (ROC) curve, 14 with quantification of the area under the curve at different cut-off times using the 'survival ROC' package. Cox regression analyses and subgroup analyses were performed to identify whether six-IRG signature was a clinically independent prognostic factor.

| Construction and assessment of the nomogram
The nomogram was used to construct a prognostic scoring system for predicting survival in bladder cancer patient. In the nomogram, a vertical line was drawn from each risk factor to the 'Points' line to get a score in the nomogram. Then, the score of each risk factor was added to obtain the total points, which could be used to estimate survival probability. Calibration plots were performed to assess the performance of the nomogram. R package 'rms' was performed for construction and assessment of the nomogram.

| Genetical alteration of IRG genes
The cBioPortal for Cancer Genomics (http://www.cbiop ortal.org/) is a large-scale cancer genomics database. 15 It provides an open platform to explore, visualize and analyse multi-dimensional cancer genomic data. Researchers can interactively explore the genetic changes of different samples, genes and paths. This site also provides gene-level graphical summaries from multi-platform, web visualization analysis and survival analysis. We used cBioPortal to explore genetic alterations connected with the six IRGs and their correlation with other famous genes.

| Statistical analysis
R software version 3.5.2 was used for all the statistical analyses. All statistical tests were two-sided, and P < .05 was considered statistically significant. Group comparisons were performed using the t test for continuous variables and chi-squared test for categorical variables. The Harrell's concordance index (C-index) and Akaike information criterion (AIC) were performed to measure and compare predictive accuracy of the prognostic models.

| Identification of immune-related genes and pathway analysis
The gene expression matrix, which consists of 414 bladder cancer specimens and 19 adjacent non-tumour bladder specimens in the TCGA data set, was analysed to identify differentially expressed genes (DEGs). A total of 5180 DEGs were screened out with a threshold criterion |log2FC| > 1 and FDR < 0.05, including 3108 up-regu-

| Identification of the predictive six-IRG signature for survival prediction
To avoid the interference of unrelated causes of death, patients with follow-up time <30 days were excluded. Afterwards, by performing univariable Cox regression analysis between 392 differentially expressed IRGs above and survival data of 404 patients, we obtained 92 prognosis-related IRGs, which all reached a statistical significance (P < .05). Then, all the 404 patients were randomly assigned to the training cohort (n = 303) and the testing cohort (n = 101). To precisely construct a prognostic model to predict the survival of bladder cancer patients, LASSO Cox regression model was performed to identify the most optimal prognostic IRG signature of bladder cancer patients in the training cohort ( Figure S1). As the result, a prognostic IRG signature consisted of six immune-related genes was screened out, including AHNAK, OAS1, APOBEC3H, SCG2, CTSE and KIR2DS4. Then, TIMER, 17,18 a software calculating the proportion of cell types, was performed to estimate the correlation between six IRGs and immune cell infiltration. As expected, the proportion of all these seven cell types was significantly and negatively correlated with six IRGs

| Validation of the six-IRG signature for survival prediction
The prognostic value of six-IRG signature was further assessed in the testing cohort and the validation cohort. Patients were assigned to the low-risk group and the high-risk group via the optimal cut-off value of risk score in these two independent cohorts. The distributions of the six-IRG signature based on risk F I G U R E 1 Differentially expressed immune-related genes. A-C, The heatmaps visualize the differentially expressed genes in TCGA data set and differentially expressed immune-related genes (IRGs), respectively. B-D, The volcano plots visualize the differentially expressed genes in TCGA data set and differentially expressed immune-related genes (IRGs), respectively. The red nodes represent up-regulated genes. The green nodes represent down-regulated genes score, survival status and six-IRG expression profiles of testing set and validation set are shown in Figure 4A

| Association of six-IRG signature with patients' clinicopathological characteristics
To explore association between six-IRG signature and patients' clinicopathological characteristics, clinicopathological data including age, gender, stage and grade were collected from TCGA database. All the patients were assigned to the high-risk group and the low-risk group  Table 1, the risk score of six-IRG signature showed a significant relevance with age, stage and grade (P < .05). However, there was no association between the risk score and gender of patients with bladder cancer.

| Validation of the six-IRG signature as an independent prognostic factor
To explore whether six-IRG signature was a clinically independent prognostic factor in bladder cancer patients, we performed univariable and multivariable Cox regression analyses in TCGA data set. The risk score of six-IRG signature and other clinicopathological variables, including age, gender, stage and grade, were included as covariates. The result is demonstrated in Figure 5A, the six-IRG signature remained to be an independent prognostic factor even adjusted by age and other covariates in multivariable analyses.

| Construction of nomogram prognostic model based on six-IRG signature
As six-IRG signature performed a strong prognostic ability, we explored its potential to improve prognostic accuracy of bladder cancer clinicopathological risk factors. The result showed the accuracy and efficiency of prognostic risk factors for bladder cancer including age, grade and stage were improved, after incorporating six-IRG signature (Table S2). The integration of six-IRG signature and other prognostic risk factors showed higher c-indices and lower AICs than a single prognostic risk factor. Hence, we estab- The calibration plots showed nomogram did well compared with the ideal model for predicting OS at 1, 3 and 5 years ( Figure 6B-D) F I G U R E 3 Prognostic analysis of six-IRG signature in the training cohort. (A) X-tile analysis indicates red inverse association between risk score and overall survival, (B) shows distribution of risk score of six-IRG signature, (C) timedependent ROC for survival prediction models, (D) indicates Kaplan-Meier survival curves between the high-risk group and the low-risk group

| Gene set enrichment analysis (GSEA)
Gene set enrichment analysis software was performed to in-

F I G U R E 4
Prognostic analysis of six-IRG signature in the testing cohort and the validation cohort. (A-B) The distribution of risk score, OS and OS status and the heatmap of prognostic six-IRG signature in the testing cohort and the validation cohort, respectively. The dotted line indicates the cut-off point of the optimal risk score used to stratify patients into the low-risk group and high-risk group, (C-D) the survival analysis of six-IRG signature in the testing cohort and the validation cohort. Patients in the high-risk group suffered shorter survival intervals

| Genetical alteration of hub genes
The alteration statuses of six IRGs were analysed using TCGA bladder cancer patients' data of cBioPortal database. The six IRGs altered in 122 (29.5%) of 413 cases ( Figure S3B), and the frequency of alteration of each hub gene is shown in Figure S3A. APOBEC3H and OAS1 altered most (10% and 6%, respectively). mRNA upregulation and amplification were the main type. Figure S3C demonstrates the relationship of the 9 genes and the other 50 most frequently altered neighbour genes. OAS1 was significantly important in the network.

| Distribution of IRG scores in different molecular subtypes
Bladder cancer is a heterogeneous disease that is characterized by genomic instability and high gene mutation rate. According to different genomic and transcriptomic characteristics, bladder cancer can be classified into different molecular subtypes. 19 Different molecular subtypes had dramatically different clinical prognosis and respond variably to therapeutic options. To explore the relationship between IRG scores and molecular subtypes, we analysed the distribution of IRG scores in different consensus molecular subtypes, and the results showed the IRG scores had significant difference in different molecular subtypes (Figure 8). The Ba/Sq subtype with the poor prognosis had the highest IRG scores, but the best prognostic LumU and LumP also had low IRG scores.

| D ISCUSS I ON
Recently, the clinical application of immunosuppressive point inhibitors is one of the most important advances in the field of cancer treatment. But immune evasion mechanisms in the TME of advanced human cancers are highly heterogeneous. The PD-1/ PDL1 pathway is responsible for dysfunctional immunity in fewer than 40% of human solid tumours. 20,21 Except for the PD-1/PD-L1, many other molecular or cellular mechanisms also contribute to dysfunctional immunity in the TME, for example up-regulation of suppressive molecules, cytokines, metabolites and down-regulation of immune stimulatory molecules. 22,23 Korpal et al found PPARγ/RXRα pathway impairs CD8 + T cell infiltration and confers partial resistance to immunotherapies of bladder cancer. 24 A study based on 209 patients with bladder cancer showed immune-inflammation index predicts prognosis of bladder cancer patients after radical cystectomy. 25   APOBEC3H is a member of the apolipoprotein B mRNA-editing enzyme catalytic polypeptide 3 families of proteins, and it induces somatic mutagenesis in cancer cells that drive tumour evolution and may manifest clinically as recurrence, metastasis and/or therapy resistance. 31 Hence, it may be used as target for cancer therapy. SCG2 is a neuroendocrine secretory proteins, and it can regulate leucocyte, endothelial and mesenchymal cell functions. 32  In conclusion, the current study gained insight into the immunogenomic landscape of bladder cancer and identified a novel six-IRG-based prognostic model to predict survival of bladder cancer patients. The six-IRG-based nomogram had higher prognostic value than the conventional TNM stage in patients with bladder cancer.
Therefore, our research may provide a novel immune-related gene signature to estimate prognosis for patient survival with bladder cancer, and it might facilitate bladder cancer patients counselling and individualized management.

ACK N OWLED G EM ENTS
The excellent technical assistance of Yayun Fang and Danni Shan, Zhongnan Hospital of Wuhan University, is gratefully acknowledged. We would like to acknowledge the TCGA database. We also would like to acknowledge the GEO, ImmPort and TIMER databases for free use. This study was supported by grant from the Health Commission of Hubei Province Scientific Research Project (grant numbers WJ2019H013, WJ2019H023 and WJ2019H080).
The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are openly available in The Cancer Genome Atlas (TCGA) data portal (https:// tcga-data.nci.nih.gov/tcga/), Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/) and ImmPort data portal (https://www.immpo rt.org).