Multi‐omics of the expression and clinical outcomes of TMPRSS2 in human various cancers: A potential therapeutic target for COVID‐19

Abstract Growing evidence has shown that Transmembrane Serine Protease 2 (TMPRSS2) not only contributes to the severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) infection, but is also closely associated with the incidence and progression of tumours. However, the correlation of coronavirus disease (COVID‐19) and cancers, and the prognostic value and molecular function of TMPRSS2 in various cancers have not been fully understood. In this study, the expression, genetic variations, correlated genes, immune infiltration and prognostic value of TMPRSS2 were analysed in many cancers using different bioinformatics platforms. The observed findings revealed that the expression of TMPRSS2 was considerably decreased in many tumour tissues. In the prognostic analysis, the expression of TMPRSS2 was considerably linked with the clinical consequences of the brain, blood, colorectal, breast, ovarian, lung and soft tissue cancer. In protein network analysis, we determined 27 proteins as protein partners of TMPRSS2, which can regulate the progression and prognosis of cancer mediated by TMPRSS2. Besides, a high level of TMPRSS2 was linked with immune cell infiltration in various cancers. Furthermore, according to the pathway analysis of differently expressed genes (DEGs) with TMPRSS2 in lung, breast, ovarian and colorectal cancer, 160 DEGs genes were found and were significantly enriched in respiratory system infection and tumour progression pathways. In conclusion, the findings of this study demonstrate that TMPRSS2 may be an effective biomarker and therapeutic target in various cancers in humans, and may also provide new directions for specific tumour patients to prevent SARS‐CoV‐2 infection during the COVID‐19 outbreak.


| INTRODUC TI ON
Cancer is considered the most serious and prevalent disease in human beings around the world. The morbidity and mortality rate of cancer is the highest in the world. The 2020 cancer report estimates 19.3 million new cancer cases and 10.0 million cancer-associated deaths. 1 Owing to the increasing and ageing world population, the global cancer burden is expected to be 22.2 and 28.4 million new cases in 2030 and 2040, respectively, as depicted by the current trends. 1,2 In recent years, great efforts have been made for cancer prevention, screening, early detection, standardized treatment and regular follow-up, however, the world still bears a large tumour burden due to the unclear pathogenesis of most tumours and the unavailability of potential biomarkers. 3 It is crucial to extensively explore the pathogenesis of tumours and effective screening markers can be exploited as therapeutic targets.
Since the first case of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections occurred in December 2019, the number of global coronavirus disease (COVID-19) cases is still steadily increasing. As of 1 November 2021, the global COVID-19 pandemic has led to 246,929,884 confirmed cases and 5,003,404 deaths over 188 countries/regions across the world. 4 Clear evidence exists that patients with co-morbidities are more susceptible to the COVID-19, and are more likely to manifest complications and mortality after infection. 5,6 During the epidemic period, due to ageing, decreased immunity, delay in diagnosis, treatment and follow-up, surgery, radiotherapy and chemotherapy and tumour-related multiple co-morbidities, cancer has been identified as an individual risk factor for COVID-19. 6-9 A cohort study from China showed that the overall prevalence of COVID-19 in cancer patients was considerably elevated than the overall incidence in the general population (1% vs. 0.29%). 10,11 In addition, various studies have reported that cancer patients are more likely to have serious clinical outcomes after suffering from COVID-19. 7,10,[12][13][14] This might provide a clue that pays attention to the internal relationship between tumour and COVID-19, which can lead to the preventing as well as controlling of COVID-19 in cancer patients.
Transmembrane Serine Protease 2 (TMPRSS2) is a multifunctional encoding gene and is considered one of the members of the serine protease family. TMPRSS2 contains four domains, that is, protease domain, type II transmembrane domain, receptor class A domain and F I G U R E 1 Expression pattern of TMPRSS2 mRNA in many cancers by Oncomine, UALCAN and GEPIA databases. (A) The expression pattern of TMPRSS2 mRNA in various cancers was searched in the Oncomine database. The underlined graphic was generated from the Oncomine database, revealing the number of datasets (p < 0.01) mRNA over (red) or under expression (blue) of TMPRSS2 (tumour tissue vs. corresponding normal tissue). The threshold was considered with the underlined parameters: p and fold-change were equal to 0.001 and 1.5, respectively, and the data type was mRNA. (B) The expression of TMPRSS2 mRNA in many cancers was searched in the UALCAN database. Boxes represent the median and the 25th and 75th percentiles; green and red boxes indicate normal and tumour tissues respectively. (C) The expression level of TMPRSS2 mRNA in many cancers was searched in the TIMER database. Boxes indicate the median and the 25th and 75th percentiles; blue and red boxes indicate normal and tumour tissues respectively. Blue and red dashed lines indicate the average value of normal and tumour tissues respectively. *p = 0.05，**p = 0.01, ***p = 0.001 scavenger receptor cysteine-rich domain. 15 Among them, the serine protease domain of the underlined protease cleaves, followed by secreting into the cell culture medium after being autocleavage. Thus, it participates in viruses in host cell processes. 15,16 TMPRSS2 has been reported for its contribution to the process of human influenza viruses, coronaviruses including SARS-CoV, SARS-CoV-2, Middle East respiratory syndrome coronavirus (MERS-CoV) and human coronavirus 229E (HCoV-229E) and entering host cells. 17,18 Currently, modulating the expression or activity of TMPRSS2 is considered a potential intervention against human influenza viruses and coronaviruses including COVID-19. 18 On the other hand, multiple studies revealed that the expression of TMPRSS2 was found to be considerably downregulated in tumour tissues compared to non-tumorous ones, and abnormal expression of TMPRSS2 was closely related to tumour growth, invasion, metastasis and prognosis in various cancers, especially prostate cancer. 19,20 More importantly, the Inhibition of TMPRSS2 expression can reduce prostate or head and neck cancer cell invasion and metastasis, and reduce human lung Calu-3 cells infection with SARS-CoV-2. 21,22 In addition, the TMPRSS2 knockout mouse model in the cancer study showed that TMPRSS2 inhibition is safe and effective for molecular therapy of tumours with few on-target side effects. 23 Therefore, a systematic and in-depth investigation of the function of TMPRSS2 in multiple tumours and COVID-19 could pave the way for precision medicine and TMPRSS2-targeted strategies.
Herein, to investigate the potential relationship between tumours and COVID-19, and assess the expression level of TMPRSS2 and its prognostic value in different carcinomas, we systematically studied the expression of TMPRSS2 and its medical consequences in different types of carcinomas while employing multiple recognized online network databases. Furthermore, we examined the co-altered genes with TMPRSS2 for common cancer types and performed functional enrichment analysis. Therefore, the analyses may provide the potential value of TMPRSS2 expression for the survival of patients associated with cancer, and give potential direction to prevent COVID-19 pandemic for specific tumour patients.

| Analyses of Oncomine dataset
In this study, a public web-based microarray database Oncomine (www.oncom ine.org) was employed for the analysis of the TMPRSS2 transcription levels in cancerous specimens followed by F I G U R E 2 TMPRSS2 promoter methylation in various cancers using the UALCAN database. The methylation levels of TMPRSS2 promoter in BRCA, CESC, ESCA, HNSC, KIRC, KIRP and UCEC were considerably elevated than that in normal tissue. On the contrary, the methylation levels of TMPRSS2 promoter in COAD, PRAD, READ and TGCT were slightly reduced relative to that in normal tissue. Boxes indicate the median and the 25th and 75th percentiles; green and red boxes show normal and tumour tissues respectively. Green and red dashed lines indicate the average value of all normal and tumour tissues respectively. **p = 0.001, ***p = 0.001 FIGURE 3 Legend on next page comparing the obtained results with the healthy specimens (controls). 24 The thresholds were restricted in the following manner: fold-change = 1.5; p = 0.001; data type: mRNA.

| Analysis of GEPIA dataset
Gene Expression Profiling Interactive Analysis (GEPIA) (http:// gepia.cance r-pku.cn) offers significant interactive and customizable tasks, such as profile plotting, differential expression analysis, correlation analysis, estimating RNA sequencing expression data based on 8587 healthy and 9736 cancer samples in Genotype-Tissue Expression (GTEx) and TCGA projects. 25 We used GEPIA to verify the differences in TMPRSS2 gene expression in both healthy and different types of cancer tissues. In addition, profile plotting based on cancer pathological stage or type of cancer, survival rate of patient, similar gene detection, correlation analyses and dimensionality reduction analysis can be carried out via the GEPIA dataset.

| UALCAN database analysis
UALCAN (http: //ualcan. path.uab.edu/index.html) is a user-friendly integrated data-mining platform and is used for the extensive analysis of data obtained from cancer OMICS. 26 It is built on PERL-CGI and can also be employed for gene expression analysis, methylation of promoter, prognosis and correlation. In this study, the UALCAN database was employed for analysing the expression pattern and promoter methylation profiling of TMPRSS2 mRNA.

| cBioPortal dataset and muTarget database analysis
TCGA (cancergenome.nih.gov/) is a comprehensive database, which has both sequencing and pathological data of 30 various forms of carcinomas. For cancer genomics, cBioPortal (http://www.cbiop ortal.org/) is an open-access platform, which can be utilized for the multi-functional visualization of complex cancer genomics, F I G U R E 3 Frequency of mutations, CNAs and expression in many cancers obtained from cBioPortal web. (A) TMPRSS2 was only altered in 371 (3%) of 10,953 queried patients in 371 (3%) of 10,967 queried samples. (B) Total 329 mutations were found within amino acids 1 to 492 of TMPRSS2. Among them, there are 7 missense mutation sites, 19 truncating sites, 11 inframe mutation sites and 242 fusion mutation sites respectively. Mutation sites were found in a hotspot in SRCR_2 and trypsin domains. (C) In 29 cancer studies, the expression of TMPRSS2 mRNA (RNA Seq V2) was generated from the cBioPortal web. The x-axis has been categorized based on cancer type and y-axis indicates the expression level of BMP5 mRNA. The expression frequency revealed fusions (violet), missense mutations (green), no mutations (blue) and truncating (deep blue). (D) Among the 32 datasets, the percentage of TMPRSS2 alteration frequency varied from 0% to 42.7% in many cancers. The highest alteration frequency was found in prostate cancer (42.7%), whereas other types of tumours all exhibited very low mutation alteration (<10%) among all of the query cancer samples. The alteration frequency revealed fusions (violet), mutations (green) and multiple

| PrognoScan database analysis
An online database PrognoScan (http://dna00.bio.kyute ch.ac.jp/ Progn oScan/) provides a platform to assess effective tumour biomarkers and their significant therapeutic targets. 30 PrognoScan According to the obtained results, the threshold was found to be Cox p < 0.05.

| The Kaplan-Meier plotter analysis
A web-based online database Kaplan-Meier plotter (www.kmplot. com) is used to study prognostic implications of genes in various forms of carcinoma. The underlined database comprised data, associated with the rate of survival and gene expression in 7461 samples, obtained from 21 different types of tumour. 31 We employed this database for the validation of the TMPRSS2 prognostic value in different types of cancers, with an HR with 95% CI and log-rank p-value.

| Protein-Protein interaction analysis
An online interface GeneMANIA (https://genem ania.org/) is a userfriendly data mining platform. It can be used for the generation of genes correlated to a set of input genes, based on protein and genetic interactions, pathways, co-localization, co-expression and protein domain similarity. 32 STRING (https://strin g-db.org/) is associated with proteinprotein interactions. 33 Herein, we employed both the GeneMANIA and STRING servers to explore the related genes of TMPRSS2.  kidney and liver cancer, respectively, as depicted in Figure 1A.

| Co-expressed and pathway analysis
The UALCAN, TIMER and GEPIA were used to further explore the were relatively decreased than those in normal tissue. TMPRSS2 promoter methylation level was negatively correlated with gene expression level, it is indicated that TMPRSS2 promoter hypermethylation in various cancers may trigger itself and elevates its level accordingly.

| TMPRSS2 genetic variation in various cancers
The gene mutations of TMPRSS2 were explored in 32 common

| TMPRSS2 gene and protein partners in various cancers
GeneMANIA web was used for the prediction of the functionally similar genes including TMPRSS2, which accumulates data on colocalization, co-expression, genetic interactions, involved cascades, prediction of physical interactions and shared protein domains, as depicted in Figure 5A. The predicted functionally similar genes of

| Prognostic analysis of TMPRSS2 in many cancers
We explored the prognostic value of TMPRSS2 mRNA expression in many types of cancers by PrognoScan and Kaplan-Meier plotter databases. The PrognoScan database results revealed that the  Table 1). The opposing results were found in colorectal cancer datasets. The GSE17536, GSE14333 and GSE17536 datasets revealed that the patient's group with an elevated level of TMPRSS2 mRNA expression had significantly better OS and disease-free survival (DFS) than the low expression group ( Figure 6B and Table 1).
Analysis of GSE31210 and GSE13213 datasets of PrognoScan revealed considerably poor OS and RFS of lung cancer patients in the down-regulated TMPRSS2 mRNA expression group relative to their elevated expression counterparts ( Figure 6C and Table 1). High OS and DFS ratio were shown in the low TMPRSS2 expression group of ovarian cancer relative to the elevated expression group, according to the DUKE-OC, GSE9891 and GSE26712 (205102_at) datasets, while one alteration revealed by dataset GSE26712 (211689_s_at) reversed the correlation of reduced expression with poor DFS of ovarian cancer patients, as depicted in Figure 6D and Table 1.  Table 2). In addition, the Kaplan-Meier plot indicated that reduced expression of TMPRSS2 has been linked with poor RFS prognosis in liver hepatocellular carcinoma, ovarian cancer, pheochromocytoma, stomach adenocarcinoma, paraganglioma, testicular germ cell tumour and uterine corpus endometrial carcinoma ( Figure S3 and Table 2).  Abbreviations: 95% CI, confidence interval; HR, hazard rate; OS, overall survival; PFS, progression-free survival.

| The association of differentially expressed genes with the expression of TMPRSS2 in many cancers
We further explored the potential signalling mechanism and pathway linked with TMPRSS2 in four common types of cancers (breast, lung, colorectal and ovarian) through the R2 platform and Reactome tool. The results revealed that 160 differently expressed genes (DEGs) were found in four selected cancers derived from the Venn diagram, as depicted in Figure 9A. The Reactome diseases analysis revealed that these 160 co-expressed differential genes are considerably associated with the occurrence as well as the development of diseases including cell cycle monitoring, cell death, DNA repair, immune response, stress response and infection ( Figure 9B). In terms of infectious diseases, their participation in respiratory infections, such as influenza, SARS-Cov, tuberculosis and HCMV was particularly gaining attention ( Figure 9C). In addition, the Reactome pathway analysis indicated that particularly associated genes were classified in FGFR3 point Cumulative studies have reported that the process of SARS-CoV-2 entering the host alveolar epithelial cells is closely related to angiotensin-converting enzyme 2 (ACE2) and TMPRSS2. 18,37 Among them, TMPRSS2 is a key molecular target that enhances the invading capacity of SARS-CoV-2, has been earmarked as candidate drug targets for interventions against the viral pathologies. 18 Besides its role in COVID-19, TMPRSS2 considerably contributes to several physiological and pathological processes, included in tumour biology. 19,20 It was worth noting that TMPRSS2 is not only closely associated with the occurrence and progression of prostate cancer, but also can be used as an attractive therapeutic target. 20 It has been reported that TMPRSS2 has a key role in many kinds of cancers in vitro and in vivo including breast, 38 46 revealed that the TMPRSS2 expression was considerably F I G U R E 9 Analysis of positively correlated genes of TMPRSS2 and their predicted cascade analysis by R2 and Reactome tools. (A) Venn diagram of 160 genes co-expressed differentially and positively correlated to TMPRSS2, indicating coincident genes in lung, breast, ovarian and colorectal cancers. (B) 160 co-expressed differential genes are effectively associated with the incidence and progression of human diseases. (C) 160 co-expressed differential genes participated in respiratory infections such as influenza, SARS-Cov, tuberculosis and HCMV.
(D) Pathway analysis of TMPRSS2 by Reactome and following a classification based on their cascades elevated in normal tissues than colorectal cancers using across several existing databases. As for lung cancer, one single nucleus and single-cell RNA sequencing study showed a strong expression level of TMPRSS2 in lung tissue and cells derived from subsegmental bronchial branches, and smoking may be susceptible to COVID-19 by affecting the expression of TMPRSS2. 47,48 In addition, both our study and previous studies believed that TMPRSS2 as a cancer suppressor gene was significantly down-regulated in two types of tumour tissues of LUAD and LUSC, the expression of TMPRSS2 have an impact on the prognosis of lung cancer. 49 As for ovarian cancer, Huang et al.
observed the oncogenic gene fusion TMPRSS2: ERG has not been present in ovarian cancer. Therefore, ERG cannot be used as a potential diagnostic or prognostic indicator for ovarian cancer, unlike prostate cancer. 40 Unfortunately, there is little known on the prognostic value and molecular mechanism of TMPRSS2 in ovarian cancer. In our analysis, we found that higher OS and DFS were indicated in the low TMPRSS2 expression group of ovarian cancer relative to the elevated expression group, based on DUKE-OC, GSE9891 and GSE26712 dataset, but the specific mechanism needs further verification and discussion in the future.
To better explore the internal connection of TMPRSS2, COVID-19 and tumour, we exploited GeneMANIA and STRING webserver to predict the functional protein partners of TMPRSS2. A total of 27 genes or proteins were determined as gene and protein partners of TMPRSS2, which might contribute to regulating the progression of carcinoma mediated by TMPRSS2 and its survival rate. Then, the mutations and CNAs were analysed in the 27 predicted protein-related coding genes of TMPRSS2 via the cBioPortal database, the results showed that these genes were altered in 38% (3225/10,953) queried cancer patients. Furthermore, the association between TMPRSS2 expression and the level of immune infiltration was also determined; our results indicated that an elevated level of TMPRSS2 was linked with infiltration of immune cells in BRCA, COAD, LUAD and OV. The underlined results showed similarity with the results of Luo et al. 42 which revealed that elevated expression levels of TMPRSS2 mRNA were linked with an elevated immune infiltration in PRAD. Next, we evaluated the genes associated with TMPRSS2 in several types of cancer, that is, lung, breast, colorectal and ovarian cancer by using the R2 platform. Our analysis has shown that 160 co-expressed differential and positively correlated genes were found in the above four common cancers. According to the pathway analysis, these genes are significantly enriched in associated cascades including respiratory system infection and tumour progression, especially the multiple viral infection pathways and PI3K/AKT signalling pathways are worthy of attention.

| CON CLUS IONS
In our study, the expression, promoter methylation, genetic variation, protein partners, associated genes, immune infiltration and prognostic value of TMPRSS2 were systematically analysed in many kinds of cancer in humans. These results indicated that TMPRSS2 expression was decreased in many tumour tissues distinctively and was significantly related to the clinical outcomes in the brain, blood, colorectal, breast, lung, soft tissue and ovarian cancer. The multiomics analysis also revealed the significance of TMPRSS2 expression and possible cascades associated with TMPRSS2 in several kinds of cancers (human) and progression of COVID-19. Hence, the underlined results may help to explore TMPRSS2 as an effective biomarker and therapeutic target for many kinds of cancer in humans, and also provide directions for the prevention of COVID-19 pandemic for particular tumour patients.

ACK N OWLED G EM ENTS
The authors would like to thank all authors who share public database data for use in our research.

CO N FLI C T O F I NTE R E S T S
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data generated or analysed during this study are included in this article.