Genetic alterations, RNA expression profiling and DNA methylation of HMGB1 in malignancies

Abstract The high mobility group box 1 (HMGB1) is a potential biomarker and therapeutic target in various human diseases. However, a systematic, comprehensive pan‐cancer analysis of HMGB1 in human cancers remains to be reported. This study analysed the genetic alteration, RNA expression profiling and DNA methylation of HMGB1 in more than 30 types of tumours. It is worth noting that HMGB1 is overexpressed in malignant tissues, including lymphoid neoplasm diffuse large B‐cell lymphoma (DLBC), pancreatic adenocarcinoma (PAAD) and thymoma (THYM). Interestingly, there is a positive correlation between the high expression of HMGB1 and the high survival prognosis of THYM. Finally, this study comprehensively evaluates the genetic variation of HMGB1 in human malignant tumours. As a prospective biomarker of COVID‐19, the role that HMGB1 plays in THYM is highlighted.


| INTRODUC TI ON
Some studies have recently recognized high mobility group box 1 (HMGB1) as a potential biomarker for severe COVID-19. [1][2][3][4] The serum HMGB1 of patients with severe COVID-19 is significantly elevated. In some circumstances, exogenous HMGB1 could promote the entry of SARS-CoV-2 into alveolar epithelial cells expressing the receptor ACE2. 2 Genetic and pharmacological inhibition of the HMGB1-AGR pathway can play an important role in blocking the expression of ACE2. HMGB1 is a multifunctional protein that plays different roles in different cell compartments. Extracellular HMGB1 is considered a damage-associated molecular pattern (DAMP) protein in response to stress, which serves as the central mediator of lethal systemic inflammation in tissue injury or infection. Alarmins are constitutive endogenous molecules that are released and activate the immune system in the event of tissue injury. [5][6][7] HMGB1 is one of the prototypical alarmins that activate innate immunity. 8 In addition, although the number of references to alarmins in the literature is increasing rapidly, the one most characteristic in health and disease is HMGB1. Finally, it is worth noting that cancer is known as one of the individual risk factors for COVID-19, and many of the affected patients with COVID-19 are patients with malignant tumours. 9 During the current COVID-19 outbreak, one of the potential risks for cancer patients is the limited ability to access to necessary medical services. Furthermore, patients with lung cancer who are ≥60 years of age tend to have higher risks for COVID-19 infection. 9,10 However, comprehensive pan-cancer analyses have yet to be conducted to investigate the potential impact of HMGB1 aberration in human cancers. 11,12 Here, we conducted a pan-cancer analysis of HMGB1 in malignant tumours. In the TCGA pan-cancer analysis, the most common genetic alterations were investigated. Next, the expression of HMGB1 in tumour tissues and normal control tissues was compared.
Since the new COVID-19 is mainly transmitted through the air, one focus should be respiratory tumours. Furthermore, this study studied the genetic disorders of HMGB1 in cancer. Interestingly, COVID-19 is related to aging and inflammatory diseases, and a dysfunctional thymus may be the predisposing factor. 13,14 We report that HMGB1 plays an important role in THYM. This result highlights the relationship between COVID-19 patients and the disorders of the thymus gland through bioinformatics tools.

| Gene expression analysis of HMGB1
Initially, the tumour immune-estimation resource, version 2 (TIMER2) webserver (http://timer.cistr ome.org/) was used to investigate the mRNA expression difference of HMGB1 between tumour and normal tissues for the different tumours derived from the TCGA project. However, there are specific tumours with no normal tissues or very limited normal tissues in the TCGA project. For these tumours, the GEPIA2 (http://gepia2.cance r-pku.cn/, the gene expression profiling interactive analysis 2) webserver was used to compare box plots of the mRNA expression difference between the tumour tissues and the corresponding normal tissues of the genotype-tissue expression (GTEx) database. 15 To determine the difference in HGMB1 protein expression between tumour tissues and the normal tissues, analyses of protein expression were performed on the Clinical Proteomic Tumour Analysis Consortium (CPTAC) datasets using the UALCAN (http:// ualcan.path.uab.edu). 16 Six tumours were available: breast cancer, ovarian cancer, colon cancer, renal cell cancer, endometrial cancer and lung adenocarcinomas. The UALCAN is a comprehensive and interactive web resource for analysing cancer OMICS data, including TCGA, MET500 and CPTAC. 16 Furthermore, this study investigated HMGB1 expression at different pathological stages across cancer types using the GEPIA2 stage-plot module. The cut-off value was set to 50% to separate the groups into high-and lowexpression cohorts.

| Survival prognosis of HMGB1
GEPIA2 was also used to perform custom statistical methods, such as survival analyses on a given dataset to obtain differentially expressed genes or isoforms dynamically. The survival-map module in GEPIA2 was applied to generate plots for overall survival (OS) and disease-free survival (DFS). The cut-off value was 50% to separate the groups into high-and low-expression cohorts. The log-rank test was used for hypothesis testing. The comparison/survival module, p-Values, q-Values and Kaplan-Meier plots of Disease-Free, Overall, Disease-specific and Progression-Free were obtained for TCGA cases. Statistical analyses were performed using the 'survival' package with R statistical software, version 4.0.5.

| DNA methylation and genetic alteration analyses
The DNA methylation level of HMGB1 was analysed using the methylation panel from the CGA module via UALCAN. 17,18 More than 30 tumours were available for the analyses.

| Immune infiltration analysis of HMGB1
The Immune-Estimation module of the TIMER2 webserver was used to explore the association between the level of HMGB1 ex-

| Gene-related enrichment analysis
The STRING website (https://strin g-db.org/) was applied to search HMGB1 under the protein name section in Homo sapiens organism. 20,21 The main parameters under the settings panel were set by checking (evidence) for the meaning of network edges and (Experiments) for active interaction sources. In addition, we selected (low confidence [0.150]) for the minimum required interaction score and (no more than 50 interactors) for the maximum number of interactors. Using these settings, 50 top HGMB1-binding proteins were identified for further analysis.
The 'Similar Gene Detection' panel on the GEPIA2 webserver was applied to obtain the top 100 HGMB1-correlated targeting genes based on the datasets from all TCGA tumours and normal tissues.
Pearson correlation analysis of selected genes was conducted using the 'Correlation Analysis' module. The p-value and the correlation coefficient were provided. TIMER2 produced the heatmaps; these contain the p-values and partial correlation in the purity-adjusted Spearman's test. The intersection analysis of the HMGB1-binding and interacted genes was completed using a Venn diagram. Finally, the enriched pathway analyses were analysed using 'clusterProfiler' in R statistical software, version 4.0.5, and the bubble plots were produced by 'tidyr' and 'ggplot2' packages.

| HMGB1 is overexpressed in three tumours out of 33 tumours
Initially, the expression pattern of HMGB1 was analysed across various cancer types of TCGA using TIMER2. As shown in Figure

| Overexpression of HMGB1 is linked to poor prognosis in five tumours
After examining the significant dysregulation of HMBG1 expression in different cancer types and its correlation with the pathological stage, one potential hypothesis is that this protein might be used as

| DNA methylation and genetic alteration analysis
Eleven probes in the HMGB1 promoter were used in this study to detect the DNA methylation level of HMGB1 ( Figure S4). Interestingly,  Figure 3C). The most observed frequent mutation was R163*/Q; the 3D structure of the HMGB1 mutations is shown in a graphic panel ( Figure 3C).
The results showed that mutations were not statistically relevant to RNA expression of HMGB1 ( Figure S5). Furthermore, copy variations were also not significantly relevant to HMGB1 expression ( Figure S5). One possible explanation is that the upregulation of HMGB1 expression is not a direct consequence of genetic variation. Thus, we further investigated the post-translation features of HMGB1 in 33 cancers.

| Phosphorylation levels of HMGB1 in several cancers
The differences in HMGB1 phosphorylation levels were compared between normal tissue and primary tumour tissues using CPTAC datasets for four types of tumours (breast cancer, clear cell carcinoma, LUAD and UCEC). Figure S6 summarizes the phosphorylation sites of HMGB1, which are significantly different from the control group: S35 locus and S100 locus. The S35 locus demonstrates a significantly lower phosphorylation level in primary tumour tissues compared with normal tissues for breast cancer (p = 2e-05), LUAD (p = 6e-38) and UCEC (p = 9e-06) ( Figure S6). By contrast, the S100 locus is the only one to exhibit a significantly decreased phosphorylation level for breast cancer ( Figure S6, p = 2e-04), but not for LUAD and UCEC.

| Immune infiltration analysis
As an important part of the tumour microenvironment, tumourinfiltrating immune cells were reported to be closely related to the initiation, promotion, progression or metastasis of tumours. 24,25 Furthermore, according to previous research, cancer-associated fibroblasts regulate the functions of various cancer-infiltrating immune cells. 26

| Enrichment of HMGB1-related partners
To further explore the molecular mechanism of HMGB1 in tumorigenesis, the HMGB1 expression-related genes or proteins were obtained from a series of pathway enrichment analyses. First, 50 binding proteins were observed using the STRING tool, all supported by experimental evidence. The interaction network of these proteins is presented in Figure 5. Next, based on the GEPIA2 tool, the top 100 genes related to HMGB1 expression were obtained by combining all tumour expression data of TCGA. Finally,  LUAD is the most common type among the COVID-19 patients with malignant tumours. 28,29 In addition, lung cancer patients have been confirmed to have a higher COVID-19 incidence and more F I G U R E 5 Enrichment analysis of the HMGB1 gene. (A) Fifty proteins that bind to HMGB1 were identified using the STRING tool. In addition, 100 genes associated with HMGB1 were acquired from the TCGA database. (B) KEGG pathway analysis based on the HMGB1binding and interacted genes. (C) An intersection analysis of the HMGB1-binding and correlated genes was conducted. (D) The cnetplot for the molecular function data in GO analysis severe symptoms. 28,29 Here, we demonstrated that RNA expression of HMGB1 is significantly upregulated in THYM patients but not significantly changed in LUAD and LUSC. The phosphorylation analyses using the CPTAC dataset included four cancer types.
Results demonstrated the decreased phosphorylation levels of S35 and S100 for different tumours. Furthermore, the findings showed that compared with the normal control group, the total protein and phosphorylation level of HMGB1 at the S35 locus in the primary tumour was significantly lower for breast cancer, LUAD and UCEC ( Figure S6, all p < 0.01). However, the total protein levels of HMGB1 were significantly higher for both ovarian cancer and colon cancer.
Although the clinical significance of these post-translational modification sites remains to be determined, the current analyses do not rule out the possibility that the significantly decreased level of  Figure S7). Furthermore, the results from STRING and GEPIA2 analyses shared three members (HMGB2, SRSF1 and SSRP1) for the enrichment analyses of HMGB1-related partners ( Figure 5C, Figure S10). These three members have been reported to be associated with lung cancers or breast cancers. [40][41][42][43][44][45] In this study, we unified several publicly available databases to investigate the expression of the HMGB1, explored correlations with prognosis and evaluated potential mechanisms of regulation in tumour patients. We utilized the TCGA, ONCOMINE, cBioPortal, UALCAN, GEPIA and STRING databases to obtain a comprehensive understanding of the structure and function of the HMGB1. Based on the results of the correlation analysis of HMGB1 and survival prognosis using GEPIA2, it can be seen that the overexpression of HMGB1 is significantly associated with poor prognosis of the five tumours (ACC, ESCA, KICH, LUAD and PAAD), while the overexpression of HMGB1 is also significantly associated with better prognosis of KIRC and THYM. These results suggest that the expression of HMGB1 has the potential to serve as a poor prognostic biomarker and therapeutic target for cancer patients.
COVID-19 is a respiratory disease that causes severe symptoms in the lungs. However, one of the differences from other respiratory diseases is that the high fatality rate is initially due to thick, copious mucus in the lungs and then, to the impairment of lung function 46,47 .
Therefore, the disease of the chest cavity caused by thymic cancer, such as THYM, could greatly promote mucus secretion or make it easier for the mucus to affect the lungs. This finding helps us understand the further impact of thoracic cavity structure and function on COVID-19, rather than only focusing on lung function. To the best of our knowledge, this is the first discovery of COVID-19 and THYM through HMGB1.
In summary, the pan-cancer analysis of HMGB1 showed that the expression of HMGB1 was significantly related to the prognosis, genetic changes, immune cell infiltration and drug sensitivity of different tumours in cancer patients. HMGB1 acts as a tumour promoter in most of the tumours studied and has the potential to be used as a potential marker for prognosis. This helps us under-

ACK N OWLED G EM ENTS
I thank numerous investigators who contributed datasets used in this manuscript and members of the Lemos lab for discussions at Harvard University.

CO N FLI C T O F I NTE R E S T
The authors confirm that there are no conflicts of interest.

DATA AVA I LB I LIT Y S TATEM ENT
All the data used in this study are obtained from publicly available databases, the data and results analysed in this study are available on request.