Identity of MMP1 and its effects on tumor progression in head and neck squamous cell carcinoma

Abstract Background Head and neck squamous cell carcinoma (HNSCC) is the sixth most common malignant tumor worldwide with high morbidity and mortality. However, the diagnosis and molecular mechanisms of HNSCC remains poor. Methods Robust rank aggregation method was performed to excavate the differentially expressed genes (DEGs) in five datasets (GSE6631, GSE13601, GSE23036, GSE30784, GSE107591) from the Gene Expression Omnibus. Search Tool for the Retrieval of Interacting Genes (STRING) database extracted hub genes from the protein‐protein interaction network. The expression of the hub genes was validated using expression profile from The Cancer Genome Atlas and Oncomine database. The module analysis and disease‐free survival analysis of the hub genes were analyzed by Cytoscape and the Kaplan‐Meier curve, respectively. The expression of hub genes was verified in clinical specimens. The functions of MMP1 which is most important in hub genes were explored in vitro and in vivo. Results Totally, 235 DEGs were identified in the present study which consists of 103 up‐regulated and 132 down‐regulated genes which were significantly enriched in the molecular function of calcium ion binding followed in the biological process of skin development. The mainly enriched pathways were ECM (extracellular matrix)‐receptor interaction (hsa04512) and protein digestion and absorption (hsa04974). Six hub genes were screened out which showed dramatically increased expression in HNSCC samples compared with normal samples, including COL4A1, MMP1, PLAU, RBP1, SEMA3C, and COL4A2. These hub genes all showed worse disease‐free survival with higher expression and were up‐regulated in HNSCC clinical samples. MMP1 was proved to promote cell growth, migration, and phosphorylation of AKT in vitro and to promote liver metastasis in vivo. Conclusion Bioinformatics analysis identified six key genes in HNSCC. Of these, MMP1 is the most likely biomarker. It activates the AKT pathway and promotes tumor progression.


| INTRODUCTION
Head and neck squamous cell carcinoma (HNSCC) is the most common malignant tumor of the head and neck in the world. 1 Although the level of surgery and the efficacy of chemotherapy drugs have improved significantly in the past half century, the 5-year survival rate of HNSCC has not improved significantly, and the survival rate in patients diagnosed with advanced disease is only about 50%. 2 In addition, although increasing numbers of studies have focused on the mechanism of HNSCC formation and progression, the molecular mechanism of HNSCC is still unclear. 3 Therefore, exploring reliable and effective markers and the hub genes which are essential to the discovery of molecular mechanisms of HNSCC is urgently needed.
With the development of next generation sequencing (NGS) and chip technology, the datasets produced by these NGS and chip have been applied to exploring amounts of molecular biomarkers in different cancers, 4,5 especially for the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases. 6,7 Although it can rapidly screen out the differential expressed genes (DEGs) which have important effects on cancer progression, it may cause biased and unreliable results because of the high expenses and limited sample tissues in individual experiments. Therefore, in order to obtain accurate and reliable results, amounts of studies have been analyzed by integrated bioinformatics analysis using multiple datasets. So far, various gene chips have been used in many studies to identify key molecular factors in HNSCC and various genes. [8][9][10] For example, Feng et al screened 1459 differentially expressed lncRNAs and 2381 DEGs in advanced laryngeal squamous cell carcinoma by lncRNA and mRNA integrated microarrays 11 ; Recently, HOTTIP was found to be a key candidate biomarker by Yin et al and SERPINE1, PLAU, and ACTA1 could act as biomarkers in HNSCC detected by Yang et al via integrated bioinformatics analysis. 10,12 However, there are few reliable diagnostic biomarkers and therapeutic targets for HNSCC. In addition, the involved mechanisms in the development of HNSCC for the hub genes obtained by integrated bioinformatics analysis are not clear. Therefore, it is really urgent to discover the effective molecular biomarkers and the molecular mechanism of HNSCC which will be crucial to the diagnosis and treatment of HNSCC.
In the present study, we performed an integrated bioinformatics analysis to explore the DEGs from five GEO datasets. Furthermore, Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses were performed to assess functional pathways of DEGs. And the hub genes were extracted from a protein-protein interaction (PPI) network and the expression of the hub genes were validated using expression profile from TCGA and Oncomine database. The Kaplan-Meier curve of these genes was used to evaluate disease-free survival of the six hub genes based on TCGA and GTEx in GEPIA. We detected their expression in clinical specimens. Finally, we focused on MMP1 and confirmed its function in vitro and in vivo.

K E Y W O R D S
differentially expressed genes, head and neck squamous cell carcinoma, hub genes, integrated bioinformatics analysis, MMP1 Ethanol, xylene, and neutral gum were from Sinopharm Chemical Reagent Co., Ltd. HE dye solution set was bought from Wuhan Servicebio Technology Co., Ltd. BALB/C nude mice were bought from Hunan SJA Laboratory Animal Co., Ltd.

| Collection of tissue specimens
Carcinoma of tonsil, laryngeal cancer, hypopharyngeal cancer, carcinoma of the buccal mucosa, tongue cancer tissues, and their adjacent normal tissues (n = 3) were obtained from patients in the Third Xiangya Hospital (Changsha, People's Republic of China).

| Data acquisition and preprocessing
All the five GEO datasets were downloaded from GEO (https://www.ncbi.nlm.nih.gov/geo/) using GEO query described by Sean and Meltzer, 14 including GSE6631, GSE13601, GSE23036, GSE30784, GSE107591. The detailed information of all the five GEO datasets is listed in Table 1. The raw microarray data of expression files were normalized and log2-transformed. DEGs were identified by the Bioconductor Limma package and then robust rank aggregation method was used to integrate and rank all of the DEGs from five GEO datasets. In addition, the edge R package was used to screen DEGs with thresholds of |log2-fold-change[FC]|>1 and the thresholds of the adjusted p < 0.05.

| GO analysis and KEGG pathway analysis
The DEGs from GEO database were analyzed by an online program Database for annotation, visualization, and integrated discovery (DAVID) (http://david.abcc. ncifc rf.gov/). 15 The GOchord R package and DAVID database were used to perform GO analysis and KEGG pathway maps with cut-off p < 0.05, respectively. 16

| Hub genes analysis
The seed genes in modules with the most connectivity referred to hub genes and TCGA data was used to perform validation using GEPIA database. 18 The analysis for expression level of hub genes between HNSCC samples (n = 519) and normal samples (n = 44) was based on GTEx data in GEPIA from TCGA. Oncomine database was used to further analyze the expression level of hub genes with clinical traits. 19 Logrank value p < 0.05 was considered to be statistically significant.

| QRT-PCR
RNA of tissues was isolated by E.Z.N.A.® miRNA Kit. Concentration of RNA was tested by Thermo® NanoDrop 2000. After reverse transcription, Roche LightCycler® 480 detected amplification for qRT-PCR. The expression of mRNAs was normalized by ACTB (beta-actin).

| Western blotting (WB)
Cells and tissues were lysed by RIPA containing PMSF and phosphatase inhibitor. Concentration of protein was tested by BCA protein assay kit. 10% SDS-PAGE gels were prepared for separating 10-30 μg total protein.
The protein was transferred to the PVDF membrane, blocked in 5% BSA for 1 h, and incubated in primary antibodies at 4°C overnight. Then, the membrane was incubated in secondary antibody for 1 h. Enhanced chemiluminescence reaction was applied to detect the protein expression.

| Cell growth curve
3 × 10 4 FaDu-CON313 and FaDu-siMMP1 cells per well were, respectively, seeded in a 24-well plate. Three wells of cells were each digested with pancreatin and counted under 10 × 10 microscope for 7 consecutive days.

| Wound-healing assay
4 × 10 5 cells per well were seeded in a 24-well plate and cultured in MEM containing 10% FBS at 37°C in an incubator with 5% CO 2 for 24 h. Then, 200-μl pipet tips were used to scrape them. Cells were then cultured in MEM containing 1% FBS. Images of wounds were captured after 0 and 24 h.

| Transwell migration assay
5 × 10 5 cells per well were seeded in transwell chambers with 8.0 μm pore membrane. 100 μl MEM without FBS was in the upper chambers and 500 μl MEM with 10% FBS was in the lower wells. After being cultured at 37°C for 24 or 48 h, cells were fixed by methanol and dyed by crystal violet. All figures were captured under 10 × 10 microscope. Five fields were collected for each chamber, and 3 chambers for each group.

| AKT inhibiting and colony formation assay
AKT inhibitor VIII (AKTi) could inhibit Akt1, Akt2, and Akt3 activity. The final concentration of AKTi in the experimental group was 10 μM. For wound-healing assay, cells were pretreated with AKTi for 12 h before being scraped. For transwell migration assay, the upper chambers were added with AKTi. For colony formation assay, 2 × 10 3 cells per well were seeded in a 24-well plate. Cells were treated with 10 μM AKTi for 12 h. After being totally cultured for a week, cells were fixed by methanol and dyed by crystal violet.

| In vivo subcutaneous tumor model
200 μl PBS dissolving 2 × 10 6 FaDu-CON313 or FaDu-siMMP1 cells were respectively injected into two male 4-6-week-old BALB/C nude mice under the skin of armpit to obtain tumor tissues. These tumors were cut into about 1 mm 3 masses of tissue. The other eight male 4-6-week-old BALB/C nude mice were randomly divided into two groups (n = 4). Tumor tissues derived from FaDu-CON313 and FaDu-siMMP1 were, respectively, buried under dorsal skin. These eight nude mice were executed after 4 weeks to compare volume of tumors and anatomized to estimate situation of metastasis. These nude mice were euthanized through cervical dislocation.

| HE staining
Paraffin sections were dewaxed by xylene and ethanol and rinsed with water. Sections were stained by hematoxylin solution, differentiated by hematoxylin differentiation solution, and blued by hematoxylin Scott's tap bluing. Then, the sections were dyed with eosin. Finally, sections were dehydrated by ethanol and xylene, and observed with microscope inspection.

| Statistical analysis
All statistical analyses in the present study were calculated using SPSS 19.0 (SPSS Inc.). Cells of migration, density of WB bands, and area of colonies were calculated by ImageJ 1.52a. All bar charts and cell growth curve were drawn by GraphPad Prism 5. All data are presented as mean ± standard deviation (SD). Statistical significance between two groups was evaluated by Student's t-test between two groups. p < 0.05 was statistically significant.

| Identification of DEGs in HNSCC among five GEO datasets
In order to explore the DEGs in HNSCC, five GEO datasets containing 170 normal samples and 276 HNSCC samples were downloaded, including GSE6631, GSE13601, GSE23036, GSE30784, and GSE107591 (Table 1). Finally, 235 DEGs were identified with 103 up-regulated and 132 down-regulated DEGs after reprocessing for raw microarray data of expression files in the five GEO datasets. The top 20 significant differentially up-regulated and down-regulated genes from five GEO datasets are shown in Figure 1.

| GO and KEGG analysis for DEGs
The up-and down-regulated DEGs were used for GO analysis by GOChord R package, respectively. The DEGs were significantly enriched in various GO categories of molecular function (MF), biological process (BP), and cellular component (CC). And the top 12 and 11 GO terms for up-and down-regulated DEGs are listed in Table S1.  (Figure 2A,B). KEGG analysis was performed in order to know the biological pathways of the DEGs. The DEGs with up-regulation were most significantly enriched in extracellular matrix (ECM)-receptor interaction (hsa04512), Protein digestion and absorption (hsa04974) and Amoebiasis (hsa05146) ( Figure 2C and Table S2) and the DEGs with downregulation were most significantly enriched in Retinol metabolism (hsa00830), Drug metabolism-cytochrome P450 (hsa00982), and Chemical carcinogenesis (hsa05204) ( Figure 2D and Table S2).

| Construction of PPI network and module analysis
Interaction networks for different proteins will help us to systematically understand the molecular mechanisms in the development of HNSCC. PPI networks were constructed using STRING database and all the nodes without connections will be removed from PPI network. The genes with the most highly connected cluster were extracted by MCODE plug-in in Cytoscape after PPI networks were analyzed. The DMNC and MCC algorithms of Cytoscape were used to identify hub genes. The top 10 hub genes based on the two methods were screened, and six mutual hub genes were isolated, including COL4A1, MMP1, PLAU, RBP1, SEMA3C, and COL4A2 ( Figure 3). These six genes all significantly up-regulated in HNSCC samples compared with normal samples were used for further analysis.

| Hub genes analysis
In order to better know the expression pattern of the hub genes, the expression of the hub genes was analyzed using the TCGA database which contains 519 HNSCC tumor samples and 44 normal samples. As Figure 4A showed, the expression of all the hub genes showed a significantly increase in HNSCC tumor samples when compared with normal samples (p < 0.05). These results were also confirmed by  Figure 4B). In addition, the disease-free survival rate of COL4A1, MMP1, PLAU, RBP1, SEMA3C, and COL4A2 were analyzed using Kaplan-Meier curve. As Figure 4C shows, HNSCC patients with high transcripts per million for all the hub genes showed poorer survival rates. These results demonstrated that the six hub genes may play crucial roles in the development of HNSCC.

| Hub genes were upregulated in HNSCCs
We detected the expression of these six hub genes in carcinoma of tonsil, laryngeal cancer, hypopharyngeal cancer, carcinoma of the buccal mucosa, and tongue cancer. These genes were all expressed higher in these cancer tissues than their adjacent tissues ( Figure 5A), and the fold-changes in these genes are shown in Table 2. Among the six hub genes, the difference in MMP1 expression between cancer tissues and adjacent tissues was the highest. Thus, we further detected the protein expression of MMP1 in these clinical samples.

| Knockdown of MMP1 inhibited cell growth, migration, and phosphorylation of AKT in vitro
We used siMMP1 lentivirus and controlling lentivirus to stably transfect FaDu cells. The WB results demonstrated that infection of siMMP1 lentivirus can decrease the expression of MMP1 ( Figure 6A). The fold-change of MMP1 in FaDu-siMMP1 was 0.002 ± 0.001. Subsequently, we detected cell growth of FaDu-siMMP1 and FaDu-CON313 as control for consecutive 7 days. The cell growth curve indicated that after knockdown of MMP1, cells grew much more slowly during the whole week ( Figure 6B). We then conducted wound-healing assay and transwell assay. The wound-healing assay showed that FaDu-siMMP1 crept in a shorter distance than FaDu-CON313 ( Figure 6C). The transwell assay revealed that migrated cells of FaDu-siMMP1 were less than FaDu-CON313 ( Figure 6D). The percentage area of migrated cells was 22.03 ± 1.71% in a field after silencing MMP1, while that of the control group was 88.60 ± 2.73%. The absence of MMP1 can attenuate cell growth and migration.
We conducted WB to identify the pathway affected by the knockdown of MMP1. AKT and p-AKT (Ser473) both expressed lower in FaDu-siMMP1 than FaDu-CON313. But the expression of p-AKT (Ser473) was more inhibited than total AKT in FaDu-siMMP1 cells ( Figure 6A). The fold-changes of AKT and p-AKT (Ser473) in FaDu-siMMP1 were 0.594 ± 0.0001 and 0.282 ± 0.00009. MMP1

| AKT inhibition attenuated cell growth and migration
We then explored the relationship between AKT inhibition and the reduced ability of proliferation and migration. For colony formation assay, the area of colonies in the experimental group was significantly bigger than that in the control ( Figure 7A). The area of experimental group was 15.96 ± 1.73% of the area of control group. But the difference in colony numbers between the two groups was not statistically significant. The wound-healing assay showed that the presence of AKT inhibitor decreased the creeping distance of FaDu cells ( Figure 7B). Similarly, the transwell assay also demonstrated that inhibition of AKT attenuated the ability of migration. The area of FaDu cells passed through pores accounted for 7.33 ± 3.21% in a field after AKT being inhibited, while the area of migrated untreated FaDu cells accounted for 27.01 ± 8.28% ( Figure 7C). These results indicated that AKT inhibition can also attenuate tumor progression as same as silencing MMP1. MMP1 activated AKT pathway to promote proliferation and migration of hypopharyngeal cancer cells.

| Absence of MMP1 decreased tumor migration but not growth in vivo
We conducted xenograft tumor models to explore the function of MMP1 in vivo. The subdermal tumors derived from FaDu-siMMP1 were not statistically different from tumors in the FaDu-CON313 group ( Figure 8A). The volume of FaDu-CON313 group and FaDu-siMMP1 group was 32.56 ± 10.48 and 37.07 ± 14.16 mm 3 . However, the liver metastasis in the FaDu-CON313 group was more severe than that in the FaDu-siMMP1 group. There were much more macroscopical gray metastases in the livers of FaDu-CON313 group ( Figure 8B). But the swelling mesenteric lymph nodes could be observed both in the FaDu-CON313 group and FaDu-siMMP1 group ( Figure 8B). To deeper understand the situation of multiple organ metastases, we conducted HE staining of lungs, livers, and swelling mesenteric lymph nodes in both groups. Shown in Figure 8C, metastatic necroses were more in FaDu-CON313 group compared to FaDu-siMMP1 group in

| DISCUSSION
HNSCC is a malignant tumor that could lead to high mortality rate with an extremely poor 5-year survival rate. Although amounts of molecular biomarkers have been identified, the biomarkers which were truly used for the diagnosis, prognosis in HNSCC are still scarce. In the present study, five high-quality GEO datasets were selected to excavate the hub genes and the biological pathways of the hub genes involved in HNSCC by integrated bioinformatics analysis. Finally, 103 up-regulated and 132 down-regulated genes were identified. Six hub genes were screened out which showed dramatically increased expression level in HNSCC samples, including COL4A1, MMP1, PLAU, RBP1, SEMA3C, and COL4A2. These hub genes all showed worse disease free survival with higher expression level. Besides, the expression of MMP1 was found to be the most significantly differential. Our study demonstrated that MMP1 attenuated cell growth, migration, and phosphorylation of AKT. These results indicated that these hub genes, especially MMP1, might participate in regulating the progression of HNSCC and could be potential molecular biomarkers in the precise diagnosis of HNSCC. The GO analysis revealed that DEGs were significantly enriched in MF of calcium ion binding (GO:0005509) and BP of skin development (GO:0043588). Yang et al demonstrated that the DEGs were significantly enriched in protein binding and calcium ion binding by meta-analysis of DEGs in osteosarcoma based on microarray data. 20 However, there is no study related to HNSCC. In this study, the expression level of the genes related to calcium ion binding in patients with HNSCC, such as PTHLH, MMP10, LTBP1, MMP9, NELL2, MMP3, MMP13, MMP12, and MMP1 were significantly altered, indicating that these genes might be crucial to HNSCC development related to calcium ion binding. Meanwhile, KEGG pathway analysis revealed that the upregulated DEGs were significantly enriched in ECMreceptor interaction (has04512) and Protein digestion and . Previous studies have shown that ECM remodeling not only promotes cancer development but also indicates poor prognosis in HNSCC patients. 21 The GO and KEGG pathways analysis of the DEGs provide a solid foundation for revealing the BPs and mechanisms involved in the development of HNSCC.
In line with the GO and KEGG analysis, six genes were considered to be hub genes in PPI network, and their expression characteristics were also verified by TCGA database. So far, increasing numbers of studies have demonstrated that the expression of collagen family members, such as COL1A1 and COL1A2, was abnormal in several cancers, such as breast and lung cancers. [22][23][24] PLAU has already been demonstrated to be involved in metastasis and immune escape for different tumors. 25 Recent studies showed that PLAU may act as a biomarker of HNSCC based on integrated bioinformatics analysis. 9,10 Vaitkienė et al identified that SEMA3C had a potential as a prognostic marker for glioma, breast, and oral neoplasia, respectively. 26,27 Recently, Hui and Liu et al considered that SEMA3C was associated with poor prognosis in cervical cancer and might act as a therapeutic target in prostate and other cancers. 28,29 Chen et al demonstrated that the expression of RBP1 is up-regulated in tongue squamous cell carcinoma. 30 Many studies have proved that MMP1 is associated with tumor progression in different kinds of cancers. For example, MMP1 was highly expressed in colorectal cancer tissues and its high expression was related to lymphatic metastasis and TNM stage. The experiments in vitro and in vivo demonstrated that knockdown of MMP1 attenuated tumor growth and migration. 31 In esophageal squamous cell cancer, MMP1 was expressed higher in tumor tissues than their adjacent normal tissues. Its high expression was associated with lymph node metastasis, microvessel density, and advanced TNM stage. The experiments in vitro and in vivo also proved that overexpression of MMP1 promoted cell growth, migration, and colony formation. 32 MMP1 regulated by ETV4 was also found to promote cell migration and invasion in nonsmall-cell lung cancer. 33 In triple-negative breast cancer, MMP1 regulated by YB-1 can promote invasion. 34 In addition, MMP1 was proved to activate PI3K-AKT pathway and epithelial-mesenchymal transition. 31,32 In cervical squamous cell carcinoma, high expressed MMP1 was associated with poor prognosis and negatively related to the amount of T cells and macrophages infiltration. 35 MMP1 might be a biomarker for immunotherapy and prognostic judgment in cervical squamous cell carcinoma. In HNSCC, there were several reports about MMP1 and its effects on tumor progress. Lallemant et al reported that MMP1 shows high sensitivity and specificity for serving as a diagnostic marker in HNSCC. 36 A research about human oral squamous cell carcinoma showed that MMP1 was expressed higher in tumor than normal tissue and its higher expression was positively correlated with a shorter overall survival rate. 37 Wang et al also found that MMP1 3'UTR can sponge miR-188-5p to facilitate the proliferation and migration of human oral squamous cell carcinoma. 38 Consistently, our results showed that MMP1 was significantly increased in HNSCCs and its high expression was positively correlated with poor prognosis. However, few research has focused on MMP1 expression and effect in hypopharyngeal cancer specifically. Our experiment in vitro showed that the absence of MMP1 in hypopharyngeal cancer cells attenuated cell growth and migration. And MMP1 was also found to promote liver metastatic necrosis of hypopharyngeal cancer in vivo. Interestingly, our experiments in vivo did not show slowly growing tumor after decreasing MMP1 expression. Ignoring the individual differences of the nude mice, we speculated that cells proliferated more in the control group than the silencing MMP1 group, but the increased cells contributed to liver metastasis instead of growing in situ. In this study, we verified that MMP1 promoted tumor progress and could serve as a biomarker in HNSCC, especially in hypopharyngeal cancer.

| CONCLUSION
In the present study, we have identified six hub genes involved in HNSCC using integrated bioinformatics analysis (COL4A1, MMP1, PLAU, RBP1, SEMA3C, and COL4A2). These hub genes all showed worse disease-free survival with higher expression in an online database. Then, we verified that these hub genes were expressed higher in our clinical samples of HNSCC. We demonstrated that MMP1 can facilitate cell growth and migration in vitro by phosphorylating the AKT pathway. Furthermore, we proved that MMP1 promotes liver metastasis in vivo. Thus, MMP1 might serve as a biomarker for HNSCC.