PRKCB is relevant to prognosis of lung adenocarcinoma through methylation and immune infiltration

Abstract Background Lung adenocarcinoma (LUAD) is one of the tumor‐related diseases with high morbidity worldwide. Epigenetic modifications such as DNA methylation changes may involve in tumorigenesis. This study aimed to explore new biomarkers that have prognostic significance of LUAD. Methods First, we downloaded the gene expression and methylation data set from Gene Expression Omnibus. R software was then used to identify abnormally methylated differentially expressed genes (MDEGs). Next, R package Cluster Profiler was used to analyze the enrichment and pathway of the MDEGs. Analysis using STRING revealed the protein–protein interaction network. The result was then visualized by Cytoscape and obtained 10 hub genes. Afterward, they were further verified by The Cancer Genome Atlas to select candidate genes. Moreover, quantitative real‐time polymerase chain reaction (qRT‐PCR) and immunohistochemistry were used to verify the expression and prognostic value of candidate genes in LUAD patients. Results The results showed that the expressions of ADCY5 and PRKCB are indeed related to LUAD. The clinical relevance to PRKCB was confirmed by its clinical correlation analysis. Gene set enrichment analysis (GSEA) and tumor immune estimation resource (TIMER) tumor immune correlations showed that PRKCB is involved in the cancer‐related Kyoto Encyclopedia of Genes and Genomes pathway and is involved in immune infiltration. It was also verified by qRT‐PCR and immunohistochemistry that PRKCB was lowly expressed in LUAD patients and correlated with prognosis. Conclusions PRKCB is relevant to prognosis of LUAD through methylation and immune infiltration.


INTRODUCTION
Lung cancer can be divided into small-cell lung cancer (SCLC, about 15%) and non-small-cell lung cancer (NSCLC, $85%) according to histopathology. Lung adenocarcinoma (LUAD) is one of the histological subtypes of NSCLC, and its incidence has increased significantly in recent years. 1 LUAD is usually diagnosed in advanced stages. However, if diagnosed early, survival rate of LUAD patients can be greatly extended. Therefore, to reduce the mortality of LUAD, effective early Jinjie Wang and Muqi Shi contributed equally to this work. identification methods and related biomarkers are urgently needed. Currently, in addition to low-dose tomography widely used in lung cancer screening and postoperative monitoring, potential biomarkers such as autoantibodies, complement fragments, microRNAs, circulating tumor DNA, DNA methylation, and blood protein profiles have attracted widespread attention. 3 DNA methylation is a form of epigenetic modification that can change genetic performance without changing the DNA sequence. This is one of the current research hotspots in tumor and molecular biology. 4 Recent studies have shown that changes in the methylation pattern of tumor cells can be divided into two types: hypomethylation of oncogenes leads to activation of oncogenes, and increased levels of DNA methylation in specific regions cause inactivation of tumor suppressor genes. 5 Because DNA methylation often occurs in lung cancer, 6 we sought to discover new DNA methylation biomarkers in LUAD patients, which may become a prognostic factor for LUAD patients.
With the advantage of big data networks, convenient and public databases such as Gene Expression Omnibus (GEO) 7 and The Cancer Genome Atlas (TCGA), 8 which contain gene expression levels and methylation characteristics of various tumors and normal samples, makes it possible to select the most detectable organism from a large number of potential markers.
In this study, we sought to identify genes that are abnormally methylated in LUAD through systematic bioinformatics analysis, which is a more accurate analysis of huge biological and genomic data. To increase the persuasive power of this study, we verified by quantitative real-time polymerase chain reaction (qRT-PCR) that PRKCB was indeed downregulated in LUAD tissues. At the same time, we further verified by immunohistochemistry (IHC) that LUAD patients with low PRKCB expression level had worse overall survival (OS). In short, PRKCB may act on LUAD patients through methylation and immune infiltration.

Identification of methylated differentially expressed genes (MDEGs) and functional analysis
The raw data of GSE118370 were preprocessed and normalized by using the affy package under the R environment (https://www.r-project.org/). After pretreatment, we used the limma package to identify genes differentially expressed in LUAD tissues with jlogFCj >1 and adjusted p value <0.05. Meanwhile, data of GSE118370, which related to the methylation expression level of genes, were first standardized and normalized in the R environment using the wateRmelon package. Next, we took β value >0.2 and adjusted p value <0.05 as the standard of abnormal methylation. Finally, we used the online Wayne diagram to cross-contrast the DEGs with abnormally methylated to obtain MDEGs. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of MDEGs were performed by R package cluster profiler based on org.Hs.eg.db database. [9][10][11][12] The GO and KEGG analysis results were visualized using the enrichplot and the GOplot package. 13 GO analysis includes biological processes (BP), molecular functions (MF), and cellular components (CC).

Construction of protein-protein interaction (PPI) network
Potential relationships between MDEGs were identified by the online STRING (https://www.string-db.org/), which is a database that uses bioinformatics to predict the PPI network. 14 Next, Cytoscape (https://cytoscape.org) was used to visualize the PPI and further find hub genes. 15 The top 10 hub genes were then identified using the cytoHubba plugin and the Maximal Clique Centrality method. In addition, core modules of the PPI network with degree cutoff = 2, node score cutoff = 0.2, and K-core = 2 were selected by the plug-in Molecular Complex Detection (MCODE) in Cytoscape software.

Expression and methylation levels of hub genes in TCGA
The Gene Expression Profiling Interactive Analysis (GEPIA) (http://gepia.cancer-pku.cn) combined with TCGA database was used to further confirm the expression levels of hub genes between LUAD and normal tissues. 16 The online The University of Alabama at Birmingham Cancer data analysis portal (UALCAN) (http://ualcan.path.uab.edu) was also used to confirm the methylation levels of hub Genes between LUAD and normal tissues combined with the TCGA database. 17 In addition, cBioPortal for Cancer Genomics (https://www.cbioportal.org) was used to further analyze the correlation between expression and methylation of hub genes.

Survival and prognosis analysis
GEPIA was used to evaluate the relationship between the expression of candidate genes in LUAD and survival rate. With the help of the TCGA database, OS of LUAD patients can be assessed based on the different expression levels of each gene.

Analysis of clinical relevance of candidate genes in TCGA
All gene expression data (594 cases) and corresponding clinical information of LUAD can be downloaded from the TCGA official website. Samples with incomplete clinical information were exclued when investigating clinical relevance.

Gene set enrichment analysis of PRKCB
Genes was classified by gene set enrichment analysis (GSEA) according to the degree of differential expression in two types of samples, and then checks whether the top or end of the preset list is enriched with the preset gene set. 18 GSEA first, generated a ranked table of all genes according to the correlation between the expression of all genes and PRKCB, and second, divides the expression of PRKCB into high expression group (PRKCB-h) and low expression group (PRKCB-l) according to the median. At last, GSEA was then performed to clarify the significant survival differences between PRKCB-h and PRKCB-1. Each analysis was performed 1000 times. The nominal p value and normalized enrichment score (NES) were used to rank each phenotype enrichment pathway.

Correlation analysis of PRKCB and immune cell infiltration
Tumor immune evaluation resources (TIMER) (https:// cistrome.shinyapps.io/timer) is a free online website based on the TCGA database that uses statistical methods to detect the infiltration of immune cells in tumor tissues and its impact on the prognosis of patients. 19 The immune cells in this study included CD4 + T cells, CD8 + T cells, B cells, macrophages, neutrophils, and dendritic cells.

Clinical samples collection
We collected 20 tumor tissues (tumor) and adjacent normal tissues (normal) from LUAD patients, which were taken from the Affiliated Hospital of Nantong University and stored at À80 C for subsequent RNA extraction. In addition, we retrospectively studied a tissue microarray (TMA) of 60 tissues of LUAD patients who underwent surgical treatment at the Affiliated Hospital of Nantong University from January 2010 to June 2017. We extracted the clinical characteristics of the LUAD patients from the medical record, including age, gender, smoking history, differentiation, and pathological TNM stage. This experimental protocol was approved by the Ethics Committee of the Affiliated Hospital of Nantong University.

RNA extraction and qRT-PCR
We selected 20 pairs of LUAD tumor tissues and paired normal lung tissues clinically, extracted total RNA from them with TRIzol reagent (Life Technologies) and transcribed to complementary DNA (cDNA) using the PrimeScript RT reagent kit (Takara). Finally, qRT-PCR was used to analyze the expression levels of candidate genes in tissues. We set the reaction conditions as follows: incubate at 95 C for 2 minutes and then perform 45 cycles at 95 C for 5 seconds and 60 C for 30 seconds. The analysis software (Eppendorf) displayed the cycle threshold of each reaction. The GAPDH gene served as an internal control. The primers of PRKCB were as follows: forward 5 0 -CGTCCTCATTGTCCTCGTAA-3 0 and reverse 5 0 -TGTC TCATTCCACTCAGGGTT-3 0 .

IHC
IHC method was used to detect the expression level of PRKCB in paraffin-embedded LUAD specimens. The samples were first incubated with rabbit anti-PRKCB antibody (1:200), and then incubated with goat anti-rabbit secondary antibody (1:500) for secondary staining. Finally, a microscope (Leica DMR 3000; Leica Microsystem) was used to capture images of each slice at a magnification of 200-fold. PRKCB (brown)-positive staining is mainly located in the cytoplasm. It was scored according to staining intensity and percentage of PRKCB-positive tumor cells. The median was used as the cutoff value for high or low PRKCB expression.

Statistical analysis
The Wilcoxon signed-rank test and logistic regression were used to analyze the relationship between clinicopathological characteristics and candidate genes. Cox regression was used to assess the clinicopathological features associated with overall survival in TCGA patients. Multivariate Cox analysis was used to compare the effect of candidate gene expression F I G U R E 4 The expression of 10 hub genes through GAPIA database (left column reflects tumor data and right column reflects normal data, *p < 0.05) on survival and other clinical characteristics. The hazard ratio referred to the risk of death in LUAD patients as the value of each risk factor increased. Statistical significance of qRT-PCR was determined using Student's t-test. Relationships between PRKCB expression and clinicopathological characteristics were evaluated using the χ2 test or Fisher's exact test. Kaplan-Meier method was used to construct OS curve, and log-rank test was used to analyze the difference between the curves.

Identification of MDEGs
To find genes with differential expression of LUAD, we first downloaded GSE118370, which contains all gene expression datasets from LUAD tissues and paired normal lung tissues, from GEO. We identified 2085 significant DEGs (301 upregulations, 1784 downregulations) (Figure 1(a),(b)) in LUAD from GSE118370. At the same time, the GSE139032 data was processed for the methylation data of LUAD to further obtain 780 methylation differential genes (Figure 1(c)) in LUAD. Next, the genes screened from the two gene sets described above were jointly imported into a Venn diagram (Figure 1(d)). This resulted in 124 overlapping MDEGs.

GO and KEGG analysis of MDEGs
Next, we sought to find out the common biological function of these 124 MDEGs. Therefore, we used R package Cluster Profiler for GO and KEGG analysis. As we can see from Figure 2 activity, and DNA-binding transcription factor binding. In addition, KEGG pathway analysis showed that 124 MDEGs were concentrated in the calcium signaling pathway, circadian entrainment, salivary secretion, parathyroid hormone synthesis, secretion and action, and long-term depression (Figure 2(b)).

Construction of PPI network
To find out key genes, we used an online STRING platform to identify potential protein interactions between these MDEGs. The resulting PPI network graph contains 123 nodes and 97 edges. We used Cytoscape software to visualize the PPI network (Figure 3(a)). Through the Cytoscape plugin, we identified the first 10 hub genes: RYR2, ADCY4, ADCY5, DRD5, PRKCB, NMUR2, ADRA2C, DRD4, SOX17, and FGF2. Meanwhile, the entire network was then analyzed using Cytoscape's MCODE plugin, which identified three subnetworks (Figure 3(b)-(e)).

Expression and prognostic value in TCGA database
To further verify the expression levels of candidate genes, we performed differential expression analysis using GEPIA online tool. It can be seen from the boxmap in Figure 4 that the expression of ADCY4 (fold change = 0.36), ADCY5 (fold change = 0.36), ADRA2C (fold change = 0.28), FGF2 (fold change = 0.24), PRKCB (fold change = 0.62), RYR2 (fold change = 0.19), and SOX17 (fold change = 0.30) in LAUD was lower than that in normal lung tissues. That is, these seven candidate genes may be suppressors in LUAD oncogene. There was no statistical difference in the expression level of the remaining three genes in the tumors and normal tissues. Hypermethylation in the DNA promoter region is an important regulatory mechanism for tumorigenesis, which is widely present in a variety of tumor suppressor genes. Therefore, we speculated that the downregulation of the expression of these seven genes in LUAD may be related to DNA promoter hypermethylation. We used the UALCAN to (fold change = 1.78, p < 1E-12), and SOX17 (fold change = 3.07, p < 1E-12) promoter regions in LUAD tissues were significantly higher than those in normal tissues ( Figure 5). The methylation levels of ADCY4, ADCY5, FGF2, PRKCB, and SOX17 were negatively correlated with transcriptional expression ( Figure 6). Next, we used the TCGA database to study the relationship between candidate gene expression levels and clinical characteristics. We performed univariate analysis of the clinicopathological characteristics and the five candidate genes to further screen prognostic genes (Figure 7(a)). Meanwhile, multivariate Cox analysis indicated that FGF2 and PRKCB were independent prognostic factors (Figure 7(b),(c)). Figure 7€ showed that the median survival time of patients with low expression of PRKCB was about 40 months, whereas the median survival time of patients with high expression of PRKCB was about 55 months. Therefore, we can conclude that highly expressed PRKCB in LUAD patients had better survival results (log rank p = 0.0014). Meanwhile, highly expressed FGF2 did not show the survival advantage (log rank p = 0.74) (Figure 7(d)). At the same time, we can see from Figure 8(a) that the expression of PRKCB in stage I was higher than that in stage II and stage III. The expression of PRKCB in T1 was significantly higher than that in T2 and T3 (Figure 8(b)). The expression of PRKCB in N2 was lower than that in N0 (Figure 8(c)). Although the median expression of PRKCB in M1 was lower than that in M0, it had no statistical significance (p > 0.05, Figure 8(d)). Therefore, we can preliminarily conclude that the expression of PRKCB was lower in the advanced stage than in the early stage.

Identification of PRKCB-Related signal paths with GSEA
To explore the potential mechanism of PRKCB in LUAD, KEGG was analyzed by GSEA method. As shown in Figure 9, genes related to high expression of PRKCB were concentrated on NSCLC, pathways in cancer, B cell receptor signaling, T cell receptor signaling, VEGF signaling pathway, and so on. In contrast, the PRKCB low expression related gene sets were rich in Huntington's disease, oxidative phosphorylation, purine metabolism, pyrimidine metabolism, and base excision repair. Taken together, these results suggested that PRKCB was indeed involved in the cancerrelated KEGG pathway.

Correlation between PRKCB and tumor infiltrating immune cells
It can be seen from GSEA that PRKCB expression was related to immune cell receptor signaling, and we used TIMER software to analyze the relationship between PRKCB expression and tumor infiltrating immune cells The PRKCB-related lung adenocarcinoma gene set enrichment analysis (GSEA) was identified by TIMER software. The nominal p-value and normalized enrichment score (NES) were used to rank each phenotypic enrichment pathway (Figure 10(a)). PRKCB expression and CD4 + T cells (cor = 0.582), CD8 + T cells (cor = 0.478), B cells (cor = 0.6), macrophages (cor = 0.42), neutrophils (cor = 0.605), and dendritic cells (cor = 0.657) had significant correlations. Among them, the expression of PRKCB is more correlated with neutrophils and dendritic cells. In addition, the expression of PRKCB was combined with the expression of each immune cell to analyze its influence on the prognosis of LUAD patients. We found that LUAD patients with high expression of PRKCB combined with high expression of B cells had better prognosis. Meanwhile, LUAD patients with high expression of

PRKCB was less expressed in LUAD tumor tissues
We used qRT-PCR to detect the messenger RNA (mRNA) expression of PRKCB in 20 pairs of LUAD tumors and adjacent normal lung tissues to confirm the role of PRKCB in LUAD. As shown in Figure 11(a), the expression of PRKCB in tumor tissues was significantly lower than that in normal lung tissues (p < 0.05). This result was consistent with the TCGA database ( Figure 4).

In LUAD patients, low PRKCB expression was associated with poor clinical prognoses
We conducted an immunohistochemical study on the tumor specimens of 60 patients with LUAD. Figure 11(b),(c), respectively showed representative images of PRKCB positive and negative staining. According to the median IHC score, the expression can be divided into high PRKCB group (25 cases) and low PRKCB group (35 cases). We next evaluated the relationship between PRKCB IHC scores and clinical characteristics in LUAD patient specimens. The expression of PRKCB had a significant correlation between age and differentiation, but had no obvious correlation between gender, smoking, or pathological TNM stage (Table 1). Survival analysis showed that patients with low expression of PRKCB in the TMA cohort had a lower overall survival rate (Figure 11(d)). This result was consistent  III-IV 14 8 6 with the TCGA database (Figure 7). Univariate and multivariate Cox analyses shown in Table 2 indicated that PRKCB gene expression level is an independent protective factor for LUAD patients.

DISCUSSION
With the impact on air pollution and smoking, the incidence and mortality of lung cancer continue to rise. LUAD is the most common histological subtype of NSCLC. It is often diagnosed at advanced stage because of its absence of obvious symptoms. This year, epigenetic modification has received increasing attention, especially in cancer-related research. DNA methylation, one of the epigenetic modifications, controls cell proliferation, differentiation, and apoptosis in eukaryotes, and directly or indirectly controls tumorigenesis. 20 With the continuous study of gene promoter methylation, we not only have a new understanding of the mechanism of tumorigenesis, but also identified useful biomarkers through changes in gene DNA methylation, providing new methods for the diagnosis and treatment of diseases. 21 Therefore, an in-depth understanding of tumor suppressor genes associated with LUAD will be of great value of early clinical diagnosis and treatment of this disease.
In this study, we freely obtained a LUAD gene expression profile (GSE118370) and a LUAD methylation expression profile (GSE139032) from GEO. R software packages were used to analyze LUAD tissue and normal samples. This study was aimed to identify potential biomarkers related to aberrant methylation of LUAD to contribute to the early diagnosis and treatment of LUAD patients. In these two data, 124 overlapping genes were found, that is, 124 aberrantly methylated genes. To further analyze overlapping genes, we used R package Cluster Profiler for functional and pathway analysis. The results of GO analysis indicate that these genes have something to do with the regulation of calcium ion homeostasis and transcription activator activity. Meanwhile, KEGG pathway analysis indicated that overlapping genes are mainly concentrated in calcium signaling pathway. We constructed a PPI network with 123 nodes and 97 edges, and selected the first 10 genes as the central genes using CytoHubba, including RYR2, ADCY4, ADCY5, DRD5, PRKCB, NMUR2, ADRA2C, DRD4, SOX17, and FGF2. We used the TCGA database for further verification. As a result, ADCY4, ADCY5, FGF2, PRKCB, and SOX17 were obtained. They are all hypermethylated and underexpressed in LUAD. Next, we used TCGA database analysis to show that PRKCB and FGF2 are associated with the clinical prognosis of LUAD and are independent prognostic factors. Moreover, the effect of PRKCB and FGF2 on the OS rate of LUAD was analyzed by GEPIA. We can conclude that the low expression of PRKCB in LUAD patients has a poor prognosis. Next, we further studied the GSEA pathway of PRKCB in LUAD through the TCGA database. The results show that PRKCB is indeed involved in cancerrelated pathways and immune cell receptor signaling pathways. TIMER verified that PRKCB is associated with immune cell infiltration in LUAD. In addition, different PRKCB gene expression levels combined with different immune cell contents have an impact on the prognosis of LUAD patients. We speculate that the expression of PRKCB may affect OS by regulating the degree of immune cell infiltration in LUAD. This suggests that PRKCB may play a role in LUAD through methylation and immune cell infiltration.
PRKCB is a member of the protein kinase C (PRKC) family, which is composed of several serine/threonine kinases and can be activated by calcium and a second messenger diacylglycerol. 22 It can be concluded from previous studies that PRKCB plays multiple roles in cell life and survival, especially in regulating cell survival and apoptosis. 23 Studies have found that PRKCB promoters are hypermethylated in a variety of adenocarcinomas. 24 At the same time, research indicates that PRKCB may regulate its expression in NSCLC through the Wnt signaling pathway. 25 In summary, based on comprehensive data processing and analysis, we found that PRKCB is highly methylated and lowly expressed in LUAD and is associated with immune cell infiltration. At the same time, the survival prognosis of LUAD patients with high PRKCB expression is better. Subsequent experiments such as qRT-PCR and IHC have also preliminarily verified this conclusion. Therefore, PRKCB is relevant to prognosis of LUAD through methylation and immune infiltration. Moreover, because gene methylation modifications usually occur in the early stage of cancers, the methylation change of the PRKCB gene may occur in the early stage of LUAD, which may have certain value for the early diagnosis of LUAD patients. 26 Of course, the current research is still limited, and further research is needed. The expression of PRKCB is accompanied by a large number of immune cell infiltrations, which means that PRKCB may play an important role in the tumor microenvironment of LUAD by regulating the tumor infiltration of immune cells. 27 This provides new ideas for future antitumor immunotherapy and anti-drug resistance in LUAD.