m6A RNA methylation regulators could contribute to the occurrence of chronic obstructive pulmonary disease

Abstract N6‐methyladenosine (m6A) RNA methylation, the most prevalent internal chemical modification of mRNA, has been reported to participate in the progression of various tumours via the dynamic regulation of m6A RNA methylation regulators. However, the role of m6A RNA methylation regulators in chronic obstructive pulmonary disease (COPD) has never been reported. This study aimed to determine the expression and potential functions of m6A RNA methylation regulators in COPD. Four gene expression data sets were acquired from Gene Expression Omnibus. Gene ontology function, Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses, weighted correlation network analysis and protein‐protein interaction network analysis were performed. The correlation analyses of m6A RNA methylation regulators and key COPD genes were also performed. We found that the mRNA expressions of IGF2BP3, FTO, METTL3 and YTHDC2, which have the significant associations with some key genes enriched in the signalling pathway and biological processes that promote the development progression of COPD, are highly correlated with the occurrence of COPD. In conclusion, six central m6A RNA methylation regulators could contribute to the occurrence of COPD. This study provides important evidence for further examination of the role of m6A RNA methylation in COPD.

In the current study, we first explored and validated key m6A RNA methylation regulators that were significantly correlated with the occurrence of COPD and determined key COPD genes. In addition, we further analysed the relevance of m6A RNA methylation regulators with these key genes.

| Data acquisition and preprocessing
Four gene array expression Series Matrix Files of small airway epithelium samples, containing GSE5058, GSE8545, GSE11906 and GSE20257, were obtained from the Gene Expression Omnibus data sets (https://www.ncbi.nlm.nih.gov/geo/). GSE5058 includes molecular profiles of 15 samples from COPD cases and 24 samples from healthy controls. 15 GSE8545 contains expression profiles of 54 samples, including 18 COPD cases and 36 healthy controls. 16 GSE11906 includes 33 samples from COPD cases and 98 from healthy controls. 17 GSE20257 contains 23 samples from COPD cases and 112 from healthy controls. 18 Their experiments were conducted on the Affymetrix Human Genome U133 Plus 2.0 Array (GPL570 platform, Affymetrix, Inc). Gene symbols of data sets were obtained using the R software and annotation packages. Subsequently, the log2 converted and standardized mRNA expression data were obtained by R 3.61 and limma package 3.40.6. Given the limited sample size of small airway epithelium in COPD, we merged and batch-normalized GSE8545 and GSE20257 into the train group using R packages (sva and limma3.40.6). Validation group 1 was obtained from the merged and batch-normalized GSE5058 and GSE11906, and validation group 2 was gained from the merged and batch-normalized four data sets.
A flow chart of this study was showed in Figure S1.

| Selection of m6A RNA methylation regulators
Twenty-four widely recognized m6A RNA methylation regulators were collated from published literature. 3,4,6,8,9 We then obtained their mRNA expression data from the train group, validation group 1 and validation group 2, respectively. Differential expression analysis for 24 m6A RNA methylation regulators between COPD cases and health controls was performed using wilcox.test. R package corrplot was used to identify the correlation between m6A RNA methylation regulators. The P value < .05 was considered as statistical significant.

| Identification of co-expression modules and key COPD genes
Firstly, the differentially expressed genes (DEGs) between COPD cases and health controls for train group, validation group 1 and validation group 2, were separately identified by the limma package 3.40.6. The P value < .05 after being corrected by false discovery rate (FDR) and |log2 fold change (FC)| > 1 were applied as the cut-off for DEGs screening. The overlapped DEGs were obtained from the above three groups.
Subsequently, we used differentially expressed genes (FDR < 0.05) obtained from train group to identify the hub COPD genes by weighted correlation network analysis (WGCNA) package on mRNA expression data of validation group 1. The detailed procedure and introduction of WGCNA were shown in WGCNA R package. 19 In brief, a weighted adjacency matrix was first constructed based on the selected soft threshold power. Subsequently, the connectivity measure per gene was calculated according to the connection strengths with other genes. Then, the module structure preservation was identified by module preservation R function. Finally, the gene expression profiles of each module were summarized by the module eigengene, and each module eigengene was regressed on COPD trait using the linear model in the limma R package. 20 Finally, key COPD genes were acquired by taking the intersection of the overlapped DEGs and hub COPD genes.

| Bioinformatic analysis
The interactions among 24 m6A RNA methylation regulators and key COPD genes were analysed using the STRING database (http:// www.strin g-db.org/) and were visualized by Cytoscape 3.7.1. 21 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment and Gene Ontology (GO) functional analyses were performed to analyse 24 m6A RNA methylation regulators and key COPD genes using R packages (clusterProfiler, org.Hs.eg.db, enrichplot, colorspace, stringi and ggplot2). The P value < .05 was applied as the cut-off.

| The Correlation of m6A RNA methylation regulators with key COPD genes
R package ggpubr was used to determine the correlation between m6A RNA methylation regulators and key COPD genes in the COPD data of validation group 2. The P value < .05 was considered as statistical significant.

| Expression of m6A RNA methylation regulators in COPD
Firstly, the differential expression analysis for 24 m6A RNA methylation regulators between the COPD samples and control samples was performed in train group, we found that 16 m6A RNA methylation regulators were significantly associated with the occurrence of COPD ( Figure 1A

| Identification of DEGs in COPD
As shown in Volcano plots ( Figure  up-regulated genes and 3 down-regulated genes were overlapped among three groups ( Figure S2D).

| Identification of co-expression modules and key COPD genes
In order to determine the modules and genes correlated with the occurrence of COPD, we performed WGCNA for 4410 genes obtained from train group according to the thresholds of FDR < 0.05 on validation group 1. The lowest soft threshold power 6, for which the scale-free topology fit index reaches 0.90, was used to conduct a hierarchical clustering tree. Subsequently, nine modules were generated, including black, blue, brown, green, greenyellow, grey, magenta, pink and yellow modules. The grey module, including the non-co-expressed genes, was not further analysed ( Figure S3A,B).
To identify modules correlated with COPD status, we analysed the relationship between each module and COPD. Form Figure S3C, we could conclude that the modules of black, green, greenyellow, magenta and yellow revealed the strongest significant correlations with COPD, indicating that genes in these modules are predominantly correlated with this disease. Based on the criteria that cor.
COPD > 0.2 and cor.module membership > 0.6, a total of 336 genes with the high connectivity in black, green, greenyellow, magenta and yellow modules were screened as hub genes ( Figure S3D-H).
Finally, 27 COPD key genes were acquired by taking the intersection of 32 overlapped DEGs and 336 hub COPD genes (Table 1).
Among the above 27 key COPD genes, except that LTF and SEMA5B expressions were notably reduced in COPD and negatively associated with COPD, the expression of other genes was notably up-regulated and positively correlated with this disease. and nucleic acid and protein binding (Table S1).

| KEGG and GO enrichment analyses
GO and KEGG pathway enrichment analyses were performed for the above 27 key COPD genes to explore potential biological processes correlated with COPD. GO analysis showed that these genes were involved in twenty two biological processes, includ-

| D ISCUSS I ON
In this study, the PPI network analyses between m6A RNA methylation regulators and key COPD genes showed that

This work was funded by Sichuan Provincial Health and Family
Planning Commission Universal Application Project (18PJ514) and Education Office of Sichuan province (17ZA0130).

CO N FLI C T O F I NTE R E S T
No conflicts of interest, financial or otherwise, are declared by the authors.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of our study are openly available in the Gene Expression Omnibus (GSE5058, GSE8545, GSE11906 and GSE20257) at https://www.ncbi.nlm.nih.gov/geo/.