Differential coexpression networks in bronchiolitis and emphysema phenotypes reveal heterogeneous mechanisms of chronic obstructive pulmonary disease

Abstract Chronic obstructive pulmonary disease (COPD) is a heterogeneous disease with multiple molecular mechanisms. To investigate and contrast the molecular processes differing between bronchiolitis and emphysema phenotypes of COPD, we downloaded the GSE69818 microarray data set from the Gene Expression Omnibus (GEO), which based on lung tissues from 38 patients with emphysema and 32 patients with bronchiolitis. Then, weighted gene coexpression network analysis (WGCNA) and differential coexpression (DiffCoEx) analysis were performed, followed by gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes enrichment analysis (KEGG) analysis. Modules and hub genes for bronchiolitis and emphysema were identified, and we found that genes in modules linked to neutrophil degranulation, Rho protein signal transduction and B cell receptor signalling were coexpressed in emphysema. DiffCoEx analysis showed that four hub genes (IFT88, CCDC103, MMP10 and Bik) were consistently expressed in emphysema patients; these hub genes were enriched, respectively, for functions of cilium assembly and movement, proteolysis and apoptotic mitochondrial changes. In our re‐analysis of GSE69818, gene expression networks in relation to emphysema deepen insights into the molecular mechanism of COPD and also identify some promising therapeutic targets.

The two classical clinical phenotypes of COPD are bronchiolitis and emphysema. 4 Patients with bronchiolitis display persistent inflammation, goblet cell hyperplasia and mucin hyperexpression in the airway, 5 increasing intraluminal mucus, wall muscle fibrosis and airway stenosis. 6 Emphysema involves elastolytic destruction of the alveolar wall without obvious fibrosis and loss of normal lung tissue. 7 COPD patients with either the bronchiolitis or the emphysema phenotype differ in their clinical characteristics and treatment response. Emphysema patients seem to present worse pulmonary function and greater dyspnoea than bronchiolitis patients. 8 Long-acting beta-agonist and inhaled corticosteroid treatment leads to greater improvement in lung function and dyspnoea in obstruction-dominant COPD patients than in emphysemadominant patients. 9 This suggests that the bronchiolitis and emphysema phenotypes of COPD occur via different mechanisms, leading to different treatment response and clinical outcome.
We know from gene expression and polymorphism studies that COPD is a polygenic disease involving multiple pathogenic signalling pathways. [10][11][12] Whether and how the disease processes differ between the emphysema and bronchiolitis phenotypes of COPD remains to be investigated.
Large-scale gene expression analysis using systems biology may help to address this question. Weighted gene coexpression network analysis (WGCNA) identifies correlations among genes across microarray samples, so it can detect clusters (modules) of highly correlated genes. It can also summarize such clusters using the module eigengene or an intramodular hub gene, it can relate modules to one another and to external clinical traits, and it can measure module membership. 13 Building on WGCNA, the method known as differential coexpression (DiffCoEx) can identify changes in gene correlation patterns. 14 Using differential gene expression analysis, Faner et al 15 characterized genes differentially expressed between COPD patients with bronchiolitis or emphysema. They found that several B cell-related genes were up-regulated in patients with emphysema but not in patients with bronchiolitis. We felt that simple up-and down-regulation of differential gene expression were unlikely to capture the full complexity of COPD pathology, so we used WGCNA and DiffCoEx to re-analyse the GSE69818 gene expression data set at the Gene Expression Omnibus database. Our aim was to begin to identify genes differentially coexpressed between the bronchiolitis and emphysema phenotypes of COPD as a way to develop promising biomarkers for diagnosis and as a way to identify molecular pathways involved in each phenotype. Data for the 49 386 probes representing 20 958 genes in this microarray were extracted using ENSEMBL Gene ID. When the expression level of a gene was measured using multiple probes, the expression level was calculated by averaging the level for all the probes. Gene expression was normalized using the robust multiarray average algorithm implemented in the limma package in R software. 19 Genes differentially expressed between the two COPD phenotypes were identified using the RankProd package in R. The difference in expression had to be associated with P < .05 in order to qualify for further analysis. In the end, 8150 differentially expressed genes (DEGs) were analysed further.

| WGCNA
Comparability was assessed in terms of gene expression levels and connectivity. Connectivity was measured using the soft connectivity function in the WGCNA package, which drew on data from 5000 randomly selected genes. A WGCNA was constructed in the WGCNA package, and cluster analysis was performed using the flash clust function in the flash Clust package. Modules were identified using the cutreeHybrid function in the Dynamic Tree Cut package.
Modules are groups of genes that have similar patterns of connection strengths with all other genes of the network and that usually share similar functions. 20 The module eigengenes function in the WGCNA package was used to identify module eigengenes (MEs), and then, correlations between MEs and individual genes were assessed using signed module membership (MM). These correlations were used to determine whether a given gene belonged to a given module.

| Differential coexpression analysis
The differential coexpression of genes between bronchiolitis and emphysema was analysed using the DiffCoEx algorithm, which operates differently from WGCNA. Briefly, unsigned Pearson correlation matrices called adjacency matrices were calculated separately for the microarray results from patients with the emphysema or bronchiolitis phenotype, and differences between these matrices were calculated. A topological overlap matrix was calculated from the matrix of correlation change. To find modules of genes differing in similar ways between the two phenotypes, genes were clustered by average hierarchical clustering, using the topological overlap matrix as a distance metric. Modules were defined using the 'hybrid' method of dynamic tree cutting. The tree was cut at a height of 0.93, requiring a minimum cluster size of 20 genes, and modules whose eigengenes branched at a height ≤0.2 were merged.

| Gene function analysis and the protein-protein interaction (PPI) network
Genes and hub genes in each module were functionally analysed using the clusterProfiler algorithm for gene ontology (GO) and pathway enrichment according to the Kyoto Encyclopedia of Genes and F I G U R E 1 Hierarchical clustering of module hub genes that summarize the modules yielded in the clustering analysis. Branches of the dendrogram (the meta-modules) represent together hub genes that are positively correlated Genomes (KEGG). 21 Interactions among genes were described using STRING 22 and displayed using CytoScape. 23

| Identification of gene coexpression networks and modules
Of the 70 patients with COPD, the clinical information of the samples involving GOLD and DLCO was summarized in Supplementary   Table S1  were consistent with those identified previously. 15 The strongest association between module and trait was between the pink module and DLCO_S (cor = 0.48, P = 2E-05; Supplementary Figure S2).
Other significant correlations were found between the pink module and emphysema (cor = 0.34, P = .004) and GOLD grade 4 (cor = 0.33, P = .003). Therefore, the pink and tan modules were analysed further.

| Functional enrichment and WGCNA clustering
Given the similarity among genes within each module, the genes differentially expressed between the bronchiolitis and emphysema phenotypes in the pink and tan modules were functionally annotated and analysed for pathway enrichment. The top 20 GO and KEGG F I G U R E 3 Gene ontology (GO) annotation of genes in the (A) pink and (B) tan module. The significant GO terms that conformed to P < .05 were screened. The x-axis shows the number of entries enriched for specific GO-BP: the longer the column, the greater the numbers of enriched entries. The y-axis represents specific GO-BP entries. Column colour is used to map the p value of specific functional items. Stronger red colour of the column indicates smaller P value of the corresponding enrichment, thus more significant enrichment terms were extracted for further analysis, with a threshold of P < .05.

| Modules differentially coexpressed between emphysema and bronchiolitis phenotypes of COPD
Of the 8150 genes analysed by DiffCoEx, 1001 were assigned to one of five modules, which were arbitrarily assigned colours ( Figure 4).
The blue and brown modules, containing 307 genes, were significantly more highly correlated with each other in patients with the bronchiolitis phenotype than in patients with the emphysema phenotype. The opposite was observed for the turquoise module, containing 606 genes. In this way, the modules showed differential coexpression patterns between emphysema and bronchiolitis.
The differential coexpression network was also visualized using Cytoscape ( Figure 5).

| Functional and pathway enrichment analysis of DiffCoEx
The biological significance of the modules was assessed using gene set enrichment analysis with clusterProfiler. 21 In the biological process category, the following four GO terms were enriched significantly in the turquoise module: cilium assembly, epithelial cilium movement involved in determination of left/right asymmetry, proteolysis and apoptotic mitochondrial changes (Table 1). Then, we identified IFT88, CCDC103, MMP10 and Bik as hub genes of the above four GO items, respectively. Moreover, FOXJ1 and RFX3, which interacted with many cilia-related genes in our gene interaction network diagram (Supplementary Figure S4), magnify the induction of cilia-associated genes. 24 The blue module, in contrast, was enriched for genes involved in cell division, mitotic spindle midzone assembly, DNA replication initiation and G2/M transition of mitotic cell cycle ( Table 1). The blue module was also significantly involved in three pathways (Supplementary Table S3 Table S3). Hub genes in each module were listed in Table 2.

| D ISCUSS I ON
The present study identified coexpression of several B cell-related genes enriched in the tan module, which positively correlate with emphysema, which is consistent with the work of Faner et al 15 and other groups that implicates antigen and immune processes in COPD pathogenesis. 25,26 Moreover, we also identified other vital genes using gene interaction network analysis (Supplementary Figure S3). For example, CD79A increases with increasing emphysema severity, 27 and targeting the CD79A interacting genes CXCL13 and CXCR5 causes a decrease in the immune responses of allergic airway inflammatory process. 28 This should be explored in greater detail in future studies.
In addition, we focused on differential coexpression rather than differential expression in an attempt to identify functional gene modules with different coexpression signatures in the two COPD phenotypes. Thus, we performed DiffCoEx analysis to identify transcriptome differences between the two phenotypes. In this approach, genes are clustered into modules according to how similarly their expression differs between the two phenotypes, and then, significant differences among these modules can be identified.
Ciliary function abnormalities have been associated with smoking and reduced mucociliary clearance in COPD, 29 contributing to F I G U R E 4 Differentially coexpressed modules between bronchiolitis and emphysema: comparative correlation heatmap. The upper diagonal of the main matrix shows a correlation between pairs of genes among the emphysema (red colour corresponds to higher correlations; yellow, lower correlations). The lower diagonal of the heatmap shows a correlation between the same gene pairs in bronchiolitis. Colour bars at the edge of the square show modules in which genes coexpressed differently between emphysema and bronchiolitis (second column); darker colours indicate higher mean expression levels increased inflammation and infection. 30 Motile multi-ciliated cells, the dominant cell type in the airway epithelium, perform the critical function of transporting mucus and entrapped inhaled environmental contaminants up from the airways. 31,32 In the human airway epithelium,  Figure S4). Prior work has shown that in the large airway epithelium, cigarette smoking in healthy individuals and COPD is associated with shorter cilia than healthy non-smokers. 33 A number of proteins related to IFT have been shown in model systems to F I G U R E 5 Differential coexpression network between emphysema and bronchiolitis. Dots in different colours represent genes in different modules. Edges shows different correlation between two genes; only Pearson δR differences (δR = |Remph|−|Rbronch| ) reaching 0.7 are shown. Red lines show stronger correlation in emphysema than in bronchiolitis; green lines show stronger correlation in bronchiolitis directly affect cilia length. 34 For instance, cells carrying homozygous IFT88 hypomorphic mutation fail to grow cilia with normal length. 35 The results of our DiffCoEx identify IFT88 as a hub gene enriched in cilium assembly of the turquoise module, and we observed the interaction between IFT88 and FOXJ1 in the gene interaction network diagram in that module (Supplementary Figure S4). Previous research has also suggested that FOXJ1 overexpression reversed the CSEmediated inhibition of cilia growth. 36 Whether and how these two genes work together to maintain the normal length of cilia remains to be further studied. Moreover, IFT88 is part of the intraflagellar transport machinery and helps maintain cilia motility. 37,38 Knockout of IFT88 or the other IFT-B subunits IFT20 and IFT38 results in a 'no cilia' phenotype. 39,40 CCDC103, a hub gene enriched in epithelial cilium movement involved in determination of left/right asymmetry, is an oligomeric coiled-coil protein that binds tightly to the ciliary axoneme and facilitates dynein attachment to it, thereby promoting ciliary motility. Patients with loss-of-function mutations in CCDC103 have largely static cilia. 41 Our results suggest that future studies should TA B L E 1 GO enrichment analysis in differential coexpression modules  46 In fact, we failed to identify coexpressing hub genes related to normal cilium assembly and movement and promotion of mucous cell apoptosis in patients with the bronchiolitis phenotype of COPD, which may contribute to cough and expectoration in affected individuals.
Imbalance of proteases plays an important role in the development and progression of COPD. The MMP10 gene, which we identified here as a proteolysis related hub gene, is a potential COPD biomarker. 47 It promotes emphysema development by influencing the proteolytic and inflammatory activities of macrophages. 48 Knocking out the MMP10 gene in mice strongly attenuates emphysema-like lung injury induced by cigarette smoke. 49 Future study should examine MMP10 as a therapeutic target in emphysema.
Cigarette smoke may contribute to COPD pathogenesis by interfering with the removal of apoptotic cells (efferocytosis), 50 and active RhoA inhibits efferocytosis by inducing the formation of stress fibres and focal adhesions as well as by inducing cell spreading. 51 It has been reported that apoptotic cells are abundant in animal models of emphysema. 52,53 In addition, the RhoA/ROCK pathway may up-regulate inflammatory genes via NF-κB, 54 and blockade of ROCK inhibits NF-κB activation and production of proinflammatory cytokines. 55

| CON CLUS IONS
This study showed that emphysema and bronchiolitis phenotypes of COPD are different in transcriptional level. Immune-related processes, cilium assembly and movement, proteolysis, apoptotic mitochondrial changes, neutrophil degranulation and RhoA/ROCK signalling play a role in the pathological process of emphysema. Hub genes related to these biological processes may be biomarkers of emphysema. These findings should be verified and extended in future experimental work.

CO N FLI C T O F I NTE R E S T S
The authors declare that they have no conflict of interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data used to support the findings of the current study are available from Gene Expression Omnibus (https ://www.ncbi.nlm.nih.gov/ geo/query/ acc.cgi?acc=GSE69818).

E TH I C S A PPROVA L A N D CO N S E NT TO PA RTI CI PATE
Not applicable.