Identification of key genes and upstream regulators in ischemic stroke

Abstract Introduction Ischemic stroke (IS) causes severe neurological impairments and physical disabilities and has a high economic burden. Our study aims to identify the key genes and upstream regulators in IS by integrated microarray analysis. Methods An integrated analysis of microarray studies of IS was performed to identify the differentially expressed genes (DEGs) in IS compared to normal control. Based on these DEGs, we performed the functional annotation and transcriptional regulatory network constructions. Quantitative real‐time polymerase chain reaction (qRT‐PCR) was performed to verify the expression of DEGs. Results From two Gene Expression Omnibus datasets obtained, we obtained 1526 DEGs (534 up‐regulated and 992 down‐regulated genes) between IS and normal control. The results of functional annotation showed that Oxidative phosphorylation and Alzheimer's disease were significantly enriched pathways in IS. Top four transcription factors (TFs) with the most downstream genes including PAX4, POU2F1, ELK1, and NKX2‐5. The expression of six genes (ID3, ICAM2, DCTPP1, ANTXR2, DUSP1, and RGS2) was detected by qRT‐PCR. Except for DUSP1 and RGS2, the other four genes in qRT‐PCR played the same pattern with that in our integrated analysis. Conclusions The dysregulation of these six genes may involve with the process of ischemic stroke (IS). Four TFs (PAX4, POU2F1, ELK1 and NKX2‐5) were concluded to play a role in IS. Our finding provided clues for exploring mechanism and developing novel diagnostic and therapeutic strategies for IS.


| INTRODUC TI ON
Stroke is one of the leading causes of serious long-term disability and mortality worldwide, which is characterized by symptoms such as inability to move or feel one side of the body, inability to speak, or understand problems and dizziness (Donnan, Fisher, Macleod, & Davis, 2008;Dorrance & Fink, 2015). Ischemic stroke (IS) is one of the main types of stroke, which is characterized by cerebral ischemia. Survival of IS can be lead to severe neurological impairments and physical disabilities with a high economic burden (Evers et al., 2004;Volny, Kasickova, Coufalova, Cimflova, & Novak, 2015). Up to now, thrombolysis is the only efficacious treatment for IS (Chi & Chan, 2017). Despite recent progress, little is known about the underlying pathophysiology mechanisms of IS and much work still needs to be done to fully elucidate the pathophysiology mechanisms.
With high-throughput genetic analysis, the emergence of gene expression profiles has became an effective method to identify differentially expressed genes (DEGs) in a variety of diseases which help to explore pathogenesis and develop biomarkers (Alieva et al., 2014;Du, Yang, Tian, Wang, & He, 2014;Kong et al., 2018). Due to differences of samples and platforms in multiple microarray studies, integrated analysis of multiple microarray studies can identify more accurate profiles of DEGs with a larger sample size than a single microarray. Exploring the upstream transcription factors (TFs) mediating abnormal gene expression in disease status can help to understand pathophysiological changes in complex diseases (Li, Dani, & Le, 2009;Zhao, Wang, Xu, Li, & Yu, 2017).
Our study aims to make an integrated analysis of multiple IS microarray data to obtain the key DEGs in the pathogenesis of IS.
Functional annotation and protein and IS-specific transcriptional regulatory network construction were performed to explore the biological functions of DEGs, which hopefully provide clues for exploring mechanism and developing novel diagnostic and therapeutic strategies for IS.
The inclusion criteria for this study were: (a) The type of dataset was described as "expression profiling by array. (b) Dataset should be whole-genome mRNA expression profile by array. (c) Datasets were obtained by blood samples of IS and normal control group (no drug stimulation or transfection). (d) The datasets should be normalized or original, and two sets of mRNA data (GSE22255 and GSE16561) of IS were selected. In GSE22255, Gene expression profiling was performed in peripheral blood mononuclear cells of 20 IS patients and 20 sex-and age-matched controls using Affymetrix microarrays. In GSE16561, total RNA extracted from whole blood in 39 IS patients compared to 24 healthy control subjects.

| Identification of DEGs between IS and normal controls
MetaMA, an R package, is used to combine data from multiple microarray datasets, and we obtained the individual p-values. The Benjamini & Hochberg method were used to obtain multiple comparison correction false discovery rate (FDR). DEGs in IS compared to normal controls were obtained with FDR < 0.05. The heat-map of top 100 DEGs was obtained by pheatmap package.

| Functional annotation of DEGs
By using GeneCodis3 (http://genec odis.cnb.csic.es/analysis), gene ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed to uncover biological functions of DEGs. FDR < 0.05 was considered as statistically significant.

| Construction of IS-specific transcriptional regulatory networks
Based on the integrated analysis, the corresponding promoters of the top 20 up-regulated or down-regulated DEGs were obtained by UCSC (http://genome.ucsc.edu). The TFs involved in regulating these DEGs were derived from the match tools in TRANSFAC. The

| Validation the expression of DEGs in qRT-PCR
Five adult patients with IS and five normal controls were enrolled in our study from First Affiliated Hospital of Shantou University Medical College. Patients with neurological diseases, cardiac embolism, transient ischemic attack, hemorrhagic infarction, occult cerebrovascular malformation, traumatic cerebrovascular disease were excluded. Normal control without history of stroke, head trauma and surgery, heart surgery or neurological disease was included in this study. All subjects were first on an empty stomach for 12 hr. Then, we collected the blood samples by venipuncture at 7:00-8:00 of the next morning. The patient demographics, clinical features and risk factors were extracted from medical records as displayed in Table 1.  Table 2.

| DEGs in IS
Two datasets (GSE22255 and GSE16561) were downloaded from GEO ( Table 3). Samples of GSE22255 and GSE16561 were obtained from participants of Portugal and USA, respectively. Compared with the normal controls, 1526 DEGs in IS were obtained with FDR < 0.05, among which, 534 genes were up-regulated and 992 genes were down-regulated. Top 20 DEGs between IS and normal controls were displayed in Table 4. Heat map of top 100 DEGs was displayed in Figure 1.

| Functional annotation of DEGs
According to the GO enrichment analysis with FDR < 0.05, apoptotic process (FDR = 1.25E-12), respiratory electron transport chain
regulatory network maps were built, which consisted of 22 nodes and 26 edges (Figure 4).

| Validation the expression of DEGs in qRT-PCR
According to our integrated microarray analysis based on GEO, six DEGs including ID3, ICAM2, DCTPP1, ANTXR2, DUSP1, and RGS2 were selected to perform the quantitative real-time polymerase chain reaction (qRT-PCR) confirmation ( Figure 5)

TA B L E 4 The top 20 DEGs in IS
in the stress response, neural plasticity, and neural circuitry (Avecilla, Doke, & Felty, 2017). ID3 was down-regulated in peripheral blood of IS patient, and identified candidate gene that can accurately detect IS (O'Connell et al., 2016). The diagnostic robustness of the identified 10 candidate genes (ID3 is one of 10 candidate genes) in an independent patient population, and further suggest that it is temporally stable over the first 24 hr of stroke pathology (O'Connell, Chantler, & Barr, 2017). In this study, the expression of ID3 was down-regulated in the blood of patients with IS in the results of bioinformatics analysis and qRT-PCR validation. Our results provide further evidence that ID3 may be an important biomarker for the diagnosis of IS.
Intercellular adhesion molecule-2 (ICAM-2) belongs to the ICAM family of adhesion proteins. ICAM2 is expressed in vascular endothelial cells and blood cells, and plays a key role in cell-cell interactions during humoral immunity (Lyck & Enzmann, 2015). ICAM2 also promotes neutrophil binding to and migration through vascular endothelium as a component of immune reactions (Huang et al., 2006). Platelet-leukocyte aggregation and platelet activation are found to be on the higher side in IS patients, and ICAM2 is an important gene in regulating interaction of platelets with leukocytes.
Herein, ICAM2 was down-regulated in the blood of patients with IS in the results of bioinformatics analysis and qRT-PCR validation.
Transcriptional regulatory networks results showed that ICAM2 was co-expression with PAX4, ELK1 and NKX2-5. Therefore, we speculated that ICAM2 may be involved in the occurrence of IS.
ANTXR cell adhesion molecule 2 (ANTXR2), also known as CMG2, encodes a 55-kDa type I transmembrane protein serves as capillary morphogenesis protein 2 (Youssefian et al., 2017). ANTXR2 also known as the main receptor of the anthrax toxin (Scobie, Rainey, Bradley, & Young, 2003). ANTXR2 was up-regulated in peripheral blood of IS patient, and identified candidate gene that can accurately detect IS (O'Connell et al., 2016). In this study, the results of bioinformatics analysis and qRT-PCR validation showed that ANTXR2 was up-regulated in IS patient. ANTXR2 was co-expression with PAX4 and ELK1 in transcriptional regulatory networks. These finds indicated ANTXR2 may play a pivotal role in IS.   (Nicholls, 2008). Is and AD, despite being distinct disease entities, share numerous pathophysiological mechanisms such as those mediated by inflammation, immune exhaustion, and neurovascular unit compromise (Lucke-Wold et al., 2015). Therefore, we speculated that Oxidative phosphorylation and AD pathway may play an important role in pathogenesis of IS.
In conclusion, our study identified several DEGs, TFs, and pathways in IS which provides clues to understand the pathology and develop diagnostic and therapeutic targets for the IS. The new DEGs, TFs, and pathways of IS obtained in our integrated analysis suggested that integrated microarray analysis is a good way to uncover the molecular mechanism of diseases. However, there are limitations in this study. The number of samples for qRT-PCR confirmation was small. Further research with larger sample size was performed to confirm our finding and explore the precise role of key DEGs in IS.

ACK N OWLED G M ENTS
We thank Beijing Medintell Bioinformatic Technology Co., LTD for assistance in data analysis.

CO N FLI C T O F I NTE R E S T
None.

DATA AVA I L A B I L I T Y S TAT E M E N T
The dataset supporting the conclusions of this article is included within the article.