Bioinformatic analysis identifies key transcriptome signatures in temporal lobe epilepsy

Abstract Aims To identify transcriptome signatures underlying epileptogenesis in temporal lobe epilepsy (TLE). Methods Robust rank aggregation analysis was used to integrate multiple microarrays in rodent models of TLE and determine differentially expressed genes (DEGs) in acute, latent, and chronic stages. Functional annotation and protein‐protein interaction analysis were performed to explore the potential functions of the DEGs and identify hub genes with the highest intramodular connectivity. The association between hub genes and hippocampal sclerosis/seizure frequency was analyzed using publicly available RNA‐sequencing datasets from TLE patients. We subsequently established a pilocarpine‐induced status epilepticus (SE) model in rats and validated mRNA expression of hub genes by quantitative reverse transcription PCR (qRT‐PCR). Results The DEGs in the acute, latent, and chronic phases of TLE in animal models were prominently enriched in inflammatory response. Hub genes identified in the acute phase mainly participated in biological processes including inflammation, blood‐brain barrier damage, and cell adhesion. The hub genes in the latent phase were related to microglia/macrophage activation (Emr1 and Aif1) and phagocytosis (Cd68, Tyrobp, and Lyz). In the chronic phase, the hub genes were associated with activation of complements and microglia/macrophages. We further found that some hub genes identified in human TLE, such as Tlr2, Lgals3, and Stat3, were positively correlated with seizure frequency. Other hub genes, including Lgals3 and Serpine1, were associated with hippocampus sclerosis. qRT‐PCR analysis confirmed that the mRNA levels of hub genes in rat hippocampus were significantly up‐regulated after SE induction. Conclusions Our integrated analysis identified hub genes in different stages of epilepsy. The functional annotations suggest that the activation and phagocytic activities of microglia/macrophages may play critical roles in epileptogenesis of TLE.


| INTRODUC TI ON
Temporal lobe epilepsy (TLE) is characterized by refractory seizures, significant cognitive decline, and depression. TLE accounts for 36% of intractable epilepsy. 1 Epileptogenesis is a progressive process in which an insult to a normal brain culminates in the occurrence of spontaneous seizures. There are three stages in epileptogenesis-the acute, latent, and chronic stages.
Pathophysiologic changes occur in the injured zone in the acute phase, which is followed by the gradual maturation of epileptic circuit in the latent phase and the ultimate appearance of spontaneous recurrent seizure (SRS) in the chronic phase. 2 Animal models that mimic the clinical and histopathological features of TLE are particularly helpful to study molecular mechanisms in different stages of epileptogenesis. 3 Previous studies identified several pathological mechanisms of epileptogenesis involving immune responses, synaptic plasticity, neurodegeneration, neurogenesis, alterations in metabolism, and blood-brain barrier (BBB) integrity. 2,4 Alterations of transcriptomic profiles have been reported in brains from epilepsy patients and animal models. 5 Approximately 2000 genes were published to be differentially expressed in TLE. 6 Characterizing stage-specific gene expression profiles and identifying key regulatory genes are crucial to elucidate the molecular mechanisms of epileptogenesis in TLE.
High-throughput microarray and RNA-sequencing (RNAseq) analyses provide superior platforms for a comprehensive and unbiased transcriptome-wide analysis and have been widely applied in epileptic studies in animal models and in patients in the past two decades. However, concerns about representativeness and repeatability of an individual microarray/RNAseq study have been raised. 7 The heterogeneities in experimental design, animal model, sampling time, and sample size confounded result interpretation. 7 Wang et al compared the lists of differentially expressed genes (DEGs) from different microarray studies in TLE and found that only 53 among over 2000 DEGs altered in more than two studies. 6 An integrated analysis could efficiently overcome the low validity and reproducibility of an individual experiment, and may help to identify candidate genes that are implicated across different studies. Several bioinformatic analyses on microarray studies in epilepsy have been published recently. However, three analyses were based on summarizing published results rather than performing integrated analysis of original microarrays. 5,7,8 Another study integrated multiple rodent TLE microarrays with RankProd package, whereas their research of key gene regulators was derived from one dataset instead of from gene expression profiles across different microarrays. 9 Robust rank aggregation (RRA) algorithm has been developed for microarray integration in 2012. The basis of RRA is to compare the actual list of each dataset with a null assumption of random order and to screen the DEGs by assigning a significance score for each gene. The aggregated list kept only the statistically relevant DEGs. 10 RRA has been used for integrated microarray analysis of triple-negative breast cancer and discovered key dysregulated miRNAs. 11 Song et al also identified hub genes in prostate cancer using RRA. 12 Moreover, RRA can process the variable gene lists and handle missing ranks from different microarray platforms. 10 In the current study, we conducted an integrated analysis on TLE microarrays using RRA to identify key genes consistently expressed across different epileptic models and platforms. Further gene annotation and protein interaction analysis identified key pathways and hub genes in three phases of epilepsy. The association between hub genes and hippocampal sclerosis (HS)/seizure frequency was also analyzed using RNA sequencing datasets of TLE patients. In addition, the expression of several hub genes was validated in a rat pilocarpine model of TLE using quantitative reverse transcription polymerase chain reaction (qRT-PCR).

| Microarray datasets of epilepsy
The microarray datasets were obtained through searching the Gene Expression Omnibus (GEO) database. We systematically searched the microarray studies using the terms: "epilepsy" or "seizure" or "pilocarpine" or "kainate" or "kainic acid" or "electrical stimulation".
Datasets or samples were included according to the following cri-

| Dataset preprocessing and obtaining DEGs
We downloaded raw data or series matrix files of six microarray datasets from GEO database. In order to reduce heterogeneity of analysis, only the ipsilateral hippocampus samples in each microarray were included. The corresponding annotation documents were used to transform the probes into gene symbols. If multiple probes matched the same symbol, the mean signal intensity was calculated.
Each dataset was normalized by RMA algorithm. Since gene expression patterns may change with the development of epilepsy, we divided our analysis into three distinct phases: acute stage (0-3d), latent stage (4-14d), and chronic stage (the first SRS or after 14d).
The samples in six included microarrays were regrouped according to their sampling time after model establishment. The common control was used if there was no control for each epileptic group in the microarray. The DEGs between epileptic and control hippocampus in different phases were identified using the "limma" package in R language. The cutoff criteria were set as |log 2 fold change (FC)| >1 and adjusted P-value <0.05.

| RRA analysis to integrate microarrays in different phases
To minimize the deviation caused by the analysis of a single array, we integrated the results of several microarrays by adopting RRA algorithm. 10 Before RRA analysis, up-ranked and down-ranked matrixes of all genes in each dataset were generated based on log 2 FC between epileptic and control groups. The log 2 FC matrixes were then merged to get a combined matrix. "RobustRankAggreg" package was applied to integrate the combined log 2 FC matrix. The adjusted P-value indicated the gene ranking in the final list. Genes with Bonferroni-adjusted P-value <0.05 and the |log 2 FC|>0.5 were considered as robust DEGs.

| Pathway and process enrichment analysis
The DEGs obtained in RRA analysis were uploaded to Metascape (http://metas cape.org/gp/index.html) to perform pathway and process enrichment analysis. 13 Terms with a P-value <0.01, a minimum count of 3, and an enrichment factor >1.5 were collected.

| Protein-protein interaction (PPI) analysis and identification of hub genes
Hub genes are defined as genes of high connectivity with other genes. The alteration of hub genes will cause prominent changes in the gene network. The DEGs obtained in RRA analysis were uploaded to STRING database to predict the interactions among proteins. 14 The minimum required interaction score was set as 0.4. Cytoscape software 3.6.1 was used to visualize the protein networks. The top 5 connective genes in each phase were identified as hub genes.

| RNA-SEQUENCING DATA ANALYSIS IN BRAINS FROM HUMAN TLE PATIENTS
The raw count matrices of the two RNA sequencing datasets (GSE12 7871 and GSE71058) were downloaded from the GEO database. In GSE12 7871, the hippocampal tissues resected from 12 intractable TLE patients with seizure frequencies ranging from 0.33 to 120 seizures per month were sequenced. The CPM of genes was calculated using the cpm function of the "edgeR" package. A Spearman rank correlation analysis was conducted using SPSS 24 to assess the association

Latent stage (4-14d)
Chronic Note: A total of six microarrays were included in the integrated analysis. In these microarrays, seizure was induced in rat or mouse by either pilocarpine or kainic acid or electrical stimulation. Hippocampus or dentate gyrus of the epileptic rats or mice was sampled at different time points for expression profile detection in microarray platform. The samples in six included microarrays were divided according to sampling time after model establishment into three groups: acute stage (0-3d), latent stage (4d-14d), and chronic stage (the first spontaneous recurrent seizure or after 14d between cpm of hub genes and seizure frequency. Genes with coefficient P-value <0.05 were considered to be significantly associated with seizure frequency. In GSE71058, the dentate gyrus of 5 TLE patients with HS and 7 without HS were sequenced. The DEGs were obtained using "DESeq2" package. Genes with the |log 2 FC| >1 and adjusted Pvalue <0.05 were considered as HS-related genes. The scatterplots and barplots were prepared by GraphPad Prism 7 software. were sacrificed 7d after SE.

| RNA extraction and qRT-PCR
Total RNA was extracted from hippocampus of rats using TRIzol ing to the 2-△△Ct method. 16 Primers are listed in Table S1.

| Statistical analysis
For qRT-PCR analysis, ANOVA was used if data were normal distribution and homogeneous variance. Otherwise, the Kruskal-Wallis test was used. An adjusted P-value <0.05 was considered to be statistically significant. The data were reported as mean ± standard error.
The GraphPad Prism 7 software was used for data analysis and plotting.

| Information of included microarrays
According to the inclusion criteria, GSE14763, GSE49849, GSE73878, GSE47752, GSE49030, and GSE88992 were included in this study. The detailed information of these datasets is shown in Table 1. We divided our analysis into the acute, la-

| Identification of DEGs in epilepsy models
We standardized each microarray dataset using RMA algorithm to achieve homogeneity between samples. The boxplots after standardization are shown in Figure S1. The "limma" package in R soft- ware was used to screen the DEGs according to the cutoff criteria.

| RRA integrated analysis
Through RRA integration, we were able to obtain the robust DEGs that were stably up-regulated or down-regulated across microarrays in each stage. We identified 60 robust DEGs in acute period  Figure 2 and Table S2.

| Pathway and process enrichment analysis
The pathway and process enrichment analyses were then performed on the robust DEGs identified by RRA analysis in each stage ( Figure 3, Table S3). A pronounced role of inflammation was exhibited persistently in all three phases of epilepsy, especially in the acute and latent stages. Biological processes including regulation of angiogenesis, apoptotic signaling pathway, endothelial cell differentiation, negative regulation of cell proliferation, and leukocyte migration were highly enriched in acute stage compared to the other two stages. In latent stage, biological processes such as neuroinflammatory response, Toll-like receptor 7 signaling pathway, defense response to virus, negative regulation of immune system process, regulation of cytokine production, and integrin-mediated signaling pathway were prominently enriched. Due to the small number of DEGs identified in the chronic stage of TLE, the chronic stageassociated biological processes were relatively less compared to the acute and latent stages and mainly related to inflammation, immune responses, and apoptotic signaling pathway.

| Protein-protein interaction (PPI) network analysis and identification of hub genes
In the PPI analysis, the top 5 connective genes were identified as hub genes for their robust expression and high connectivity ( Figure 4, Table S4). The hub genes in acute stage were Tlr2, Stat3, Timp1,

F I G U R E 3
The representative heat map of pathway and process enrichment analysis identified by Metascape using robust DEGs obtained by RRA analysis in the acute, latent, and chronic stages. In Metascape, terms with a P-value <0.01, a minimum count of 3, and an enrichment factor >1.5 are collected and grouped into clusters based on their membership similarities. A similarity >0.3 is considered a cluster. The top 20 clusters were chosen according to Log 10 (P) ranking and displayed. The color scale represents the value of −log 10 (P). The items enriched are listed on the right of each row

| Hub genes associated with seizure frequency and hippocampal sclerosis in human TLE
Among the abovementioned 16 hub genes in three stages of epileptogenesis (6 in the acute, 5 in the latent and 5 in the late stage), the hub genes Tlr2, Lgals3, and Stat3 were positively correlated with seizure frequency in GSE12 7871 ( Figure 5A). Lgals3 and Serpine1 were significantly increased in TLE patients with HS compared with those without HS (NO-HS) in GSE71058 ( Figure 5B).

| The validation of hub genes in rat model of pilocarpine-induced SE
We then validated the expression of several hub genes in a rat model

Epileptogenesis in TLE is an intricate and heterogeneous process
with different pathological mechanisms such as inflammation,

F I G U R E 4
The protein-protein interaction (PPI) network analysis was performed using STRING database of differentially expressed genes (DEGs) obtained by robust rank aggregation (RRA) analysis in the three stages of epilepsy. In the PPI network, each node represents a protein encoded by DEGs. The edge between nodes represents the interaction of the proteins. The edges of each RRA gene were counted and ranked. Hub genes were genes with top 5 connectivity. A-C: PPI network of DEGs in the acute, latent, and chronic stages of epilepsy, respectively. Red indicates the hub genes with the top 5 connectivity. Yellow indicates several important RRA genes with connectivity second to hub genes. Blue represents other RRA genes. D-F: The barplots of top 30 genes with most connections in the acute and latent stages, and all genes in the chronic stage of epilepsy, respectively. The number labeled at the right side of the barplots represents the edges counts of the DEGs neurodegeneration, BBB disruption, and synapse reorganization. 2 To better understand the molecular mechanisms underlying epileptogenesis, it is necessary to integrate the results from various animal TLE models. To the best of our knowledge, our study is the first to systematically integrate the existing microarray analyses in rodent TLE models using RRA and to analyze gene expression patterns in different phases after SE induction. The advantages of RRA analysis include strong robustness to noise, ability to handle incomplete lists, assigning p-value to each gene in the final list, and high computing efficiency. 10 These advantages make RRA an excellent tool for microarray integration. We also identified the hub genes in each stage, and some of them were associated with seizure frequency or hippocampus sclerosis in TLE patients.
Interestingly, we found that the hub genes in the latent phase, the most critical phase for epileptogenesis, were closely associated with the activation and phagocytic activity of microglia/macrophages, suggesting that microglia/macrophages may play crucial roles in epileptogenesis. 17 The robust DEGs obtained by RRA analysis were significantly enriched in inflammatory response in all three phases of epileptogenesis, especially in the acute and latent stage. This is consistent with the previous results in Gorter's microarray analysis and Walker's proteomic analysis. 18,19 Inflammation could participate in epilepsy through increasing the BBB permeability, regulating neurotransmitter transport, promoting neuron degeneration, and influencing neurogenesis. 2,20 Some promising antiinflammatory small molecules have been evaluated for their capacities to fight against epilepsy and epileptogenesis. 21,22 Our RRA analysis further supports the importance of inflammatory response in epileptogenesis of TLE.
In our study, the acute stage is defined as the first 3 days after SE induction. 23 The enriched biological processes in acute stage were found to be involved in inflammatory response, neuron apoptosis, angiogenesis, and leukocyte migration, which are important pathological changes in the acute stage and may affect the severity of seizure. 2,24 The six hub genes identified in the acute phase have also been reported in previous microarrays studies. 8 Specifically, we found Tlr2, Stat3, and Lgals3 were positively correlated with seizure frequency. In addition, Lgals3 and Serpine1 were also associated with HS, further strengthening the importance of these hub genes. Tlr2, Stat3 and Ptgs2 are well-known mediators of neuroinflammation and are potential antiseizure or antiepilepsy targets. 21,[25][26][27] CD44 is an extracellular adhesion molecule expressed by glial cells and neurons. It regulates neurite outgrowth, leukocyte homing, synaptic transmission, and BBB integrity. 28 It is reported that CD44 expression coincided with early mossy fiber sprouting at 3 days following pilocarpine-induced SE. 29 Our qRT-PCR results showed that the hub gene Timp1 was activation, and appears to be critical in microglial phagocytosis. 40 Lyz2 encodes a lysosome protein that is related to macrophage phagocytosis. 41 The mRNA level of Cd68 and Tyrobp was also significantly up-regulated in the latent stage after epilepsy in the current study.
These results suggest that the activation and phagocytic activity of microglia/macrophage could play a critical role in epileptogenesis of TLE. Indeed, activated microglia has been detected in brain tissues of both TLE animals and TLE patients. 42

F I G U R E 6
The expression of several hub genes was validated in the rat hippocampus by real-time quantitative PCR at 48h (n = 4), 7d (n = 4), and 35d (n = 3) after pilocarpine-induced status epilepticus. The control animals received equal amounts of vehicle (n = 4). *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. Data were expressed as mean ± standard error The chronic period was the time when SRS occurs and epileptic neural network matures. 52 Our analysis showed that in- Recent studies revealed a critical role of TREM2/TYROBP signaling in the regulation of microglial phagocytosis in Alzheimer's disease and other neurological diseases. 54,55 The role of TYROBP in epilepsy is still unknown and deserves further exploration.
LGALS3 is abundantly expressed and secreted by activated microglia and acts as an endogenous TLR4 ligand to promote microglial activation. 56 Despite a strong up-regulation in microglial cells, mice lacking LGALS3 only exhibited limited neuroprotection in cerebral cortex and no effect in hippocampus 3 days following pilocarpine injection. 57 The role of LGALS3 in epilepsy needs to be further determined.
Our study has several limitations. First, the microarray detects the alterations in transcriptional level but not protein level.
Proteomics research should be performed to confirm our conclusions. Second, due to limited amount of high-throughput RNAseq data in epileptic animal models in GEO database, we only analyzed the microarray data which are not as accurate in detecting gene expression as high-throughput sequencing. Third, we integrated both the rat and mouse microarrays to obtain the genes robustly expressed across different species. However, different species may respond to SE differently, which may confound our data interpretation. In addition, a cutoff of |log 2 FC|> 0.5 was used to screen robust DEGs in RRA analysis to avoid missing important genes.
A loose cutoff, however, could also bring several less significant genes.
In summary, using RRA analysis, we found that the robust DEGs were enriched in inflammation response in all three phases. We also identified the hub genes in each stage of epileptogenesis. Some hub genes were associated with seizure frequency or hippocampus sclerosis. Our results provide novel insights into the molecular mechanisms of TLE and highlight the importance of microglia/macrophage responses and neuroinflammation in epileptogenesis.

ACK N OWLED G EM ENTS
The authors would like to thank Zengsong Zhang for help on bioinformatics analysis. This work was supported by project grants from National Natural Science Foundation of China (Code: 81771308 and 31771184).

D I SCLOS U R E
None.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available from the corresponding author upon reasonable request.