CpG methylation signature defines human temporal lobe epilepsy and predicts drug‐resistant

Abstract Aims Temporal lobe epilepsy (TLE) is the most common focal epilepsy syndrome in adults and frequently develops drug resistance. Studies have investigated the value of peripheral DNA methylation signature as molecular biomarker for diagnosis or prognosis. We aimed to explore methylation biomarkers for TLE diagnosis and pharmacoresistance prediction. Methods We initially conducted genome‐wide DNA methylation profiling in TLE patients, and then selected candidate CpGs in training cohort and validated in another independent cohort by employing machine learning algorithms. Furthermore, nomogram comprising DNA methylation and clinicopathological data was generated to predict the drug response in the entire patient cohort. Lastly, bioinformatics analysis for CpG‐associated genes was performed using Ingenuity Pathway Analysis. Results After screening and validation, eight CpGs were identified for diagnostic biomarker with an area under the curve (AUC) of 0.81 and six CpGs for drug‐resistant prediction biomarker with an AUC of 0.79. The nomogram for drug‐resistant prediction comprised methylation risk score, disease course, seizure frequency, and hippocampal sclerosis, with AUC as high as 0.96. Bioinformatics analysis indicated drug response–related CpGs corresponding genes closely related to DNA methylation. Conclusions This study demonstrates the ability to use peripheral DNA methylation signature as molecular biomarker for epilepsy diagnosis and drug‐resistant prediction.


| INTRODUC TI ON
Temporal lobe epilepsy (TLE) is the most common focal epilepsy syndrome in adults and frequently develops drug resistance, 1 requiring surgical treatment which offers a comparatively favorable prognosis. 2,3 Moreover, cognitive impairment and psychiatric comorbidities including depression and anxiety disorders, together with the longterm actual seizures and accompanying drug usage, often result in severe effects on the quality of life and individual health. 4,5 At present, the diagnosis of epilepsy mainly depends on clinical manifestation, neuroimaging, and electroencephalogram (EEG).
These methods are not only expensive and time-consuming, but also require professional equipment and trained specialists that are not accessible to many patients, which result in delayed diagnosis or misdiagnosis to some extent. 6,7 Furthermore, drug-response prediction is mainly based on subjective clinical features by experience and has not come to a conclude. 4,8,9 Earlier identification of drug-resistant patients makes it possible to benefit from epilepsy surgery. Thus, biomarkers for assisting the current diagnosis and predicting the treatment outcome are in urgent need. Preliminary attempts have been made in circulating molecules biomarkers of epilepsy, including inflammatory cytokines, S100 calcium-binding protein B(S100B), and matrix metallopeptidase 9(MMP9), and recently miRNA. [10][11][12][13][14] However, the limitations of these studies mainly related to small sample size and lack of validation, as well as heterogeneity of epilepsy that prevent the clinical value of these biomarkers. 15 DNA methylation, the best-studied epigenetic mechanism, refers to the covalent attachment of methyl groups to the cytosine residues (mainly confined in CpG sites) mediated by DNA methyltransferase (DNMT). 16,17 It is mostly stable throughout the genome and is associated with transcriptional activation/repression. 18,19 Aberrant DNA methylation implicated in underlying epileptogenesis and progression mechanisms of epilepsy has gained considerable attention.
Altered expression of DNMTs and methylation changes in individual candidate genes (ie, RELN) have been found in TLE patients. [20][21][22][23] Several genome-wide studies using epileptic brain tissue have identified differential methylation events occurred in genes associated with inflammation, neuronal development, etc [24][25][26][27] Moreover, our previous research reported that dysregulated methylation implicated in both protein-encoding genes and noncoding RNA genes in peripheral blood DNA from TLE patients. 28,29 A substantial number of studies have investigated the value of peripheral DNA methylation signature as molecular biomarker for diagnosis or prognosis, especially in cancer research. 15,[30][31][32] The prognostic value of O 6 -methylguanine-DNA-methyltransferase (MGMT) promoter methylation in glioblastoma and methylated SEPTIN 9(SEPT9) in plasma for detection of asymptomatic colorectal cancer is well-known paradigms, [33][34][35] which have been included in clinical guidelines and translated into the commercially available clinical test. 15 In addition, there was a trend that researchers favored combinatorial biomarkers of multiple CpG signature. [36][37][38][39] DNA methylation-based biomarkers present advantages with regard to clinical application: presence in various biofluids, more stable than other biological materials (such as RNA or protein), easy detection by well-established methodologies, and cell-type specificity. 15,31 However, to date, methylation biomarkers for TLE diagnosis and pharmacoresistance prediction have not been explored.
In this study, we aimed to identify and validate disease-related and drug response-related CpGs in TLE. We initially conducted genome-wide DNA methylation profiling in TLE patients; then, we selected candidate CpGs in training cohort and validated those CpGs in another independent cohort by employing machine learning algorithms. Furthermore, a nomogram comprising DNA methylation and clinicopathological data was generated to predict the drug response in the entire patient cohort. Lastly, mechanistic links were pursued for all biomarker CpGs corresponding genes by bioinformatics analysis.

| Patient cohorts
The study was carried out on a cohort of 78 patients with TLE and 78 sex-and age-matched healthy controls, from the Department of Neurology at Xiangya Hospital. And all patients went through comprehensive medical history, physical examination, cranial magnetic resonance imaging (MRI) scans, and EEG. Inclusion criteria of TLE and drug-resistant epilepsy were accorded to our previous research. 28 Written informed consent was obtained from all enrolled participants. Study was conducted in accordance with the guideline for the research involving human and approved by the Ethics Committee of Central South University, Xiangya School of Medicine and the affiliated Xiangya Hospital (201303120). The data were divided into two sets: in the training cohort, 30 TLE patients were analyzed; in the validation phase, candidate CpGs were validated in another independent cohort (n = 48).

| DNA methylation quality control and processing
Whole blood DNA extraction and quality control were constructed as in our previous study. 28 The discovery and training samples were All samples passed the Illumina quality control. Methylation at individual CpG was reported as a methylation β-value, ranging continuously from 0 (unmethylated) to 1 (completely methylated). The minfi R package (Version 1.18.1) was used to retrieve raw data of 450K and 850K array. Initially, we excluded probes located on the sex chromosome and null probes. We also removed the failed probes with a detection P-value > .05 in more than 5% samples. The probes with single-nucleotide polymorphisms of MAF > 5% within 10 bp of the CpG sites were also rejected. We next performed Subset-quantile Within Array Normalization (SWAN) methods for normalization. 40 The probes of 450K assay are expected to perform similarly on data from the 850K array. In this study, we removed the set of CpG sites that were not included in the 850K array.

| Building a diagnostic model
We included three phases to identify and validate disease-related CpGs signature for patients with TLE. In the discovery phase, the logical regression test was performed to obtain differentially methylated CpG sites (DMCs) between 30 TLE and 30 normal control samples, with a threshold value of .001 for P-value was used subsequently for filtration. The 237 candidate CpGs were analyzed by Least Absolute Shrinkage and Selection Operator (LASSO) methods.
The CpGs were then ranked by the regression parameters. In the training phase, to further shrink the marker numbers to a reasonable range, support vector machine (SVM) algorithm was used for different number of CpGs. As a result, eight CpGs with the highest prediction accuracy were confirmed. SVM algorithms were tuned by 5-fold internal cross-validation, which implies optimal determination of parameters of the SVM algorithm. In the validation phase, the parameters of the SVM model from the training cohort were used to an independent cohort of 96 samples (48 TLE and 48 normal controls) for validating the diagnostic performance of the model.

| Building a predictive model for drug response
In the discovery phase, the t test was performed to identify DMCs between 10 drug-resistant and 20 drug-responsive samples, with a threshold value of .005 for P-value. After 99 DMCs were obtained, we used SVM-Recursive Feature Elimination (SVM-RFE) to select candidate CpG sites. In the training phase, logistic regression was used to further narrow CpGs. Six CpGs with the highest prediction accuracy were identified, with parameter tuning conducted by 5-fold cross-validation. A risk score was calculated for each patient using a formula derived from the methylation levels of these 6 CpGs weighted by their regression coefficient. Validation analyses were performed in another cohort (17 drug-resistant and 13 drug-responsive samples). In addition, a nomogram comprising integrated DNA methylation risk score and clinicopathological data was generated to predict the drug response. The performance of the nomogram was explored graphically by calibration plots.

| Bioinformatics analysis
Pathway analysis for CpG-associated genes was performed using Ingenuity Pathway Analysis (IPA; http://www.ingen uity.com/). For the purposes of this study, the canonical pathway and diseases functions analysis available in IPA were applied, which resulted in the inclusion of CpG corresponding genes and the other identified genes interacting with in the analysis. The Fisher's exact test was applied to measure the significance of the association between genes mapped by IPA and the canonical pathway.

| Bisulfite pyrosequencing of selected DNA methylation loci
Bisulfite pyrosequencing is well-established technique that used for quantitative methylation analysis of genomic regions in singlenucleotide resolution. 41 We selected 4 CpG loci (cg25838818, cg27564766, cg11954680, and cg26119877) for assay cross-validation by bisulfite pyrosequencing. Blood DNA samples from 10 TLE patients and 10 healthy control cases or 10 drug-responsive TLE cases and 10 drug-resistant TLE cases were bisulfite con- Pyrosequencing assay, purification, and subsequent processing of the biotinylated single-stranded DNA were carried out according to the manufacturer's recommendations.

| Statistical analysis
In the comparative analysis of clinical characteristics (SPSS18.0), measurement data (age, disease course, and seizure frequency) were subject to K-S test following by statistically analyzing with Student's t test or nonparametric test, and enumeration data (HS, aura, and SGS) were assessed using chi-square test, with P-value < .05 considered statistically significant. For the current research, scikit-learn   Table 1.

| Diagnostic model for TLE
To identify the TLE-associated CpGs, we first studied the global methylation profiles in the DNA of whole peripheral blood obtained from 30 TLE patients and 30 healthy controls. Epigenome-wide association identified 237 DMCs associated with TLE at P < .001 by logistic regression ( Figure 1A, Table S1).

| Prediction model for drug response
T test was used to analyze the DMCs between 10 drug-resistant and 20 drug-responsive patients, which identified 99 DMCs at P < .005.
( Figure 2A, Table S4). 99 DMCs were analyzed by SVM-RFE algorithm to select significant CpGs. Logistic regression was used to further narrow CpGs.
To better investigate the performance of CpGs signature in predicting drug response, a methylation risk score was built with the coefficients weighted by the logistic regression model in the validation cohort (17 drug-resistant, 13 drug-responsive). The methylation risk score was calculated as follows: risk score = 19.3 *cg15999964 − 43.5 *cg08768218 + 54.9 *cg11954680 + 26.3 *cg17706086 + 66.8 *cg21761639 + 29.9 *cg26119877 − 86.3, with a cutoff value of 0.78. Applying the model yielded a sensitivity of 77%, a specificity of 71%, and an accuracy of 73%, with an AUC of 0.79 in the validation cohort, to distinguish drug-responsive from drug-resistant patients ( Figure 2C, Table S5).

| Building a predictive nomogram
We performed the multivariate analysis of the methylation risk score and clinicopathological characteristics with drug response in the entire TLE cohort ( Figure S1). The methylation risk score and HS were

| Bioinformatics analysis
All biomarker CpGs corresponding genes were uploaded to IPA for the canonical pathway and diseases functions analysis, and network generation for defined molecular interactions. Results were visualized as networks ( Figure S3, Figure S4) and ranked as diseases functions and canonical pathways involved (Table S6). The IPA analysis showed that "cell death and survival" and "cellular development" were the top-ranked diseases functions. DLC1, IFI27L1, TP73, and

| Cross-validation of methylation with bisulfite pyrosequencing
To evaluate the accuracy of DNA methylation data from methylation beadchip, a subset of CpG loci was selected for additional methylation validation by the pyrosequencing. Blood DNA samples of TLE patients (n = 10) and controls (n = 10) were subjected to methylation detection at 2 loci (cg25838818, cg27564766), and drug-responsive TLE (n = 10) and drug-resistant TLE (n = 10) were subjected to methylation detection at 2 loci (cg11954680 and cg26119877).

| D ISCUSS I ON
In this study, we used the methylation array to screen differential CpGs and selected significant CpGs by applying machine learning algorithms in the training cohort. Subsequently, we validated the Further characterization of molecules such as TBC1D24, BAIAP2, and SULT1C2 will provide new insights into TLE development and progression.
In addition, we noted that drug response-related CpGs corresponding genes closely related to DNA methylation, which implicated that DNA methylation play an essential role in pharmacoresistance mechanisms of epilepsy. Tumor suppressor gene deleted in liver cancer 1 (DLC1) is shown to induce apoptosis, frequently silenced by methylation and negative correlation with DNMT expression. 52,53 Interestingly, previous research found increased expression of DNMT1 and DNMT3A in patients with intractable TLE. 22 Hypermethylation of gene promoters was also the predominant effect in TLE patients and rodent models as well. 24,[27][28][29] Given the well studied of epigenetic pathomechanisms underlying drug resistance in cancer, Kobow proposed that the methylation hypothesis of pharmacoresistance could open such new avenue in the field of epilepsy. 45 We produced a nomogram including DNA methylation risk score, disease course, seizure frequency, and HS for estimation of individ-

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

F I G U R E 5
Cross-validation of DNA methylation with the pyrosequencing. Shown are degrees of methylation of 4 CpG loci reported by methylation BeadChip (Y axis, ratio) and pyrosequencing (X axis, ratio) assays. For cg25838818 (A), cg27564766 (B), cg11954680 (C), and cg26119877 (D), the degrees of methylation detected by the two methods were positively correlated (P < .05) in reference to individual samples