Discovery of the neuroprotective effects of alvespimycin by computational prioritization of potential anti-parkinson agents

Authors

  • Li Gao,

    1. Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
    Search for more papers by this author
  • Gang Zhao,

    1. Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
    Search for more papers by this author
  • Jian-Song Fang,

    1. Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
    Search for more papers by this author
  • Tian-Yi Yuan,

    1. Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
    Search for more papers by this author
  • Ai-Lin Liu,

    Corresponding author
    1. Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
    2. Beijing Key Laboratory of Drug Target and Screening Research, Beijing, China
    • Correspondence

      A.-L. Liu or G.-H. Du, 1 Xian Nong Tan Street, Beijing 100050, China

      Fax: +86 010 63165184

      Tel: +86 010 63165184

      E-mail: liuailin@imm.ac.cn; dugh@imm.ac.cn

    Search for more papers by this author
  • Guan-Hua Du

    Corresponding author
    1. Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
    2. State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Beijing, China
    • Correspondence

      A.-L. Liu or G.-H. Du, 1 Xian Nong Tan Street, Beijing 100050, China

      Fax: +86 010 63165184

      Tel: +86 010 63165184

      E-mail: liuailin@imm.ac.cn; dugh@imm.ac.cn

    Search for more papers by this author

Abstract

Based on public gene expression data, we propose a computational approach to optimize gene expression signatures for the use with Connectivity Map (CMap) to reposition drugs or discover lead compounds for Parkinson's disease. This approach integrates genetic information from the Gene Expression Omnibus (GEO) database, the Parkinson's disease gene expression database (ParkDB), the Online Mendelian Inheritance in Man (OMIM) database and the Comparative Toxicogenomics Database (CTD), with the aim of identifying a set of interesting genes for use in computational drug screening via CMap. The results showed that CMap, using the top 20 differentially expressed genes identified by our approach as a gene expression signature, outperformed the same method using all differentially expressed genes (n = 535) as a signature. Utilizing this approach, the candidate compound alvespimycin (17-DMAG) was selected for experimental evaluation in a model of rotenone-induced toxicity in human SH-SY5Y neuroblastoma cells and isolated rat brain mitochondria. The results showed that 17-DMAG significantly attenuated rotenone-induced toxicity, as reflected by the increase of cell viability, the reduction of intracellular reactive oxygen species generation and a reduction in mitochondrial respiratory dysfunction. In conclusion, this computational method provides an effective systematic approach for drug repositioning or lead compound discovery for Parkinson's disease, and the discovery of the neuroprotective effects of 17-DMAG supports the practicability of this method.

Abbreviations
CMap

Connectivity Map

CTD

Comparative Toxicogenomics Database

DE gene

differentially expressed gene

DCFH-DA

2′,7′-dichlorodihydrofluorescein diacetate

FC

fold-change

GEO

Gene Expression Omnibus

HSP90

heat shock protein 90

MTT

3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide

OMIM

Online Mendelian Inheritance in Man

ParkDB

Parkinson's disease gene expression database

PD

Parkinson's disease

RCR

respiratory control ratio

ROS

reactive oxygen species

RWR

random walk with restart

SAM

significance analysis of microarrays

SN

substantia nigra

Introduction

Drug discovery is an interdisciplinary, expensive and time-consuming process. Considerable efforts, such as in vitro high-throughput screening and combinatorial chemistry [1], have been expended to improve the efficiency of the drug discovery process. However, these methods are usually expensive and laborious. With the advent of the post-genomic era, tremendous genomic resources and bioinformatic approaches have become available. Based on analysis and integration of omics data, drug repositioning is increasingly attracting attention because it can identify new indications for existing drugs [2, 3]. Thus, it is necessary to develop computational methods to prioritize small molecules for the purpose of drug repositioning or lead compound discovery.

Connectivity Map (CMap) [4] is a computational drug repositioning tool. The current version (build 02) of CMap contains 6100 gene expression profiles from five human cell lines representing treatment with 1309 distinct small molecules at distinct doses. CMap begins with a gene expression signature that is representative for significantly up- and down-regulated genes in a biological condition such as a disease. By comparing the disease signature with all reference profiles from chemical perturbations, a ranked list of compounds with connectivity scores ranging from −1 to +1 is obtained. In particular, a strong negative correlation between the disease signature and the expression profile of a compound suggests that the compound could potentially have a therapeutic effect on the disease [2, 5].

Subsequent to the introduction of the CMap methodology, several successful applications [5] of drug repositioning and lead compound discovery have emerged. Studies have focused on many types of diseases, such as lung cancer [6], breast cancer [7], muscle atrophy [8], acute myelogenous leukaemia [9] and Alzheimer's disease [10]. In these studies, the gene expression signatures were directly used for CMap queries without optimization. In the present study, we optimized the signature of Parkinson's disease (PD) instead of using all differentially expressed genes (DE genes) as a signature for the CMap query with the aim of obtaining superior performance for compound prioritization.

PD is primarily characterized by the progressive loss of dopaminergic neurones in the substantia nigra (SN) pars compacta. More evidence has accumulated to suggest that the pathogenesis is attributed to an unspecified combination of genetic and environmental factors [11]. Rotenone is an inhibitor of mitochondrial respiratory chain complex I, leading to oxidative stress and mitochondrial dysfunction, which are implicated in the pathogenesis of PD [12]. Rotenone has been extensively used to establish PD models [13].

In the present study, we present a workflow for gene expression signature optimization and candidate compound discovery for PD, as illustrated in Fig. 1: (a) based on the gene expression data from SN tissues from 31 PD patients and 17 controls, a set of DE genes was identified; (b) the Online Mendelian Inheritance in Man (OMIM) [14] database and the Parkinson's disease gene expression database (ParkDB) [15] were integrated to identify a set of DE genes; (c) the Comparative Toxicogenomics Database (CTD) [16] and ParkDB were integrated to identify a set of DE genes; (d) the DE genes from these sources were integrated, and four scoring methods were applied to rank the DE genes; (e) the scores were normalized and the combined score for each DE gene was calculated; (f) the performance of CMap was evaluated for the gene expression signatures consisting of different numbers of genes according to the combined score, and the appropriate signature size was selected; (g) and, based on the ranked list of compounds, alvespimycin (17-DMAG), a compound evaluated in phase I clinical studies of patients with advanced solid tumours [17] and acute myeloid leukaemia [18], was selected for further evaluation in a model of rotenone-induced neurotoxicity in human SH-SY5Y neuroblastoma cells and isolated rat brain mitochondria.

Figure 1.

The workflow for gene expression signature optimization and candidate compound discovery for PD.

To the best of our knowledge, in the present study, CMap was used for the first time in the investigation of PD for drug repositioning and lead compound discovery. After optimization of the gene expression signature, it was found that the top 50 molecules returned by CMap were enriched with therapeutic molecules for PD. This protocol can be considered as an effective method for acquiring a more representative gene signature. The neuroprotective effect of 17-DMAG against rotenone-induced neurotoxicity was discovered and is worthy of further study.

Results

Identification of DE genes in PD from different sources

In the present study, 535 DE genes were derived from three different sources: the Gene Expression Omnibus (GEO) database [19], the integration of the OMIM database and ParkDB and the integration of CTD and ParkDB.

Identification of DE genes from the GEO database

The significance analysis of microarrays (SAM) algorithm [20] was used to identify genes with statistically significant changes in expression from the GEO database. The DE genes in SN tissues from PD patients compared to controls are illustrated in SAM plot sheets (Fig. S1). In total, 467 DE genes (210 DE genes specific to the medial SN, 109 DE genes specific to the lateral SN, 148 DE genes common to both and 0 DE gene specific to the frontal cortex) were identified in a dataset consisting of 15 PD patients and eight controls; 25 DE genes were identified in a dataset consisting of 16 PD patients and nine controls. Among them, 15 DE genes (TH, SLC18A2, DDC, SLC6A3, KCNJ6, SV2C, RBM3, AGTR1, ALDH1A1, PCDH8, FAM70A, CLSTN2, EBP, DIRAS2 and CDH8) were common to both datasets, and all of these genes were down-regulated in PD patients compared to controls. The DE genes obtained from the GEO database produced a total of 477 genes (Table S1).

Identification of DE genes from the integration of OMIM and ParkDB

OMIM is a comprehensive database that summarizes human genes and disease phenotypes [14]. ParkDB provides comprehensive access to gene expression datasets involved in PD [15]. DE genes from different experiments (i.e. different tissues, cell lines, treatments and species) can be found in ParkDB. Ninety-eight PD-associated genes were retrieved from the OMIM database (Table S2). The genes were filtered by ParkDB with a human species parameter, and only the DE genes with high confidence (P ≤ 0.01) were retained. As a result, 45 DE genes were identified by integrating OMIM and ParkDB (Fig. 2A and Table S3).

Figure 2.

(A) The DE genes identified by integrating OMIM and ParkDB. The node size indicates the OMIM score of each gene. (B) The DE genes identified by integrating CTD and ParkDB. The node size indicates the CTD score of each gene. The round nodes (coloured yellow) represent up-regulated genes and the triangle nodes (coloured purple) represent down-regulated genes.

Identification of DE genes from the integration of CTD and ParkDB

CTD provides information about the interaction of environmental chemicals, genes and diseases [16]. The therapeutic molecule-gene interactions involved in PD were extracted from CTD. The genes that were targeted by therapeutic molecules for PD were filtered by ParkDB. As a result, 36 DE genes were obtained by integrating CTD and ParkDB (Fig. 2B and Table S4).

Prioritization of DE genes

Simple expression fold-change (FC) ranking

For each of the 477 DE genes from GEO, the absolute value of the average log2 expression FC was defined as the FC score. For the other 58 DE genes, the FC scores were defined as zero (Table S5).

Network-based random walk with restart (RWR) algorithm ranking

The unweighted co-expression network consisting of the 477 DE genes from GEO was constructed using a rank-based method [21] and the top three most correlated genes were considered in the network construction. The rank-based network construction method was aimed at creating a sparse co-expression network with highly reliable edges and few false-positive connections. The final co-expression network contained 477 nodes and 1425 edges after removal of the duplicate edges.

RWR, a network-based algorithm for gene prioritization [22], is among the best and most suitable methods for gene ranking in a co-expression network. It simulates a random walker that starts on a seed node and moves to its immediate neighbours randomly at each step. In the present study, the RWR algorithm was run on the above co-expression network to rank the candidate genes using the 15 common DE genes from the two GEO datasets as seed nodes. Among them, ten genes have been reported to be associated with PD (Table S6). The co-expression sub-network, consisting of the top 100 ranked genes after prioritization with RWR, is shown in Fig. 3. A receiver operating characteristic curve was used to validate the performance of RWR prioritization (Fig. S2). With an area under the curve value of 77.5%, the network-based RWR ranking was shown to be effective. For each of the 477 DE genes from GEO, the RWR score was obtained by applying the RWR algorithm. For the other 58 DE genes, the RWR scores were defined as zero (Table S5).

Figure 3.

The co-expression sub-network of the top 100 ranked DE genes identified by the RWR algorithm. The network is rendered by a force-directed layout. The genes used as seed nodes are represented by triangles, whereas the others are represented by rounded rectangles. The node size in the network indicates the RWR score of each gene.

Ranking by integration of OMIM and ParkDB

For each of the 45 DE genes from the integration of OMIM and ParkDB, the average gene expression FC (log2 FC) obtained from ParkDB was defined as the OMIM score. For the other 490 DE genes, the OMIM scores were defined as zero (Table S5).

Ranking by integration of CTD and ParkDB

For each of the 36 DE genes from the integration of CTD and ParkDB, the number of therapeutic molecules that were related to PD and that targeted this gene was counted as the CTD score. For the other 499 DE genes, the CTD scores were defined as zero (Table S5).

Combined score of candidate genes

The scores were normalized, and the weight for each prioritization approach was calculated using an entropy-based method. According to the results obtained from this method, we assigned weights of 0.021, 0.246, 0.387 and 0.346 to the FC score, the RWR score, the OMIM score and the CTD score, respectively. The combined score was the weighted sum of the normalized scores from the individual approaches. The 535 DE genes and their combined scores are listed in Table S5.

CMap analysis and performance measurement

A gene expression signature for CMap is made up of two lists of probe sets in separate files, neither of which is dispensable (one list for the up-regulated genes and the other for down-regulated genes). Signatures with different numbers of genes after ranking according to the combined score were used to query CMap to determine a suitable size. As shown in Fig. 4A,B, the enrichment factor and the absolute number of therapeutic molecules for PD in the top 50 molecules returned by CMap were higher when we used the top 20 DE genes (Table 1) as a signature than when using top 50, 100, 200 or all DE genes (Table S5).

Table 1. The detailed information of the top 20 DE genes from the combined score.
Gene symbolProbe set nameCombined scoreUp-/down- regulation
SLC6A3206836_at6.506 Down
DRD2216938_x_at5.411 Down
PARK7200006_at3.712 Down
DDC205311_at3.463 Down
EN1220559_at3.146 Down
SERPINA3202376_at3.108 Up
SLC18A2205857_at2.988 Down
CCK205827_at2.792 Down
SNCA204467_s_at2.704 Down
TH208291_s_at2.654 Down
BDNF206382_s_at2.463 Down
ALDH1A1212224_at2.302 Down
ASCL1209987_s_at2.286 Up
NR4A2204621_s_at2.281 Down
AGTR1205357_s_at2.256 Down
ELAVL4206051_at2.179 Down
KCNJ6210454_s_at2.155 Down
FAM70A219895_at2.094 Down
CDH8210518_at2.085 Down
GRIN1210781_x_at2.076 Down
Figure 4.

Comparison of performances for assaying the top 50 of the ranked molecules. (A) Enrichment factor. (B) Number of therapeutic molecules for PD returned by CMap when we used different signature sizes (top 20, 50, 100, 200 or all DE genes) for the combined score. (C) Enrichment factor. (D) Number of therapeutic molecules for PD returned by CMap when we used top 20 DE genes for the combined score and individual scores (FC score, OMIM score, CTD score).

We also compared the performances of the integrated approach with individual approaches using the top 20 genes as signatures (Table S7). CMap using the top 20 genes given by the RWR approach was unable to yield results as a result of a lack of up-regulated genes. Consequently, the RWR approach was not included in the comparison. As shown in Fig. 4C,D, the performance of the integrated approach was superior to the individual approaches when CMap used the top 20 DE genes as a signature.

Using the optimized gene expression signature composed of genes in Table 1, 11 therapeutic molecules for PD were included among the top 50 molecules, which yielded an enrichment factor that was 3.1-fold greater than expected by chance. Unexpectedly, we found that 17-DMAG, a heat shock protein 90 (HSP90) inhibitor, was highly ranked (rank 5) in the ranked list. HSP90 inhibitors have been considered as promising agents for the treatment of PD [23-26]. Therefore, we selected 17-DMAG to investigate whether this compound possessed protective effects against rotenone-induced neurotoxicity. The connectivity scores and ranks of 17-DMAG and the 11 therapeutic molecules for PD are represented in Table 2. All of the top 50 molecules are listed in Table S8.

Table 2. The connectivity scores and ranks of 17-DMAG and the 11 therapeutic small molecules in the top 50 molecules based on the optimized gene expression signature.
RankTherapeutic small moleculeDoseCellMinimum connectivity score
5Alvespimycin (17-DMAG)100 nmHL60−0.960
14Rosiglitazone10 μmHL60−0.922
15Tanespimycin1 μmMCF7−0.921
16Procyclidine12 μmHL60−0.920
18α-oestradiol10 nmHL60−0.917
23Valproic acid1 mmPC3−0.912
32Resveratrol50 μmMCF7−0.904
37Vorinostat10 μmHL60−0.902
38Acetylsalicylic acid100 μmMCF7−0.902
40Lidocaine15 μmHL60−0.902
41Ranitidine11 μmMCF7−0.901
43Carbamazepine17 μmMCF7−0.901

17-DMAG attenuates rotenone-induced neurotoxicity

17-DMAG attenuates rotenone-induced cell death

Exposure of human SH-SY5Y neuroblastoma cells to 17-DMAG (10−11 to 10−8 m) for 24 h and 48 h did not affect cell viability as measured by the 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide (MTT) assay. However, 17-DMAG at concentrations of 10−7 to 10−5 m caused significant reductions in cell viability for 24 h and 48 h (P < 0.001) (Fig. 5A). In the subsequent experiments, concentrations of 17-DMAG were below 10 nm.

Figure 5.

(A) The effects of 17-DMAG (10−11 to 10−5 M) on SH-SY5Y cell viability for 24 h and 48 h. The control group (Con) had no exposure to 17-DMAG. ***P < 0.001 compared to the control group (n = 4). (B) The effects of 17-DMAG on rotenone-induced neurotoxicity in SH-SY5Y cells. Cells were treated with 0.3, 1 or 3 nm 17-DMAG for 1 h, followed by incubation with 2 μm rotenone or vehicle control for 48 h. ###P < 0.001 compared to the control group; *P < 0.05, **P < 0.01 compared to the vehicle group (n = 4). (C) The effects of 17-DMAG on rotenone-induced ROS production in SH-SY5Y cells. Cells were treated with 0.3, 1 or 3 nm 17-DMAG for 1 h, followed by incubation with 2 μm rotenone or vehicle control for 24 h. ###P < 0.001 compared to the control group; **P < 0.01 compared to the vehicle group (n = 3). (D) Fluorescent images of ROS production taken with a × 10 objective lens by a Cellomics ArrayScan VTI HCS Reader (Thermo Fisher Scientific-Cellomics). Nuclei stained by Hoechst 33234 are shown in blue; the intracellular ROS stained by DCFH-DA are shown in green.

As shown in Fig. 5B, after exposure to 2 μm rotenone for 48 h, the cell viability decreased by 30% compared to the control group (P < 0.001). Pretreatment with 17-DMAG at concentrations of 1 nm and 3 nm significantly attenuated rotenone-induced cell death (P < 0.05 and P < 0.01). This result suggested that 17-DMAG was effective in inhibiting rotenone-induced cell death.

17-DMAG attenuates rotenone-induced intracellular ROS accumulation

Increasing evidence suggests that mitochondrial complex I may be a source of reactive oxygen species (ROS) and that the partial inhibition of complex I by rotenone can enhance ROS generation [13]. To determine the effect of 17-DMAG on the intracellular ROS level, the rotenone-induced ROS level was measured in the presence or absence of 17-DMAG. As shown in Fig. 5C,D, the production of ROS significantly increased (over six-fold) after exposure to 2 μm rotenone for 24 h (P < 0.001). Pre-treatment with 17-DMAG, at a concentration of 3 nm, significantly reduced ROS production by 44% (P < 0.01), suggesting that 17-DMAG is effective in attenuating rotenone-induced intracellular ROS accumulation.

17-DMAG reduces mitochondrial respiratory dysfunction induced by rotenone

The respiratory control ratio (RCR) is calculated as the ratio of state 3/state 4 oxygen consumption rates. State 3 respiration is the rate of oxygen consumption in the presence of ADP and state 4 respiration is the rate of oxygen consumption where the availability of ADP becomes the limiting factor. As shown in Fig. 6, the RCR was significantly decreased by 49% after exposure of mitochondria to 0.1 μm rotenone (P < 0.001). Pre-treatment with 17-DMAG (3 nm) 1 min prior to rotenone (0.1 μm) could significantly increase the RCR by 12% (P < 0.05), suggesting that 17-DMAG might reduce mitochondrial respiratory dysfunction induced by rotenone.

Figure 6.

The effects of 17-DMAG on RCR in rotenone-treated mitochondria from rat brains. ###P < 0.001 compared to the control group; *P < 0.05 compared to the vehicle group (n = 4).

Discussion

CMap is a drug repositioning tool based on systematic analysis of transcriptomics data [2]. Based on CMap, many successful applications of drug repositioning and lead compound discovery have been reported [5]. The possible indication of the candidate compound was validated by in vivo and (or) in vitro studies. Traditionally, a gene expression signature obtained from analysis of gene expression profiles is directly used for the CMap query. With several gene prioritization methods appearing [27], it has become possible to select the most promising DE genes to form a signature.

In the present study, we proposed a computational approach to optimize the PD gene expression signature for use with CMap to reposition drugs or discover lead compounds. This approach integrated genetic information from the GEO, OMIM, CTD and ParkDB databases; the prioritization methods were simple expression FC ranking and network-based RWR ranking. The different sets of scores were normalized, and were then combined using appropriate weights assigned by an entropy-based approach. After optimization of the gene expression signature, the performance of CMap for enriching therapeutic molecules related to PD was improved.

17-DMAG, a HSP90 inhibitor, was ranked fifth in the ranked list, arousing great interest. HSP90, a cytosolic molecular chaperone, has been considered as a target for PD therapeutics [23-26]. HSP90 has been found to be increased predominantly in brains of PD patients, which correlated with an increased level of α-synuclein [28]. HSP90 inhibitors significantly reduce PTEN-induced kinase 1 levels in cells [29] and disrupt the leucine-rich repeat kinase 2/Hsp90 complex [30]. Geldanamycin, an analogue of 17-DMAG, has proven to be beneficial in animal models of PD [31, 32]. Furthermore, 17-DMAG was shown to ameliorate polyglutamine-mediated motor neurone degeneration in spinal and bulbar muscular atrophy in mouse models [33].

Rotenone is a mitochondrial complex I inhibitor. Rotenone-treated rats have been demonstrated to reproduce the behavioural, neurochemical and neuropathological features of PD [34]. The rotenone-induced cell model has also been widely used as in vitro model to investigate PD [35, 36]. Moreover, rotenone can enhance ROS generation and disrupt mitochondrial function [13] by partial inhibition of complex I. Our task was to determine whether 17-DMAG could protect against rotenone-induced neurotoxicity.

The results showed that 17-DMAG at high concentrations (10−7 to 10−5 m) significantly decreased SH-SY5Y cell viability, which was in accordance with its antitumour effects [17]. Exposure of SH-SY5Y cells to 2 μm rotenone for 48 h significantly reduced cell viability, and the addition of 17-DMAG (0.3, 1 and 3 nm) increased cell viability in a dose-dependent manner. Such an effect might be partially attributed to the inhibition of intracellular ROS production. To observe the direct effect of 17-DMAG on mitochondrial respiratory dysfunction induced by rotenone, isolated rat brain mitochondria were used. The results showed that 0.1 μm rotenone significantly inhibited the mitochondrial respiration, whereas pre-treatment with 3 nm 17-DMAG significantly reduced the mitochondrial respiratory dysfunction induced by rotenone.

Despite these encouraging results, 17-DMAG was found to cause serious adverse events in cancer trials [37]. However, a clinical study of 17-DMAG is in progress for relapsed chronic lymphocytic leukaemia, small lymphocytic lymphoma and B-cell prolymphocytic leukaemia (ClinicalTrials.gov Identifier: NCT01126502). Although the clinical utility of 17-DMAG is uncertain and further research is required, the present findings indicate that 17-DMAG is effective at protecting cells from rotenone toxicity.

In conclusion, in the present study, a computational method for gene expression signature optimization with the aim of drug repositioning was proposed. This workflow can be considered as an efficient tool for drug repositioning or lead compound discovery. The neuroprotective role of 17-DMAG against rotenone-induced toxicity was discovered, which also validated this method. The results showed that 17-DMAG might confer protection against rotenone-induced toxicity through the inhibition of cell death, intracellular ROS generation and the reduction of mitochondrial respiratory dysfunction and thus is worthy of further study.

Materials and methods

SAM-based analysis for identifying DE genes from the GEO database

Only two Affymetrix (Affymetrix, Santa Clara, CA, USA) microarray datasets related to human SN tissues from PD patients were available in the GEO database. The microarray gene expression data taken from GEO with accession numbers GSE8397 [38] (Affymetrix Human Genome U133A Array) and GSE7621 [39] (Affymetrix Human Genome U133 Plus 2.0 Array) were obtained from the SN tissues of post-mortem human brains from normal and PD patients. The GSE8397 dataset was derived from the SN tissues (split into medial and lateral portions and the frontal cortex) of 15 sporadic PD patients and eight controls, and the GSE7621 dataset was derived from the SN tissues of 16 PD patients and nine controls.

The data from GSE8397 (22 284 probe sets) and GSE7621 (54 675 probe sets) were processed and normalized using brb-array tools, version 4.2.1 (http://linus.nci.nih.gov/BRB-ArrayTools.html). The Affymetrix MAS5.0 algorithm was applied to compute probe set summaries and the data were normalized and transformed into log2 values. Next, the data were filtered: (a) the FC of the gene median expression value was ≥ 1.5 and the percentage of this change was ≥ 20% and (b) the percentage of data missing or filtered out was ≤ 50%.

Shi et al. [20] found that simple SAM performed better than the P-value (calculated by t-test) as a ranking method to generate more reproducible results among platforms [Affymetrix, Agilent (Affinity Bioreagents, Golden, CO, USA) and Amersham (Little Chalfont, UK)] [40]. The two-class SAM [20] was used to find the DE genes and to control the false discovery rate. Gene expression was considered significantly different if the FC > 1.5 and the false discovery rate < 0.1.

Prioritization of DE genes from GEO based on expression FCs

The DE genes were ranked according to the absolute values of the log2 expression FCs, which were defined as FC scores. For the DE genes that were common to the medial and lateral SN or were common between the two datasets, the average FCs were calculated.

Gene co-expression network construction

Pairwise Pearson correlation coefficients were calculated for the DE genes from the two datasets (GSE8397 and GSE7621) of GEO. For each pair of genes that were differentially expressed in the medial and lateral SNs of the GSE8397 dataset or in both datasets, the average Pearson correlation coefficients were calculated and defined as the final correlation scores. The gene co-expression network was constructed using a rank-based method [21] with slight modifications. Nodes in the network represent the DE genes and unweighted edges represent the correlations between genes. Two genes were connected if the Pearson correlation coefficient satisfied the conditions: (a) the absolute value [41] of their Pearson correlation coefficient was greater than 0.5 and (b) one gene was ranked among the top three most correlated genes. The gene co-expression network was constructed using cytoscape [42].

Prioritization of DE genes from GEO using RWR algorithm

The RWR algorithm was run on the above co-expression network. GPEC [43], a cytoscape plug-in, was used to rank the DE genes from GEO through a RWR algorithm [22]. RWR mimics the behaviour of a random walker that moves from an initial node to a neighbouring node or goes back to the starting nodes. Formally, RWR is defined as :

display math(1)

where M is the column-normalized adjacency matrix, γ is the restart probability, and p0 is the initial probability vector, where the sum of the probabilities of the training nodes is equal to 1. The genes were ranked according to the steady-state probability vector p, which is numerically approximated by performing iterations until the difference between pt+1 and pt (measured by the L1 norm) falls below 10−6 [43]. The score given by the RWR algorithm was defined as the RWR score.

Prioritization of DE genes from integration of OMIM and CTD with ParkDB

By using the search term ‘Parkinson's’ to search the OMIM database (http://www.omim.org/), 272 hits were extracted. We checked the genes one by one via PubMed to verify the correlations with PD and deleted the replicate genes. Ninety-eight PD-associated genes were obtained (Table S2). The genes that were targeted by therapeutic molecules for PD were extracted from CTD (http://ctd.mdibl.org). To identify the DE genes, the genes from OMIM and CTD were filtered by ParkDB (http://www2.cancer.ucl.ac.uk/Parkinson_Db2/) by applying the criteria: (a) human species was selected; (b) P ≤ 0.01; and (c) the up- or down-regulation of a DE gene was consistent across different experiments in ParkDB. The up- or down-regulations of DE genes were recorded for the CMap query. For each DE gene from OMIM, the absolute value of the average log2 expression FC, obtained from different experiments in ParkDB, was defined as the OMIM score. For each DE gene from CTD, the number of therapeutic molecules that were related to PD and that targeted this gene was defined as the CTD score.

Integration of multiple scores

The scores from different sources were not directly comparable; therefore, it was necessary to normalize and combine the individual scores (the FC, RWR, OMIM and CTD scores). Considering the variety of genetic information and data sources, different weights should be assigned to the four set of scores. In the present study, the weight was assigned using an entropy-based method [44], which objectively determined the weight according to the variability of each set of scores.

Score normalization

Let m denote the number of prioritization approaches, n denote the 535 DE genes, Xij (= 1, …, m; j = 1, …, n) represent the initial score for each DE gene by utilizing a certain prioritization method and Sij represent the normalized score. The four sets of scores were transformed into normalized scores [45] by the formula:

display math(2)

Entropy-based weight assignment

For the ith method, entropy [46] was defined as:

display math(3)

where math formula, = 1/ln n, the weight of entropy for the ith method was defined as:

display math(4)

where 0 ≤ wi ≤ 1 and the sum of wi (= 1, …, m) is equal to 1.

Score combination

The combined score of each DE gene was calculated by the function:

display math(5)

CMap analysis and performance measurement

Among the 1309 molecules in CMap, 93 therapeutic molecules for PD (Table S9) were collected based on the DrugBank database [47] (http://www.drugbank.ca/), the Clinical Trials database (http://www.clinicaltrials.gov/), the Therapeutic Target Database [48] (http://bidd.nus.edu.sg/group/ttd/ttd.asp), PolySearch [49] (http://wishart.biology.ualberta.ca/polysearch/), CTD and PubMed.

Signatures comprising the top 20, 50, 100, 200 or all (535) DE genes (Table S5) from the integrated approach were used to query CMap to determine a suitable size. The signatures comprising the top 20 genes from individual prioritization approaches and the integrated approach were also used to query CMap to compare their performances. For the expression changes representing each of the 1309 treatment instances, CMap calculated the enrichment in up- and down-regulated genes of the given input signature based on the nonparametric rank-ordered Kolmogorov–Smirnov statistic. The enrichment scores were then combined to produce a connectivity score [10]. The molecules were ranked according to the minimum connectivity score. The performances of CMap using different signatures were measured by calculating the enrichment factor [50] and the absolute number of therapeutic molecules for PD in the top 50 molecules of the ranked list returned by CMap. The enrichment factor [50] was calculated by the formula:

display math(6)

where Hitssampled represents the numbers of therapeutic molecules for PD in the top 50 of the ranked list, Nsampled represents 50, Hitstotal represents 93 and Ntotal represents 1309.

Cell culture and treatment

Human SH-SY5Y neuroblastoma cells were cultured in Dulbecco's modified Eagle's medium/F-12 medium (Neuronbc, Beijing, China) supplemented with 10% foetal bovine serum (Hyclone, Logan, UT, USA). SH-SY5Y cells were seeded at a density of 3 × 103 cells per well in 96-well plates, and they were kept in a humidified atmosphere of 5% CO2 and 95% air at 37 °C. After 24 h of cultivation, the serum was withdrawn for another 24 h. Then, the cells were pre-incubated with 17-DMAG (Selleck Chemicals, Houston, TX, USA) at concentrations of 10−11 to 10−5 m (cytotoxicity assay) or 0.3 nm, 1 nm and 3 nm (other assays) for 1 h, followed by incubating with 2 μm rotenone (Sigma-Aldrich, St Louis, MO, USA) or vehicle control for 24 h or 48 h.

Cell viability assay

Cell viability was assessed by the MTT method, as described previously [51]. Briefly, after 24 h or 48 h cultivation, the medium in each well was changed to 100 μL of MTT (Sigma-Aldrich) solution at a concentration of 0.5 mg/ml for 3.5 h. The absorbance of formazan dissolved in dimethylsulfoxide was measured at 570 nm using a microplate reader (SpectraMax M5; Molecular Devices, CA, USA).

Measurement of intracellular ROS

The intracellular ROS level was determined using 2′,7′-dichlorodihydrofluorescein diacetate (DCFH-DA) fluorescent probe (Sigma-Aldrich), as described previously [52]. Cells were treated with various concentrations of 17-DMAG (0.3 nm, 1 nm and 3 nm) for 1 h, followed by incubation with 2 μm rotenone or vehicle control for 24 h. Then, the cells were incubated with 10 μm DCFH-DA and 10 μm Hoechst 33 234 for 20 min at 37 °C in the dark, and washed twice with NaCl/Pi. The fluorescence intensity was detected and the images were photographed using a Cellomics ArrayScan VTI HCS Reader (Thermo Fisher Scientific-Cellomics, Pittsburgh, PA, USA). The intracellular ROS level was indicated by the mean of the average fluorescent intensity (Mean_AvgInten).

Measurement of mitochondrial respiration

Isolation of rat brain mitochondria

The animal experimental procedure was in accordance with the institutional guidelines and ethics for the use and care of laboratory animals and approved by our local Animal Ethics Committee. The preparation of rat brain mitochondria and the measurement of mitochondrial respiration were performed as described previously [53] with slight modifications. Briefly, male Sprague–Dawley rats (250–300 g; Vital River Laboratory Animal Technology Co., Ltd, Beijing, China) were decapitated, and the cerebral cortex was rapidly removed in ice-cold isolation medium (0.25 m sucrose containing 10 mm Tris-HCl, 1 mm EDTA and 250 μg·mL−1 BSA, pH 7.4). The cerebral cortex was repeatedly washed with the isolation medium and then homogenized in a glass homogenizer. The homogenate was centrifuged at 2000 g for 12 min at 4 °C to discard nuclei and cell debris. The supernatant was collected and centrifuged at 12 000 g for 12 min at 4 °C. The precipitate was suspended in the 800-μL isolation medium for each rat to standardize the concentration. Mitochondria were freshly prepared for each experiment and were used immediately for respiration assays.

Measurement of mitochondrial respiration

Oxygen consumption by mitochondria was measured using a Clark oxygen electrode (Strathkelvin Instrument Co., Motherwell, UK). The oxygen consumption experiment was conducted at 25 °C in respiratory medium (225 mm sucrose, 5 mm KH2PO4, 10 mm Tris-HCl, 10 mm KCl, 0.2 mm EDTA and 100 μg·mL−1 BSA, pH 7.4) in a total reaction volume of 500 μL. Mitochondria were incubated for 1 min in Clark oxygen electrode, followed by 0.1 μm rotenone in the presence or absence of 17-DMAG (3 nm). Then, 10 mm l-glutamate was added as the respiratory substrate, followed by 250 μm ADP. The RCR was calculated as the ratio of state 3 respiration to state 4 respiration.

Statistical analysis

The results are represented as the mean ± SEM of at least three experiments. Statistical analysis was conducted using one-way analysis of variance followed by Dunnett's test. P < 0.05 was considered statistically significant.

Acknowledgements

This research work was supported by a Research Special Fund for Public Welfare Industry of Health (No. 200802041), National Great Science and Technology Projects (2012ZX09301002-001-001 and 2012ZX09301002), the International Collaboration Project (2011DFR31240) and Peking Union Medical College graduate student innovation fund (2012-1007-006). Partial analyses were performed using BRB-ArrayTools developed by Dr Richard Simon and the BRB-ArrayTools Development Team. We greatly appreciate the comments made by the anonymous referees that improved our manuscript.

Ancillary