Bioinformatics approach to identifying molecular biomarkers and networks in multiple sclerosis

Authors


Jun-ichi Satoh MD, PhD, Department of Bioinformatics and Molecular Neuropathology,
Meiji Pharmaceutical University,
2-522-1 Noshio, Kiyose, Tokyo 204-8588,
Japan.
Tel: +81-42-495-8678
Fax: +81-42-495-8678
Email: satoj@my-pharm.ac.jp

Abstract

Multiple sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system (CNS) white matter mediated by an autoimmune process triggered by a complex interplay between genetic and environmental factors, in which the precise molecular pathogenesis remains to be comprehensively characterized. The global analysis of genome, transcriptome, proteome and metabolome, collectively termed omics, promotes us to characterize the genome-wide molecular basis of MS. However, as omics studies produce high-throughput experimental data at one time, it is often difficult to find out the meaningful biological implications from huge datasets. Recent advances in bioinformatics and systems biology have made major breakthroughs by illustrating the cell-wide map of complex molecular interactions with the aid of the literature-based knowledgebase of molecular pathways. The integration of omics data derived from the disease-affected cells and tissues with underlying molecular networks provides a rational approach not only to identifying the disease-relevant molecular markers and pathways, but also to designing the network-based effective drugs for MS. (Clin. Exp. Neuroimmunol. doi: 10.1111/j.1759-1961.2010.00013.x, 2010)

Introduction

Multiple sclerosis (MS) is an inflammatory demyelinating disease affecting exclusively the central nervous system (CNS) white matter mediated by an autoimmune process triggered by a complex interplay between genetic and environmental factors.1 Intravenous administration of interferon-gamma (IFNγ) provoked acute relapses of MS, indicating a pivotal role of proinflammatory T helper type 1 (Th1) lymphocytes. More recent studies proposed the pathogenic role of Th17 lymphocytes in sustained tissue damage in MS.2 MS shows a great range of phenotypic variability. The disease is classified into relapsing-remitting MS (RRMS), secondary progressive MS (SPMS) or primary progressive MS (PPMS) with respect to the clinical course. Pathologically, MS shows a remarkable heterogeneity in the degree of inflammation, complement activation, antibody deposition, demyelination and remyelination, oligodendrocyte apoptosis, and axonal degeneration.3 Currently available drugs in clinical practice of MS, including interferon-beta (IFNβ), glatiramer acetate, mitoxantrone, FTY720 and natalizumab, have proven only limited efficacies in subpopulations of the patients.4 These observations suggest the hypothesis that MS is a kind of neurological syndrome caused by different immunopathological mechanisms leading to the final common pathway that provokes inflammatory demyelination. Therefore, the identification of specific biomarkers relevant to the heterogeneity of MS is highly important to establish the molecular mechanism-based personalized therapy in MS.

After the completion of the Human Genome Project in 2003, the global analysis of genome, transcriptome, proteome and metabolome, collectively termed omics, promotes us to characterize the genome-wide molecular basis of the diseases, and helps us to identify disease-specific molecular signatures and biomarkers for diagnosis and prediction of prognosis. Actually, the genome-wide association study (GWAS) of MS revealed novel risk alleles for susceptibility of MS.5 The comprehensive transcriptome and proteome profiling of brain tissues and lymphocytes identified key molecules aberrantly regulated in MS, whose role has not been previously predicted in the pathogenesis of MS.6,7 Most recently, the application of next-generation sequencing technology to personal genomes has enabled us to investigate the genetic basis of MS at the level of individual patients.8

Because omics studies usually produce high-throughput experimental data at one time, it is often difficult to find out the meaningful biological implications from such a huge dataset. Recent advances in bioinformatics and systems biology have made major breakthroughs by showing the cell-wide map of complex molecular interactions with the aid of the literature-based knowledgebase of molecular pathways.9 The logically arranged molecular networks construct the whole system characterized by robustness, which maintains the proper function of the system in the face of genetic and environmental perturbations.10 In the scale-free molecular network, targeted disruption of limited numbers of critical components designated the hub, on which the biologically important molecular connections concentrate, could disturb the whole cellular function by destabilizing the network.11 From the point of these views, the integration of omics data derived from the disease-affected cells and tissues with underlying molecular networks provides a rational approach not only to characterizing the disease-relevant pathways, but also to identifying the network-based effective drug targets.

Increasing numbers of human disease-oriented omics data have been deposited in public databases, such as the Gene Expression Omnibus (GEO) repository (http://www.ncbi.nlm.nih.gov/geo) and the ArrayExpress archive (http://www.ebi.ac.uk/microarray-as/ae). Most of these are transcriptome datasets. Importantly, they really include the data that have potentially valuable information on molecular biomarkers and networks of the dis-eases, when they are reanalyzed by appropriate bio-informatics approaches, followed by validation of in silico observations with in vitro and in vivo experiments.12

The present review has focused on bioinformatics approaches to identifying MS-associated molecular biomarkers and networks from high-throughput data of omics studies.

Global gene expression analysis

DNA microarray technology is an innovative approach that allows us to systematically monitor the genome-wide gene expression pattern of disease-affected tissues and cells. This approach enables us to illustrate most efficiently a global picture of cellular activity by the messenger RNA (mRNA) expression levels as an indicator, although the levels of mRNA do not always correlate with the levels of proteins directly involved in cellular function. However, the use of DNA microarray is more convenient to collect temporal and spatial snapshots of gene expression than the conventional mass spectrometry, which is often hampered by limited resolution of protein separation. In transcriptome analysis, we could logically assume that a set of coregulated genes might have similar biological functions within the cells.

First of all, I would like to briefly overview the gene expression analysis (Fig. 1). In general, total RNA fractions containing mRNA species are extracted from cells and tissues, individually labeled with fluorescent dyes, and processed for hybridization with thousands of oligonucleotides of known sequences immobilized on the arrays. After washing, they are processed for signal acquisition on a scanner. Various types of microarrays are currently available, although the MicroArray Quality Control (MAQC) project verified that the core results are well reproducible among different platforms used.13 However, it is recommended that each experiment should contain biological replicates to validate reproducibility of the observations. The raw data are normalized by representative methods, including the quantile normalization method and the Robust MultiChip Average (RMA) method using the r software of the Bioconductor package (http://www.cran.r-project.org) or the GeneSpring software (Agilent Technology, Palo Alto, CA, USA).

Figure 1.

 The load map from global gene expression profiling to molecular network analysis. Total RNA samples labeled with fluorescent dyes are processed for hybridization with oligonucleotide probes on the arrays, which should include biological replicates. They are processed for signal acquisition on a scanner. To identify the list of differentially expressed genes (DEG) among the samples, the normalized data are processed for statistical analysis, followed by validation by quantitative reverse transcription polymerase chain reaction (qRT–PCR). They are also processed for hierarchical clustering analysis and gene ontology and function analysis. To identify biologically relevant molecular pathways, the list of DEG is imported into pathway analysis tools endowed with a comprehensive knowledgebase. ANOVA, analysis of variance; DAVID, Database for Annotation, Visualization and Integrated Discovery; FDR, false discovery rate; GSEA, Gene Set Enrichment Analysis; IPA, Ingenuity Pathways Analysis; KEGG, Kyoto Encyclopedia of Genes and Genomes; MCT, multiple comparison test; PANTHER, Protein Analysis Through Evolutionary Relationships; and STRING; Search Tool for the Retrieval of Interacting Genes/Proteins.

To identify differentially expressed genes (DEG) among distinct samples, the normalized data are processed for statistical analysis using t-test for comparison between two groups or analysis of variance (anova) for comparison among more than three groups, followed by the multiple comparison test with the Bonferroni correction or by controlling false discovery rate (FDR) below 0.05 to adjust P-values.

In the next step, the levels of expression of DEG should be validated by quantitative reverse transcription polymerase chain reaction (qRT–PCR). The normalized data are also processed for hierarchical clustering analysis to classify the expression of profile-based groups of genes and samples by using GeneSpring or the open-access resources, such as Cluster 3.0 (http://www.bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster) and TreeView (http://www.sourceforge.net/projects/jtreeview). The Gene ID Conversion tool of the Database for Annotation, Visualization and Integrated Discovery (DAVID) (http://www.david.abcc.ncifcrf.gov)14 converts the large-scale array-specific probe IDs into the corresponding Entrez Gene IDs, HUGO Gene Symbols, Ensembel Gene IDs or UniProt IDs, being more convenient for application to the downstream analysis. Both the DAVID Functional annotation tool and the Gene Set Enrichment Analysis (GSEA) tool (http://www.broad.mit.edu/gsea/downloads.jsp)15 are open-access resources that help us to identify a set of enriched genes with a specified functional annotation in the entire list of genes. Many other approaches for preprocessing microarray data are applicable, and the resources are available elsewhere.

Molecular network analysis

To identify biologically relevant molecular pathways from large-scale data, we could analyze them by using a battery of pathway analysis tools endowed with a comprehensive knowledgebase; that is, Kyoto Encyclopedia of Genes and Genomes (KEGG; http://www.kegg.jp), the Protein Analysis Through Evolutionary Relationships (PANTHER) classification system (http://www.pantherdb.org), Search Tool for the Retrieval of Interacting Genes/Proteins (STRING; string.embl.de), Ingenuity Pathways Analysis (IPA; Ingenuity Systems, http://www.ingenuity.com) and KeyMolnet (Institute of Medicinal Molecular Design, http://www.immd.co.jp) (Fig. 1). KEGG, PANTHER and STRING are open-access databases, whereas IPA and KeyMolnet are commercial databases updated regularly. Both transcriptome and proteome data are acceptable for all the databases described here.

KEGG systematically integrates genomic and chemical information to create the whole biological system in silico.16 KEGG includes manually curated reference pathways that cover a wide range of metabolic, genetic, environmental and cellular processes, and human diseases. Currently, KEGG contains 108 983 pathways generated from 358 reference pathways. PANTHER, operating on the computational algorithms that relate the evolution of protein sequences to the evolution of protein functions and biological roles, provides a structured representation of protein function in the context of biological reaction networks.17 PANTHER includes the information on 165 regulatory and metabolic pathways, manually curated by expert biologists. By uploading the list of Gene IDs, the PANTHER gene expression data analysis tool identifies the genes in terms of over- or under-representation in canonical pathways, followed by statistical evaluation by multiple comparison test with the Bonferroni correction. STRING is a database that contains physiological and functional protein-protein interactions composed of 2 590 259 proteins from 630 organisms.18 STRING integrates the information from numerous sources, including experimental repositories, computational prediction methods and public text collections. By uploading the list of UniProt IDs, STRING illustrates the union of all possible association networks.

IPA is a knowledgebase that contains approximately 2 270 000 biological and chemical interactions and functional annotations with definite scientific evidence, curated by expert biologists.19 By uploading the list of Gene IDs and expression values, the network-generation algorithm identifies focused genes integrated in a global molecular network. IPA calculates the score P-value, the statistical significance of association between the genes and the networks by the Fisher’s exact test.

KeyMolnet contains knowledge-based content on 123 000 relationships among human genes and proteins, small molecules, diseases, pathways and drugs, curated by expert biologists.20 They are categorized into the core content collected from selected review articles with the highest reliability or the secondary contents extracted from abstracts of PubMed and Human Reference Protein database (HPRD). By importing the list of Gene ID and expression values, KeyMolnet automatically provides corresponding molecules as a node on networks. The “common upstream” network-search algorithm enables us to extract the most relevant molecular network composed of the genes coordinately regulated by putative common upstream transcription factors. The “neighboring” network-search algorithm selected one or more molecules as starting points to generate the network of all kinds of molecular interactions around starting molecules, including direct activation/inactivation, transcriptional activation/repression, and the complex formation within the designated number of paths from starting points. The “N-points to N-points” network-search algorithm identifies the molecular network constructed by the shortest route connecting the start-point molecules and the end-point molecules. The generated net-work was compared side-by-side with 430 human canonical pathways of the KeyMolnet library. The algorithm counting the number of overlapping molecular relations between the extracted network and the canonical pathway makes it possible to identify the canonical pathway showing the most significant contribution to the extracted network. The significance in the similarity between both is scored following the formula, where O is the number of overlapping molecular relations between the extracted network and the canonical pathway, V is the number of molecular relations located in the extracted network, C is the number of molecular relations located in the canonical pathway, T is the number of total molecular relations, and X is the sigma variable that defines coincidence.

image

Biomarkers for predicting MS relapse

Molecular mechanisms underlying acute relapse of MS remain currently unknown. If molecular biomarkers for MS relapse are identified, we could predict the timing of relapses, being invaluable to start the earliest preventive intervention.

By gene expression profiling with Affymetrix Human Genome U133 plus 2.0 arrays, Corvol et al. identified 975 genes that separate clinically isolated syndrome (CIS) into four groups.21 Surprisingly, 92% of patients in group 1 were characterized by a subset of 108 genes converted to clinically definite MS (CDMS) within 9 months of the first attack. They suggest downregulation of TOB1, a negative regulator of T cell proliferation as a marker predicting the conversion from CIS to CDMS.

By gene expression profiling with Affymetrix Human Genome U133A2 arrays, Achiron et al. showed that 1578 DEG of peripheral blood mono-nuclear cells (PBMC) of RRMS patients, differentiating acute relapse from remission, are enriched in the apoptosis-related pathway, in which proapoptotic genes are downregulated, whereas antiapoptotic genes are upregulated during acute relapse.22 The same group also compared 62 patients with CDMS and 32 patients with CIS by combining gene expression profiling with the support vector machine (SVM)-based prediction of time to the next acute relapse, setting a two stage predictor composed of First Level Predictors (FLP) and Fine Turning Predictors (FTP).23 They identified three sets of the best 10-gene FLP that predict the next relapse with a resolution of 500 days and four sets of the best 9-gene FTP that predict the forthcoming relapse with a resolution of 50 days. The predictor genes are enriched in the TGFΒ2-related signaling pathway. More recently, Achiron et al. compared nine subjects who developed MS during a 9-year follow-up period (the preactive stage of MS; MS-to-be) and 11 control subjects unaffected with MS (MS-free) by gene expression profiling.24 They found downregulation of nuclear receptor NR4A1 in the preactive stage of MS, suggesting that self-reactive T cells are not eliminated in the MS-to-be population, owing to a defect in the NR4A1-dependent apoptotic mechanism.

By gene expression profiling with a custom microarray of the Peter MacCallum Cancer Institute, Arthur et al. showed that a set of dysregulated genes in peripheral blood cells during the relapse and the remission phases of RRMS are enriched in the categories involved in apoptosis and inflammation, when annotated according to the GOstat program.25 They also found upregulation of TGFB1 during the relapse. These observations support the working hypothesis that MS relapse involves an imbalance between promoting and preventing apoptosis of autoreactive and regulatory T cells. By gene expression analysis with Affymetrix Human Genome U133 plus 2.0 arrays, Brynedal et al. showed that MS relapses reflect the gene expression change in PBMC, but not in cerebrospinal fluid (CSF) lymphocytes, suggesting the importance of initial events triggering relapses occurring outside the CNS.26

By gene expression profiling with a custom DNA microarray (Hitachi Life Science, Saitama, Japan), we identified 43 DEG in peripheral blood CD3+ T cells between the peak of acute relapse and the complete remission of RRMS patients.27 We isolated highly purified CD3+ T cells, because autoreactive pathogenic and regulatory cells, which potentially play a major role in MS relapse and remission, might be enriched in this fraction. By using 43 DEG as a set of discriminators, hierarchical clustering separated the cluster of relapse from that of remission. The molecular network of 43 DEG extracted by the common upstream search of KeyMolnet showed the most significant relationship with transcriptional regulation by the nuclear factor-kappa B (NF-κB). NF-κB is a central regulator of innate and adaptive immune responses, cell proliferation, and apoptosis.28 A considerable number of NF-κB target genes activate NF-κB itself, providing a positive regulatory loop that amplifies and perpetuates inflammatory responses, leading to persistent activation of autoreactive T cells in MS. These observations support the logical hypothesis that NF-κB plays a central role in triggering molecular events in T cells responsible for induction of acute relapse of MS, and suggest that aberrant gene regulation by NF-κB on T-cell transcriptome serves as a molecular biomarker for monitoring the clinical disease activity of MS. Supporting this hypothesis, increasing evidence has shown that NF-κB represents a central molecular target for MS therapy.29

We also studied the gene expression profile of purified CD3+ T cells isolated from four Hungarian monozygotic MS twin pairs with a custom DNA microarray (Hitachi Life Science, Saitama, Japan).30 By comparing three concordant pairs and one discordant pair, we identified 20 DEG aberrantly regulated between the MS patient and the genetically identical healthy subject. The molecular network of 20 DEG extracted by the common upstream search of KeyMolnet showed the most significant relationship with transcriptional regulation by the Ets transcription factor family. Ets transcription factor proteins, by interacting with various co-regulatory factors, control the expression of a wide range of target genes essential for cell proliferation, differentiation, transformation and apoptosis. Importantly, Ets-1, the prototype of the Ets family members, acts as a negative regulator of Th17 cell differentiation.31 It is worthy to note that discordant monozygotic MS twin siblings do not show any genetic or epigenetic differences, as validated by whole genome sequencing analysis and genome-scale DNA methylation profiling.8

Biomarkers for predicting IFNβ responders

Although recombinant IFNβ therapy is widely used as the gold standard to reduce disease activity of MS, up to 50% of the patients continue to have relapses, followed by progression of disability. If molecular biomarkers for IFNβ responsiveness are identified, we could use the best treatment options depending on the patients, being invaluable to establish the personalized therapy of MS.

By genome-wide screening of single-nucleotide polymorphisms (SNP) with Affymetrix Human 100K SNP arrays, Byun et al. identified allelic differences between IFNβ responders and non-responders of RRMS patients in several genes, including HAPLN1, GPC5, COL25A1, CAST and NPAS3, although odds ratios of SNP differences of individual genes are fairly low.32

By gene expression profiling with Affymetrix Human Genome U133A Plus 2.0 arrays, Comabella et al. showed that IFNβ non-responders of RRMS patients after treatment for 2 years are characterized by the overexpression of type I IFN-induced genes in PBMC, associated with increased endogenous production of type I IFN by monocytes at pretreatment.33 These observations suggest that a preactivated type I IFN signaling pathway is attributable to IFNβ non-responsiveness in MS. By gene expression profiling with Affymetrix Human Genome Focus arrays, Sellebjerg et al. showed that in vivo injection of IFNβ rapidly induces elevation of IFI27, CCL2 and CXCL10 in PBMC of MS patients, even after 6 months of treatment,34 consistent with previous studies.35 The induction of IFN-responsive genes is greatly reduced in patients with neutralizing antibodies (NAbs) against IFNβ.34 In contrast, there exist no global differences in gene expression profiles of PBMC of RRMS patients between NAbs-negative IFNβ non-responders and responders.36

By gene expression profiling with Affymetrix Human Genome U133A/B arrays, Goertsches et al. found that IFNβ administration in vivo elevates a panel of IFN-responsive genes in PBMC of RRMS patients during a 2-year treatment, but it also downregulates several genes, including CD20, a known target of B-cell depletion therapy in MS.37 By using the Pathway Architect software (Stratagene, La Jolla, CA, USA), they identified two major gene networks where upregulation of STAT1 and downregulation of ITGA2B act as a central molecule, although they did not further characterize the responder/non-responder-linked gene expression profiles.

By gene expression profiling with a custom array of the National Institutes of Health (NIH)/National Institute of Neurological Disorders and Stroke (NINDS) Microarray Consortium, Fernald et al. showed that a 1-week IFNβ administration in vivo induces a set of coregulated genes whose networks are related to immune- and apoptosis-regulatory functions, involving JAK-STAT and NF-κB cascades, whereas the networks of untreated subjects are composed of the genes of cellular housekeeping functions.38 By combining kinetic RT–PCR analysis of expression of 70 genes in PBMC of RRMS with the integrated Bayesian inference system approach, the same group previously reported that nine sets of gene triplets detected at pretreatment, including a panel of caspases, well predict the response to IFNβ with up to 86% accuracy.39

By gene expression profiling with a custom microarary (Hitachi), we previously identified a set of interferon-responsive genes expressed in purified peripheral blood CD3+ T cells of RRMS patients receiving IFNβ treatment.40 IFNβ immediately induces a burst of expression of chemokine genes with potential relevance to IFNβ-related early adverse effects in MS.41 The majority of the top 30 most significant DEG in CD3+ T cells between untreated MS patients and healthy subjects are categorized into apoptosis signaling regulators.42 Furthermore, we found that T cell gene expression profiling classifies a heterogeneous population of Japanese MS patients into four distinct subgroups that differ in the disease activity and therapeutic response to IFNβ.43 We identified 286 DEG expressed between 72 untreated Japanese MS patients and 22 age- and sex-matched healthy subjects. By importing the list of 286 DEG into the common upstream search of KeyMolnet, the generated network showed the most significant relationship with transcriptional regulation by NF-κB.30 Although none of the single genes alone serve as a MS-specific biomarker gene, NR4A2 (NURR1), a target of NF-κB acting as a positive regulator of IL-17 and IFNγ production, is highly upregulated in MS T cells.42,43 It is worthy to note that IFNβ is beneficial in the disease induced by Th1 cells, but detrimental in the disease mediated by Th17 cells in mouse experimental autoimmune encephalomyelitis (EAE), and IFNβ non-responders in RRMS patients show higher serum IL-17F levels, suggesting that IL-17 serves as a biomarker predicting a poor IFNβ response in MS.44

Molecular networks of MS brain lesion proteome

Recently, Han et al. investigated a comprehensive proteome of six frozen MS brains.7 Proteins were prepared from small pieces of brain tissues isolated by laser-captured microdissection (LCM), and they were characterized separately by the standard histological examination, and classified into acute plaques (AP), chronic active plaques (CAP) or chronic plaques (CP) based on the disease activity. The pro-teins were then separated on one-dimensional SDS-PADE gels, digested in-gel with trypsin, and peptide fragments were processed for mass spectrometric analysis. Among 2574 proteins determined with high confidence, the INTERSECT/INTERACT program identified 158, 416 and 236 lesion-specific proteins detected exclusively in AP, CAP and CP, respectively. They found that overproduction of five molecules involved in the coagulation cascade, including tissue factor and protein C inhibitor, plays a central role in molecular events ongoing in CAP. Furthermore, in vivo administration of coagulation cascade inhibitors really reduced the clinical severity in EAE, supporting the view that the blockade of the coagulation cascade would be a promising approach for treatment of MS.43 However, nearly all remaining proteins are uncharacterized in terms of their implications in MS brain lesion development.

We studied molecular networks and pathways of the proteome dataset of Han et al. by using four different bioinformatics tools for molecular network analysis, such as KEGG, PANTHER, KeyMolnet and IPA.45 KEGG and PANTHER showed the relevance of extracellular matrix (ECM)-mediated focal adhesion and integrin signaling to CAP and CP proteome. KeyMolnet by the N-points to N-points search disclosed a central role of the complex interaction among diverse cytokine signaling pathways in brain lesion development at all disease stages, as well as a role of integrin signaling in CAP and CP. IPA identified the network constructed with a wide range of ECM components, such as COL1A1, COL1A2, COL6A2, COL6A3, FN1, FBLN2, LAMA1, VTN and HSPG2, as one of the networks highly relevant to CAP proteome. Thus, four distinct tools commonly suggested a role of ECM and integrin signaling in development of chronic MS lesions, showing that the selective blockade of the interaction between ECM and integrin molecules in brain lesions in situ would be a target for therapeutic intervention to terminate ongoing events responsible for the persistence of inflammatory demyelination.

KeyMolnet identifies a candidate of molecular targets for MS therapy

The KeyMolnet library includes 91 MS-linked molecules, collected from selected review articles with the highest reliability (Table 1). By importing the list of these molecules into KeyMolnet, the neighboring search within one path from starting points generates the highly complex molecular network composed of 913 molecules and 1005 molecular relations (Fig. 2a). The extracted network shows the most significant relationship with transcriptional regulation by vitamin D receptor (VDR) with P-value of the score = 4.415E-242. Thus, VDR, a hub that has direct connections with 118 closely related molecules of the extracted network (Fig. 2b, Table 2), serves as one of the most promising molecular target candidates for MS therapy, because the adequate manipulation of the VDR network capable of producing a great impact on the whole network could efficiently disconnect the pathological network of MS. Indeed, vitamin D plays a protective role in MS by activating VDR, a transcription factor that regulates the expression of as many as 500 genes, although the underlying molecular mechanism remains largely unknown.46

Table 1. Multiple sclerosis-linked molecules of the KeyMolnet library
KeyMolnet IDKeyMolnet symbolDescription
  1. 91 multiple sclerosis-linked molecules of the KeyMolnet library are listed in alphabetical order.

KMMC:044222,3cnPDE2′,3′-cyclic nucleotide 3′-phosphodiesterase
KMMC:04421aBcrystallinAlpha crystallin B chain
KMMC:01024ADAM17A disintegrin and metalloproteinase 17
KMMC:04753AMPARAMPA-type glutamate receptor
KMMC:00019APPAmyloid beta A4 protein
KMMC:07424AQP4Aquaporin 4
KMMC:06672b-arrestin1Beta-arrestin 1
KMMC:04017BAFFB-cell activating factor
KMMC:00868Bcl-2B-cell lymphoma 2
KMMC:00728CaCalcium ion
KMMC:00605caspase-1Caspase-1
KMMC:00429CCL2Chemokine (C-C motif) ligand 2
KMMC:00425CCL3Chemokine (C-C motif) ligand 3
KMMC:00424CCL5Chemokine (C-C motif) ligand 5
KMMC:00450CCR1Chemokine (C-C motif) receptor 1
KMMC:00454CCR5Chemokine (C-C motif) receptor 5
KMMC:03088CD28T-cell-specific surface glycoprotein CD28
KMMC:00530CD80T-lymphocyte activation antigen CD80
KMMC:03089CTLA-4Cytotoxic T-lymphocyte protein 4
KMMC:00418CXCL10Chemokine (C-X-C motif) ligand 10
KMMC:00447CXCR3Chemokine (C-X-C motif) receptor 3
KMMC:00271ERaEstrogen receptor alpha
KMMC:00362FGF-2Fibroblast growth factor 2
KMMC:04423GFAPGlial fibrillary acidic protein
KMMC:01120GluGlutamic acid
KMMC:00396glucocorticoidGlucocorticoid
KMMC:03232hH1RHistamine H1 receptor
KMMC:00344HLA class IIHLA class II histocompatibility antigen
KMMC:09224HLA-C5HLA-C5
KMMC:09221HLA-DQA1*0102HLA-DQA1*0102
KMMC:06358HLA-DQA1*0301HLA-DQA1*0301
KMMC:06359HLA-DQB1*0302HLA-DQB1*0302
KMMC:09222HLA-DQB1*0602HLA-DQB1*0602
KMMC:06309HLA-DRB1HLA-DRB1
KMMC:06315HLA-DRB1*0301HLA-DRB1*0301
KMMC:09223HLA-DRB1*0405HLA-DRB1*0405
KMMC:09191HLA-DRB1*11HLA-DRB1*11
KMMC:07762HLA-DRB1*15HLA-DRB1*15
KMMC:06903HLA-DRB1*1501HLA-DRB1*1501
KMMC:07763HLA-DRB1*1503HLA-DRB1*1503
KMMC:09220HLA-DRB5*0101HLA-DRB5*0101
KMMC:04418HSP105Heat-shock protein 105 kDa
KMMC:00526IFNbInterferon beta
KMMC:00404IFNgInterferon gamma
KMMC:00292IGF1Insulin-like growth factor 1
KMMC:03611IgGImmunoglobulin G
KMMC:00402IL-10Interleukin-10
KMMC:03248IL-12Interleukin-12
KMMC:04266IL-12Rb2Interleukin-12 receptor beta-2 chain
KMMC:03129IL-17Interleukin-17
KMMC:03383IL-18Interleukin-18
KMMC:00521IL-1bInterleukin-1 beta
KMMC:00296IL-2Interleukin-2
KMMC:06578IL-23Interleukin-23
KMMC:00533IL-2RacInterleukin-2 receptor alpha chain
KMMC:00400IL-4Interleukin-4
KMMC:03255IL-5Interleukin-5
KMMC:00108IL-6Interleukin-6
KMMC:03257IL-7RacInterleukin-7 receptor alpha chain
KMMC:00523IL-9Interleukin-9
KMMC:00555iNOSInducible nitric oxide synthase
KMMC:00982int-a4/b1Integrin alpha-4/beta-1
KMMC:00968int-aMIntegrin alpha-M
KMMC:00970int-aXIntegrin alpha-X
KMMC:04094MBPMyelin basic protein
KMMC:06533mGluRMetabotropic glutamate receptor
KMMC:04420MOGMyelin-oligodendrocyte glycoprotein
KMMC:04419MPLPMyelin proteolipid protein
KMMC:03210N-VDCCVoltage dependent N-type calcium channel
KMMC:04712NCAMNeural cell adhesion molecule
KMMC:06537NCENa(+)-Ca2+ exchanger
KMMC:05576NeuroFNeurofilament protein
KMMC:09225neurofascinNeurofascin
KMMC:05903NF-HNeurofilament triplet H protein
KMMC:05904NF-LNeurofilament triplet L protein
KMMC:03785NMDARN-methyl-D-aspartate receptor
KMMC:07764NMDAR1N-methyl-D-aspartate receptor subunit NR1
KMMC:07765NMDAR2CN-methyl D-aspartate receptor subtype 2C
KMMC:07766NMDAR3AN-methyl-D-aspartate receptor subtype NR3A
KMMC:02064NONitric oxide
KMMC:07767Olig-1Oligodendrocyte transcription factor 1
KMMC:01005OPNOsteopontin
KMMC:03073PDGFPlatelet derived growth factor
KMMC:06225Sema3ASemaphorin 3A
KMMC:06229Sema3FSemaphorin 3F
KMMC:00111SMAD3Mothers against decapentaplegic homolog 3
KMMC:03839tauMicrotubule-associated protein tau
KMMC:00349TNFaTumor necrosis factor alpha
KMMC:00545VCAM-1Vascular cell adhesion protein 1
KMMC:03832VDVitamin D
KMMC:03711VDRVitamin D3 receptor
Figure 2.

 Molecular network of 91 MS-linked molecules. (a) By importing 91 MS-linked molecules into KeyMolnet, the neighboring search within one path from starting points generates the highly complex molecular network composed of 913 molecules and 1005 molecular relations. (b) The extracted network shows the most significant relationship with transcriptional regulation by vitamin D receptor (VDR) that has direct connections with 118 closely related molecules of the extracted network. VDR is indicated by blue circle. Red nodes represent start point molecules, whereas white nodes show additional molecules extracted automatically from core contents to establish molecular connections. The molecular relation is shown by a solid line with an arrow (direct binding or activation), solid line with an arrow and stop (direct inactivation), solid line without an arrow (complex formation), dash line with an arrow (transcriptional activation), and dash line with an arrow and stop (transcriptional repression). Please refer high resolution figures to URL (http://www.my-pharm.ac.jp/~satoj/sub22.html).

Table 2. Molecules constucting the transcriptional regulation by vitamin D receptor network
KeyMolnet IDKeyMolnet symbolDescription
  1. 118 molecules constucting the transcriptional regulation by VDR network are listed in alphabetical order.

KMMC:029591a,25(OH)2D31 alpha, 25-dihydroxyvitamin D3
KMMC:00751amphiregulinAmphiregulin
KMMC:03795ANPAtrial natriuretic peptide
KMMC:00090b-cateninbeta-catenin
KMMC:00301c-FosProtooncogene c-fos
KMMC:00183c-JunProtooncogene c-jun
KMMC:00626c-MycProtooncogene c-myc
KMMC:03813CA-IICarbonic anhydrase II
KMMC:04105CalbindinD28KVitamin D-dependent calcium-binding protein, avian-type
KMMC:03531CalbindinD9KVitamin D-dependent calcium-binding protein, intestinal
KMMC:00289caseinK2Casein kinase 2
KMMC:04195CaSRExtracellular calcium-sensing receptor
KMMC:00268CBPCREB binding protein
KMMC:00922CD44CD44 antigen
KMMC:00136CDK2Cyclin dependent kinase 2
KMMC:00135CDK6Cyclin dependent kinase 6
KMMC:01008collagenCollagen
KMMC:06770collagenase-IType I collagenase
KMMC:04081CRABP2Cellular retinoic acid-binding protein II
KMMC:00060CRTCalreticulin
KMMC:00401CXCL8Chemokine (C-X-C motif) ligand 8 (IL8)
KMMC:00137cyclinACyclin A
KMMC:00061cyclinD1Cyclin D1
KMMC:05926cyclinD3Cyclin D3
KMMC:00093cyclinECyclin E
KMMC:02960CYP24A1Cytochrome P450 24A1
KMMC:02958CYP27B1Cytochrome P450 27B1
KMMC:04593CYP3A4Cytochrome P450 3A4
KMMC:06769cystatin MCystatin M
KMMC:06762Cytokeratin 13Keratin, type I cytoskeletal 13
KMMC:06751Cytokeratin 16Keratin, type I cytoskeletal 16
KMMC:00053DHTRDihydrotestosterone receptor
KMMC:00928E-cadherinE-cadherin
KMMC:00594ErbB1Receptor protein-tyrosine kinase erbB-1
KMMC:00068filaminFilamin
KMMC:00341FN1Fibronectin 1
KMMC:06760FREAC-1Forkhead box protein F1
KMMC:06763G0S2G0/G1 switch protein 2
KMMC:00617GM-CSFGranulocyte macrophage colony stimulating factor
KMMC:06755HairlessHairless protein
KMMC:05978HOXA10Homeobox protein Hox-A10
KMMC:06767HOXB4Homeobox protein Hox-B4
KMMC:00404IFNgInterferon gamma
KMMC:00579IGF-BP3Insulin-like growth factor binding protein 3
KMMC:04498IGF-BP5Insulin-like growth factor binding protein 5
KMMC:00402IL-10Interleukin-10
KMMC:03241IL-10RInterleukin-10 receptor
KMMC:03239IL-10RacInterleukin-10 receptor alpha chain
KMMC:03240IL-10RbcInterleukin-10 receptor beta chain
KMMC:03248IL-12Interleukin-12
KMMC:03246IL-12AInterleukin-12 alpha chain
KMMC:00403IL-12BInterleukin-12 beta chain
KMMC:00296IL-2Interleukin-2
KMMC:00108IL-6Interleukin-6
KMMC:00973int-b3Integrin beta-3
KMMC:03747IVLInvolucrin
KMMC:00629JunBProtooncogene jun-B
KMMC:04334JunDProtooncogene jun-D
KMMC:06764KLK10Kallikrein-10
KMMC:06765KLK6Kallikrein-6
KMMC:04635Mad1Max dimerization protein 1
KMMC:06757MetallothioneinMetallothionein
KMMC:06722MKP-5MAP kinase phosphatase 5
KMMC:00595MMP-2Matrix metalloproteinase 2
KMMC:03104MMP-3Matrix metalloproteinase 3
KMMC:00631MMP-9Matrix metalloproteinase 9
KMMC:00556MnSODManganese superoxide dismutase
KMMC:00927N-cadherinN-cadherin
KMMC:00074NCOA1Nuclear receptor coactivator 1
KMMC:00075NCOA2Nuclear receptor coactivator 2
KMMC:00080NCOA3Nuclear receptor coactivator 3
KMMC:00282NCOR1Nuclear receptor corepressor 1
KMMC:00270NCOR2Nuclear receptor corepressor 2
KMMC:00392NFATNuclear factor of activated T cells
KMMC:00104NFkBNuclear factor kappa B
KMMC:03120OPGOsteoprotegerin
KMMC:01005OPNOsteopontin
KMMC:00304osteocalcinOsteocalcin
KMMC:00100p21CIP1Cyclin dependent kinase inhibitor 1
KMMC:00155p27KIP1Cyclin dependent kinase inhibitor 1B
KMMC:00195p300E1A binding protein p300
KMMC:03204PLCb1Phospholipase C beta 1
KMMC:03295PLCd1Phospholipase C delta 1
KMMC:00724PLCg1Phospholipase C gamma 1
KMMC:04869plectin1Plectin 1
KMMC:06772PMCA1Plasma membrane calcium-transporting ATPase 1
KMMC:06766PP1cSerine/threonine protein phosphatase PP1 catalytic subunit
KMMC:00786PP2ASerine/threonine protein phosphatase 2A
KMMC:03442PPARdPeroxisome proliferator activated receptor delta
KMMC:03710PTHParathyroid hormone
KMMC:00346PTHrPParathyroid hormone-related protein
KMMC:03115RANKLReceptor activator of NFkB ligand
KMMC:04537RelBTranscription factor RelB
KMMC:00091RIP140Nuclear factor RIP140
KMMC:00383RXRRetinoid X receptor
KMMC:06771SCCASquamous cell carcinoma antigen
KMMC:05340SKIPSki-interacting protein
KMMC:04103SUG126S protease regulatory subunit 8
KMMC:05702TAFII130Transcription initiation factor TFIID subunit 4
KMMC:06753TAFII28Transcription initiation factor TFIID subunit 11
KMMC:06752TAFII55Transcription initiation factor TFIID subunit 7
KMMC:04955TCF-1T-cell-specific transcription factor 1
KMMC:03075TCF-4T-cell-specific transcription factor 4
KMMC:06754TFIIATranscription initiation factor IIA
KMMC:04089TFIIBTranscription initiation factor IIB
KMMC:06768TGase ITransglutaminase I
KMMC:04184TGFb1Transforming growth factor beta 1
KMMC:05986TGFb2Transforming growth factor beta 2
KMMC:04104TIF1Transcription intermediary factor 1
KMMC:00349TNFaTumor necrosis factor alpha
KMMC:00277TRAP220Thyroid hormone receptor-associated protein complex component TRAP220
KMMC:06759TRPV5TRP vanilloid receptor 5
KMMC:06758TRPV6TRP vanilloid receptor 6
KMMC:06756TRR1Thioredoxin reductase 1
KMMC:03711VDRVitamin D3 receptor
KMMC:04853VDUP1Vitamin D3 up-regulated protein 1
KMMC:06761ZNF-44Zinc finger protein 44
KMMC:05147ZO-1Tight junction protein ZO-1
KMMC:05811ZO-2Tight junction protein ZO-2

Conclusion

MS is a complex disease with remarkable heterogeneity caused by the intricate interplay between various genetic and environmental factors. Recent advances in bioinformatics and systems biology have made major breakthroughs by illustrating the cell-wide map of complex molecular interactions with the aid of the literature-based knowledgebase of molecular pathways. The efficient integration of high-throughput experimental data derived from the disease-affected cells and tissues with underlying molecular networks helps us to characterize the molecular markers and pathways relevant to MS heterogeneity, and promotes us to identify the network-based effective drug targets for personalized therapy of MS.

Acknowledgements

This work was supported by grants from the Research on Intractable Diseases, the Ministry of Health, Labour and Welfare of Japan (H22-Nanchi-Ippan-136), and the High-Tech Research Center Project, the Ministry of Education, Culture, Sports, Science and Technology (MEXT), Japan (S0801043). The author thanks Dr Takashi Yamamura, Department of Immunology, National Institute of Neurosciences, NCNP for his continuous help with our studies.

Ancillary