Corresponding author: Javier Costas, Instituto de Investigación Sanitaria de Santiago, Fundación Pública Galega de Medicina Xenómica-SERGAS, Edif. Consultas Andar –2, E-15706 Santiago de Compostela, Spain. Tel: 34-981951490; Fax: 34-981951473; E-mail: firstname.lastname@example.org
A balanced translocation affecting DISC1 cosegregates with several psychiatric disorders, including schizophrenia, in a Scottish family. DISC1 is a hub protein of a network of protein–protein interactions involved in multiple developmental pathways within the brain. Gene set-based analysis has been proposed as an alternative to individual analysis of single nucleotide polymorphisms (SNPs) to get information from genome-wide association studies. In this work, we tested for an overrepresentation of the DISC1 interacting proteins within the top results of our ranked list of genes based on our previous genome-wide association study of missense SNPs in schizophrenia. Our data set consisted of 5100 common missense SNPs genotyped in 476 schizophrenic patients and 447 control subjects from Galicia, NW Spain. We used a modification of the Gene Set Enrichment Analysis adapted for SNPs, as implemented in the GenGen software. The analysis detected an overrepresentation of the DISC1 interacting proteins (permuted P-value = 0.0158), indicative of the role of this gene set in schizophrenia risk. We identified seven leading-edge genes, MACF1, UTRN, DST, DISC1, KIF3A, SYNE1, and AKAP9, responsible for the overrepresentation. These genes are involved in neuronal cytoskeleton organization and intracellular transport through the microtubule cytoskeleton, suggesting that these processes may be impaired in schizophrenia.
Genome-wide association studies (GWAS) of schizophrenia identified a few common variants reaching the highly stringent criteria for genome-wide significance (Purcell et al., 2009; Stefansson et al., 2009; Ripke et al., 2011; Steinberg et al., 2011). They also revealed the existence of many susceptibility single nucleotide polymorphisms (SNPs) embedded within the distribution of P values, far from reaching statistical significance (Purcell et al., 2009; Lee et al., 2012). Gene set-based analyses have been proposed as a way to obtain additional information from GWAS results (Wang et al., 2007; Cantor et al., 2010).
The DISC1 gene was discovered when it was disrupted by a translocation that segregated with several psychiatric disorders, including schizophrenia, in a Scottish family (St Clair et al., 1990; Millar et al., 2000). Since then, it has been extensively studied. One of the approaches to the study of its role in psychiatric phenotypes was the analysis of protein–protein interactions involving DISC1 (Millar et al., 2003; Miyoshi et al., 2003, 2004; Morris et al., 2003; Ozeki et al., 2003; Camargo et al., 2007). These studies have shown that DISC1 is a “hub” protein, i.e., a highly connected node in its protein–protein network, involved in multiple processes, such as microtubule organization, kinesin-mediated transport of vesicles, neurite extension, or regulation of neural progenitor proliferation (Camargo et al., 2007; Chubb et al., 2008; Brandon et al., 2009).
Several proteins that interact with DISC1 may be associated with schizophrenia risk. For instance, the DISC1 interacting protein PDE4B is also disrupted by a balanced translocation in a patient with schizophrenia and a relative with chronic psychiatric illness (Millar et al., 2005). In addition, one SNP at this gene, rs910694, is associated with schizophrenia according to the SZgene meta-analysis (Allen et al., 2008). There is some evidence of epistasis between DISC1 and its partners in relation to schizophrenia susceptibility (Burdick et al., 2008; Nicodemus et al., 2010; Andreasen et al., 2011). Nevertheless, most of the DISC1 partners have not been tested for association with schizophrenia. Some animal models of DISC1 interaction partners exhibit behavioral and neuroanatomical phenotypes related to schizophrenia (Ikeda et al., 2008; Sakae et al., 2008; Youn et al., 2009; Carlisle et al., 2011), similar to DISC1 mutant mice (Brandon & Sawa, 2011).
In a previous study, we performed a genome-wide analysis of missense SNPs to identify specific SNPs associated with an increased risk of schizophrenia, leading to the identification of SLC39A8 as a new schizophrenia susceptibility gene (Carrera et al., 2012). Here, following the hypothesis that different perturbations at the DISC1 interaction network may increase the individual susceptibility to develop schizophrenia, we reanalyzed these data to test for any overrepresentation of the DISC1 interactome gene set at the top of the ranked list of association test statistics. To this goal, we used a Gene Set Enrichment Analysis method, based on previous studies on gene expression, specifically adapted for GWAS using SNPs (Wang et al., 2007).
Materials and Methods
Genotype Data Set
Our genotype data set, once the quality control procedures were applied, consisted of 5100 missense SNPs at frequencies higher than 5% located in 3751 different genes that have been genotyped in 476 unrelated patients (69% males) with DSMIV (Diagnostic and Statistical Manual of Mental Disorders-IV) diagnosis of schizophrenia and 447 unrelated control subjects (49% males) from Galicia, NW Spain (Carrera et al., 2012). All the cases gave their written informed consent to participate in this study, which has been approved by the Ethical Committee of Clinical Research from Galicia, Spain. The genomic inflation factor for this data set was 1.035 using the trend test, a value that remains almost identical after correcting for the four main components of ancestry using multidimensional scaling, indicating that population stratification is not a concern in this data set. Further details of the data set are in Carrera et al. (2012).
Association results for stage 1 of the Schizophrenia Psychiatric GWAS Consortium (PGC) mega-analyses (Ripke et al., 2011) were obtained from the PGC Web site (https://pgc.unc.edu/Sharing.php). Currently, this is the largest data set of results of association with schizophrenia, comprising 9394 cases and 12,462 controls.
DISC1 Interactome Gene Sets
Two sets of genes were analyzed. The first set was comprised of all genes with genotyped data that interact with DISC1, according to: (i) STRING, a database of protein–protein interactions, considering only data from experiments and databases (Jensen et al., 2009); (ii) the Entrez GENE database from NCBI (Maglott et al., 2005); and (iii) bibliographic searches. The second set was comprised of those genes that interact with at least two additional genes of the “DISC1 interactome” described by Camargo et al. (Camargo et al., 2007). This second set is named “DISC1 interactome core” throughout this work.
Gene Set Analysis
Main gene set analysis was performed as described in Wang et al. (2007), using the GenGen software (http://www.openbioinformatics.org/gengen/). First, a Cochran–Armitage trend test for association was done at each SNP. Next, a rank list of genes based on the P-value of the most significant SNP at each gene was generated. Then, the putative overrepresentation of the DISC1 interactome at the top of the ranked list was estimated by calculation of an enrichment score based on the weighted Kolmogorov–Smirnov-like running-sum statistic. Finally, the significance of the enrichment score was assessed by 5000 permutations of the case–control labels. Therefore, the linkage disequilibrium structure and SNP density at each gene are preserved.
Additional gene set analysis for confirmation of results was done with the gene set option of PLINK 1.07 (Purcell et al., 2007). The main gene set statistic in this analysis is the mean of the different single SNP statistics that, in our case, is the χ2 value of the Cochran–Armitage trend test for association. Several parameters have to be chosen for this analysis, such as the r2 value to consider inclusion of a SNP in the gene set statistic or the number of SNPs to include in this statistic. We chose intermediate values of 0.5 and 10, respectively, for these parameters. Significance of the gene set statistic was calculated by 10,000 permutations of the case–control label.
Twenty-one of the 3751 genes with genotype data in our previous work present experimental evidence of interaction with DISC1 (Table 1). The most significant P-value corresponds to the SNP rs587404 at MACF1 (trend test P-value = 0.0123, false discovery rate q = 0.71, SNP rank by significance: 85). The GenGen analysis detected a significant enrichment of the genes from the DISC1 interactome at the top of the ranked list of genes by significance (P = 0.0158), as determined by 5000 permutations of the phenotypic labels. The analysis of the DISC1 interactome core, based only on data for eight genes, confirmed the significance of the network (P = 0.021). Two of the eight genes of the DISC1 interactome core are not included in the wide DISC1 interactome, while 13 of the genes of the wide DISC1 interactome are not in the core gene set (Table 1).
Table 1. DISC1 interactome genes present in the data set
Leading-edge analysis identified seven genes that ranked before or at the point of maximum enrichment score (Fig. 1). Only four of the seven genes presented at least one SNP with nominal P-value lower than 0.05 (Table 1).
Additional analyses using the set-based test of PLINK showed similar results, with P-values of 0.055 and 0.025 for the broad DISC1 interactome and the DISC1 interactome core, respectively.
Data from the Schizophrenia PGC are available for 16 of the 21 SNPs (Table 1). Three SNPs reached significance at the nominal level, including the most significant in our own data. Three additional SNPs present a P-value lower than 0.1 and four additional SNPs present a P-value lower than 0.2. Comparison of this distribution of P-values with the distribution of P-values of the linkage disequilibrium clumped Schizophrenia PGC SNPs (N = 113774, r2 = 0.25 within 500 kb window) reveals a trend for overrepresentation of the DISC1 interactome in the upper part of ranked P-values (Kolmogorov–Smirnoff test, P = 0.078), although lack of access to individual genotype data precludes a detailed analysis.
Our analysis identified a significant overrepresentation of the proteins that interact with DISC1 within the top results of the association with schizophrenia, suggesting the relevance of this gene set. Using an approach to detect protein–protein interacting networks that locally maximize the proportion of genes with low P-values in two different GWAS data sets, Jia et al. identified 205 genes with a joint effect significantly associated with schizophrenia. Among these genes, are DISC1 and seven direct interactors, including SYNE1, KIF3A, MACF1, and UTRN, all of them leading-edge genes in our analysis (Jia et al., 2012).
The leading-edge genes at the DISC1 interactome identified in our analysis are involved in neuronal cytoskeleton organization and dynamics, affecting processes, such as cytoskeletal stability or intracellular transport through the microtubule cytoskeleton, a process overrepresented within the DISC1 interacting proteins (Camargo et al., 2007). UTRN binds the actin cytoskeleton, linking it to the extracellular matrix, suggesting a role in structural support of neuronal membranes (Perronnet & Vaillend, 2010). AKAP9 is required for the assembly of microtubules on Golgi membranes (Rivero et al., 2009), and is probably involved in cytoskeletal attachment of NMDA receptors (Lin et al., 1998). KIF3A is a member of the kinesin-II complex, a motor protein complex involved in the anterograde transport of membranous organelles to the axons (Kondo et al., 1994; Hirokawa et al., 2009). DST is a microtubule cross-linking factor responsible for the high stability of axonal microtubules necessary for transport over long distances (Yang et al., 1999). Disruption of its association with dynactin results in defects in retrograde axonal transport (Sonnenberg & Liem, 2007). MACF1 is also a microtubule cross-linking factor closely related to DST. DST-null mice suggest compensation by MACF1 in the central nervous system, indicative of some functional redundancy (Leung et al., 2002). DISC1 is involved in transport along microtubules to the distal part of axons, acting as a cargo receptor in a complex with kinesin-I (Shinoda et al., 2007, Taya et al., 2007). Finally, SYNE1 is a constituent of the LINC (linker of nucleoskeleton and cytoskeleton) complex, involved in the anchorage of the nuclear envelope to the actin cytoskeleton, playing an important role in neurogenesis (Zhang et al., 2009). A fragment of SYNE1 binds the KIF3B subunit of the kinesin II complex (Fan & Beck, 2004).
Interestingly, a meta-analysis of GWAS of bipolar disorder revealed association with SYNE1 (Sklar et al., 2011), which was recently confirmed and extended to major depression (Green et al., 2013). Furthermore, exome sequencing of sporadic autism patients identified a de novo missense mutation of predicted functional consequence in SYNE1 (O'Roak et al., 2012) and a truncating de novo mutation in DST (Iossifov et al., 2012). In addition, Yu et al. (2013) applied exome sequencing to consanguineous families with several members affected by autism spectrum disorders (ASD) and found a putatively functional missense mutation in SYNE1 that cosegregates with ASD. Taking into account that genetic factors of susceptibility to psychiatric disorders are often shared by several different psychiatric diagnoses (Gratacos et al., 2009; Malhotra & Sebat, 2012; Cross-Disorder Group of the Psychiatric Genomics Consortium, 2013), including the original DISC1 translocation (St Clair et al., 1990), these findings lend additional support to our main result.
At least three of the seven leading-edge genes of the DISC1 interactome, DST, MACF1, and SYNE1, encode proteins that also bind DTNBP1 (Benson et al., 2001, Camargo et al., 2007, Guo et al., 2009). DTNBP1 is considered another putative schizophrenia susceptibility gene (Straub et al., 2002; Allen et al., 2008). Early work found that UTRN formed a complex with DTNBP1 in brain (Benson et al., 2001), but further work failed to replicate this result (Nazarian et al., 2006). DTNBP1 is also involved in cyskeketal organization and in vesicle transport from the neuronal cell body to synaptic terminals as a subunit of the BLOC-1 complex (Kubota et al., 2009; Mead et al., 2010; Larimore et al., 2011).
Detection of enrichment of the DISC1 interacting gene set has been facilitated by several characteristics of our experimental research. First, we used a hypothesis-driven gene set, a strategy recently suggested for the analysis of GWAs data as a whole in order to avoid the excessive burden of correction for multiple testing intrinsic of the hypothesis-free approach (Askland et al., 2011). Second, we manually compiled the candidate gene set taking advantage of the many studies on the DISC1 interacting partners. This is especially important for neuropsychiatric diseases because common pathway/gene set databases may not have well-annotated pathways for neuronal function (Wang et al., 2010). Although the exact genes belonging to the DISC1 interactome gene set may be a matter of debate, the detection of significant overrepresentation of the DISC1 interactome using two different criteria for selection of gene set shows that our result is robust to gene set selection criteria. Third, the use of missense SNPs led to a direct assignment of SNP to gene, avoiding the selection of arbitrary criteria to assign intergenic SNPs to genes. Finally, while other methods to test overrepresentation of a set of genes consider an arbitrary threshold of significance, the method by Wang et al. (2007) considers all the P-values. Taking into account that there is not a clear consensus of the best way to perform gene set analysis, the use of different gene set approaches to examine consistency of the results has been suggested (Wang et al., 2010). To this goal, we performed an additional gene set analysis using PLINK (Purcell et al., 2007). Unlike GenGen, PLINK requires choosing several parameters that may have considerable effect in the final result. Using intermediate parameters for consideration of SNPs to enter the gene set statistic, we obtained results in agreement with GenGen, increasing the strength of the evidence for association of the DISC1 interactome with schizophrenia.
One of the main findings of GWAS in schizophrenia is the existence of an important common polygenic component, quantified by a polygenic score, usually measured as the sum of the number of susceptibility alleles at each SNP weighted by its odds ratio (Purcell et al. 2009; Carrera et al., 2012). Interestingly, this polygenic score explains a larger proportion of the population variance in liability for developing schizophrenia if it is based on a larger set of SNPs instead of just the most significant ones. In fact, Purcell et al. (2009) found that polygenic scores based on all the SNPs at P-values lower than 0.5 in their GWAS explained the largest proportion of variance in liability to schizophrenia in independent case–control samples. This suggests that there are many common susceptibility variants that do not reach significance, although their P-values are slightly lower than expected by chance. In agreement with this, we identified a trend for overrepresentation of the SNPs of the DISC1 interactome analyzed in this work at the top of the ranked list of P-values from the PGC data (Ripke et al., 2011).
Some limitations of the work should be considered. The association at each gene was summarized by the best P-value and therefore this method does not take into account putative allelic heterogeneity (Cantor et al., 2010). In addition, our genotype data set included few genes for the DISC1 interactome. Another caveat is the absence of an additional data set for formal replication. Finally, we must take into account that gene set analysis of association studies is in a preliminary stage, and there is not a consensus method for analysis.
In spite of these limitations, we have found a significant association between a subset of the DISC1 interactome gene set and schizophrenia in agreement with previous findings (Jia et al., 2012). The result may be of relevance, mainly under the consideration of schizophrenia as a pathway disease (Sullivan, 2012).
This work has been funded by Grants FIS/FEDER 08/1522 from ISCIII (Instituto de Salud Carlos III) and INCITE08PXIB9101149PR from Xunta de Galicia to JC and by the REGENPSI (Red de Genética de Enfermedades Neurológicas y Psiquiátricas) network (Xunta de Galicia). The funding source has no involvement in any additional aspect of the study. We thank the healthcare professionals of Complexo Hospitalario de Santiago and the staff of the Centro de Transfusión de Galicia who assisted in the recruitment of study participants, and the Centro de Supercomputación de Galicia (CESGA) for the use of their computing facilities.
Conflict of Interest
The authors declare that they have no conflict of interest.