*M. F. Gosso, Department of Biological Psychology, Vrije Universiteit, Van der Boechorststraat 1, 1081 BT Amsterdam, the Netherlands. E-mail: email@example.com
The synaptosomal associated protein of 25 kDa (SNAP-25) gene, located on chromosome 20 p12-12p11.2 encodes a presynaptic terminal protein. SNAP-25 is differentially expressed in the brain, and primarily present in the neocortex, hippocampus, anterior thalamic nuclei, substantia nigra and cerebellar granular cells. Recently, a family-based genetic association was reported between variation in intelligence quotient (IQ) phenotypes and two intronic variants on the SNAP-25 gene. The present study is a follow-up association study in two Dutch cohorts of 371 children (mean age 12.4 years) and 391 adults (mean age 36.2 years). It examines the complete genomic region of the SNAP-25 gene to narrow down the location of causative genetic variant underlying the association. Two new variants in intron 1 (rs363043 and rs353016), close to the two previous reported variants (rs363039 and rs363050) showed association with variation in IQ phenotypes across both cohorts. All four single nucleotide polymorphisms were located in intron 1, within a region of about 13.8 kbp, and are known to affect transcription factor-binding sites. Contrary to what is expected in monogenic traits, subtle changes are postulated to influence the phenotypic outcome of complex (common) traits. As a result, functional polymorphisms in (non)coding regulatory sequences may affect spatial and temporal regulation of gene expression underlying normal cognitive variation.
Cognitive ability is currently considered as a polygenic trait influenced by many genes of moderate to small effect that in turn may interact with each other and with environmental factors (Butcher et al. 2006; Plomin & Spinath 2004; Savitz et al. 2006). Identifying the actual genes underlying normal cognitive variation has proven to be a daunting task, mainly because of this polygenic nature. So far, successful identification of genes underlying genetic variation in human cognitive ability has been mainly limited to mutations for relatively rare neurological disorders with considerably severe cognitive effects in which mental retardation or milder forms of cognitive disability are part of a syndromic phenotype [i.e. fragile X syndrome (Verkerk et al. 1991), Apert syndrome (Ibrahimi et al. 2005), Rett syndrome (Neul & Zoghbi 2004)]. These mutations occur generally in key regulatory proteins within general neuronal signaling pathways.
We recently conducted a family-based association study using an indirect (tagging) approach that involved the SNAP-25 gene and psychometric intelligence scores as a measure of cognitive ability in humans (Gosso et al. 2006a). Psychometric intelligence tests consist of a number of component subtests that taken together are used to infer a general intelligence quotient (IQ) score. Two single nucleotide polymorphisms (SNPs) in the SNAP-25 gene showed a highly significant association with IQ. Both were (non)coding variants. Associations in a (non)coding region of SNAP-25 can arise from variants in intronic and untranslated regions (UTR) that influence gene expression [e.g. variants located on promoter regions, transcription starting sites and 3′ UTR microRNA target sites], which in turn might result in individual variation among IQ phenotypes.
The initial analyses (Gosso et al. 2006a) were based on a tagging approach. We here perform follow-up analyses to (1) narrow down the location of causative genetic variant underlying the association in intron 1 and (2) identify extra regions on SNAP-25 gene not tagged during the previous analyses. Two independent extended cohorts of children (mean age 12.4 years) and adults (mean age 36.2 years) were used in order to identify these putative regulatory genomic variants underlying variation among IQ phenotypes.
Materials and methods
All twins and their siblings were part of two larger cognitive studies and were recruited from the Netherlands Twin Registry (Boomsma 1998; Boomsma et al. 2006). Informed consent was obtained from the participants (adult cohort) or from their parents if they were under 18 years (young cohort). The current study was approved by the institutional review board of the VU University Medical Center. None of the individuals tested suffered from severe physical or mental handicaps, as assessed through standard questionnaire.
The young cohort consisted of 177 twin pairs born between 1990 and 1992, and 55 siblings (Polderman et al. 2006a,b), of which 371 were available for genotyping. The genotyped twins were 12.4 (SD = 0.9) years of age and the siblings were between 8 and 15 years old at the time of testing. There were 35 monozygotic male (MZM) twin pairs, 28 dizygotic male (DZM) twin pairs, 48 monozygotic female (MZF) twin pairs, 23 dizygotic female (DZF) twin pairs, 26 dizygotic opposite-sex (DOS) twin pairs, 24 male siblings and 24 female siblings and 3 subjects form incomplete twin pairs (1 male and 2 females). Participation in this study included a voluntary agreement to provide buccal swabs for DNA extraction.
This sample is similar to the sample used in our initial analyses, except for 20 individuals that were deleted from analyses in the current sample because of a more stringent threshold of genotyping failure per individual.
A total of 793 family members from 317 extended twin families participated in the adult cognition study (Posthuma et al. 2005). Participation in this study did not automatically include DNA collection, however, part of the sample (276 subjects) returned to the lab to provide blood for DNA extraction or participated in NTR Biobank project (115 subjects) (Hoekstra et al. 2004). Mean age was 36.25 years (SD = 12.60). There were 25 MZM twin pairs, 15 DZM twin pairs, 1 DZM triplet, 20 MZF twin pairs, 28 DZF twin pairs and 23 DOS twin pairs, 29 female siblings and 28 male siblings and 109 subjects from incomplete twin pairs (41 males and 68 females).
In the young cohort, cognitive ability was assessed with the Dutch adaptation of the Wechsler Intelligence Scale for Children-Revised (Wechsler 1986), and consisted of four verbal subtests (similarities, vocabulary, arithmetic and digit span) and two performance subtests (block design and object assembly).
In the adult cohort, the Dutch adaptation of the Wechsler Adult Intelligence Scale III-Revised (Wechsler 1997), assessed IQ and consisted of four verbal subtests (information, similarities, vocabulary and arithmetic) and four performance subtests (picture completion, block design, matrix reasoning and digit–symbol substitution). In both cohorts, verbal IQ (VIQ), performance IQ (PIQ) and full-scale IQ (FSIQ) were normally distributed. Correlations between FSIQ/VIQ, FSIQ/PIQ and PIQ/VIQ were 0.89, 0.81 and 0.45, respectively, in the young cohort, and 0.90, 0.84 and 0.55, respectively, in the adult cohort. Mean and SD of the full and genotyped cohorts are provided in Table 1.
Table 1. Means and standard deviations of PIQ, VIQ and FSIQ in the young and adult cohorts
Longitudinal studies have shown that heritability estimates increase from around 30% in preschool children to 80% in early adolescent and adulthood (Bouchard & McGue 1981; Petrill et al. 2004). Furthermore, the stability of IQ performance during childhood has been shown to be mainly influenced by genetic factors (Bartels et al. 2002; Plomin 1999) whose effects are amplified when children grow older. Heritability estimates for young and adult cohorts were reported elsewhere (Gosso et al. 2006b). Power for detecting relatively small quantitative trait loci (QTL) effects (1–3%) assuming a relatively high linkage disequilibrium (LD) between the genotyped marker and the causal variant are about 0.76 and 0.98, respectively.
DNA collection and genotyping
Buccal swabs were obtained from 371 children; DNA in adults was collected from blood samples (276 subjects) and buccal swabs (115 subjects). The DNA isolation from buccal swabs was performed using a chloroform/isopropanol extraction (Meulenbelt et al. 1995). DNA was extracted from blood samples using the salting out protocol (Miller et al. 1988). Zygosity was assessed using 11 polymorphic microsatellite markers (Heterozygosity > 0.80). Single nucleotide polymorphisms were selected based on their minor allele frequency (MAF) as obtained from a randomly selected population with northern and western European ancestry by the Centre d’Etude du polymorphisme Humain (CEPH) (http://www.hapmap.org/thehapmap.htlm.en). MAF had to be >0.10 in order to reach enough power to detect common variants and also be able to observe all three possible bi-allelic combinations. Forty-nine SNPs were selected using Haploview v.3.32 (http://www/broad.mit.edu/mpg/haploview) (NCBI Build 36.1) to be genotyped in both cohorts. Genotyping was performed blind to familial status and phenotypic data. Both MZ twins of a pair were included in genotyping and served as additional controls.
The SNPlex assay was conducted following the manufacturer’s recommendations (Applied Biosystems, Foster City, CA, USA). All pre-polymerase chain reaction (PCR) steps were performed on a cooled block. Reactions were carried out in Gene Amp 9700 Thermocycler (Applied Biosystems). PCR products were analyzed with ABI3730 Sequencer (Applied Biosystems). Data were analyzed using genemapper v3.7 (Applied Biosystems).
Transcription factors (TFs) are proteins that recognize specific DNA sequences so-called transcription factor-binding sites (TFBS), their interaction is fundamental for regulation of gene expression [for review see (Garvie & Wolberger 2001)]. Physical TFBS are constitutive DNA sequences found every 10–15 bp throughout the genome, and weight matrices are usually used to accurately predict them (Cartharius et al. 2005). In order to identify gain or loss of physical TFBS, SNP variation was analyzed using MatInspector v. 7.4.8 and SNPInspector v. 6.3 [both programs are available via Genomatix browser (http://www.genomatix.de.html)]. While the former aids in the identification of physical TFBS, the latter identifies physical TFBS affected by SNPs (SNPs with putative regulatory activity). A random expectation value (the program assigns an expectation value for the number of TFBS matches per 1000 bp of random DNA sequence) is assigned to each TF, as well as a percentage of vertebrate promoters containing the TFBS.
Allele frequencies of selected SNPs were estimated in both young and adult cohorts using Haploview v3.32 (http://www.broad.mit.edu/mpg/haploview/). Hardy–Weinberg equilibrium (HWE) P values were estimated for each variant, which is the probability that its deviation from HWE could be explained by chance. Only one member of a twin pair was selected for HWE calculations. Linkage disequilibrium parameter (r2) was calculated from the haplotype frequencies estimates using Haploview 3.32 (http://www.broad.mit.edu/mpg/haploview). Haplotypes were estimated using SNPs that showed a significant association with IQ in both samples, using the expectation–maximization (EM) algorithm to obtain the maximum likelihood estimates of haplotype frequencies in each sample (Excoffier & Slatkin 1995), as implemented in the Allegro software package v. 2 (Gudbjartsson et al. 2005). The EM algorithm allows for missing data and can be applied when no parental genotypes are available.
Genetic association tests were conducted using the program qtdt which implements the orthogonal association model proposed by Abecasis et al. (2000) (see also Fulker et al. 1999; extended by Posthuma et al. 2004) This model allows the decomposition of the genotypic association effect into orthogonal between-family (βb) and within-family (βw) components, can incorporate fixed effects of covariates and can also models the residual sib correlation as a function of polygenic or environmental factors. MZ twins can be included and are modeled as such, by adding zygosity status to the data file. They are not informative to the within-family association component (unless they are paired with non-twin siblings), but are informative for the between-family component. The between-family association component is sensitive to population admixture, whereas the within-family component is significant only in the presence of LD because of close linkage. Testing for the equality of the βb and βw effects, serves as a test of population stratification. If population stratification acts to create a false association, the test for association using the within-family component is still valid, and provides a conservative test of association. If this test is not significant, the between- and within-family effects are equal and total association test that uses the whole population at once can be applied. It should be noted, however, that given the relatively modest sample size, both the within-family test and the population stratification test are not as powerful as the ‘total’ association test. When evaluating potentially interesting results from a number of statistical tests, it is necessary to determine how often a ‘significant’P value would arise by chance if the study were repeated under the hypothesis of no genetic association. Bonferroni correction has proved to be too conservative especially when non-independent phenotypes are used in the context of association studies (e.g. IQ phenotypes), and as we tested multiple SNPs, a significance level of 0.01 was kept.
Single SNP analysis
Genotyping of 7 of the 49 SNPs failed in both cohorts (rs1889188, rs2423487, rs363040, rs362548, rs11547873, rs11547859 and rs3025896). The LD structure for the remaining 42 SNPs is given in Table 2 for the young and adult cohort separately. Three of the remaining 42 SNPs were not in HWE in either cohorts. Further analyses will focus on the 39 variants that were in HWE. SNPs positions as well as LD values for the combined cohort are given in Fig. 1.
Table 2. SNP descriptives for young and adult cohorts
SNPs were selected if allele frequency was > 10% (18.0% heterozygosity) and a SNP pairwise correlation (ρ) < 0.85. SNPs already reported in our previous study (Gosso et al. 2006a) are in given in bold.
MA/MAF, minor allele/frequency.
Chromosomal location in base pairs based on Build 36.1.
qtdt modeled additive allelic between- and within-family effects. Residual sib correlations were modeled as a function of polygenic additive effects and non-shared environmental effects. Tests for the presence of population stratification were all nonsignificant indicating that genotypic effects within families were not significantly different from those observed between families, suggesting that the more powerful total association test can be interpreted.
Four SNPs (rs363039, rs363043, rs363016 and rs363050) located in intron 1, showed significant association with IQ phenotypes. Significance was strongest in the young cohort (P < 0.01) and showed a trend in the same direction in the older cohort (P < 0.10) (see Tables 3 and 4). Analyses of the combined sample resulted in highly significant associations for these four SNPs. Two of these SNPs (rs363039 and rs363050) were previously associated with IQ variation in these same young and adult cohorts (Gosso et al. 2006a), whereas two new SNPs that were found significant, were added in the current follow-up analyses (rs363043 and rs363016). All four SNPs showed association in the same direction and the same order of magnitude. As can be observed from Table 3, rs363050 and rs363016 are in complete LD (r2 = .98), and as expected, the similarity of association results reflects the high LD between them. The strongest association among this intron 1 region was observed between rs363016 and FSIQ (χ2 = 15.99, P = 0.0001). The increaser allele of this SNP was associated with an increase of 3.28 IQ points (see Table 4). Subsequently, further haplotype analysis was conducted with only three variants out of the four significantly associated SNPs (rs363039, rs363043 and rs363016) (see Haplotype Analysis).
Table 3. Mean (SD) per genotype for PIQ, VIQ and FIQ for young and adult cohorts for the four tag-SNPs within the SNAP-25 gene that show association with a significant association in intron 1
SNP position (bp)
Table 4. Family-based association analysis for SNAP-25 tag-SNPs for young, adult and combined cohorts
SNP position (bp)
The genotypic effect is the increase in IQ points associated with the increaser allele. P < 0.01 are in bold. Residual variance was modeled as a function of polygenic effects and non-shared environmental effects.
A few other significant P values (i.e. ≤0.01) were observed that may suggest a second and new association peak located within a region of 4.9 kbp in the 3′UTR of SNAP-25. This involved two untranslated variants located only 658 bp apart (rs rs362620 and rs362557) in the young cohort. The A allele of rs362620 was associated with an increase of 1.29 VIQ points (χ2 = 6.56, P = 0.01), whereas the C allele of rs362557 was associated with an increase of 2.84 in VIQ (χ2 = 7.02, P = 0.008). LD patterns between these two SNP were extremely low (r2 = 0.18). In addition, within the same region, association results for the adult cohort were observed with a neighbor variant (rs6104580) located 4.32 kbp away from the variants associated on the young cohort. In the old cohort, the T allele of rs6104580 was associated with an increase of 3.26 PIQ points (χ2 = 8.36, P = 0.004). LD between the rs6104580 and the two SNPs associated in the young cohort (rs362620 and rs362557) was relatively low (r2 between 0.04 and 0.34) and variants rs363039, rs363043, and rs363016 were considered for subsequent haplotype analysis (see Haplotype Analysis).
Based on LD patterns among the four (non)coding SNPs significantly associated with IQ phenotypes, only three were selected to conduct further haplotype analysis. The selected SNPs encompassed a genomic region of about 10.7 kbp (rs363039, and two variants within LD block 4, rs363043 and rs363016). LD (r2) among these variants ranged between 0.10 and 0.54. These SNPs were used to estimate haplotypes within each sample. Haplotype analysis of SNPs with a relatively low LD is more powerful than single SNP analysis because the combination of SNPs into a haplotype can be considered as a multiallelic marker that is more informative than a biallelic marker when the causal variant(s) are not genotyped.
Five possible haplotypes were observed in our samples (A-C-C, A-T-T, G-C-C, G-C-T and G-T-T). Haplotypes A-T-C and G-T-C were not observed and A-C-T was only observed in the adult cohort at a very low frequency (see Table 5). Significant associations were found in both samples; however, it is worth noting that different allelic combinations were associated across cohorts. Within the young cohort, G-T-T was the strongest associated haplotype [PIQ χ2(1) = 9.36, P = 0.002], whereas G-C-T showed the strongest association among the adult cohort [FSIQ χ2(1) = 10.08, P = 0.001]. When the data were combined, highly significant associations were observed among all IQ phenotypes for both, G-T-T [PIQ χ2(1) = 8.27, P = 0.004] and G-C-T [VIQ χ2(1) = 8.61, P = 0.003] haplotypes (see Table 6), confirming the single SNP association results.
Table 5. Family-based association analysis for tagging haplotypes (rs363039, rs363043 and rs363016) in intron 1 of the SNAP-25 gene for young, adult and combined cohorts
Young cohort (n = 328)
Adult cohort (n = 325)
Combined cohort (n = 653)
Table 6. List of putative TFBS modified by common SNPs within intron 1 on the SNAP-25 gene
Haplotype analysis was also conducted for the rs362620 and rs362557 haplotype in the second region. Significant association was found between the A-C haplotype and VIQ [χ2(1) = 6.09, P = 0.01] with an increase of 2.65 IQ points, corroborating the single SNP analysis in the young cohort. This association should be interpreted with more than the usual caution because the single SNP analyses had not shown replication between cohorts or within cohort across different IQ phenotypes.
A search with MatInspector and SNPInspector showed that all three SNPs affected TFBS (see Table 6). Although there is no hard evidence that these TFBS are functional, it at least allows for the possibility that the (non)coding variants identified could affect regulatory gene expression.
To continue our investigation of the possible role of the SNAP-25 gene in intelligence, we employed a family-based genetic association test in two independent cohorts of 371 children (mean age 12.42 years), and 391 adults (mean age 36.25 years). The selected SNPs gave a dense coverage of the first intron of SNAP-25, which was previously reported to be associated with intelligence (Gosso et al. 2006a). Single and haplotype analysis was conducted in the present study in order to (1) narrow down the location of causative genetic variant underlying the association in intron 1 and (2) identify extra regions on SNAP-25 gene not tagged during the previous analyses. Four SNPs (rs363039, rs363043, rs363016 and rs363050) located in intron 1, showed significant association with IQ phenotypes. Haplotype analysis confirmed the single association results. Combined data across age cohorts showed highly significant associations among IQ phenotypes for both G-T-T [PIQ χ2(1) = 8.27, P = 0.004] and G-C-T [VIQ χ2(1) = 8.61, P = 0.003] haplotypes. Interestingly, two haplotypes were independently found associated to IQ phenotypes among young and adult cohorts. Within the young cohort, G-T-T was the strongest associated haplotype [PIQ χ2(1) = 9.36, P = 0.002], whereas G-C-T showed the strongest association among the adult cohort [FSIQ χ2(1) = 10.08, P = 0.001]. Variance in these haplotypes accounts for 1% and 3% of the phenotypic variance in PIQ and FIQ, respectively.
Such differential genotypic effects might be possibly explained within a heterogeneous genomic context. Although physical TFBS are a constitutive portion of the genome, the cellular and genomic context will determine whether a given TF sequence(s) become functional or not. Transcription factors are differentially expressed in response to developmental requirements, and even more important, like QTL, single TF will not be sufficient to trigger a regulatory response, but they will rather interact in a collaborative manner. For example, low levels of TF p53 are constitutively expressed in the developing nervous system (embryonic and neonatal) under normal growth conditions and this is downregulated in adults (Komarova et al. 1997). Nevertheless, role of p53 in differentiation rather than apoptosis during sensory neuronal development still has to be determined.
Likely, IQ can be considered a truly polygenic complex trait, and as such, not a single common allelic variant might be involved in IQ phenotype variation, but rather, similar genetic effects might be exerted by diverse allelic variants. Alternatively, our results could indicate that the causal variant is older than the SNPs that have been tested and in fact is present on both haplotypes. It is worth noting that rare alleles may still contribute importantly to variation in cognitive ability, albeit their small effects can be only identified within a multicenter collaborative study framework, mainly because of the relative large amount of samples required to achieve power to detect their genetic effects.
The SNAP-25 gene, located on chromosome 20 p12-12p11.2 encodes a presynaptic terminal protein. SNAP-25 is thought to be differentially expressed in the brain and is primarily present in the neocortex, hippocampus, anterior thalamic nuclei, substantia nigra and cerebellar granular cells. In the mature brain, expression is mainly seen at presynaptic terminals (Oyler et al. 1989). Two splicing variant of the SNAP-25 exist, SNAP-25a and SNAP-25b isoforms (Bark & Wilson 1991). During development, SNAP-25a isoform is involved in synaptogenesis, forming presynaptic sites and neuritic outgrowth (Osen-Sand et al. 1993; Oyler et al. 1989), whereas in the mature brain, the SNAP-2b isoform forms a complex with syntaxin and the synaptic vesicle proteins (synaptobrevin and synaptotagmin) that mediates exocytosis of neurotransmitter from the synaptic vesicle into the synaptic cleft (see Bark & Wilson 1994; Horikawa et al. 1993; Low et al. 1999; Seagar & Takahashi 1998).
SNAP-25 isoforms (SNAP-25a and SNAP-25b) are fundamental for keeping a balanced trade-off between synaptic formation and neurotransmitter vesicle release; however, evolutionary (comparative genomics) analysis of the coding sequence showed no selection in favor of any of the gene-coding variants on SNAP-25. If variation of coding variants may not per se be associated to phenotypic variation, then, the next possible scenario might be the presence of regulatory effects exerted by variants on (non)coding regions. Regulatory (non)coding variants may interact in a concerted manner rather than in isolation, with the capacity to regulate gene expression. Genetic (non)coding variants present within intron 1 might be involved in regulation of protein isoforms expression. All associated SNPs were involved in TFBS changes (gain/loss of TFBS). Furthermore, because functional TFBS are predicted to interact in a co-operative manner rather than in isolation, a global overview might be required to (1) identify known and unknown TFBS and (2) putative functional (non)coding polymorphisms that may affect spatial and temporal regulation of gene expression.
Contrary to what is expected in Mendelian traits, subtle changes are postulated to influence the phenotypic outcome of complex (common) traits. Further functional studies may aid in identification of functional polymorphisms that may affect functional TFBS, which in turn may be used to uncover genetic regulatory interactions underlying normal cognitive variation.
This study was supported by the Universitair Stimulerings Fonds (grant number 96/22), the Human Frontiers of Science Program (grant number rg0154/1998-B), the Netherlands Organization for Scientific Research (NWO) grants 904-57-94 and NWO/SPI 56-464-14192. This study was supported by the Centre for Medical Systems Biology (CMSB), a centre of excellence approved by the Netherlands Genomics Initiative/Netherlands Organization for Scientific Research (NWO). We thank Saskia van Mil and David Sondervan from the Medical Genomics Laboratory for technical support. We also like to thank the families from the Netherland Twin Registry (NTR) who participate in this study.