Neuroligins are a family of synaptic type I transmembrane proteins that consist of a large extracellular portion similar to cholinesterases, a glycosylated linker sequence, a transmembrane domain, and a short C-terminal tail containing a type I PDZ-binding motif (Ichtchenko et al., 1995). Neuroligins belong to a family of molecules containing the cholinesterase-like domain, called cholinesterase-like adhesion molecules (CLAMs), which include glutactin, neurotactin, and gliotactin (Gilbert and Auld, 2005). However, unlike cholinesterases, Neuroligins lack one of the residues of the catalytic triad located within the extracellular esterase-like domain, which renders them enzymatically inactive. Thus, instead of mediating enzyme/substrate interplay, the Neuroligins' cholinesterase-like domain is involved in protein–protein interaction.
Neuroligin proteins have been identified in different species from invertebrates to human (Ichtchenko et al., 1995; Bolliger et al., 2001; Gilbert et al., 2001; Kwon et al., 2004; Biswas et al., 2008). The mouse genome encodes four Neuroligin family members while five different genes (NLGN1, NLGN2, NLGN3, NLGN4X, and NLGN4Y) are present in the human genome (Ichtchenko et al., 1996; Bolliger et al., 2001, 2008; Paraoanu et al., 2006). The extracellular portion of mammalian Neuroligins contains two conserved regions subjected to alternative splicing events, indicated as Site A and B (Ichtchenko et al., 1995, 1996; Bolliger et al., 2001; Paraoanu et al., 2006). Usually, Neuroligin 1 and Neuroligin 3 can present three different isoforms at site A: without inserts, with insert A1/A2, or with inserts A1+A2. Neuroligin 2 and Neuroligins 4 can only present the insert A2. Finally, only Neuroligin 1 can be alternatively spliced at Site B.
Neuroligins are localized at the postsynaptic side of both excitatory and inhibitory synapses of the central nervous system (CNS). They bind to presynaptic alpha and beta forms of Neurexins (Ichtchenko et al., 1995, 1996; Missler et al., 1998; Boucard et al., 2005) and the alternative splicing of both protein families controls their binding affinities (Boucard et al., 2005; Chih et al., 2006; Comoletti et al., 2006; Fabrichny et al., 2007; Koehnke et al., 2008a, b; Shen et al., 2008). The current data suggest that Neurexin–Neuroligin binding is governed by a complex code that is based on the type of isoforms and splice variants involved, calcium binding, and glycosylation (Sudhof, 2008).
Neuroligin 1 expression is mainly restricted to the CNS, while Neuroligins 2–4 present a broader expression pattern (Song et al., 1999; Philibert et al., 2000; Bolliger et al., 2001; Gilbert et al., 2001; Kang et al., 2004; Suckow et al., 2008). Some lines of evidence suggest that the expression of Neuroligins may differ between species. For example, rat Neuroligin 3 is mainly detected in brain, whereas human Neuroligin 3 and 4 are also expressed in different peripheral tissues (Ichtchenko et al., 1996; Nemeth et al., 1999; Philibert et al., 2000; Bolliger et al., 2001).
Numerous studies investigated the synaptic roles of Neuroligins and Neurexins. Although different works suggested their involvement in synapses formation (Scheiffele et al., 2000; Dean et al., 2003; Graf et al., 2004; Prange et al., 2004; Chubykin et al., 2005; Levinson et al., 2005), other studies and the analysis of knockout mice proposed that Neuroligins, along with the alpha-forms of Neurexins, are involved in synaptic function and maturation (Missler et al., 2003; Sara et al., 2005; Varoqueaux et al., 2006; Chubykin et al., 2007). Triple knockout mice lacking Nlgn1–3 die shortly after birth and present normal synapse numbers with an apparently normal ultrastructure (Varoqueaux et al., 2006). Moreover, the importance of Neuroligins in proper brain function is documented by the fact that mutations in the human Neuroligin 3 and Neuroligin 4 genes appear to be involved in mental retardation and autism (Jamain et al., 2003; Chih et al., 2004; Comoletti et al., 2004; Laumonnier et al., 2004; Talebizadeh et al., 2006).
In this study, we describe the isolation and characterization of Neuroligin genes in Danio rerio. The zebrafish Neuroligin gene family includes seven genes very similar to their human homologs, suggesting that, during evolution, they have been subjected to strong evolutionary pressure in order to preserve their function. Through a phylogenomic analysis of the intron/exon structure, we reconstructed the evolution of Neuroligin genes in vertebrates. Our data highlighted the presence of eleven intron gains and a unique intron loss event, mostly involving major branching points in the vertebrate tree such as the origin of mammals and teleosts. Moreover, our data confirm previous independent studies on vertebrate genomes indicating that intron gain is more prevalent than intron loss. Through Reverse-transcriptase polymerase chain reaction (RT-PCR)-based methods, we analyzed the alternative splicing pattern during embryo development and in adult organs finding that: (1) in adult fishes Neuroligins are expressed in many different organs, (2) in many cases different organs present specific alternative splicing patterns, and (3) subfunctionalization events occurred differentiating the expression pattern and the alternative splicing regulation of paralogous genes. Finally, gene expression analyses by whole mount in situ hybridization showed that Neuroligins are widely expressed in the brain of developing embryos. In particular, almost all genes share a similar expression pattern in the midbrain and the hindbrain. Nevertheless, we also find evidence of a differential expression between paralogous genes.
RESULTS AND DISCUSSION
Molecular Cloning and Characterization of the Zebrafish Neuroligins
The sequences of the human Neuroligins were assembled using ENSEMBL and VEGA and then used as queries in BLAST searches at the Zebrafish Genome Browser (www.ensembl.org/Danio_Rerio) and in public EST databases available at the NCBI (www.ncbi.nlm.nih.gov/BLAST). We found seven different genes homologous to human Neuroligins in different Linkage Groups (LGs) of the zebrafish genome. Using in silico cloning and rapid amplification of cDNA ends (RACE) techniques, we assembled the complete coding sequences (CDS) of zebrafish Neuroligins. Finally, we confirmed each CDS by RT-PCR and sequencing.
According to Zebrafish Nomenclature Guidelines (www.zfin.org), we designed these new genes as: nlgn1 (partially located on LG11 and LG2 in Ensembl Zv8 assembly), nlgn2a (LG7), nlgn2b (LG10), nlgn3a (LG5), nlgn3b (LG14), nlgn4a (LG1), and nlgn4b (LG9). In zebrafish, Neuroligin 2–4 genes are duplicated; only the Neuroligin 1 gene seems to be present in a single copy and, notwithstanding extensive database searches, we did not find any traces of a duplicate form of this gene. The lack of a Neuroligin 1 paralogous gene in other teleosts such as G. aculeatus, O. latipes, T. nigrovirids, and T. rubripes (see below) strongly suggests that, after the whole genome duplication (Postlethwait, 2007), one copy quickly disappeared.
The multialignment of zebrafish and human Neuroligins amino acid sequences (Fig. 1) shows that they present common features. Similarly to human Neuroligins, zebrafish proteins are composed of an N-terminal signal peptide of variable length, an esterase-like domain encompassing almost the entire extracellular portion, a short linker just upstream the transmembrane domain, and a cytosolic region. As observed in mammalian Neuroligins, the highest degree of sequence conservation can be found in the esterase-like domain, in the transmembrane domain, and in the PDZ binding domain, which mediates their interaction with the scaffold protein PSD-95 (Hata et al., 1996; Barrow et al., 2009). Notably, the linker between the esterase-like domain and the transmembrane region is different among all the vertebrate Neuroligins. Like mammalian Neuroligins, each zebrafish protein presents a substitution (Gly for Ser) of one residue of the catalytic triad of esterases and it is, therefore, catalytically inactive. Moreover, the positions of all cysteine residues involved, through disulphide bridges, in the correct folding of these proteins are completely maintained in zebrafish proteins (asterisks in Fig. 1).
Table 1 shows the percentage of amino acid identity and conservation between mammalian and zebrafish homologs gathered from the multialignment in Figure 1. The identity and conservation values are higher than 70 and 80%, respectively, further supporting the identity of the cloned genes. As we previously observed comparing zebrafish and human Neurexin proteins (Rissone et al., 2007), also zebrafish Neuroligins are very similar to their human homologs, notwithstanding the evolutionary distance from the common ancestor (Aparicio et al., 2002). This high degree of sequence similarity can be explained by the presence of positive evolutionary pressure acting on both Neurexins and Neuroligins strongly suggesting a conservation of their functions across vertebrate evolution. Remarkably, both Neuroligins 4 (a and b) are more similar to human Neuroligin 4X than Neuroligin 4Y. The lack of one or more homologue of human NLGN4Y is coherent with the assumption that Neuroligin 4Y is a primate-specific gene that originated from a recent duplication (Sudhof, 2008). It is remarkable that, in spite of their global similarity, the different Neuroligins (from Danio rerio to Homo sapiens) are on the average only 60–70% identical (see Supporting Information Table S1, which is available online). This suggests an appreciable evolutionary distance among different genes (Nlgn1, Nlgn2, Nlgn3, and Nlgn4s) of the same family and it suggests a probable functional diversification.
Table 1. Percentage of Identity and Conservation Between Homo sapiens and Danio rerio Homologsa
Numbers in bold indicate identity (and conservation) percentages among each couple of homologs.
To further support the data obtained by the multialignment, we performed phylogenetic (Fig. 2) and syntenic analyses (Supporting Information Fig. S1). The full-length nucleotide and amino acid sequences of zebrafish Neuroligins were used as queries to find homologous genes in other vertebrate and invertebrate species by performing BlastN and TBlastN searches in different genomic databases. In Supporting Information Table S7, we list all the accession numbers of the sequences used for phylogenetic tree construction. In Figure 2, we present a rooted neighbour-joining (NJ) tree obtained by aligning multiple protein sequences of members of the Neuroligin protein family from different species. The topology of the tree is coherent with the known relationship between the different taxa and is supported by robust bootstrap values in almost all the nodes. All vertebrate Neuroligins are divided in four groups that correspond to the four different types of Neuroligins. The putative invertebrate orthologs, used as outgroup, represent the root of the tree. Subsequently, we used the MultiContigView tool of ENSEMBL Genome Browser (Hubbard et al., 2005) in order to analyse the neighbouring genomic regions of each zebrafish Neuroligin with the mapped human or mouse genome (Supporting Information Fig. S1). We found conserved genes in genomic flanking regions around each zebrafish homolog. Notably, in the case of Neuroligins 4a and 4b, we observed syntenic genes only in flanking regions of human Neuroligin 4X but not in Neuroligin 4Y, confirming that NLGN4Y is specific to primates.
Taken together, phylogenetic and syntenic analyses confirm that all the zebrafish Neuroligin genes derived from a whole genome duplication event occurred at the base of the teleost radiation at approximately 350 Mya (Postlethwait, 2007). Both paralogous genes survived during evolution with the exception of one duplicate of Neuroligin 1, which has been lost before the first speciation events within the teleosts.
Gene Structure of the Zebrafish Neuroligins
We determined the exon–intron structure of zebrafish Neuroligin genes by mapping each CDS to the genomic sequence available in the ENSEMBL genomic database and then we compared them to the human genes. Supporting Information Tables S2–S5 present the results of this analysis. Globally, all the splicing borders present canonical GT and AG donor and acceptor sequences and, with specific exceptions discussed below, all the exons display the same protein reading frame, are identical or very similar in size, and encode the same protein region.
Among all the zebrafish Neuroligins, nlgn1, nlgn3b, and nlgn4a are the most similar to human genes. Indeed, all the exon sizes are identical to the human counterpart, with the exception of the first and the last exons, which present different sizes in all the zebrafish Neuroligins. In particular, nlgn1 presents a high level of sequence conservation in exonic borders and intronic flanking regions of exons 2, 3, and 5, which are alternatively spliced at sites A1, A2, and B, respectively (Supporting Information Tables S2, S6). The exonic and intronic flanking regions of alternatively spliced exons can present numerous cis-regulatory elements that serve as either splicing enhancers or silencers (Ladd and Cooper, 2002; Black, 2003; Chen and Manley, 2009). As shown in Supporting Information Table S6, intronic flanking regions of alternative splicing sites of Neuroligin 1 are slightly more conserved with respect to the other zebrafish Neuroligins, suggesting a possible conservation of regulatory elements involved in alternative splicing regulation.
Overall, these data suggest the existence of a strong selective pressure to preserve the original gene function due to the early loss of one copy after gene duplication.
Classical models indicate that the most common fate of a gene duplication event is nonfunctionalization. Because of functional redundancy, most new gene copies tend to accumulate deleterious mutations and degenerate into pseudogenes before being, eventually, lost (Ohno, 1970, 1973; Holland et al., 1994; Sidow, 1996; Petrov and Hartl, 2000; Postlethwait, 2007). Alternative fates of duplicated genes are neofunctionalization and subfunctionalization. In neofunctionalization, one or both paralogous genes acquire new functional roles; in subfunctionalization, the genes divide their original functions (Postlethwait et al., 2004; Taylor and Raes, 2004; Postlethwait, 2007). In addition to gene duplication, alternative splicing is another major source of protein function diversity. When an alternatively spliced gene is duplicated, each copy can lose some alternative splicing isoforms due to the functional redundancy, or it can acquire new isoforms (Su et al., 2006). Therefore, alternative splicing can contribute also to the neo- or subfunctionalization of one or both newly duplicated genes.
nlgn3a and nlgn4b, which lost alternative splicing sites A1 and A2, respectively (see Supporting Information Tables S4, S5), can represent typical cases of subfunctionalization events. Their paralogous genes (nlgn3b and nlgn4a) present the exons lost by nlgn3a and nlgn4b and they produce the corresponding specific isoforms, as demonstrated by RT-PCR analyses (see Fig. 5A and B for further details).
Both nlgn2 genes (a and b) are more divergent from the human gene. Exons 1–4 present similar features but, from exon 5 to exon 7, the genomic organization varies (see Supporting Information Table S3). In particular, exons 5–7 of both zebrafish nlgn2s seem to correspond to exons 5–6 of the human gene. Notably, the amino acid sequence of this terminal region of the protein is almost conserved. A similar case is represented by nlgn3a (see Supporting Information Table S4), where exons 6–7 correspond to a unique exon in human Neuroligin 3. The presence of a unique exon in nlgn3b suggests that in nlgn3a the exonic fragmentation is the result of a specific phenomenon of evolutionary divergence, which occurred after the complete duplication of zebrafish genome.
Analysis of Intron Loss/Gain Events During Evolution
In the last years, numerous and independent studies indicated that, in eukaryotes, intron evolution is a dynamic process and introns are gained and lost in different genomes in response to strong selective pressures (Jeffares et al., 2006). Therefore, at least a fraction of introns seems to be important for genome adaptation. Since the reconstruction of intron gain/loss events may provide valuable information to clarify evolutionary relationships within gene families and may provide insights about their possible functional implications (Coulombe-Huntington and Majewski, 2007), we compared the region corresponding to exons 5–7 of human Neuroligin 1 in different species of vertebrates. Furthermore, we reconstructed the possible intron gain/loss events along the evolutionary history of the Neuroligin gene family (Fig. 3A–B).
In Figure 3A we present a phylogenomic reconstruction of the above-mentioned exonic region, and in Figure 3B we show a graphic representation of the same region limited to the genes subjected to intron gain/loss events with a classification of exons and introns. In our phylogenomic analysis, we postulated that the presence of a single exon 6 (of 790 bases) could represent the original condition of the hypothetical ancestral gene. This working hypothesis is essentially based on the following observations: (1) in most cases, this region of the CDS is encoded by a unique exon, (2) in almost all the species, the sum of the fragmented exons corresponds to the size of the unique exon form, and (3) the exon obtained by assembling all the fragmented exons presents similar borders and, if compared to the unique form of exon 6, it encodes the same protein region. As shown in Figure 3, using the mapped presence/absence of introns on the phylogenetic tree, we calculated the presence of 11 intron gains (green circles) and a unique intron loss event (red square). Intriguingly, Neuroligin 1 presents four intron gains only in teleosts (see Fig. 3A and B), with the only exception of D. rerio. This means that these events followed the zebrafish speciation and, consequently, since zebrafish diverged about 314–332 Myr ago from the other teleost species (Kasahara et al., 2007), they are relatively old. Moreover, they were completely fixed during evolution. Although it is not possible to exclude eventual intron loss events in zebrafish, our hypothesis remains the most parsimonious.
The Neuroligins 2 present by far the most fragmented exon structure. In our analysis, only X. tropicalis and A. carolinensis present a unique exon 6 as the supposed ancestor. In Neuroligins 2, we identified two intron gains (introns i5a5 and i5b4 in Fig. 3B) before the teleost radiation, followed by a specific intron loss in T. nigroviridis (resulting in the formation of exon 6a6) and, lastly, an independent intron gain event probably at the base of mammals (i5a4). It is important to note that although exons 6a3 and 6a5 of teleost nlgn1s and nlgn2s, respectively, present very similar sizes (267 vs. 266 bp), introns i5a3 and i5a5 cannot be considered orthologous introns subjected to intron sliding events because they present different phases (Garcia-Espana et al., 2009). Therefore, our data indicate that, during evolution, two different introns were independently inserted in almost the same exonic position, suggesting that this region could represent a hotspot of intron gain.
Finally, the gene structures of Neuroligin 3 and 4 appear more conserved during evolution. Only limited and independent events occurred in two different fish species. As previously noted, an intron gain occurred only in zebrafish nlgn3a (the insertion of intron i5a1 with the formation of exons 6a1 and 6b1) and three different events in T. nigroviridis Neuroligin 4b.
Taken together, these results suggest that major branching events in the vertebrate tree (like the origin of mammals and teleosts) seem to be correlated with intron gain events. Furthermore, our results are very consistent with the following independent results: (1) genome-wide informatics studies suggest that intron gain is more prevalent than intron loss (Babenko et al., 2004; Kumar and Hedges, 2005; Roy and Gilbert, 2005a, b), (2) the fish lineage has gained many introns after it diverged from the ancestor of the mammalian lineage (Venkatesh et al., 1999), and (3) intron gain is extremely rare in mammals (Roy et al., 2003).
Although in almost all the highlighted cases for Neuroligin genes, intron gain/loss events do not induce variations in the amino acid sequence and protein structure, they represent strong selective events. Thus, a possible explanation is that intron gain/loss events could represent evolutionary events, which can result in a different gene regulation of distinct members of the same gene family. Indeed, some introns are known to enhance or be necessary for normal levels of mRNA transcription, processing, and transport. Moreover, they can encode a variety of untranslated RNAs including microRNAs, small nucleolar RNAs, and guide RNAs for RNA editing (Jeffares et al., 2006). From this point of view, the presence of the vast majority of these events in the most ancient species analyzed (teleosts) suggests a possible correlation between intron gain/loss events and the evolutionary age of the species. It is interesting to note that, at least in vertebrates, all the Neuroligin intron gain/loss events are localized in the last exons of CDS, which encode part of the extracellular region and the intracellular portion of the proteins. A possible explanation is that this exonic region represents the majority of the CDS (∼62%) and as a result it displays an increased probability of being subjected to these events. However, the possibility that this increased frequency is caused by the presence of specific nucleotide sequences favoring gain/loss events (as suggested by independent insertion of introns i5a3 and i5a5 in teleost nlgn1 and nlgn2, respectively) cannot be excluded and it requires further analyses.
Correlation of Intron Gain/Loss Events and Secondary Protein Structure
Experimental evidence indicates that a non-random tendency exists for introns to be located in interdomain regions of proteins (Patthy, 1999; Liu and Altman, 2003) and that introns have a propensity to avoid secondary structure elements such as alpha-helices and beta-strands (Contreras-Moreira et al., 2003). Therefore, we investigated the possible correlation between intron gain/loss events and the secondary structure of Neuroligin proteins.
In Figure 4, we aligned the amino acid sequences of the Neuroligins subjected to intron gain/loss events (see also Fig. 3B) and we compared their different exonic structures to the secondary structure elements of crystallized Neuroligins (Fabrichny et al., 2007; Koehnke et al., 2008a). In Neuroligins, most of the intron gain/loss events are localized outside or at the borders of secondary structures and only few events are found inside alpha-helices. This tendency could be explained by the purifying effects of natural selection, as a result of the chance of disrupting the alpha-helices. Mammalian Neuroligins 2 (represented in Fig. 4 by the mouse sequence) and teleost Neuroligin 1 (represented by the fugu sequence) present a unique intronic insertion (respectively, introns i5a4 and i5b3 in Fig. 3B) in an interdomain region. Moreover, all the teleost Neuroligins 2 display an intron (i5b4) at the border of one of the two alpha-helices involved in Neuroligin dimerization (α14) and T. nigroviridis Neuroligin 4b presents an intron immediately before alpha-helix 17 (indicated as α17 in Fig. 4).
In a few cases, the intron insertions involve specific secondary structure elements. In D. rerio nlgln3a, the intron i5a1, which divides exon 6 in exons 6a1 and 6b1 (Fig. 3B), was inserted inside the coding sequence for alpha-helix 9 (α9 in Fig. 4). Moreover, in teleost Neuroligins 1–2 two different intron insertions occurred (i5a3 and i5a5, respectively) between α11 and β11. However, in all these cases, the intron gain did not produce any sort of protein sequence alteration. As shown in Figure 4, the region between α11 and β11 (residues QGEFLN, highlighted with a violet box in Fig. 4) represents the main binding site for beta-Neurexin 1 (Fabrichny et al., 2007). Notably, the specific intron loss event in T. nigroviridis produced the insertion in nlgn2b CDS of a specific sequence absent in all the other Neuroligins (see Fig. 4); this nucleotide sequence encodes 27 amino acids that break the main binding region (Q-GEFLN, where dash indicates the insertion). Nevertheless, surprisingly, the amino acid insertion terminates with a glutamine (Q), thus reforming the correct pair of residues (QG) involved in Ca2+ coordination at the interface between Neuroligin and beta-Neurexin 1 (Fabrichny et al., 2007). The effect of this sequence insertion on the Neuroligin structure and, therefore, on the Neurexin–Neuroligin binding requires further analyses.
Analysis of Alternative Splicing of Zebrafish Neuroligins in Adult Organs and During Embryonic Development
Mammalian Neuroligins present two different alternative splice sites (sites A and B) inside the esterase-like domain. In particular, each human Neuroligin gene encodes different isoforms at site A (A1 and A2), while only Neuroligin 1 is alternatively spliced at site B (see Fig. 1).
Previous studies have already focused on the expression pattern of Neuroligin genes in different species (from mammals to birds), but few data concerning the alternative splicing pattern variation are available (Philibert et al., 2000; Bolliger et al., 2001; Chih et al., 2006; Suckow et al., 2008).
In order to analyse the Neuroligins expression and their alternative splicing pattern variation in zebrafish, we performed RT-PCR assays on cDNA obtained from different adult organs and developmental stages, using primers specific for zebrafish isoforms (Fig. 5A,B and Supporting Information Table S8). Zebrafish Neuroligin transcripts are detected in many adult organs with the exception of muscle and swim bladder where they are barely identifiable or completely absent (Fig. 5A). Notably, in some cases the alternative splicing pattern seems to be organ specific.
nlgn1 is strongly expressed in ovary, testis, brain, and eye samples. Ovary and testis have a specular pattern and, interestingly, only in brain and eye a third band corresponding to the isoform A1+A2 is detectable. At site B (indicated as SS2 in Fig. 5) only eye, brain, ovary, and testis present a band corresponding to the insert-minus forms, which is expressed at a low level.
nlgn2a and nlgn2b present a very similar expression pattern. Their expression is abundant in brain, eye, and testis where both forms (with or without exon A2) are present. The major difference between these two duplicated genes is that in nlgn2a, the short form seems to be more represented. Moreover, as shown in Supporting Information Table S6, both genes present a low percentage of sequence identity in intronic regions flanking the alternatively spliced exon at site A2. Together, these data indicate the presence of subfunctionalization events in the regulative mechanisms of alternative splicing of both genes.
The nlgn3a and nlgn3b genes represent another case of subfunctionalization event. While nlgn3a shows a unique band corresponding to insert-minus isoform, nlgn3b presents all the possible isoforms (compare eye and brain samples in Fig. 5A and 24–120-hpf stages in Fig. 5B).
With the exception of the swim bladder, the nlgn4a is expressed in all the organs tested. The splice site A of nlgn4a can produce two different isoforms (with or without insert A2), confirming that alternative splicing occurs at site A. Moreover, while the longer isoform is expressed only in specific organs, the isoform without the insert is shared by all the organs tested. nlgn4b insert-minus transcripts are present at low levels in digestive system, gills, and testis, while their expression is particularly enriched in eye, ovary, and brain. Overall, these data indicate that, in adult Danio rerio, Neuroligin genes are expressed and alternatively spliced in different organs inside and outside the Central Nervous System (CNS). Although their broad expression can indicate possible unknown functions, its real functional meaning requires further analyses.
Figure 5B shows the different isoforms expressed during zebrafish embryonic development. As the Neurexin genes (Rissone et al., 2007), many nlgns (1, 2a, 2b, 3a, and 4b) are expressed since the earliest stages of development and they increase their expression from 24 hr post fertilization (hpf).
As independently confirmed by an EST from ovary (Acc. Number: BI709920), all the nlgn1 isoforms are maternally inherited. For nlgn1 splice site A, the intermediate form (with inserts A1 or A2) is the more abundant throughout embryo development. The expression level of the other isoforms decreases from tailbud to 15–20 somite stage and afterwards it increases starting from 24 hpf. Concerning site B, both possible forms are present during development. Starting from late epiboly, the shorter form is no longer detected and, as observed in other vertebrate species (Sudhof, 2008), the longer form predominates (Fig. 5B).
The insert-plus isoform of nlgn2a is expressed, at very low levels, from the first stages of development. A lower band representing the insert-minus is detectable from the 8-somite stage and, in the last examined stages, both isoforms are more represented. nlg2a and nlgn2b present a similar expression during embryonic development and, as previously observed in adult organs, the alternative splicing pattern is different. In particular, the insert-minus isoform of nlgn2b is expressed at very low levels (Fig. 5B).
As previously mentioned, nlgn3a presents a unique isoform without insert, which is present at all developmental stages examined. Although the intensity of the band seems to vary at different stages, it is possible to observe a marked increase starting from 24 hpf. On the contrary, nlgn3b is not expressed until the 8-somite stage. Notably, starting from 24 hpf its expression increases and all the possible alternatively spliced isoforms are produced.
Finally, nlgn4a and nlgn4b display a complementary distribution. While nlgn4a is not expressed until the 50% epiboly stage (Fig. 5B), nlgn4b transcripts are maternally inherited and they are detectable up to 120 hpf. The complementary expression of duplicate nlgn3s and nlgn4s could be induced by further evolutionary phenomena targeting their promoter regions, which resulted in different temporal expressions.
The data presented so far indicate that, during evolution, duplicate Neuroligin genes underwent different evolutionary fates that differentiated their gene expression and alternative splicing regulation.
Analysis of Expression Patterns of Zebrafish Neuroligin Genes During Embryonic Development
We analyzed the spatial and temporal expression pattern of zebrafish Neuroligins during embryogenesis by whole mount in situ hybridization (WISH) from 48 hr post-fertilization (hpf) to 120 hpf. All the sense probes did not show any staining (data not shown). In general, Neuroligins are widely expressed throughout the brain in all the stages analyzed (Figs. 6–7).
At 48 hpf, nlgn1 is mainly expressed in the mesencephalon and the rostral part of rhomboencephalon (Fig. 6A, B). Moreover, a discrete staining is weakly present also in the ventral diencephalon and telencephalon. Afterwards, a strong signal is visible throughout the whole brain and the retina (Fig. 6C,D) while the expression of nlgn1 appears to fade at 120 hpf (Fig. 6E,F).
In all the analyzed stages, the expression pattern of nlgn2a is quite similar to that of nlgn1 with the exception of the retina at 72 hpf (compare Fig. 6D and J). nlgn2b is widely detected at stage 48 hpf in the brain. Notably, compared to nlgn2a, the staining in the telencephalon is wider and, intriguingly, a positive signal is also present in the retina (black arrowheads in Fig. 6N, P, and R). Later during development (72 and 120 hpf), the mesencephalon and the rhomboencephalon are strongly stained, while only discrete regions within the diencephalon show a positive signal.
The two Neuroligin 3 paralogs display a rather distinct expression pattern (Fig. 7A–L). nlgn3a mRNA is detected at 48 hpf in discrete regions of the telencephalon, diencephalon, and rhomboencephalon. At later stages, positive staining is widespread in brain, with discrete regions visible in the telencephalon and diencephalon, while a faint signal is present also in the retina (black arrowhead in Fig. 7F). On the contrary, at 48 and 120 hpf nlgn3b presents a more restricted expression pattern (Fig. 7G,H and K,L), while at 72 hpf it is expressed throughout the brain (Fig. 7I,J). Notably, the retina is completely unstained in all the analyzed stages.
At 48 hpf, nlgn4a mRNA is detectable in discrete areas of the telencephalon, diencephalon, and rhomboencephalon (Fig. 7M,N). At 72 hpf, its expression expands also to mesencephalon, while at 120 hpf the ventral mesencephalon, the anterior rhomboencephalon, and the diencephalon are stained (Fig. 7O–R).
For nlgn4b, a strong signal is visible in all the analyzed stages. At 48 hpf, the staining is present in the diencephalon and rhomboencephalon (Fig. 7S,T); starting from 72 hpf, the mRNA is mainly detectable in the ventral mesencephalon and in the anterior rhomboencephalon (Fig. 7U–X). At 72 hpf, a faint signal is also visible in the dorsal diencephalon (Fig. 7U).
Taken together, these data indicate that, starting from 48 hpf, several Neuroligins are widely expressed in the brain of developing embryos. Although almost all Neuroligins shared a similar expression pattern in some regions of the brain as, for example, the midbrain and the hindbrain, in some cases we highlighted a differential expression between paralogous genes.
For example, nlgn2b and nlgn3a are expressed in the retina (from 48 hpf and at 120 hpf, respectively) and at 48 hpf nlgn4b is strongly expressed only in the dorsal diencephalon and ventral hindbrain. Finally, nlgn3b at 48 and 120 hpf presents a very restricted staining pattern.
Overall, RT-PCR and WISH analyses indicate that, during evolution, different subfunctionalization events differentiated the gene expression and the alternative splicing pattern of paralogous Neuroligins in zebrafish.
Zebrafish were raised and maintained under standard laboratory conditions as described in the Zebrafish Book (Westerfield, 2000) and staged according to Kimmel et al. (1995). Beginning from 24 hpf, embryos were cultured in fishwater containing 0.003% 1-phenyl-2-thiouera (PTU) to prevent pigmentation and 0.01% methylene blue to prevent fungal growth.
Databases and Bioinformatic Analysis of Data
The following genome assemblies were searched with the tools available from ENSEMBL: Homo sapiens (GRCh37, Feb. 2009), Mus musculus (NCBI m37, Apr. 2007), Danio rerio (Zv8, Dec. 2008), Caenorhabditis elegans (WS200, Jan. 2009), Tetraodon nigroviridis genome (TETRAODON 8.0, Mar. 2007), Takifugu rubripes (FUGU 4.0, June 2005), Orizyas latipes (HdrR, Oct. 2005), Gasterosteus aculeatus (BROAD S1, Feb. 2006), Gallus. gallus (WASHUC2, May 2006), Anolis carolinensis (AnoCar1.0, Feb. 2007), Monodelphis domestica (monDom5, Oct. 2006), Rattus norvegicus (RGSC 3.4, Dec. 2004), Xenopus tropicalis (JGI 4.1, Aug. 2005), Ciona savignyi (CSAV 2.0, Oct. 2005), and Ciona intestinalis genome (JGI version 2.0). Multi-alignment of human and zebrafish Neuroligins was performed with AlignX of VECTOR NTI Advance 10.1.1 (Invitrogen Corporation, Carlsbad, CA) and edited with GeneDoc version 2.7 (Nicholas, 1997). Structural features of amino acid sequences of human Neuroligins were predicted using release 20.52 (dated 28 July 2009) of PROSITE (http://www.expasy.org/prosite/).
Phylogenetic and molecular evolutionary analyses were conducted using MEGA version 4.1, build number 4103 (Tamura et al., 2007). Amino acid sequences of Neuroligins from organisms representing different taxa were aligned in MEGA 4.1 using Clustal W algorithm with a Gonnet protein weight matrix. A rooted phylogenetic tree was built using a Neighbor-Joining method. Bootstrap analyses for 1,000 cycles were used to assess the strength of the topologies.
Cloning of Zebrafish Neurexins and Neuroligins
Public databases were searched to find genes and their genomic environments: Ensembl database http://www.ensembl.org/), NCBI (http://www.ncbi.nlm.nih.gov/). For 5′ RACE technique, we used two different kits: FirstChoice® RLMRACE Kit (Ambion, Austin, TX) and 5′ RACE System for Rapid Amplification of cDNA Ends (Invitrogen) following the manufacturer's instructions. The cDNA sequences obtained with PCR methods were compared with the genomic sequences to identify the splice sites.
To find evidence for the conservation of synteny, we compared genomic regions neighbouring the zebrafish Neuroligins to the genes neighbouring human and mouse Neuroligins. Putative orthologs for each zebrafish gene were located on the human and mouse map using the MultiContigView comparative tools in release 49 of ENSEMBL Genome Browser (Hubbard et al., 2005).
Total RNAs were prepared from different zebrafish adult organs (muscle, swim bladder, digestive system, eye, ovary, heart, gills, brain, and testis), oocytes, and embryos at different developmental stages using the Totally RNA Isolation Kit (Ambion) or the RNAgents Total RNA Isolation System (Promega, Madison, WI), treated with DNase I RNase-free (Roche) to avoid possible contamination from genomic DNA and then reverse transcribed using Superscript II (Invitrogen) and Oligo dT primers or the ImProm-II Reverse Transcription System (Promega) and Random primers. The cDNAs were then subjected to PCR amplification using specific primers (see Supporting Information Table S8) using Expand High Fidelity Taq polymerase (Roche) or GOTaq (Promega) following the manufacturer's instructions. When possible, all primer pairs have been designed on different exons to avoid the amplification of DNA contaminations eventually present in cDNA preparations. Control PCR experiments with samples prepared without reverse transcriptase were performed to ensure that genomic DNA contamination did not contribute to the PCR amplification (data not shown). The PCRs consisted of an initial denaturation of the samples at 95°C for 3 min, followed by 35 cycles. Each cycle consisted of a denaturation step at 95°C for 30s, a 30-s annealing step at the temperatures specified in Supporting Information Tables S8, and an extension step at 72°C for a time depending on fragment length. A final extension cycle of 10 min at 72°C was added to each PCR. Products were then separated on agarose gels at different concentration (from 1 to 3% maximum, based on the fragments length), visualised by ethidium bromide staining and then scanned with a Typhoon 8600 (Molecular Dynamics, Sunnyvale, CA). Nucleotide sequences of PCR products were determined by cloning in pCRII system (Invitrogen) or pGEM-T and pGEM-T Easy Vectors systems (Promega) followed by sequencing of both strands (PRIMM). A fragment of zebrafish β-actin cDNA was amplified by PCR (35 cycles) as an internal control for the quality of cDNA using a couple of primers that demonstrate the lack of genomic contamination in our RNA preparations (Argenton et al., 2004).
The accession numbers of zebrafish Neuroligin family members and all the other sequences used in phylogenetic analyses are listed in Supporting Information Table S7.
Gene-Specific Primer Sequences
The sequences of primers used in alternative splicing pattern analysis are listed in Supporting Information Table S8.
In Situ Hybridization
For Neuroligins probes preparation, gene-specific fragments were amplified by RT-PCR on suitable template. All the primers were designed to cover the 3′end of the coding sequence and part of the 3′UTR of the gene of interest. PCR products were cloned into the pGEM-T Easy Vector (Promega) or pCRII-TOPO (Invitrogen) and recombinant plasmids were sequenced on both strands. MAXIscript SP6/T7 Kits (Ambion) were used to synthesize antisense or sense RNA probes. All embryos used for whole-mount in situ (WISH) hybridization were fixed for 2 hr in 4% paraformaldehyde/phosphate buffered saline, rinsed with PBS-Tween, dehydrated in 100% methanol, and stored at −20°C until processed for WISH. WISH assays were carried out according to Thisse et al. (1993), modifying the protocol on purpose. Wild type embryos were hybridized by using digoxigenin-11-UTP (Boehringer, Roche) in vitro–labeled riboprobes at a concentration of 0.5–1 ng/μl. Hybridization was carried out at 68°C. Hybridized probes were then detected by using an anti-digoxigenin antibody conjugated to alkaline phosphatase (AP, Boehringer-Roche) at a 1:5,000 dilution and nitroblue tetrazolium/ 5-bromo-4-chloro-3-indolyl phosphate (NBT/BCIP; Promega) was used as the substrate for AP. Stained embryos were dehydrated and stored in methanol.
We thank all of the members of Prof. Cotelli's lab for helpful advice and expertise. Special thanks go to Giulio Pavesi and David Horner for helpful discussions and to Erica Bresciani, Manuela Marai, Luca Del Giacco, and Roberto Marotta for a critical reading of the manuscript. This study was supported by the Associazione Italiana per la Ricerca sul Cancro (AIRC), Regione Piemonte (Ricerca Finalizzata 2007, 2008, 2009; Ricerca industriale e competitiva 2006, grant PRESTO; Ricerca Tecnologie convergenti 2007, grant PHOENICS; Piattaforme tecnologiche per le biotecnologie, grant Druidi); Fondazione CRT- Torino, and “Ministero della Salute” (Programma Ricerca Oncologica 2006, Ricerca Finalizzata 2006); Fondazione Cariplo (grant number 2006. 0807) and Progetto Cariplo N.O.B.E.L. (biological and molecular characterization of cancer stem cells), and Regione Piemonte (Ricerca Scientifica Applicata 2004, A17).