FOXP2 is a forkhead domain transcription factor whose mutation has been associated with severe deficits in language. A single amino acid missense mutation in FOXP2 was identified in a family (KE family) with a dominantly inherited impairment of grammar and articulation. An unrelated individual with similarly affected language was also identified and found to have a chromosomal translocation breakpoint in FOXP2 (Lai et al., 2001). To date, it is the only gene known to specifically affect the development of language.
The family of forkhead-box (FOX) genes is conserved through evolution from fungi to mammals, and plays diverse roles in developmental processes including eye development, T-cell maturation, mesoderm patterning, and nervous system differentiation (reviewed in Carlsson and Mahlapuu, 2002). The FOX proteins are characterized by their forkhead domain, a highly conserved 100-residue DNA-binding domain. The sub-family of FOXP proteins has four members (FOXP1–4), which share high conservation of the forkhead domain as well as throughout the entire protein. The FOXP forkhead domain is 80% identical between human and Drosophila proteins, and the entire protein is 62% identical between FoxP1 and FoxP2 (Shu et al., 2001; Banerjee-Basu and Baxevanis, 2004).
Mouse Foxp1 and Foxp2 are expressed in the CNS, but the other two members of the FOXP gene family are not. Foxp3 has been implicated in T-cell development, and mutations in Foxp3 cause immune system dysregulation (Bennett et al., 2000; Brunkow et al., 2001; Hori et al., 2003). Foxp4 is expressed in lung and gut tissues but not the CNS (Lu et al., 2002).
FOXP2 has been cloned and its expression described in humans, mouse, and songbird (Shu et al., 2001; Ferland et al., 2003; Takahashi et al., 2003; Lai et al., 2003; Haesler et al., 2004; Teramitsu et al., 2004). In vitro studies suggest that FOXP2 may function as a transcriptional repressor (Shu et al., 2001), possibly by homo- or hetero-dimerization via a leucine zipper motif (Wang et al., 2003). Here we describe the initial characterization of zebrafish foxP2. The highly conserved sequence and expression pattern of this gene suggest that in addition to its specific role in human language development, it is likely to play a more general role in vertebrate CNS development.
RESULTS AND DISCUSSION
Molecular Cloning, Sequencing, and Mapping of foxP2
To identify zebrafish orthologs of FOXP2, sequence of the human cDNAs encoding FOXP2 and FOXP1 were used in a BLAST search of zebrafish EST sequences. Two potential ESTs were identified, and their corresponding cDNAs obtained and completely sequenced. Based on comparisons to known FOXP1 and FOXP2 genes, one clone was most similar to known FOXP1 genes (although it was lacking both 5′ and 3′ ends), while the other clone was most similar to known FOXP2 genes. This latter cDNA spans 2,384 bp; however, it was missing the 5′-most 200 bp of the predicted foxP2 sequence, which we subsequently cloned and sequenced using 5′-RACE from cDNA. We have also cloned and sequenced the entire zebrafish foxP2 cDNA using RT-PCR, with primers from the 5′ and 3′ UTRs, to confirm the nucleotide sequence. The entire sequence has been deposited as GenBank accession number DQ061052.
The predicted protein has 697 amino acids (Fig. 1). Comparison to available genomic sequence (Sanger Centre Whole Genome Assembly version 4, which is incomplete in this area) reveals that the foxP2 gene includes at least 19 exons. The gene spans more than 92 kb, and includes one alternatively spliced exon in the 5′ untranslated region (identified using 5′-RACE) (Fig. 2A).
To determine the chromosomal location of foxP2, radiation hybrid mapping on the T51 panel (Geisler et al., 1999) was performed using PCR primers Prj1 and Prj2 across an intron-exon boundary of foxP2. foxP2 maps approximately 300 cR (centiRays) from the top of linkage group 4, in the vicinity of brn 1.2. The closest markers are chunp30694 (25cR) and fc13d11 (28cR).
A comparison of the amino acid sequence with other FOXP2 proteins reveals significant sequence divergence of zebrafish FoxP2 from its orthologs. The entire protein is 78% identical, and 85% similar, to its mouse and primate orthologs (Fig. 2B). Notable differences in zebrafish FoxP2 include a stretch of only three glutamines at the location where its human ortholog contains 39, and a 15-amino acid insertion (compared to its orthologs) in its C-terminus. Phylogenetic analysis shows that zebrafish FoxP2 is significantly diverged from its orthologs, but is clearly distinct from the FOXP1 homologs (Fig. 2C).
A high degree of conservation is maintained, however, in the zinc-finger domain (amino acids 311 to 335), in the leucine zipper domain (amino acids 348 to 375), and in the forkhead domain (amino acids 470 to 579). The zinc-finger domain is 100% identical, and the leucine zipper and forkhead domains 96% identical, to their mouse and human orthologs. Also, zebrafish FoxP2 maintains a conserved arginine at position 519, which is in the third alpha-helix of the forkhead domain. This arginine is invariant in all forkhead proteins, but is mutated in the family with language difficulties.
In the N-terminal region shown to mediate transcriptional repression (Shu et al, 2001; Li et al., 2004), FoxP2 is only 78% similar to its mouse and human orthologs (even though the leucine zipper and zinc-finger domains in this region are highly conserved). Human FOXP2 has a conserved Thr-to-Asn substitution at position 303, which is shared by all human lineages surveyed but by none of 29 non-human species (Enard et al., 2002; Zhang et al., 2003). Interestingly, zebrafish FoxP2 has a Thr-to-Ser substitution at this position, which is also not found in any of the other species surveyed.
foxP2 mRNA Expression
To determine where foxP2 might act during zebrafish development, we carried out whole-mount in situ hybridization at stages from 10 to 72 h postfertilization (hpf). At 10 hpf, foxP2 is diffusely expressed in much of the embryo, with stronger expression seen in the head region (Fig. 3A). By 24 hpf, expression is localized to the pallial and subpallial telencephalon (Fig. 3B). This expression in the telencephalon persists, and by 36 hpf additional areas of foxP2 expression in the diencephalon (ventral and dorsal thalamus) and hindbrain are visible (Fig. 3C,D). Axons crossing in the anterior commissure also appear to contain foxP2 mRNA (Fig. 3C,D). We have seen similar apparent axonal staining with several mRNA probes (Hutson et al., 2003; unpublished results). Finally, expression in the heart is detected at this stage (Fig. 3D).
At 48 hpf, levels of expression in the telencephalon decrease but are still clearly detectable. foxP2 is strongly expressed in the tectum and cerebellum. Expression persists in the diencephalon, and in the hindbrain there is a clear stripe of expression at the rostral edge as well as along the lateral edges (Fig. 3E,F).
At 60 hpf, expression in the diencephalon, cerebellum, and hindbrain is clearly visible, and it extends into the spinal cord (Fig. 3G,H,J,K). Expression in the telencephalon is greatly decreased. The tectum continues to express foxP2, and strong staining for foxP2 in the inner plexiform layer of the eye is first observed. foxP2 expression is seen in both the anterior and post-optic commissures (Fig. 3I).
Expression in the telencephalon, diencephalon, and tectum persists at 72 hpf. In the hindbrain, expression in the cerebellum is well defined. Spinal cord expression is undetectable (Fig. 4A–C), while the retinal inner plexiform layer continues to stain for foxP2 (Fig. 4D).
foxP2 Is Expressed in Pallial and Subpallial Telencephalon
To determine whether foxP2 is expressed in both the pallial and sub-pallial telencephalon, we examined whether foxP2 expression overlaps with expression of Dlx5 and Dlx6. In teleosts such as zebrafish, the dorsal telencephalon is believed to represent the pallium, and the ventral telencephalon to represent the subpallium (Rink and Wulliman, 2004). The subpallium develops to form structures of the basal forebrain (such as the basal ganglia). In mouse, Dlx5 and Dlx6 are expressed in the basal ganglia during development (Liu et al., 1997). A zebrafish transgenic reporter construct, dlx6a(156i, 156ii):gfp, expresses GFP in the ventral telencephalon of zebrafish and also in the basal ganglia of mouse when expressed as a stable transgene (Ghanem et al., 2003), mirroring endogenous Dlx5 and Dlx6 expression.
We performed whole-mount double-labeling at 32hpf, using an in situ probe against foxP2 and an antibody directed against GFP, in the dlx6a(156i, 156ii):gfp zebrafish line. Examination of the double-labeled whole-mount embryos shows a narrow band of overlap of expression of foxP2 and GFP in the ventral telencephalon (Fig. 5A). Analysis in 7-μm transverse sections in the ventral telencephalon reveals co-expression of foxP2 and GFP (Fig. 5B–D), demonstrating that foxP2 is indeed expressed in the subpallium.
FoxP2 in zebrafish is the most divergent of the known FoxP2 protein family members, most notably in the size of its glutamine repeats and its distinctive N-terminus. However, the forkhead domain, zinc-finger domain, and amino acids critical for human language development are all highly conserved. The gene structure of foxP2 is also highly conserved. Zebrafish has 16 coding exons, as does the human ortholog (Lai et al., 2001; Bruce and Margolis, 2002). The exon sizes are also similar, although the precise acceptor and donor splice sites are not conserved between human and zebrafish.
The divergence of zebrafish foxP2 (85% similar to the human ortholog) is striking when taken in the context of the high conservation of foxP2 in such disparate species as mouse and songbird, which maintain almost 99% amino acid similarity to each other and to the human ortholog. However, the degree of foxP2 divergence is comparable to other orthologs of human genes identified in zebrafish. For example, zebrafish, chicken, and mouse slit2 are (respectively) 80.5%, 91.5%, and 96.5% similar to their human ortholog. This suggests that foxP2 may have been under strong selection pressure in tetrapods, but not in fish.
The high amino acid conservation of FOXP2 in mammals and birds, together with evidence of a recent selective sweep in human evolutionary history (based on the pattern of nucleotide polymorphisms conserved in different human lineages), argue that the CNS role of FOXP2 is highly conserved (Enard et al., 2002; Zhang et al., 2002). However, the amino acid changes in humans, and in other vocal-learning species (songbirds, whales, dolphins, and bats) are not identical, but rather are highly conserved within each vocal-learning species (Enard et al., 2002; Webb and Zhang, 2005). This suggests that FOXP2's role is related to the creation of neuroanatomic substrates that in some species are co-opted for vocal learning (Lai et al., 2003).
Expression patterns of FoxP2 are similar in humans, mice, and songbirds, and, as we have shown, in zebrafish. Zebrafish foxP2 is expressed in a temporally and spatially dynamic pattern, which includes the telencephalon, diencephalon, cerebellum, hindbrain, spinal cord, and retinal ganglion cells. This is similar to mouse Foxp2, which is expressed in the cerebral cortex, thalamus, cerebellum, and spinal cord interneurons (Ferland et al., 2003; Lai et al., 2003). This expression pattern is not obviously similar to that of other known genes. Zebrafish foxP2 is also expressed in the heart (Fig. 3D), but we have not detected embryonic expression in the spleen, kidney, or gut, unlike its mouse and songbird orthologs (Shu et al, 2001; Haesler et al., 2004).
Further, we have shown that zebrafish foxP2 is expressed in the subpallium, which will develop into the striatum (the subcortical nuclei of the telencephalon). In humans, these FOXP2-expressing subcortical regions include the caudate nucleus and putamen (Lai et al., 2003; Teramitsu et al., 2004), which in the KE family show abnormalities when imaged using MRI or functional MRI (Watkins et al., 2002; Liegois et al., 2003).
The function of Foxp2, and downstream targets of Foxp2, are not known. Foxp2 can act as a transcriptional repressor for lung epithelium-specific promoters (Shu et al., 2001; Li et al., 2004), but whether Foxp2 operates as a transcriptional repressor in other tissues is not known. This repressor activity is localized in amino acid residues 260–500 of the mouse protein, which encompasses the zinc-finger domain, but not the forkhead domain. However, the zinc-finger domain itself is not critical for repressor activity (Li et al., 2004).
Foxp2 is able to homodimerize, as well as to heterodimerize with two other FOXP family members, Foxp1 and Foxp4 (Li et al., 2004). Dimerization is mediated by a leucine zipper motif (Wang et al., 2003; Li et al, 2004) that is highly conserved across species, including zebrafish. Foxp1 has been hypothesized to cooperate with Foxp2 in neural development, as Foxp1 is expressed in overlapping regions in the CNS and shows sexually dimorphic expression in the developing zebra finch song circuit (Ferland et al., 2003; Takahashi et al., 2003; Haesler et al., 2004; Teramitsu et al., 2004).
The conserved expression patterns of the mammalian, avian, and teleost orthologs raise the question of what unique effect human FOXP2 has in language development. Further, the high sequence homology amongst vertebrates argues that foxP2 controls similar genetic pathways, and likely has similar roles in neural development. Determining the role of foxP2 in the CNS will require studies of cell fate determination, axon pathfinding, and synaptogenesis, for which the zebrafish may be a useful model, especially given the rapid loss-of-function experiments possible using antisense morpholino oligonucleotides (Nasevicius and Ekker, 2000). The zebrafish system may also allow identification of upstream and downstream genetic partners of foxP2.
Zebrafish were maintained and bred under standard conditions. Embryos were raised in phenylthiourea to inhibit pigment formation and staged at 28.5°C according to Kimmel et al. (1995). cDNAs corresponding to potential foxP2 ESTs were obtained from the Resource Center of the Human Genome Project (RZPD, Berlin, Germany). Sequencing was performed at the University of Utah Sequencing Core Facility. Final sequence was determined from agreement of sequence in both directions over the entire coding region. Mapping of foxP2 was performed in triplicate on the T51 radiation hybrid panel (Geisler et al., 1999) using primers Prj1 (5′- TCCTTGACGTGAATGGGAGGC-3′) and Prj2 (5′-GATCAAAAGAGGCCAGTGGGC-3′); product size 400 bp. Protein sequence analysis was performed using software at the San Diego Supercomputer Center Biology Workbench. Protein alignment was done by CLUSTALW (Corpet, 1988), version 3.2, and unrooted tree analysis was done by PHYLIP, version 3.2. Species included in our analysis were Gorilla gorilla (gorilla), Homo sapiens (human), Macaca mulatta (Rhesus monkey), Mus musculus (mouse), Pan paniscus (pygmy chimp), Pan troglodytes (chimpanzee), Taeniopygia guttata (zebra finch), and Xenopus laevis (frog). Whole-mount in situ hybridization, photography, and image processing were performed as previously described (Lee et al., 2001).
The dlx6a(156i, 156ii):gfp line was obtained from M. Ekker (Ghanem et al., 2003), and expresses GFP in the domains of dlx expression in the telencephalon and diencephalon. Whole-mount double-labeling was performed by sequential staining of dlx6a(156i, 156ii):gfp embryos with foxP2 antisense digoxin-labeled probe detected using BM Purple (Roche), followed by anti-GFP antibody staining detected using a fluorescent Alexa-488 tyramide reaction (Molecular Probes, Eugene, OR). After dehydration through an ethanol series, double-labeled whole embryos were embedded in methacrylate and allowed to polymerize overnight. Serial 7-μm plastic sections were cut using a microtome, floated on a water bath, collected on slides, and cover-slipped. To combine brightfield and GFP images for Figure 5D, the Image Calculations Subtract function in Adobe Photoshop was used to subtract the GFP image from the blue and green channels of the brightfield image.
We thank A. Kugath for assistance with radiation hybrid mapping, A. Suli for assistance with microscopy and immunohistochemistry, R.F. Doolittle for assistance in the phylogeny analysis, J. Rosenthal for assistance in sectioning, M. Ekker for generously sharing the dlx6a(156i, 156ii):gfp line, and other members of the Chien lab for their assistance in preparing this work. This work was supported by a Primary Children's Medical Center Foundation grant to J.L.B., a Primary Children's Medical Center Foundation Scholar Grant to J.L.B., and NIH grant R01 EY12873-01 to C.B.C.