Forkhead genes encode transcriptional regulators that share a highly conserved, 110 amino acid DNA-binding domain (Weigel and Jackle, 1990). Over 100 different forkhead genes have been identified among numerous eukaryotic species, and expression and functional analyses have implicated these genes in a wide range of biological functions (Kaufmann and Knochel, 1996; Carlsson and Mahlapuu, 2002). We have previously identified zebrafish foxi1 and demonstrated that this forkhead gene is expressed in otic placode precursor cells and pharyngeal pouch endoderm (Solomon et al., 2003). Embryos homozygous for hearsay (hsy), a frameshift mutation in the foxi1 gene, display developmental defects in both of these structures. Intriguingly, hsy mutants display considerable variability in severity of the otic defect, even though this mutation appears to create a null allele. FoxI class genes have been reported for Xenopus (Lef et al., 1994, 1996; Pohl et al., 2002), humans (Pierrou et al., 1994; Larsson et al., 1995), rat (Clevidence et al., 1993), and mouse (Hulander et al., 1998, 2003). Among these, Xenopus FoxI1c appeared most similar to zebrafish foxi1 by both phylogenetic and expression analyses (Pohl et al., 2002; Solomon et al., 2003), whereas less similarity was observed for the other Xenopus and mammalian paralogs. Human and rat FOXI1 transcripts have only been detected in tissue isolated from kidney (Clevidence et al., 1993). In situ hybridization experiments have demonstrated expression of mouse FoxI1 in the developing otic vesicle, in addition to the kidney (Hulander et al., 1998). In contrast to zebrafish foxi1, the mouse homolog is not expressed in preplacodal tissue, and the initial induction and formation of the otic placode occurs normally in mouse embryos homozygous for a targeted disruption of this gene, although later patterning defects are observed.
Compared with other vertebrates, several gene families (hox, dlx, msx, eng) have additional members in the zebrafish genome, consistent with the hypothesis that ray-finned fish have undergone an additional round of genome duplication after diverging from the tetrapod lineage (Postlethwait et al., 1998). The variability observed for the hsy phenotype, together with the low levels of similarity between the mammalian and zebrafish FoxI genes, suggests that additional, unidentified FoxI family members exist for zebrafish. If so, a second foxi1-like paralog with a partially redundant function may be able to compensate for a loss of foxi1 function, and this paralog could explain the variable otic placode phenotype observed in hsy mutants. Furthermore, additional zebrafish FoxI genes may be more closely related to the previously reported mammalian family members. To address these issues, we have performed database searches for FoxI class genes in zebrafish and other available chordate genomes. Here, we report the identification and expression analysis of three additional zebrafish FoxI genes, foxi2, foxi3a, and foxi3b, and the determination of their phylogenetic relationships with other chordate FoxI homologs.
RESULTS AND DISCUSSION
Identification of zebrafish foxi2, foxi3a, and foxi3b genes
We used the zebrafish Foxi1 protein sequence, excluding the forkhead domain, as a query in a BLAST search of the Danio rerio translated trace repository on the Ensembl Trace Server Web site (http://trace.ensembl.org/). Nucleotide sequences of hits containing identity for at least five amino acids were retrieved and assembled into contigs using LASERGENE Navigator sequence alignment software. Although most sequences failed to fall into alignments, four separate contigs containing multiple sequences were identified. One of these contigs defined the foxi1 sequence. Consensus sequences were derived for the other three contigs and were used as translated queries in BLAST searches of protein databases on the NCBI Web site (http://www.ncbi.nlm.nih.gov/BLAST/). All three sequences showed the closest similarity to other FoxI class proteins, as was the case for the original analysis performed with foxi1 (Solomon et al., 2003). Potential complete coding sequences were derived for these contigs through searches of the partially assembled Danio rerio Ensembl database and comparisons with other FoxI coding sequences that showed high similarity by BLAST search, and we have named these sequences Foxi2, Foxi3a, and Foxi3b.
Expression of Zebrafish foxi2
foxi2 is expressed faintly in the notochord at the three-somite stage (3s), and this expression extends to a circular region corresponding to the position of the tailbud (Fig. 1A,B). By 18s, strong expression is detected in the pharyngeal arch region (Fig. 1C,D) and the anterior portion of the optic primordium (Fig. 1C,E), and expression is maintained in both of these domains at 2 days postfertilization (2d; Fig. 1F–H). Expression in the eye at 2d is detected ventrally to the lens and extends along the choroid fissure (Fig. 1F,G), and it is also detected more faintly in the ventral cells of the retina surrounding the choroid fissure (Fig. 1G). However, foxi2 is not expressed in the optic nerve or stalk. Four distinct stripes of expression are detected in the pharyngeal arch domain at this stage, potentially corresponding to either the cartilaginous precursor cells or the adjacent pharyngeal arch endoderm (Fig. 1F–H). Higher magnification reveals that at least some cartilaginous cells are included within this expression domain (Fig. 1H). Also at 2d, foxi2 is expressed bilaterally as two ventral, circular domains immediately anterior to the first somite, and in the region surrounding the mouth, including strong bilateral expression in cartilage adjacent to the ventro-lateral boundary of this structure (Fig. 1G). The identity of the tissue expressing the two ovoid domains remains unclear at this time. At 4d, foxi2 is expressed ventrally to the lens and bilaterally in a single stripe through a noncartilaginous region of the gill arches (Fig. 1I,J).
Expression of Zebrafish foxi3a and foxi3b
foxi3a and foxi3b display nearly identical expression patterns, in accordance with our proposal that these FoxI homologs represent a recently duplicated gene pair. At 3s, both genes are expressed in a punctate pattern over a large portion of the yolk sac (Fig. 2A,B), and this pattern is detectable as early as 90% epiboly for foxi3a, but not foxi3b (data not shown). A similar pattern is observed at 18s, although in some cases the expression extends from the yolk sac to include the lateral portions of the trunk and tail (Fig. 2C,D). This punctate pattern is very similar to the mucous cell expression reported for several Na/K ATPase subunit genes (Blasiole et al., 2002; Canfield et al., 2002) and the parvalbumin genes pvalb3a and pvalb3b (Hsiao et al., 2002). Based on this comparison, it is likely that foxi3a and foxi3b are expressed within the mucous cell population of the epidermis, and this is the first report of expression within this cell type during late gastrulation/early somitogenesis in zebrafish. At 2d, expression is detected in the regions surrounding the pharyngeal arches and the posterior border of the eye and extends posteriorly along the trunk/yolk contact site, ending near the anterior-most portion of the hind-yolk (Fig. 2E,F). At 4d, extensive expression is detected in the gill epithelium for both genes (Fig. 2G,H). foxi3a expression also extends along the yolk sac/trunk border, similar to the pattern observed at 2d, whereas foxi3b expression is weaker in this domain.
Identification of Additional Chordate FoxI Class Homologs
To comprehensively identify all available FoxI homologs in other species, sequences of zebrafish FoxI genes (excluding the forkhead domain) were used as queries to identify additional FoxI family members. Multiple iterative searches were performed against protein databases and translated nucleotide databases (including both nonredundant sequences and expressed sequence tags [ESTs]) using the NCBI BLAST server. Sequences from the pufferfish, Fugu, were identified by BLAST search of the second draft Fugu genome assembly (http://fugu.hgmp.mrc.ac.uk/blast/). Sequences from the urochordates Ciona intestinalis and Ciona savignyi were obtained from the Ensembl Trace Server (http://trace.ensembl.org/). Sequences were selected as potential FoxI members based on the following criteria: (1) high amino acid similarity with more than one of the zebrafish FoxI queries or with multiple known FoxI proteins from other species, and (2) sequence similarity extending for a region outside of the conserved forkhead domain.
Phylogenetic Analysis of FoxI Genes
A data set was assembled including 26 available amino acid sequences comprising the zebrafish FoxI genes, the previously reported mouse (Hulander et al., 1998), human (Pierrou et al., 1994), rat (Clevidence et al., 1993), Xenopus homologs (Lef et al., 1994, 1996), and additional sequences derived from our database searches. Alignment of the winged-helix domains of these proteins with representatives of other Fox classes confirmed their identity as members of the FoxI class (Fig. 3).
To address the phylogenetic relationships within the FoxI genes, a multiple sequence alignment was initially constructed by using ClustalW (Thompson et al., 1994), and then carefully edited by eye using MacClade (Maddison and Maddison, 1989). The resulting sequence alignment consisted of 215 residues that could be aligned with confidence; this alignment was subjected to phylogenetic analyses using both distance and parsimony methods as available in the PAUP*, Tree-Puzzle, and PHYLIP packages (Felsenstein, 1995; Swofford, 1999; Schmidt et al., 2002). The tree shown in Figure 4 represents a summary of these analyses.
Although multiple FoxI homologs were identified in all of the vertebrates, only one gene was identified from each of the two available urochordate genomes of Ciona intestinalis and Ciona savignyi. This finding suggests that gene duplications of vertebrate FoxI genes occurred after their divergence from urochordates. This parsimonious scenario indicates that the singular Ciona foxi represents the ancestral gene copy.
Inspection of the resulting trees (e.g., Fig. 4) indicates three major subgroups of FoxI genes in vertebrates. One subgroup, indicated as A in Figure 5, includes Danio foxi1 and its apparent Fugu ortholog (which we have named foxi1) together with putative orthologs from Xenopus laevis and X. tropicalis (which we have named X. laevis and tropicalis FoxI2, due to the previously described Xenopus FoxI1 genes). Our analysis suggests that this assemblage is an orthologous group that arose at the base of vertebrate evolution. If so, the absence of this ortholog in completed mammalian genomes would indicate a gene loss in the lineage leading to mammals. However, it is also possible that the sequence of mammalian FoxI1 proteins are substantially divergent, making it impossible for us to recognize as orthologs. The only expression data reported within this group is for zebrafish foxi1 (Solomon et al., 2003), and, as we have previously noted, this gene shares some similarity in expression with Xenopus laevis FoxI1c of subgroup B.
Subgroup B is composed of two well-supported groups: one contains only mammalian genes (FoxI2 and FoxI3) and another is composed of amphibian (FoxI1) and fish (FoxI2) genes. Our phylogenetic analysis does not explicitly support this group, but it does not strongly reject it either. Based on the observed phylogeny and the representation of vertebrate lineages, we propose that the mammalian, amphibian, and fish members actually comprise this single orthologous subgroup. Clear similarities in expression exist between fish and amphibian members of this subgroup (Fig. 5), consistent with their evolutionary relationship. However, no expression data for the mammalian members have been reported, which may provide additional support for this grouping.
Subgroup C has the strongest support of all the groups, and contains two zebrafish genes, foxi3a and foxi3b. These two genes are highly similar (56% amino acid identity) and appear to represent a recently duplicated gene pair. Also in this group are the previously described mouse, human, and rat orthologs (FoxI1) and additional Xenopus laevis, X. tropicalis, and Fugu members (FoxI3 genes). Of interest, this highly supported subgroup shows no conservation of expression pattern between mammals and fish (Fig. 5), although the expression patterns of several members of this subgroup, particularly the amphibians, remain uncharacterized.
We have identified three additional zebrafish FoxI class genes, foxi2, foxi3a, and foxi3b. Our phylogenetic analysis reveals that zebrafish foxi1 is more closely related to FoxI orthologs in other species than to these three zebrafish paralogs. Additionally, the expression patterns of these new genes are substantially different than foxi1. foxi3a and foxi3b are expressed within the mucous cells of the epidermis and appear to represent a recently duplicated gene pair. Both foxi1 and foxi2 are expressed in the eye and pharyngeal arch domain, but no otic expression was detected for foxi2, and additional expression domains for these genes are nonoverlapping. Together, these observations argue against the proposal that foxi1, together with one of these other FoxI paralogs, constitutes a recently duplicated gene pair; it is unlikely that any of these other FoxI genes share a partially redundant function in otic placode formation with foxi1. Interestingly, our phylogenetic analysis demonstrates that foxi3a and foxi3b, and not foxi1, are more closely related to, and likely orthologs of, the previously characterized mouse, human, and rat FoxI1 genes. The dissimilarity of expression between the zebrafish and mammalian FoxI genes of subgroup C, and the similarity of expression between zebrafish foxi1 (subgroup A) and Xenopus laevis FoxI1c (subgroup B), may indicate a shift of function between paralogs in different species. The zebrafish genome sequence coverage is currently at 3–4x (http://www.sanger.ac.uk/Projects/D_rerio/faqs.shtml -second), and approximately 200,000 EST sequences are available (http://zfish.wustl.edu/). However, it remains possible that additional zebrafish FoxI class members could be identified. Resolution of these hypotheses will require additional expression analysis, particularly from mammals, and continued database searches as the zebrafish genome sequence assembly becomes more complete.
Portions of foxi2, foxi3a, and foxi3b cDNA sequences were amplified from wild-type zebrafish 6s (foxi3a, foxi3b) or 26s (foxi2) cDNA libraries. A 799-bp foxi2 fragment was amplified with the primers foxi2-5′-2 (5′-cgtgcgcccgccatactcctactc-3′)/foxi2-3′-2 (5′-tggcgggtattcacgagcat-3′), a 543-bp foxi3a fragment with primers foxi3a-5′-2 (5′-ccggcgcaagaggaagagaaag-3′)/foxi3a-3′-1 (5′-aggcaccgcgatgagaactgg-3′), and a 644-bp foxi3b fragment with primers foxi3b-5′-1 (5′-ttgggggataacataaaagagaagg-3)/foxi3b-3′-1 (5′-tgtgtcgaattgagttttgccatcc-3′). PCR products were cloned into the pCRII-TOPO vector (Invitrogen). For antisense RNA probes, foxi2 plasmid was linearized with XbaI and transcribed with SP6 RNA polymerase, foxi3a was linearized with SpeI and transcribed with T7, and foxi3b was linearized with NotI and transcribed with SP6. Whole-mount in situ hybridization was performed as described (Thisse and Thisse, 1998).
Amino-acid sequence alignments are available upon request from the authors. Maximum likelihood distance analyses were determined by using Tree-Puzzle version 5.0 (Schmidt et al., 2002) with a JTT model of substitution, amino acid frequencies estimated from data set, and by using a mixed model of rate heterogeneity with one invariable and eight gamma rates. The gamma distribution parameter alpha was estimated from data set. Neighbor-Joining distance trees were constructed by Neighbor in PHYLIP version 3.6a3 (Felsenstein, 1995). Distance bootstrap analyses were carried out with Protdist (using JTT), Neighbor, and Consense in PHYLIP version 3.6a3 (Felsenstein, 1995). Maximum parsimony bootstrap analyses were done using PAUP* version 4.0b10 (Swofford, 1999), using 10 random addition replicates per bootstrap replicate.
We thank Drs. Bernard Thisse and Iain Drummond for their helpful insight in the interpretation of the expression patterns.