Syntenic Analysis of the Zebrafish tbx22 Gene
To identify the zebrafish tbx22 gene, homology was first established between the human TBX22 mRNA and Fugu tbx22 genomic nucleotide sequences. The Fugu tbx22 sequence was then used as BLAST bait to screen the Sanger Institute Danio rerio (Dr) database using Ensembl software. A putative zebrafish tbx22 ortholog was initially identified within contig ctg10123.1 of the Zv4 assembly, and a zebrafish tbx22 partial transcript was identified as ENSDART00000020757, found at Scaffold Zv5_NA5339: 72.86k. Identification of limited synteny between the human TBX22 and zebrafish tbx22 locus, consisting of an adjacent FAM46 group transcript, ENSDARESTT00000032536, convinced us to proceed to clone the zebrafish tbx22 gene from this site.
Cloning, Sequencing, and In Vitro Translation of Full-Length Zebrafish tbx22-1 and tbx22-2 Splice Variants
Primers chosen from the bioinformatically identified partial tbx22 coding sequence were used in 5′ and 3′ RACE reactions to generate a series of overlapping PCR products (Fig. 1A, PCR fragments a–c), which were subsequently used to generate full-length cDNAs (Fig. 1A, PCR fragments d-1 and d-2). Nucleotide sequence analysis of 5′ RACE products revealed two alternatively spliced zebrafish tbx22 transcripts, which were named tbx22-1 and tbx22-2 (Fig. 1B). The 1,856-bp full-length tbx22-1 cDNA spans 8 exons of the zebrafish tbx22 locus. The initiation codon prediction program ATGpr (Salamov et al.,1998) identified a start codon located at nucleotide 174 of the tbx22-1 isoform clone with a reliability score of 0.47, which predicts a Tbx22-1 protein of 444 amino acids. An additional in frame start codon located 54 nucleotides downstream of the first putative ATG start codon, and predicting a protein of 426 amino acids, had a much lower ATGpr reliability score (0.14), indicating that the first start codon is most likely the correct start site. The 1,969-bp full-length tbx22-2 cDNA start codon has an ATGpr reliability score of 0.43, and spans 7 exons. The tbx22-2 transcript retains an intron between exons 1-1 and 1-2 that is spliced out of tbx22-1, along with an additional upstream, in frame, stop codon (Fig. 1A). Zebrafish tbx22-2 encodes a predicted protein of 400 amino acids. Note that the tbx22-2 start codon is spliced out of the tbx22-1 transcript, along with an additional upstream, in frame, stop codon. In tbx22-2, the upstream ATG start codon for tbx22-1 is eliminated as a possible start codon by an in-frame stop codon located 147 nucleotides downstream. Nucleotide sequence comparison of tbx22-1 and tbx22-2 reveals the alternative splicing and ATG start codons, with otherwise identical nucleotide sequence (Fig. 2A).
Figure 1. Alternatively spliced zebrafish tbx22 transcripts, tbx22-1 and tbx22-2. A: Intron and exon maps of identified overlapping zebrafish tbx22 RT-PCR products are shown. The tbx22 RT-PCR product “a” spans exons 1–4, fragment “b” spans exons 4–8, and fragment “c” spans exons 7–8. RT-PCR products “d-1” and “d-2” represent the two full-length clones, tbx22-1 and tbx22-2, respectively, generated from 5′ and 3′ RACE products. Internal primers for fragment a were used in developmental RT-PCR analyses presented in Figure 5. B: The transcript map depicts the two alternatively spliced zebrafish tbx22 transcripts, tbx22-1 and tbx22-2. Exons 1–8 are illustrated by boxes, open reading frames are indicated by arrows, and T-box domains are indicated in black.
Download figure to PowerPoint
Figure 2. Detailed genomic organization of zebrafish tbx22-1 and tbx22-2. A: Nucleotide sequence comparison of the first three exons for tbx22-1 and the first two exons for tbx22-2 cDNA sequences distinguish the alternatively spliced isoforms. A Clustal W 1.83 alignment of the zebrafish tbx22-1 and tbx22-2 nucleotide sequences is shown. Coding sequence is shown in uppercase, and noncoding is shown in lowercase font. Asterisks indicate nucleotide identity, and dashes indicate gaps in the alignment. B: Multisequence alignments of the predicted amino acid sequences for zebrafish tbx22-1 and tbx22-2, as compared to human and mouse Tbx22 proteins, are shown in nexus format. The T-box domain (as defined in Muller and Herrmann,1997) is indicated in bold type. Functionally conserved amino acids within the T-box domain are indicated as follows: amino acids involved in Tbx22 dimerization are underlined, and amino acids involved in DNA binding are italicized. Dots represent identity, and dashes represent gaps in the alignment. Human TBX22, ENSP00000362393; Mouse Tbx22, ENSMUSP00000033593.
Download figure to PowerPoint
Figure 5. Developmental RT-PCR analysis of tbx22-1 and tbx22-2 mRNAs. A: The developmental expression of zebrafish tbx22-1 and tbx22-2 RT-PCR products, 402 and 515 bp, respectively, size fractionated in a non-denaturing TAE agarose gel. B:tbx22-1 and tbx22-2 RT-PCR products size fractionated in a denaturing, alkaline agarose gel. The non-denatured, heteroduplex, high molecular weight product present in A is eliminated in the alkaline denaturing gel. C: β-actin control RT-PCR products.
Download figure to PowerPoint
To determine which zebrafish tbx22 isoform gene product represents the more evolutionarily conserved canonical Tbx22, both isoform gene products were compared to the human TBX22 sequence by BLAST at NCBI, and found to have 68% amino acid identity (156/228) and 83% amino acid similarity (190/228) to human TBX22 (Fig. 2B). However, a multi-sequence alignment was found to be more informative. The predicted amino acid sequences of zebrafish tbx22-1 and tbx22-2 cDNAs were compared to human and mouse Tbx22 amino acid sequences (Fig. 2B). This comparison clearly shows that zebrafish Tbx22-1 and Tbx22-2 differ only at the amino terminal portion of the protein, and that zebrafish Tbx22-1 most closely resembles human and mouse Tbx22, as it aligns with a stretch of conserved positively and negatively charged amino acids just upstream of the T-Box, which are missing from zebrafish Tbx22-2. Curiously, both human and mouse Tbx22 have a conserved Groucho repression motif, “FSVEAL,” located just upstream of the conserved hydrophilic region, which is not present in zebrafish Tbx22-1. An earlier report identified this motif in the closely related human TBX 15, -18, -22, -20, -2, and -3 paralog groups (Copley,2005), suggesting that the Groucho repression domain was present in the common ancestors of Tbx22 paralogs, but was subsequently lost within the zebrafish lineage.
Coupled in vitro transcription and translation was used to confirm the predicted translational start sites for tbx22-1 and tbx22-2. Full-length cDNAs for both splice variants were cloned into expression vectors designed for the high protein yield in coupled transcription and translation reactions. Biotin-conjugated lysine tRNAs were used to label the 24 and 21 lysine residues predicted to be present in Tbx22-1 and Tbx22-2, respectively. Two amounts of plasmid template (2.5 and 7.5 μg) and two loading volumes (1 and 2 μl) were used to reveal all translation products using gel electrophoresis (Fig. 3). The predicted molecular weights for the two possible translation start sites for Tbx22-1 are 50.3 and 48.3 kD, respectively, and that of Tbx22-2 is 45.2 kD. The biotinylated translation products are expected to migrate at slighter higher molecular weights than non-biotinylated products. Translation products corresponding to the biotinylated 50.3-kD Tbx22-1 and 45.2-kD Tbx22-2 were observed. A putative 48.3-kD Tbx22-1 translation product was not observed, neither was there a small 5–6-kD Tbx22-2 translation product observed, which could have been generated from a tbx22-2 transcript using the same start codon as Tbx22-1. These results confirmed the location of the ATG start codon at bp 174 of the tbx22-1 cDNA, as predicted by the strong Kozak consensus sequence (Kozak,1996).
Figure 3. In vitro transcription and translation of tbx22 splice variants. Western blot analysis revealed distinct biotinylated Tbx22-1 and Tbx22-2 in vitro translation products corresponding to the predicted 50.3- and 45.2-kD Tbx22 isoforms, respectively. No putative 48.3-kD Tbx22-1 translation product was observed, consistent with the location of the Tbx22-1 start codon at bp 147. Plasmid concentrations used in the in vitro transcription reactions, and the amount of translation product loaded in each well, are as indicated. A no-template negative control is also shown. M, Molecular weight protein standard markers.
Download figure to PowerPoint
Phylogenetic Analysis of the Zebrafish tbx22 Gene
The predicted amino acid sequences of the zebrafish tbx22-1 and tbx22-2 cDNAs were used to confirm the identity of the zebrafish tbx22 transcripts, and to establish that no additional tbx22 paralogs were present in the Zv7, August 2007 zebrafish genomic assembly. Closely related human T-box-containing proteins with known craniofacial expression domains, and several other T-box proteins representing other paralogous groups, were used in phylogenetic analyses to help assign zebrafish T-box genes to the previously identified eight T-box paralog subfamily groups (Takatori et al.,2004). The abbreviated gene and protein sequence identifiers used in these comparisons are listed in FASTA format in Supplemental Figure 1.
Phylogenetic analyses of the predicted amino acid sequences of the 19 identified zebrafish T-box genes (Supp. Fig. S1) used only the 125 highly conserved amino acid positions previously used to define the major vertebrate T-box paralog groups (Ruvinsky et al.,2000). MEGA 3.1 software was used to generate the Maximum Parsimony (MP) phylogenetic tree shown below (Fig. 4). Neighbor-Joining (NJ) analyses of these same sequences produced a nearly identical tree topology (data not shown). The MP tree illustrates the strong separation of the Tbx15/18/22 T-box subfamily from the other paralogs (bootstrap value = 88). In addition, the zebrafish tbx22 sequence separation from the Tbx15 and 18 paralogs is also well supported (bootstrap value = 82). These analyses helped us to assign previously ill-defined zebrafish T-box gene sequences to their appropriate orthologous families. In conclusion, our phylogenetic analyses placed the predicted amino acid sequence for the newly identified zebrafish tbx22 into closest relationship to the human TBX22 sequence, with high bootstrap values ranging from 82 to 88. No other Tbx22-coding ESTs were identified in an extensive search of the ENSEMBL and NCBI databases, performed by BLAST search to identify all putative Tbox loci within the zebrafish genome and determining the predicted amino acid sequence for each, followed by global phylogenetic analysis to assign each to its appropriate orthologous group. In this way, our phylogenetic analyses indicated that the bioinformatically identified and cloned zebrafish tbx22 cDNAs likely represent the only zebrafish ortholog of the vertebrate Tbx22.
Figure 4. Maximum parsimony tree analysis of zebrafish Tbx22 within the eight major paralog subfamilies of T-box proteins. Paralogous zebrafish T-Box subfamily members found in the zebrafish ENSEMBL Zv7 assembly, release 46, were analyzed along with representative members of each paralog subfamily (see Supp. Fig. S1). Human TBX15, -18, -22, -1, -10, and -20 were included to provide more robust paralog family definitions. The maximum parsimony (MP) tree was constructed from the multi-sequence alignment of the 125 most conserved amino acids within the protein sequence (see Supp. Fig. S1). These data are presented as a condensed, topology-only maximum parsimony tree, rooted on the Ciona Brachyury T paralog group. Only bootstrap values above 50% are shown. The position of the zebrafish tbx22 is highlighted by a black oval.
Download figure to PowerPoint
Developmental Expression of Zebrafish tbx22 Isoform mRNAs
Based on the alternative splice sites for tbx22-1 and tbx22-2, PCR primers were designed to generate isoform-specific, 402-bp tbx22-1 and 515-bp tbx22-2 RT-PCR products. The developmental expression of tbx22 isoform mRNAs was examined by RT-PCR, using total RNA isolated from staged zebrafish at 1, 6, 19, 38, and 48 hr post-fertilization (hpf), 5 and 44 days post-fertilization (dpf), and 6-month-old adult zebrafish. Unexpectedly, three distinct PCR products of approximately 545, 515, and 402 bp were consistently generated (Fig. 5A). To confirm the identities of the PCR products, each band was isolated, cloned, and nucleotide sequenced. As anticipated, the 515- and 402-bp PCR products were found to be specific for the tbx22-1 and tbx22-2 isoform sequences, respectively. Nucleotide sequence analysis of 46 clones generated from the 545-bp PCR products revealed 20 tbx22-1 clones, 26 tbx22-2 colonies, suggesting that the larger-sized RT-PCR product was in fact an incompletely denatured heteroduplex of the two zebrafish tbx22 isoform products. Analysis of the RT-PCR products in denaturing gels revealed only the 515 and 402 bp (Fig. 5B), consistent with this interpretation. Developmental RT-PCR analysis revealed that tbx22-1 was expressed at all developmental stages examined, from maternal to adult, while tbx22-2 was not maternally expressed, but exhibited zygotic expression at all stages examined. The expression of zebrafish β-actin was used as an internal control (Fig. 5C).
WISH analyses were performed using the full-length tbx22 probe to characterize tbx22 expression in developing branchial arch and craniofacial structures. tbx22 was first detected in 28-hpf embryos, where discrete expression was observed in segmental paraxial mesodermal tissues (Fig. 6A, arrow), in a manner suggestive of pre-patterning of the vertebrae. This expression was quite transient; no tbx22 expression was detected in 26- or 30-hpf embryos. Distinct, bilateral tbx22 craniofacial expression domains were first detectable at 38 hpf, coincident with the onset of pectoral fin bud growth (Fig. 6B,C, arrows). Analyses of whole mount and sectioned 40-hpf embryos revealed bilateral tbx22 expression domains (Fig. 6D, E–G, respectively), localized to mesenchymal cells underlying the oral epithelial bilayer. These tissues develop into bifurcated upper and lower elements prior to the formation of mouth opening at 42 hpf (Fig. 6H,I). Medial to lateral serial sections of 42-hpf WISH-stained specimens revealed discrete tbx22 expression at each mouth corner (Fig. 6J–N). tbx22 was also detected in the jaw joint of 51- and 55-hpf embryos, in a similar pattern as bapx1 (Miller et al.,2003) (Fig. 6O,P, arrows), consistent with roles for tbx22 in primitive jaw joint formation.
The expression patterns of zebrafish tbx22 mirror some aspects of Tbx22 expression in the mouse (Bush et al.,2002; Herr et al.,2003), where tbx22 mRNAs were found to be expressed in mesenchymal tissue underlying the inferior nasal septum epithelium, which eventually fuses with the right and left palatal shelves to form the secondary palate. Mouse tbx22 mRNAs were also detected in caudal regions of the palatal shelves prior to fusion, and in basal anterior tongue mesenchyme (Bush et al.,2002; Herr et al.,2003). The expression pattern of zebrafish tbx22 is notable for its association with the stomodeum, the depression marking the future location of the mouth opening, and in the developing jaw joint. Expression of tbx22 in the developing mouse joint has not been reported. It is interesting that tbx18, a very close paralog of tbx22, is also expressed around the zebrafish mouth at 40 hpf, and in the palate of 3-dpf embryos (Begemann et al.,2002). In Xenopus, stomodeal ectoderm and endodermal layers progressively thin via apoptosis to form the primary mouth opening (Dickinson and Sive,2006). Although the zebrafish mouth opening is thought to form by a different mechanism, where changes in intercellular junctions and cell orientations occur prior to rupture of the oral membrane (Hamlett et al.,1996; Waterman and Kao,1982), tbx22 expression in the relatively simple zebrafish mouth formation correlates well with that of the comparatively complex palatal shelf development in the mouse and human.