Much organismal complexity observed within the vertebrate lineage has been attributed to whole genome duplication events. The subsequent relaxation in evolutionary constraint upon genes retained in duplicate has enabled the acquisition of novel functions whilst preserving the ancestral gene role (Force et al., 1999). Given the lack of correlation between morphological diversity and protein coding content of the vertebrate genome, it has been proposed that such functional innovations are largely influenced by diverging transcriptional regulation of duplicate genes (Levine and Tjian, 2003). Evidence of this can be derived from studies of gene families that have arisen from whole genome duplication events. Two whole genome duplication events gave rise to the vertebrate paired box gene subfamily that consists of PAX2, PAX5, and PAX8 (Wada et al., 1998). Collectively these transcription factors play critical roles in interneuron specification, the formation of the midbrain hindbrain boundary, and in aspects of kidney, eye, ear, and thyroid organogenesis. However, despite highly conserved functional protein-coding regions, members of this subfamily have assumed divergent functional roles, both between paralogues and across animal lineages. Given their complex and overlapping temporo-spatial patterns of expression during embryonic development, to date much of the literature has necessarily focused upon individual genes and specific aspects of the collective PAX258 role, generally in association with a particular animal model. Here, we distil these data from a number of species so that we can begin to understand how diverging transcriptional regulation of one gene subfamily can impact on the evolving animal body plan.
THE ORIGINS OF PAX2, PAX5 AND PAX8 GENES
The defining feature of the PAX gene family is the presence of a highly conserved 128–amino acid N-terminal paired domain (PD), which has bi-partite DNA-binding properties (Czerny et al., 1993; Mansouri et al., 1996; Chi and Epstein, 2002). In mammals, there are nine members of the PAX family, and these are divided into subfamilies (Fig. 1) based on the presence or absence of an additional DNA-binding domain in the form of a prd-like homeodomain (HD), and an octapeptide domain (OP), which can also bind DNA (the Groucho family of repressors) as well as inhibit transactivation. PAX258 genes are characterised by both an octapeptide domain and partial homeodomain, which still has DNA-binding properties (Eberhard and Busslinger, 1999). They also have a C terminal domain, which has been shown by deletion analysis to be modular and to have both activating and inhibitory properties (Dorfler and Busslinger, 1996). The genes are subject to extensive alternative splicing (Kozmik et al., 1993; Ward et al., 1994; Tavassoli et al., 1997; Zwollo et al., 1997; Mackareth et al., 2005), affecting their transactivation potential (Kozmik et al., 1993), expression, and DNA-binding properties (Zwollo et al., 1997). The vertebrate PAX subfamilies have approximate homology to four of the five Drosphila pax gene subfamilies and to other invertebrate pax gene classes B, C, and D (Fig. 1). The single Drosophila sparkling (also known as shaven and D-Pax2) and the invertebrate class B pax genes are orthologous to the vertebrate PAX258 subfamily, which is derived from two whole genome duplication events (Fig.2). The first led to two copies of the gene, one resembling PAX2/5 and the other PAX8. The second duplication event, followed by loss of one of the PAX8 duplicates, resulted in the presence of PAX2, PAX5, and PAX8 genes in all jawed vertebrates. In teleosts, a further duplication event resulted in two copies of pax2 being retained in zebrafish, pax2a and pax2b (Pfeffer et al., 1998). The previous nomenclature of zebrafish pax2a can be somewhat confusing since it was originally referred to as pax[b] or pax[zf-b]. Pax2a and pax2b have also been known as pax2.1 and pax2.2, respectively, and this nomenclature is commonly used in other teleosts where two pax2 genes have been identified.
IN BASAL-DERIVING ANIMAL GROUPS, THE SINGLE PAX GENE MOST RESEMBLES PAX258/B
Whilst most PAX genes play a key role in neural development, their existence predates neurons in the animal lineage (Fig. 2). The most basal diverging animal groups, the Porifera and the Placozoa, lack these cells and yet both possess a single pax gene with closer homology to the paxB subfamily than to any other pax subfamily (Hoshiyama et al., 1998; Hadrys et al., 2005). Whilst this gene has paxB-like paired box and octapeptide domains, unlike the paxB subfamily, it has a full homeodomain (HD). This has also been observed in the Cnidaria (Groger et al., 2000; Miller et al., 2000; Sun et al., 2001; Kozmik et al., 2003; Matus et al., 2007), whereas all Bilaterian animal groups have only a partial HD. However, differences in the residue content of the Hydrozoan Pax-B proteins are likely to affect the DNA binding ability of the HD at its carboxyl end, possibly resulting in partial function (Miller et al., 2000).
Expression of the pax2/5/8/B subfamily in neural cells appears universal across all animal groups that possess this cell type (Fig. 2). The morphologically simple placozoan, Trichoplax adhaerens, exhibits PaxB expression in distinct cell patches close to the outer edge of the animal (Hadrys et al., 2005). It has been proposed that this expression domain consists of proliferating and differentiating cells and that the PaxB gene may play a role in specifying fibre cells, which are putative proto-neural or muscle cells (Hadrys et al., 2005).
AN ANCIENT ASSOCIATION WITH THE EYE
Members of the pax2/5/8/B subfamily also play an enduring role in both mechanosensory and photosensory cells across various animal lineages (see Fig. 2). PaxB is the only pax gene found in the box jellyfish, Tipedalia cystophra, which belongs to the most basal animal phylum that has evolved eyes (Kozmik et al., 2003). It is expressed in both the eye and the statocyst, a geosensory organ. Interestingly, it has dual pax258 and pax6 type properties. Phylogenetic profiling of the paired domain clusters it with the pax258 subfamily, it has an octapeptide domain that is characteristic of pax258 genes and the C-terminal transactivation domain has a structure that is more similar to pax258 than to pax6 (Kozmik et al., 2003). In addition, the paired domain possesses pax2-specific binding properties, enabling the activation of the J3-crystallin promoter. PaxB is also able to rescue sparkling (spapol), a Drosophila D-Pax2 eye mutant. However, whilst the octapeptide domain is essential for the interaction of pax258 proteins with corepressors of the groucho family, when the PaxB gene was tested in bandshift assays no such interaction was evident. Consequently, it appears that the octapeptide domain in PaxB does not possess all pax258-type functions.
The full HD domain of the paxB gene is more characteristic of pax6, and like the Drosophila pax6 co-orthologue Ey, it has been shown to enable the activation of rhodopsin gene expression via binding sites in the proximal promoter region (Kozmik et al., 2003). A distinguishing feature of pax6 is that it is sufficient to induce ectopic eyes in Drosophila, binding via the paired domain to the promoter of the pax6 homologue ey (Nornes et al., 1998). Not only has PaxB demonstrated this ability, but also the Drosophila D-Pax2 (Kozmik et al., 2003), which suggests that the latter retains some pax6 function. The fact that the spapol phenotype is also rescued by the Drosophila pax6 co-orthologues Ey and Toy suggests they can substitute for pax2, just as D-Pax2 is able to substitute for pax6 function in eye induction (Kozmik et al., 2003). Therefore, prior to the divergence of the diploblasts and triploblasts, the pax gene associated with the eye had both pax258 and pax6 type properties. These subsequently became distinct in the vertebrate lineage but remnant pax258 properties were retained in the Drosophila pax6 orthologues and vice versa.
AN ANCIENT ASSOCIATION WITH THE EAR
As with the eye, the role of the PAX258 gene family in developing mechanosensory systems appears to be ancient. Expression of pax258B has been detected in the statocyst (an ancient geosensory structure) of jellyfish (Kozmik et al., 2003) and abalone (O'Brien and Degnan, 2003) in addition to the second antennal segment of Drosophila, within precursors to sensory organs including the Johnston's organ, the auditory device of flies (Fu et al., 1998). In basal chordates, pax258 expression in the otic domain sparked a debate as to the existence of sensory placodes outside the vertebrate lineage. The correlation of pax258 gene expression with ectodermal thickenings in the ascidian atrial promordium, thought to be homologous to the vertebrate inner ear, is suggestive of homology to vertebrate sensory placodes (Wada et al., 1998). Equally, the correlation of amphioxus AmphiPax258 gene expression with ectodermal openings is indicative of a role in fusion adhesion and perforation (Kozmik et al., 1999). Both correlations have been recently observed in the larvacean Oikopleura dioica, since the regions of ectodermal thickenings/openings are topographically separate (Bassham et al., 2008).
PAX2/5/8/B GENE EXPRESSION AND THE TRIPARTITE BRAIN
Expression of the pax258 genes in the developing nervous system of non-vertebrate chordates has also raised speculation about the existence of an equivalent structure to the vertebrate midbrain–hindbrain boundary (MHB) (Wada et al., 1998; Krelova et al., 2002; Ikuta and Saiga, 2007). In vertebrates, this forms part of the tripartite structure of the developing brain, intervening an anterior mid/forebrain and posterior hindbrain (Martinez, 2001; Rhinn and Brand, 2001). Molecularly, it is marked by the coexpression of PAX2,5, and 8, Fgf8, En, and Wnt at the junction between abutting rostral Otx and caudal Gbx expression. It is also located in the gap between Otx and caudal Hox1 expression, which corresponds to the “neck” region of the ascidian Halocynthis roretzi where the HrPax258 gene is expressed (Wada et al., 1998). However, although there is similar expression in Ciona intestinalis and Ciona savignyi, the other MHB markers are absent from this domain (Jiang and Smith, 2002; Ikuta and Saiga, 2007). Also pax258 expression immediately posterior to the otx domain is absent in the larvacean urochordate Oikopleura dioica (Canestro et al., 2005) and in the cephalochordate amphioxus (Kozmik et al., 1999). It has been suggested that rather than possessing a homologue to the vertebrate MHB organiser as marked by the co-expression of a number of genes, in other chordates homologous gene sets are required for the development of specific neurons (Canestro et al., 2005). The restriction of “MHB” gene expression in some ascidians may be a consequence of functional necessity to direct neuron specification from a single site in the nerve cord (Lacalli, 2006). Whether tripartite organisation of the brain is entirely a vertebrate characteristic is still not entirely resolved, since both Hemichordates (Lowe et al., 2003) and Ecdysozoans (Hirth et al., 2003) show co-expression of MHB marker homologues in the otx-gbx interface. However, recent analyses in Drosophila revealed that there is no expression of Pax2/5/8-related genes in this interface during the early period of brain development (Urbach, 2007) and in the developing vertebrate brain specification of the MHB is one of the earliest events.
The tripartite otx-pax258-hox1 organisation of gene expression has also been recently reported in the endostyle of the larvacean urochordate Oikopleura dioica (Canestro et al., 2008). This offers interesting parallels between the anterior-posterior patterning of the endodermally derived endostyle and the ectodermally derived CNS.
DIFFERENTIAL PAX258 ROLES AFTER DUPLICATION IN CHORDATES
Figure 2 illustrates the many commonalities in PAX258B gene expression across the animal lineage. Their expression in the central nervous system, excretory system, eye, ear, and thyroid has been observed in all vertebrates including lamprey (McCauley and Bronner-Fraser, 2002). Similar expression has also been observed in invertebrate chordates, which have a single pax258 gene in cephalochordates, (Kozmik et al., 1999; Krelova et al., 2002) and two pax258 genes in urochordates via an independent duplication event (Wada et al., 1998; Canestro et al., 2005; Ikuta and Saiga, 2007). However, whilst the collective role of PAX258B genes has endured, subsequent to the vertebrate/urochordate duplication events there are many instances of partitioning of the ancestral sub-functions (see Fig. 3), in addition to neo-functionalisation.
When compared to members of the vertebrate PAX258 gene subfamily, the single AmphiPax258 gene has similarities in expression, structure, and DNA-binding specificity (Krelova et al., 2002). The independent duplication event in other non-vertebrate chordates, the urochordates, means that these are the most basally derived animal lineage for which there is evidence of functional partitioning of pax258. Comparisons of larvacean and ascidian pax258 gene expression (e.g., around the oral opening) imply that some of this partitioning occurred prior to the lineage divergence, with the expression of AmphiPax258 (Kozmik et al., 1999) representing the sum of urochordate pax258a and pax258b expression (Bassham et al., 2008; Hiruta et al., 2005). In other domains, such as the endostyle (homologous to the vertebrate thyroid), the two copies have subfunctionalised in the larvacean lineage, but exhibited a loss of function in one copy in the ascidian lineage (Bassham et al., 2008; Hiruta et al., 2005). Therefore, the function of duplicated genes has differentiated within the basally derived chordates.
Whilst much of the non-vertebrate chordate pax258 expression is detected in homologous domains to the vertebrate PAX258 genes, as discussed above there does not appear to be a structure equivalent to the MHB. In vertebrates, the transcription factor engrailed acts downstream of Pax2 and possesses a Pax2-binding site in its proximal promoter region. The lack of a discernable AmphPax258 binding site in the AmphiEn regulatory region implies the absence in stem chordates of at least one essential gene interaction for MHB formation (see Fig. 7 and Table 3).
OVERLAPPING AND DIFFERENTIAL EXPRESSION OF PAX258 GENES IN VERTEBRATES
The MHB is a critical organiser centre required for patterning and neural differentiation of the midbrain and anterior hindbrain and, with the exception of Pax8 in frog (Heller and Brändli, 1999), is one region where all three members of the vertebrate PAX258 gene family are expressed (Fig. 3 and see Table 1 for references). Elsewhere, both overlapping and differential roles are evident in members of the PAX258 gene subfamily and these can differ across the vertebrate lineages. This is schematised in Figures 4 and 5 and described in Table 1, where the reader is directed for relevant references. In the case of chick, much of the literature relates to pax2, but additional information can be found at http://geisha.arizona.edu/geisha/ (Bell et al., 2004). Within the vertebrates studied to date there are many commonalities in PAX2 function. Early interest in this gene came from a frameshift mutation in the octapeptide domain that resulted in the various phenotypes exhibited in Renal Coloboma Syndrome (RCS) (Sanyanusin et al., 1995). These present as eye malformations, kidney abnormalities, and mild sensorineural deafness. Similar phenotypes are exhibited in Krd mice mutants, which have a deletion that includes Pax2 (Keller et al., 1994), mouse Pax2NEU1 mutants, which have a frameshift mutation and provide a model for RCS (Favor et al., 1996) and Pax2 null mice (Torres et al., 1995). The zebrafish pax2 noi mutant carries a nonsense mutation in the paired box resulting in a truncated protein lacking DNA binding and transactivation function (Lun and Brand, 1998). This also displays eye, kidney, and otic phenotypes, reflecting a prevailing role of PAX2 in the development of these structures in vertebrates.
Table 1. Details of Spatial Expression in Developmental Domains Where the Expression of PAX258 Genes Overlapa
Domain and development
The development of each domain is given in the first column, introducing abbreviations used in Figure 4, which schematises these descriptions. PAX258 expression is outlined for four vertebrate animal models: Chick (C), Fish (F), Mouse (M), and Frog (X). We have indicated where there is no data (ND) available. Spatial expression is described for the single tetrapod Pax2, the teleost pax2 co-orthologoues, pax5, then pax8. More detailed information is available for the pax5 expression in teleosts and is, therefore, listed separately. References are numbered on the right and given in superscript in the remainder of the text. Useful reviews (R) are also listed.
MHB: Also known as the isthmus (or isthmic organiser IsO). Critical for the patterning of the midbrain and cerebellum.
Pax2/5 /8 Ci, Fiv,v; Miv, vi,vii
At isthmus and posterior midbrain, pax5 expression is broader extending into anterior hindbrain
Spinal cord: Early influence of SHH and BMP signalling establishes specific dorsal-ventral domains in the ventricular zone (VZ). Here progenitor cells exit cell cycle, migrate laterally, and the combinatory expression of transcription factors in post-mitotic cells impact on neural fate.
Pax2/5/8 Cii,iii, Fi,ix, Miii,viii,x
Pax2: in newly postmitotic cells subsequently occupying one large ventral and a smaller dorsal population. Also expressed later in the DILA Cii,iii; Fi,vi,vii; Miii,x, Xiv
Eye: The optic cup forms by invagination of the optic vesicle (OPV). Ventrally, the choroid fissure (CF) channels blood vessel access and here retinal axons exit via the optic stalk (OS), which later transforms into the optic nerve (ON). Axons navigate towards the optic chiasm (OC) at the midline where they may or may not cross depending upon the visual systemiii (see Fig. 4).
Pax2 Cix, Fi,iv,vii, Mi,ii,v,x, Xiii
Pax2, co-expressed with Pax6 in the early OPV Mii, subsequently in the CF and OS/N, Cix, Fvii, Mv,viii,x
Ear: Otic placodes (OP) form as paired ectodermal thickenings. Invagination/cavitatation forms the otocyst/otic vesicle (OV). Inner ear chambers arise from folding and differentiation. These contain sensory epithelia (SE), consisting of hair cells (HC) and support cells (SC). The maculae in the saccule (S) and utricle (U) are the first SE to form, followed by the cristae in the semicircular canals (SCC). HCs synapse with neurons of the statoacoustic ganglion (SAG). HC function requires the correct ion balance of the endolymph (the inner ear fluid) maintained via the endolymphatic duct/sac (ELD/S).
Pax2/5/8 Fvi,,ix,xi, Xii
Pax2: Maculae, cristae and SAG but also nonsensory regions of the cochlear (C), lateral SSC, and the ELD/S. Ciii,v,xii, Mi,vii.viii
Kidney: Derives from the intermediate mesoderm (IM), located between the paraxial and lateral plate mesoderm (PM, LM). First visible as the pronephric/Wolffian ducts (PD/WD) followed by formation of pronephric tubules (PT). The mature mammalian kidney develops when the ureteric bud (UB) induces condensation and proliferation of the metanephric mesenchyme (MM). In the mesonephros, the tubules and WD are transformed into the male genital tract whereas the female genital tract derives from the Mullerian duct (MD), which forms in the IM later, but in parallel to WD.
Pax2/8 Fiv,vi, Mi,ii, Xiii
Pax8 expression: Early with later expression of Pax2. Both are expressed in the IM, PD/WD, PT, MD, UB, and MMi-vi,
Pax2 expression overlaps with both Pax5 and Pax8 in the developing mammalian spinal cord (Pillai et al., 2007). Figure 4 shows spinal cord Pax2 expression in the context of other transcription factors since as well as impacting upon neural fate (for review, see Lewis, 2006), their combinatorial expression also influences neurotransmitter properties (Cheng et al., 2004, 2005; Pillai et al., 2007; Batista and Lewis, 2008). The role of pax5 in spinal cord development has not, however, been fully retained in other vertebrates, since in teleosts pax5 expression is only transient and very weak in these cells (Pfeffer et al., 1998), and undetectable in frog (Heller and Brandli, 1999). Both Pax2 and Pax8 are also involved in cell fate specification induction and maintenance of the otic domain (Burton et al., 2004; Mackereth et al., 2005) but whilst pax5 is involved in teleost ear development, regulating the development of the utricular macula (Kwak et al., 2006) and has been detected in the otic domain of frog (Heller and Brandli, 1999) and chick (http://hg.wustl.edu/lovett/projects/nohr/inner_ear_ratio.html), it has not been detected in mouse. Expression of this gene is also absent from the eye, in which PAX2 is required for morphogenesis and retinal axon guidance (Favor et al., 1996; Macdonald et al., 1997; Pfeffer et al., 1998). Whilst teleost pax8 expression in the ventral eye overlaps with the pax2 co-orthologues (Pfeffer et al., 1998), its role in this domain is unclear, and expression in the eye of other vertebrates has not been documented. The intermediate mesoderm is another domain where PAX2 and PAX8 are co-expressed (see Table 1 for references) and co-operatively control the development of the urogenital system (Bouchard et al., 2002), whereas expression of PAX5 and teleost pax2b is absent from this domain.
Perhaps the most distinctive division in the way in which members of the PAX258 subfamily have retained their ancestral function occurs in the thyroid. In humans, PAX8 has been associated with congenital hypothyroidism due to point mutations in the paired domain severely affecting its DNA binding capacity (Macchia et al., 1998). In mouse, this gene is required for the differentiation of endoderm cells into thyroid follicles (Mansouri et al., 1998) and together with Foxe1, Hhex, and Titf1 forms part of a regulatory network for thyroid morphogenesis. Surprisingly, when the frog was assessed for pax258 gene expression, pax8 was absent from this domain. Instead, pax2 was the member of the subfamily detected in the thyroid (Heller and Brandli, 1999). In teleosts, on the other hand, not only was pax8 detected, but also pax2a (Pfeffer et al., 1998). Therefore, the role in thyroid development was retained differentially both amongst the paralogues, but also amongst the teleost co-orthologues, since pax2b is not expressed in this domain. As described previously, this differential retention of the pax258 subfunction has also been observed within the urochordate endostyle, a homologue of the vertebrate thyroid.
UNIQUE ROLES ACQUIRED BY MEMBERS OF THE PAX258 GENE FAMILY
As well as differential retention of the ancestral PAX258 gene function, members of this subfamily adopted unique roles. The fundamental interest in the PAX5 gene is its key role in B-cell differentiation and maintenance (Nutt et al., 1999) and strong association with leukaemia (O'Neil and Look, 2007). A recurrent t(9;14)(p13;q32) translocation in small lymphocytic leukaemia juxtaposes the strong Eμ enhancer of the IgH gene upstream of the PAX5 exon 1A (Busslinger et al., 1996). Although the breakpoint on chromosome 9 left the PAX5 protein coding sequence intact, its dissociation from the full complement of regulatory elements could contribute to an observed increase in expression levels in cell lines. In addition, both up- and down-regulation of PAX5 can result in leukaemia (Cobaleda et al., 2007; Souabni et al., 2007).
Undetectable by immunohistochemistry, Pax2 has been found to be expressed in the pancreas and was first detected in rat islet cell lines by RT-PCR (Ritz-Laser et al., 2000). The role of Pax2 in pancreas formation is as yet obscure (see Fig. 5), but in vitro it can transactivate the glucagon gene promoter (Ritz-Laser et al., 2000). In the mouse Pax21NEU mutant, the volume occupied by the pancreas is increased due to a larger number and size of insulin containing β-cells (Zaiko et al., 2004). This may implicate Pax2 in controlling the correct proportion of these cells to the glucagon containing α-cells in the developing pancreas. Such sensitive detection methods have revealed a novel Pax2 function, unique amongst the PAX258 subfamily. Given the subtle phenotype, more cryptic PAX258 functions may yet be discovered.
THE FUNCTION OF PAX258 GENES
The cumulative PAX258 expression data given here can only be suggestive rather than indicative of gene function. Instead, the more clearly identifiable PAX258 mutant/misexpression phenotypes provide a perspective. Here we have summarised phenotypes in Table 2, with the relevant references.
Table 2. Mutant/Misexpression Phenotypes in Developmental Domains Where the Expression of PAX258 Genes Overlapa
Genes are listed in order of gene (PAX2, 5, then 8) and species (human, mouse, fish, and chick). Mutant phenotypes are listed first, then misexpression/knockdown phenotypes. Where documented, the affect on PAX258 expression is given in the adjacent column. Abbreviations correspond to those used in Table 1. Numbers for references are given in superscript, and animals described are abbreviated as C, chick; F, zebrafish; H, human; M, mouse.
Misexpression of Pax2: Induction of ectopic nephric structures in the IM Ci
SETTING THE BOUNDARIES
In many instances during the development of an embryo, boundaries between tissue-specific territories are set via mutually antagonistic signalling. Members of the vertebrate PAX258 gene subfamily are involved in such interactions in the developing eye, brain, and kidney. The unique organisation of the vertebrate brain into a tripartite structure involves an early territorial division of the neural tube into fore-, mid-, and hindbrain primordia. This requires the cooperation of pax2/5 and pax6 in regionalising the boundaries of these domains such that pax2/5 demarcates the midbrain vesicle and is nested between rostral and caudal pax6 expression domains (Schwarz et al., 1999). Similarly, in the developing eye, pax2 and pax6 respectively regionalise ventral and dorsal domains via reciprocal repression (Schwarz et al., 2000) (see Fig. 4). Conversely, earlier in eye development these two genes are also able to function redundantly when establishing the retinal pigment epithelium (Baumer et al., 2003), perhaps reflecting their functional commonalities elsewhere in the animal lineage as discussed previously. In the developing kidney, there appears to be mutual antagonism between Pax2 and Wt1 in establishing the boundary between the pronephric tubule and the podocyte, a basket-like structure that forms the blood filtration barrier to capillaries (Yang et al., 1999; Majumdar et al., 2000; Drummond, 2003). In each of these cases, loss of one interacting gene leads to the territorial encroachment by the other.
INDUCTION, SPECIFICATION, AND DIFFERENTIATION
Evidence from mutant phenotypes indicates that PAX258 genes are involved in multiple steps of embryonic development. Analyses of pax2/5 mutants suggest that the establishment of the MHB, described above, requires Pax2 for initiation and Pax5 for maintenance of this structure (Schwarz et al., 1997). Both are required for cell fate determination, since ectopic expression of Pax2 or Pax5 leads to the transformation of the diencephalon into tectum-like structures (Funahashi et al., 1999; Okafuji et al., 1999). Cell fate specification is also influenced by pax2/8 in the spinal cord, where absence of these genes specifically affects the neurotransmitter properties of interneurons (Batista and Lewis, 2008). The influence of PAX258 genes upon cell specification and differentiation pervades the majority of domains where they are expressed, such as the inner ear (Riley et al., 1999; Burton et al., 2004), developing kidney (Majumdar et al., 2000; Bouchard et al., 2002; Narlis et al., 2007), and egg-laying system of the nematode (Chamberlin et al., 1997). PAX5 alone is fundamental for the specification of the B-cell lineage (Nutt et al., 1999; Carotta and Nutt, 2008; Holmes et al., 2008), indicative of the deep-seated function of PAX258 paralogues in cell specification.
PAX258 gene expression is often located in regions that will form ectodermal openings (see Fig. 2), and there is evidence that members of the subfamily are required for the morphogenesis of these openings. In mouse, a recurrent Pax2 mutant phenotype is vaginal atresia and possible obstruction of the male genital duct (Bouchard et al., 2002). The C. elegans pax2 orthologue, egl 38, is also required for correct vulval and uterine morphogenesis (Chamberlin et al., 1997). As well as regulating ectodermal openings, members of the PAX258 gene subfamily are also required for the correct closure of a temporary embryonic groove in the eye, the choroid fissure (Sanyanusin et al., 1995; Favor et al., 1996; Torres et al., 1996; Macdonald et al., 1997; Schwarz et al., 2000). In mammals Pax2 is necessary for morphogenesis of the cochlear and endolymphatic duct (Nornes et al., 1990; Torres et al., 1996; Burton et al., 2004). In mutants, outgrowth of the cochlear duct is arrested early and this is accompanied by massive cell death (Favor et al., 1996; Burton et al., 2004). Therefore, as well as correct initiation and specification, members of the subfamily are required for cell survival. This is further suggested by the kidney mutant phenotypes, in which early structures form but then degenerate (see Table 2).
The ability to protect cells from apoptosis has been demonstrated in the C. elegans PAX2/5/8 orthologues egl-38 and pax-2, where mutants exhibit increased cell death and induced expression reduces apoptosis in the germ line of wild type animals (Park et al., 2006). Apoptosis has also been detected in the intermediate mesoderm during kidney development in Pax2/8 mutants (Bouchard et al., 2002; Narlis et al., 2007), although the influence of Pax2/8 is probably indirect, since apoptosis is delayed. In the MHB, downstream targets of Pax2 regulation include modulators of cell signalling pathways involved in cell proliferation and survival (Bouchard et al., 2005). Their prevalent association with cancer (Dressler and Douglass, 1992; Eccles et al., 1992; Kozmik et al., 1995; Poleev et al., 1995; Stuart et al., 1995; Busslinger et al., 1996; Muratovska et al., 2003; Fonsato et al., 2006; Gibson et al., 2007; Mullighan et al., 2007; Souabni et al., 2007) further exemplifies the role of PAX258 genes in apoptosis resistance and cell survival. They are frequently expressed in tumour-derived cell lines, where down-regulation of expression correlates to increased cell death (Muratovska et al., 2003; Fonsato et al., 2006; Gibson et al., 2007) and enhanced expression of tumour suppressor genes (Fonsato et al., 2006). Conversely, in normal cells lines PAX2 introduction results in apoptosis resistance with increased proliferation and angiogenesis (Fonsato et al., 2006). Therefore, the correct balance of PAX258 expression is required during both embryonic development and in the adult animal.
THE EFFECT OF PAX258 GENE DOSE ON DEVELOPMENTAL PROCESSES
The retention of duplicate genes provides a robust environment for correct embryonic development, and this is evidenced by studies of single as opposed to double or triple knockdown experiments (see Table 2). In many cases, mutations/knockdown of single genes can produce only a mild or undetectable phenotype, suggesting compensation by other members of the subfamily. Compound knockdown experiments corroborate this and there is growing evidence of functional redundancy amongst the PAX258 genes.
FUNCTIONAL REDUNDANCY AND EQUIVALENCY WITHIN THE PAX258 GENE FAMILY
As suggested by the observed milder phenotypes that arise from single knockdowns, there is a level of functional redundancy within the PAX258 gene subfamily and once again there are lineage specific differences. The absence of an MHB in zebrafish noi mutants is accompanied by loss of pax5/8 expression in this domain (Pfeffer et al., 1998). In contrast, highly strain-specific MHB phenotypes in mouse (Favor et al., 1996; Schwarz et al., 1997) could in part be due to compensation by Pax5, since Pax5 expression persists (although highly reduced) and Pax2/5 double mutants ablate the structure (Schwarz et al., 1999).
Knockdown experiments have provided further evidence of functional redundancy in other domains. Mouse Pax2 mutants show abnormal cochlear development with strain-specific severity (Nornes et al., 1990; Torres et al., 1996; Burton et al., 2004) with a loss of spiral and vestibular ganglia (Burton et al., 2004). In contrast, PAX8 mutants appear to have normal ear development, which may be due to compensation by Pax2 (Bouchard et al., 2002). In the case of zebrafish, mutant phenotypes are relatively mild. Whilst the formation of the otic placode and vesicle is unimpaired in pax2a single mutants, there is an increase in initial hair cell production in the utricular macula (Whitfield et al., 2002). Pax2b morpholino knockouts impair early production of hair cells, which later form normally (Whitfield et al., 2002). These co-orthologues, therefore, appear to have subfunctional roles within the otic domain. It has been suggested that pax2b is required for the hair cell initiation and that pax2a regulates the lateral inhibition of hair cell formation, restricting these cells to the endogenous domain (Whitfield et al., 2002). In the case of zebrafish pax8, when the translation of all splice form variants is blocked by morpholino injection, hair cell production is reduced and the anterior-posterior dimensions of the otic placodes are halved (Mackereth et al., 2005). The phenotype of the pax8 morphant in a pax2a− mutant background is more severe in that embryos rarely form an otic vesicle, although residual otic cells persist (Hans et al., 2004). When all three genes are knocked down, otic induction is comparable to pax8 morphants, but maintenance of the otic structures is not sustained so that unlike the pax8 depleted pax2a− mutants, at 24 hpf no otic tissue can be detected, and since there is no evidence for cell death this is likely to be due to dedifferentiation of otic cells (Mackereth et al., 2005).
Functional redundancy has also been demonstrated in the spinal cord and in the kidney. In single or double knockdown experiments, the spinal cord of zebrafish Pax2 mutants only exhibits a mild phenotype, but in a triple knockdown there is a dramatic reduction in the number of cells with inhibitory neurotransmitter properties (Batista et al., 2008). Whilst mouse Pax8 mutants develop a normal urogenital system, possibly due to compensation by Pax2 (Mansouri et al., 1998), in Pax2 mutants the Wolffian duct initially forms but fails to extend to the cloaca and later degenerates leading to the absence of the metanephros and genital tracts (Torres et al., 1995; Favor et al., 1996). The zebrafish noi mutant also initially forms pronephric primordia but the differentiation of the tubule and anterior pronephric duct is abnormal resulting in loss of these structures (Majumdar et al., 2000). Compound Pax2−/−/Pax8−/− mouse mutants indicate that Pax2 and Pax8 cooperatively control the development of the urogenital system in that in the absence of both genes, embryos are unable to form the pronephros. Due to the failure of mesenchymal-epithelial transitions, later structures are also absent (Bouchard et al., 2002).
Whilst during normal development, members of the PAX258 subfamily appear to function differentially, these examples of functional redundancy are indicative of remnant functional equivalency. Compelling evidence for such functional equivalency came from an experiment with Pax5 in mouse. Here a Pax5 minigene (cDNA) was inserted into the Pax2 locus creating a knock in (Pax25ki/5ki) mouse line (Bouchard et al., 2000). This effectively rescued the MHB phenotype of the C3H/He strain, corroborating functional redundancy of Pax2 and Pax5 in this domain. However, in addition to rescuing this phenotype, the Pax25ki/5ki was also able to rescue eye, ear, and urogenital phenotypes. Therefore, although Pax5 is not normally expressed in these domains, given the correct regulatory environment, it is biochemically capable of substituting for Pax2 function.
THE ROLE OF CIS-REGULATORY ELEMENTS IN DIFFERENTIAL PAX258 GENE FUNCTION
Given their highly conserved protein-coding regions, this evidence of functional equivalency strongly implicates cis-regulatory elements as candidates responsible for the differential roles/expression patterns within the PAX258 gene family. An important class of putative cis-regulatory elements of developmental genes are readily characterised by their extraordinary conservation across multiple vertebrate species. These are referred to as Conserved Non-coding Elements, CNEs (Woolfe et al., 2005; Venkatesh et al., 2006), Ultra-Conserved Elements, UCEs, (Bejerano et al., 2004), or Ultra-Conserved Regions, UCRs (Sandelin et al., 2004), and can be found in databases such as Condor (http://condor.fugu.biology.qmul.ac.uk/; (Woolfe et al., 2007) and the Enhancer Browser (http://enhancer.lbl.gov/) (Visel et al., 2007). Whilst these unusual elements are in the early stages of being fully functionally characterised, their potential role in cis-regulation has been corroborated by demonstrable enhancer activity in zebrafish (de la Calle-Mustienes et al., 2005; Shin et al., 2005; Woolfe et al., 2005), mouse (Pennacchio et al., 2006), chick (Sabherwal et al., 2007), and frog (de la Calle-Mustienes et al., 2005). The PAX2 locus is occupied by almost 60 CNEs (Fig. 6) that are conserved across vertebrates and that in human spans a range of 265 kb. Interestingly, a remarkable number (almost three quarters) of these CNEs are retained in both of the teleost pax2 co-orthologues, perhaps reflecting the overlapping functions of these duplicates. There is a striking difference in the number of CNEs populating the PAX5 and PAX8 loci, with PAX5 only possessing 16 CNEs and PAX8 a mere two (Fig. 6). Of these, four of the PAX5 and one of the PAX8 CNEs have also been retained in PAX2 (and both teleost pax2 co-orthologues). Many have of these elements have shown enhancer activity (http://condor.fugu.biology.qmul.ac.uk/) (Woolfe et al., 2007), suggesting a cis-regulatory function. Therefore, this intriguing distribution of CNEs subsequent to whole genome duplication events perhaps contributes to the differential ancient and emerging roles of this gene family.
The contribution of cis-regulatory elements to various aspects of PAX258 expression can be derived from transgenic lines (examples of these are mapped onto Fig. 6). Short zebrafish pax2a constructs containing 0.8–2.4-kb upstream sequence can recapitulate some endogenous gene expression, but the reproducibility of this transient expression is unreliable with strong ectopic expression (Picker et al., 2002). In contrast, longer constructs, containing 4.5- and 5.3-kb upstream sequence give more stable and reproducible results with expression in endogenous domains. However, ectopic expression is still observed and whilst some endogenous expression is faithful, in other domains there are temporo-spatial differences with complete absence of expression from the optic stalk (see Fig. 6). A similar result is obtained from a mouse transgenic line containing 8.5 kb upstream sequence of Pax2 (Rowitch et al., 1999). Also, departures from endogenous expression have been observed in a mouse Pax2-Cre line, despite incorporating 101 kb of upstream and 20 kb of downstream sequence (Ohyama and Groves, 2004).
These valuable studies clearly demonstrate that faithful recapitulation of spatiotemporal endogenous gene expression requires that the entire regulatory architecture be in place. As part of this architecture, the readily identifiable CNE population can put these constructs in context (Fig. 6) and can be used as a means of suggesting candidate regions responsible for regulating gene expression in PAX258 domains not so far recapitulated in the current transgenic lines.
GENERATING GENE REGULATORY NETWORKS
Many gene regulatory regions consist of transcription factor binding sites, and information from these can enable us to build up gene regulatory networks (GRNs). There is already much literature on the place of the PAX258 gene family in tissue-specific GRNs, based on functional studies. For example, in the MHB they are involved in an intricate network of interacting genes that induce and then maintain this structure. It is first demarcated by abutting early expression of otx2 rostrally and gbx2 caudally. Pax2 is required for initiating the expression of the fgf8 signalling molecule (Ye et al., 2001) that is both necessary and sufficient for inducing midbrain and cerebellum development (Crossley et al., 1996). Since the expression of the PAX258 gene family overlaps with a number of other transcription factors in this domain, it is hard to decipher their place within this network. However, many loss-and-gain-of-function studies have indicated that Pax2 acts upstream of a number of the MHB-expressing genes as well as controls intracellular signalling to modulate cell proliferation and differentiation (See Table2 and Fig. 7).
This control of cell cycle is a common theme in the role of members of the PAX258 subfamily in numerous developmental domains, often participating in the Pax-Six-Eya-Dach-Network (PSEDN). First identified in Drosophila eye development, it appears that members of this network are frequently deployed in other developmental processes such as myogenesis, nephrogenesis, placode development, and the development of endocrine tissues. Detailed analyses in amphioxus suggest that rather than a strictly conserved network, the PSEDN is loosely reiterated in a number of developmental processes (Kozmik et al., 2007). The participant members of this network vary from domain to domain, and the paralogous genes involved are lineage specific. For example, the AmphiPax258 is expressed in the developing amphioxus nephridium but no other members of the PSEDN are expressed apart from Amphi3/7, a gene subfamily that is not expressed in the vertebrate kidney. Here, six1/2 and eya are expressed in the meso- and metanephric mesenchyme, of which there is no homologue in amphioxus. How members of the PAX258 subfamily fit within the PSEDN in the vertebrate kidney is as yet obscure (and therefore has not been included in Fig. 7), apart from evidence of downregulation of Pax2 in Six1 null mice. However, together with Hox11, a combination of Pax2, Eya, and Six appears to regulate the expression of Gdnf. From the literature reviewed here, the only domain where this subfamily appears to participate in the full complement of the PSEDN is in the ear (Fig. 7) whereas in the eye, Pax6 is the Pax paralogue participant (Brodbeck and Englert, 2004).
The dynamics of GRNs shift throughout development often making tangible construction of GRNs recalcitrant, particularly if specific subdomains employ separate regulatory networks. For example, in the eye there are unique regulatory influences involved in dorsal ventral patterning, specification of the blind spot and optic stalk, and in retinal axon guidance. Also as discussed, different lineages utilise different participants of GRNs, whether this be an entire gene family or its paralogous members. The PAX258 gene subfamily has three DNA-binding domains and paired box domain alone has a fairly broad and degenerate DNA-binding recognition sequence (Czerny et al., 1993), therefore identifying direct target genes with any confidence is experimentally challenging. However, using a combination of in silico analyses and experimental corroboration, a number of novel PAX258 target genes in the otic region were recently identified from just two characterised recognition matrices of the paired box domain (Ramialison et al., 2008). It is hoped that such powerful approaches will rapidly enrich our knowledge of direct gene interactions. In the case of the PAX258 gene subfamily, this knowledge is so far limited. Nevertheless, we have attempted to reconstruct representative GRNs from the current literature so as to give an overview of the complex interactions in which these critical genes participate (Table 3, Fig. 7).
Table 3. Participants in PAX258 GRNs within their overlapping expression domainsa
Indirectly interacting genes
Superscript numbers refer to stages in development (see below) and numbers in parentheses relate to the relevant reference, unless covered in one of the reviews. *, References detailing indirect interactions. Genes may appear in more than one column depending upon their relative position in the GRN based upon specified experimental evidence or stage of development. We use the terms “upstream” and “downstream” genes loosely, referring to genes that, respectively, influence or are influenced by the expression of PAX258 genes. Indirectly interacting genes are given as one list for all members of the PAX258 gene family; refer to figure 7 for these interactions. So as to include all members of the pax258 gene subfamily, molecular interactions in the ear relate to teleosts. We have omitted the downstream target genes identified by Ramialison et al. (2008), since these are novel with unknown roles within GRNs involved in ear development. Unless interactions are teleost specific, mouse gene nomenclature is given. The model organisms used are given in parentheses: C, chick; D, Drosophila; F, zebrafish; Fm, medaka; M, mouse, R, review covering multiple model organisms; X, Xenopus. Subsequently, the experimental evidence is abbreviated as follows: bs, band shift assays; ca, chimeric animals; cc, ChIP-ChIP assays; ci, coimmunoprecipitation; df, DNAse1 footprint analyses; ev, ex vivo experiments; fs, FACS sorting; ivi, in vivo experiments; ivt, in vitro experiments; gf, gain of function experiments; kd, knockdown/knockout experiments; ma, microarray; me, misexpression experiments; mp, mutant phenotype; pd, pull down assays; qp, quantitative PCR; re, rescue experiments; ta, transgenic analyses; te, transplantation experiments. Developmental stages are as follows. Ear: 1 Pre-placode stage, 2 placode stage, 3 otic vesicle stage. Eye: 1 Establishing RPE progenitor domain, 2 Establishing dorsal-ventral polarity, 3 Formation of optic disc/choroid fissure, 4 Formation of optic stalk and retinal axon guidance, 5 Formation of optic chiasm. Kidney: 1 pronephros, 2 mesonephros, 3 metanephros. MHB: 1 Initiation, 2 Maintenance. Spinal cord: 1 Early wave of neurogenesis, 2 Late wave of neurogenesis. Thyroid: 1 endoderm specification, 2 thyroid differentiation.
The whole genome duplication events that gave rise to different members of the PAX258 subfamily have led to a degree of robustness in their collective role, leading to a considerable level of functional redundancy and equivalency. Phylogenetic analyses of s-Pax2/5/8 cDNA from the freshwater sponge Ephydatia fluviatilis show that with increasing organismal complexity, there is a decrease in the evolutionary rate of amino acid substitutions of Pax proteins (Hoshiyama et al., 1998). This perhaps reflects the constraint on protein-coding sequences when a gene undertakes multiple roles. Whilst the coding sequence in itself may still be capable of ancestral functions, it is likely that it is the cis-regulatory environment that influences the differential roles adopted by members of the PAX258 subfamily. This is strongly suggested by the fact that when directed by Pax2 cis-regulatory elements a Pax5 gene exhibits Pax2 function (Bouchard et al., 2000).
Historically, in the quest for identifying cis-regulatory regions, the focus has primarily been immediately upstream of the protein coding region. With an increasing amount of whole genome data and a growing appreciation of the range over which cis-regulatory elements can act, candidate regions for influencing developmental anomalies and disease have vastly expanded beyond the protein-coding region alone. Interestingly, a remarkable amount of endogenous gene expression can be recapitulated from just a tiny component of the gene regulatory repertoire (e.g., from the pax2 constructs mapped onto Fig. 6). This perhaps reflects the robust nature of the overall gene expression. Also, whilst deletion mapping has successfully delineated tissue-specific enhancer elements, when functionally tested in isolation, additional ectopic expression has been observed. For example, a mouse MHB enhancer element upstream of Pax5 showed ectopic expression in the basal plate of the spinal cord and in the forebrain (Pfeffer et al., 2000). In addition, a 0.4-kb upstream sequence of mouse Pax2 was able to specifically recapitulate endogenous gene expression in the Wolffian duct, but with ectopic expression in the branchial arch (Kuschert et al., 2001). Identifying the activity of discrete regulatory regions when taken out of their genomic environment is crucial for recognising potential regulatory anomalies when this genomic environment is perturbed. In context, this ectopic enhancer activity is suppressed, and so therefore context should be considered as a regulatory factor. This is particularly important in view of eliminating regulatory regions based on their inability to drive particular aspects of endogenous gene expression. For example, none of the Pax2 constructs discussed earlier are able to drive expression in the optic vesicle. However, using band shift assays and sequence comparisons, Schwarz et al. (2000) were able to identify optic-specific enhancers that fell within these 4.5-, 8.5-, and 101-kb mouse constructs (Rowitch et al., 1999; Picker et al., 2002; Ohyama and Groves, 2004). Therefore, not only is ectopic expression observed when a regulatory element is taken out of context, but also absence of endogenous gene expression can be seen with a partial regulatory complement despite the presence of the relevant tissue specific enhancers. Therefore, suppressors in the larger constructs appear to obscure the presence of these optic stalk enhancers.
Endeavouring to identify regulatory elements responsible for specific aspects of endogenous gene expression is, therefore, fraught with difficulties, particularly as given the sheer number and variety of regulatory elements, it seems unlikely that a lone element is responsible for such a critical role. However, the value of functionally analysing individual elements lies in identifying the underlying sequence language that drives expression in particular spatial and temporal patterns, whether endogenous or ectopic. Whilst databases such as TRANSFAC (Matys et al., 2006) and JASPAR (http://jaspar.cgb.ki.se/ (Sandelin et al., 2004) are invaluable for identifying transcription factor–binding sites (TFBS), low signal-to-noise ratios make this a notoriously difficult task and limited to known TFBS. In addition, this presupposes that the mode of action of all regulatory elements requires the binding of transcription factors. The extent of conservation of one class of regulatory element, the CNE, is not compatible with this supposition given the short length and degree of degeneracy of TFBS. However, CNEs may present a tightly restricted series of overlapping TFBS that have been retained across the vertebrate lineage. Comparisons of elements that show similar or divergent reporter gene expression profiles can be explored both in vivo and in silico by examining the relationship between sequence grammar and enhancer function. Using the many CNEs that have remained in duplicate within the PAX258 gene subfamily will facilitate such analyses. Programs such as Meme (Bailey and Elkan, 1994; http://meme.sdsc.edu/meme4/intro.html) and WeederH (Pavesi and Pesole, 2006; http://18.104.22.168/modtools/) enable common motifs to be identified de novo, whether they are TFBS or as yet unknown functional motifs.
Deciphering how regulatory elements coordinate endogenous gene expression is still in its infancy and yet, obviously, it is key for understanding the influence that genes such as PAX2, 5, and 8 have upon normal and abnormal development as well as disease. From functional analyses so far, it is clear that the full recapitulation of endogenous gene expression requires the full complement of gene regulatory components. Readily identifiable regulatory elements such as the CNEs enlighten us to the range over which such elements can act, encroaching into neighbouring unrelated genes (reviewed in Kleinjan and van Heyningen, 2005; Kleinjan and Lettice, 2008). Duplicated conserved non-coding elements give us some indication as to defining this range, namely that half of them are able to act over 250 kb away from their associated genes (Vavouri et al., 2006). The population of pan-vertebrate PAX258 CNEs, respectively, encompass 260, 470, and 15.5 kb of the human genome. In order to study endogenous gene expression in the context of the full cis-regulatory repertoire, these entire regions should be taken into consideration. Fortunately, emerging technologies such as recombineering are enabling vast genomic regions to be functionally analysed intact such that reporter genes can be used to substitute for coding sequence in order to study the dynamics of endogeneous gene with the control of the full complement of regulatory elements (Yu et al., 2000; Lee et al., 2001).
With their complex overlapping roles during embryonic development, members of the PAX258 subfamily represent a model for understanding how genes function as a whole and how the differentially retained cis-regulatory components contribute to differential function. BAC recombineering will prove invaluable for understanding the dynamics and subtleties of authentic gene expression in vivo. Comparative functional and bioinformatic analyses of their cis-regulatory repertoire will substantially contribute towards understanding the underlying sequence language that directs tissue-specific expression. In addition to the invaluable existing data, these combined analyses will advance our perspective into how the cohorts of cis-regulatory elements act in concert during intricate developmental processes to direct complex gene regulatory networks.
We thank Stefan Pauls for his helpful comments during an early draft of this manuscript and Dave Cheesman in helping to edit the final document. We are also indebted to reviewers of this manuscript for their constructive advice.