Within plant cells, some genes are expressed constitutively , whereas others respond to specific stimuli [2–10]. Both patterns depend on the interaction of transcription factors with cis-acting elements and/or with other transcription factors required for gene expression , and they are important in the regulation of cell activities. Therefore, alteration in the expression of transcription factor genes normally results in dramatic changes to a plant [12–14] and structural changes to these genes may represent a significant evolutionary force . As a practical consequence, engineering of transcription factor genes provides a valuable means for manipulation of plants , but success in such endeavors depends on how well the genes are understood. To this end, numerous plant transcription factor genes and the proteins they encode have been characterized [11,16–28]. In this review, we analyze recent advances in the study of higher plant transcription factors, with emphasis on the domain structure of these proteins as well as the evolution and regulation of their genes.
A typical plant transcription factor contains, with few exceptions, a DNA-binding region, an oligomerization site, a transcription-regulation domain, and a nuclear localization signal. Most transcription factors exhibit only one type of DNA-binding and oligomerization domain, occasionally in multiple copies, but some contain two distinct types. DNA-binding regions are normally adjacent to or overlap with oligomerization sites, and their combined tertiary structure determines critical aspects of transcription factor activity. Pairs of nuclear localization signals exist in several transcription factors, and basic amino acid residues play essential roles in their function, a property also true for DNA-binding domains. Multigene families encode transcription factors, with members either dispersed in the genome or clustered on the same chromosome. Distribution and sequence analyses suggest that transcription factor families evolved via gene duplication, exon capture, translocation, and mutation. The expression of transcription factor genes in plants is regulated at transcriptional and post-transcriptional levels, while the activity of their protein products is modulated post-translationally. The purpose of this review is to describe the domain structure of plant transcription factors, and to relate this information to processes that control the synthesis and action of these proteins.
nuclear localization signal.
Functional domains of transcription factors
General features of plant transcription factors
Many transcription factors have been examined by X-ray crystallography and NMR spectroscopy, but for plants only the structure of the Arabidopsis thaliana TATA box-binding protein 2 is known precisely . The functional domains of plant transcription factors are usually derived by comparing amino acid sequences deduced from cDNA clones with their animal counterparts. Study of putative functional domains by mutational and functional analysis has demonstrated that typical plant transcription factors consist of a DNA-binding region, an oligomerization site, a transcription regulation domain and a nuclear localization signal (NLS) (Fig. 1), although some lack either a transcription regulation domain  or a specific DNA-binding region [27,31,32]. In addition, the identification of conserved RNA-binding motifs in the carnation ethylene-responsive element-binding protein-1 , a G-protein β-subunit-like motif in COP1 (constitutive photomorphogenic 1) , and a putative membrane-spanning region in PEND (plastid envelope DNA-binding), a bZIP (basic zipper) protein  demonstrates that novel functional domains occur in plant transcription factors.
Classification of transcription factors depends on their structural features, with families sometimes subdivided according to the number and spacing of conserved residues in the most similar domain (Table 1, Fig. 2) [24–26,28,35–63]. For instance, owing to the quantity and arrangement of cysteine (C) and histidine (H) residues, the factors containing zinc fingers fall into five classes: C2H2, C3H, C2C2 (GATA finger), C3HC4 (RING finger), and C2HC5 (LIM finger) . Alternatively, transcription factors of the same family are categorized by reference to a domain that falls outside the most conserved region. As one example, homeodomain factors exist as five groups, including the homeodomain zipper , homeodomain finger , GLABRA2, ELK-homeodomain [42,64], and twin-homeodomain factors . The conserved regions upon which nomenclature is based were initially thought to engage only in sequence-specific DNA interactions. However, some of them consist of both oligomerization and DNA-binding domains arranged in positions that vary relative to one another (Fig. 3). Indeed, with the exception of HMG (high mobility group) finger , homeodomain finger , and the A. thaliana HAT homeobox zipper factors , each plant transcription factor contains only one type of combined DNA-binding/oligomerization domain, but it may be present in multiple copies.
|Zinc finger||Finger motif(s) each maintained by cysteine and/or|
histidine residues organized around a zinc ion
|bZIP||A basic region and a leucine-rich zipper-like motif|||
|Myb-related||A basic region with one to three imperfect repeats|||
|each forming a helix–helix–turn–helix|||
|Trihelix||Basic, acidic and proline/glutamine-rich motif which|||
|forms a trihelix DNA-binding domain|||
|Homeodomain||Approximately 60 amino acid residues producing|
either three or four α-helices and an N-terminal arm
|Myc b/HLH||A cluster of basic amino acid residues adjacent to|
a helix–loop–helix motif
|MADS||Approximately 57 amino acid residues that comprise|
a long α-helix and two β-strands
|AT-hook motif||A consensus core sequence R(G/P)RGRP with the|
RGR region contacting the minor groove of A/T-rich
|HMG-box||l-shaped domain consisting of three α-helices with an|
angle of about 80° between the arms
|AP2/EREBP||A 68-amino acid region with a conserved domain that|
constitutes a putative amphiphatic α-helix
|B3||A 120 amino acid conserved sequence at the C-termini|
of VP1 and ABI3
|ARF||A 350 amino acid region similar to B3 in sequence||[24,49]|
The DNA-binding domains of plant transcription factors, many of which are basic in character, contain amino acid residues that contact DNA bases at cis-acting elements, and these determine the specificity of the protein [59,67]. Other residues enhance transcription factor binding by contacting DNA nonspecifically through interaction with either phosphate or deoxyribose moieties . The base-recognition residues are often highly conserved [59,67]. For example, one arginine residue in bZIP domains , either one or two cysteine and histidine residues in zinc-finger motifs , and single arginine and lysine residues in MADS (Mcm1, Agamous, Deficience, and Serum response factor) domains  are nearly identical within plants, animals and fungi (Fig. 2). The spatial arrangement of these amino acid residues in DNA-binding domains is important for specificity, as evident in the cysteine-rich regions of the HMG-finger transcription factor ENBP1 (pea early nodulin gene promoter binding protein 1) and the Arabidopsis homeodomain-finger factor HAT3.1. The cysteine-containing regions form zinc-finger DNA-binding motifs, but unlike other zinc-finger motifs that recognize specific promoter sequences , they interact with DNA in a sequence-independent manner [41,66]. The loss of specificity is probably caused by changes in the three-dimensional structure of the DNA-binding domain, which is due mainly to the positioning of cysteine residues. Several plant transcription factors possess both specific and nonspecific DNA-binding domains, with the latter occasionally necessary for transactivation of target genes, as shown for VP1 (VIVIPAROUS 1), a regulator of plant genes including wheat Em. VP1 has a weak nonspecific DNA-binding domain termed BR2 (B2) and a sequence-specific binding domain, BR3 (B3), that recognizes Sph elements (CATGCATG) found in Em and a few other plant genes. Interestingly, VP1 activates Em via BR2 instead of BR3. This BR2-dependent regulation requires members of the 14-3-3 family, originally found as soluble proteins in extracts of mammalian brain. The 14-3-3 proteins lack DNA-binding domains but link VP1 and a cis-element-specific transcription factor such as EmBP1 (Em-binding protein 1) to the promoter region of Em[32,69].
Secondary structure in DNA-binding domains seems to affect their affinity and selectivity. In this context, some neutral amino acid residues, such as proline and glycine, are important [40,70]. For example, the C-terminal DNA-binding domain of the rice trihelix factor, GT-2 (GT2-box-binding factor), loses its activity when helix-breaking prolines are substituted for other amino acid residues . Moreover, probably because of inhibition of α-helix formation, a proline–glycine pair in the center of the basic DNA-binding region of some bHLH (basic helix–loop–helix)-type transcription factors is apparently essential for recognition of the E2F cis-element (TTT[G/C][G/C] CGC), while preventing interaction of these bHLH factors with the cis-element, E-box (CANNTG; N = A, T, G, C) .
Usually, each plant transcription factor has only one type of DNA-binding domain, occurring in either single or multiple copies. For instance, most plant Myb-related proteins have two Myb domains, whereas the potato MybSt1 (Myb Solanum tuberosum 1)  and the Arabidopsis CCA1  each contain only one copy. One-, two- or three-fingered DNA-binding motifs occur for C2C2 and C2H2 zinc-finger transcription factors [50,72], and AP2 (APETALA2) factors may exhibit two DNA-binding domains connected by a conserved sequence . In plants, each HMG-1/2 type factor possesses one HMG-1/2 DNA-binding domain, whereas an HMG-I/Y type factor has either four or seven repeated HMG-I/Y sites . The repeated DNA-binding domains in animal transcription factors can interact co-operatively with the same cis-acting element on a target gene. As one example, the second (R2) and third (R3) DNA-binding domains of Myb-related proteins make sequence-specific DNA contacts , they are closely packed in the major groove of the DNA with their helices in contact, and they bind bases co-operatively. These types of data are not generally available for plant transcription factors, but similar DNA-binding domains may have different specificities, as demonstrated for GT-2 in rice. The sequence-related trihelix motifs located in the C- and N-terminal halves of GT-2, respectively, bind preferentially to GT2-bx (GGTAATT) and GT3-bx (GGTAAAT), GT-box motifs of the rice phytochrome A gene . Although much information is available, there is an urgent need to further characterize this critical domain if plant transcription factors are to be understood as well as these proteins are in animals.
Many plant transcription factors form hetero- and/or homo-oligomers, affecting DNA-binding specificity, the affinity of transcription factors for promoter elements [73,74] and nuclear localization . Oligomers are either stabilized by hydrophobic interactions between coiled coils and β sheets, or by reactions between hydrophilic residues, wherein the alignment of residues affects oligomerization by altering ionic environments . Amino acid sequences of transcription factor oligomerization domains are usually highly conserved, each type yielding, in combination with DNA-binding regions, discrete three-dimensional arrangements (Table 1, Fig. 2). The oligomerization domain of bZIP-type factors is characterized by several regularly spaced leucine residues and by a zipper-like structure . In b/HLH type factors, a helix–loop–helix composition appears , while MADS factors have oligomerization domains that form two α helices and two β-pleated sheets . Transcription factors of the same family may differ in oligomerization domain length. For most bZIP factors, the leucine zipper is composed of four or five heptad repeats, but in Arabidopsis ATB2, it contains nine repeats . In other cases, dissimilar regions outside oligomerization domains influence subunit association, as shown for MADS transcription factors in which a keratin-like domain termed K  and an intervening domain designated I  are essential for interaction. These variations in oligomerization increase the versatility of the transcription machinery, and they have the capacity to modulate gene expression in plants.
Transcription regulation domains
Transcription factors of the same family generally have distinct actions because of differences in their regulation domains, regions of the proteins that tend to diverge from one another . Regulation domains, and hence transcription factors, function as either repressors or activators, depending on whether they inhibit or stimulate the transcription of target genes.
Repression of gene expression may occur via exclusion of activators from target promoters by competitive binding between transcription factors for the same cis-acting element. Other possible mechanisms include masking of regulation domains by dimerization of transcription factors, as well as interaction of repression domains with transcription factors . No evidence for competition between plant transcription factors is available, but data on the latter two processes are emerging. For example, two bZIP proteins from rice, namely OsZIP-2a and -2b, dimerize in vitro with the wheat bZIP factor EmBP1, preventing its interaction with target promoters . The observations indicate that these rice bZIP factors are able to quench other bZIP proteins, but definitive conclusions can only be drawn when they are tested more extensively. Several observations suggest the existence of repression domains in plant transcription factors, but they remain poorly characterized. PvALF (Phaseolus vulgaris ABI-3-like factor) activates transcription from selected genes, including the French bean phytohemagglutinin gene DLEC2. ROM2 (regulator of maturation-specific protein 2), a bZIP protein, binds to the enhancer site of DLEC2 and represses PvALF-activated transcription of the gene, but this ability is lost after the portion of the molecule N-terminal to its bZIP domain is deleted. Interestingly, the truncated protein binds to the enhancer site, and if it is joined to the activation domain of PvALF, the chimeric protein activates DLEC2. This indicates that a repression domain in the N-terminal half of ROM2 inhibits the PvAF-activated transcription of DLEC2.
Activation domains of plant transcription factors often exhibit sequence divergence, although the GCB motif found in many HBP-1a/GBF (histone promoter-binding protein-1a/G-box-binding factor) type bZIP factors is an exception . The GCB (GBF-conserved box) motif, with the consensus sequence NLNIGMDXW, activates reporter genes when fused to the yeast GAL4 DNA-binding domain . Modulation of gene expression by truncated transcription factors fused to the yeast GAL4 DNA-binding domain revealed that activation domains are enriched in either acidic amino acids, proline or glutamine, although this is not necessarily important for function [80–82]. Site-directed mutagenesis of residues in the activation domain of the maize Myb-like transcription factor, C1, demonstrated that only one of 11 acidic residues, namely aspartate 256, is essential. Leucine 253 is also involved in activation of transcription, and modification of other amino acids in the domain had no effect, indicating that single strategically placed residues determine activation . Study of single amino acid changes in the activation domain of VP16, a herpes simplex virus transcription factor, and in the general transcription factors, TBP (TATA-binding protein) and TFIIB, also indicates that activation domain function depends on interactions between individual residues .
Amphipathic α-helices within acidic activation domains may be important functionally, but introduction of helix-incompatible amino acid residues into this region has little impact on C1 . Moreover, the acidic activation domain of solubilized VP16 lacks α helix. β-Sheet in proline-rich activation domains was also thought to be required for function, but disruption of this secondary structure is without effect on HBP-1a(17) action . In comparison with the results just described, the proline-rich region of HBP-1a(17) has several trans-activation modules that function when separate from one another, but not when they are present as a single unit . As a consequence of this finding, it was proposed that intramolecular interactions cause conformational changes, thereby modulating activation potency. Obviously, comparing plant transcription factor regulation domains with corresponding regions of related proteins from other organisms is required and will provide valuable insight.
Nuclear localization signals
As for proteins from other organisms that selectively enter the nucleus, plant transcription factors contain NLSs characterized by a core peptide enriched in arginine (R) and lysine (K) [65,83–88]. The action of the basic core is influenced by flanking residues . Within plant transcription factors, the NLSs vary in sequence, organization and number (Table 2). There may be a single NLS in which the basic residues are closely associated (SC, Table 2)  or a single NLS, wherein the basic residues form two functionally important groups separated by several non-conserved residues, the so-called bipartite structure (SB, Table 2) . Other transcription factors contain multiple copies of the NLS [65,84–86] which may be functionally independent [84,86,87] and either clustered  or dispersed within the protein (Table 2). In the maize homeobox-finger proteins, ZmHox2a and 2b, for instance, eight putative bipartite NLS repeats are arranged in tandem at the N-terminus of each protein . On the other hand, the rice and Arabidopsis trihelix factors GT-2 , the wheat homeodomain factor KNOTTED1 , and the maize bZIP factor Opaque2  possess two functionally independent NLSs, either located within or exterior to DNA-binding domains.
|NLS amino |
|Transcription factor||Location of NLS||Reference|
|SC||KRIAEGSKKRRIKQD*||Tomato HSFA1||Oligomerization domain|||
|SB||RKDKQRIEVGQKRRLTM*||Tomato HSFA2||Oligomerization domain|||
|DB||KKCKEKFENVHKYYKRTK*||Arabidopsis, rice GT-2||N-terminal trihelix domain|||
|KRCKEKWENINKYFKKVK*||Arabidopsis, rice GT-2||C-terminal trihelix domain|||
|RKRKESNRESARRSRYRK*||Maize Opaque2||Basic region of bZIP domain|||
|RRKLEEDLEAFKMTR*||Maize Opaque2||Vicinity of activation domain|||
|MC||ERSKKRSRE**||Tobacco bZIP TAF-1||N-terminal, unspecified region|||
|ERELKREKRKQ**||Tobacco bZIP TAF-1||Basic region of bZIP domain|||
|ARRSRLRKQ**||Tobacco bZIP TAF-1||Basic region of bZIP domain|||
|MB||PAA.KRK….S…SP…VRVLRS**||ZmHox2a and 2b||N-terminus, unspecified region|||
Using site-directed mutagenesis, the NLS within the DNA-binding area of Opaque2 was shown to retain its nuclear translocation capability, even when DNA-binding ability is lost . Therefore, these two activities are independent of one another. Other experiments indicate that the functional competence of NLSs varies. As one example, the NLS within the DNA-binding domain of Opaque2, when fused to β-glucuronidase, translocates the reporter protein into the nucleus more efficiently than does its companion NLS . In addition, although two putative NLSs are found in tomato heat-shock-responsive transcription factors HSFA1 and 2, only the NLS adjacent to the oligomerization domain (Table 2) supports translocation . Some plant transcription factors may lack an NLS, and they are thought to be imported into the nucleus by dimerizing with proteins that possess these signals .
Evolution of transcription factor genes
Transcription factor genes of the same family but from diverse eukaryotic organisms show structural and functional similarity, suggesting that they evolved from a common ancestor. Gene duplication undoubtedly played an important role during this evolution, an idea supported indirectly by the observation that related pairs of maize homeodomain knox genes reside in duplicated regions of the genome . After duplication, transcription factor gene distribution may be altered through translocation, and related family members are either dispersed throughout the genome or clustered on one chromosome [55,90,91].
Sequence alignment of transcription factor genes indicates that nucleotide substitution played a central role in the evolution of conserved regions, whereas substitutions and small insertions/deletions contributed to variable region diversification . In addition, exon capture through recombination of different genes or parts thereof formed new transcription factor genes [14,54]. Strong evidence for exon capture was obtained by demonstration of spontaneous fusion between a metabolic enzyme-encoding gene and a homeodomain factor gene . Sequence comparisons suggest that homeodomain leucine zipper genes , homeodomain ring-finger genes , b/HLH leucine zipper genes  and HMG-finger genes  originated through exon capture.
The basic helix–loop–helix DNA-binding/oligomerization sequences in the myc-like R genes of seven monocot and dicot species have a much lower nucleotide replacement rate than other regions . Similarly, nonsynonymous nucleotide substitution within sequences encoding the DNA-binding/oligomerization domain of MADS genes is significantly less than for other areas . DNA-binding/oligomerization encoding domains of transcription factor genes thus appear to diverge at reduced rates, and one explanation is that these regions are critical for function, even though they constitute less than half of each gene . Mutations in these positions are usually detrimental and they are eliminated by natural selection. Clarification of evolutionary events and their importance in the design of these proteins awaits the characterization of more plant transcription factor gene families.
REGULATION OF TRANSCRIPTION FACTOR GENES BY TRANSCRIPTIONAL AND POST-TRANSCRIPTIONAL MECHANISMS
Biological and physical aspects of transcription factor gene control
Plant transcription factor genes may either be expressed constitutively or in organ-limited [12,50,93], stimulus-responsive [31,38,94–98], development-dependent [61,78,99,100] and cell-cycle-specific manners . Although members of transcription factor gene families can differ in their time of expression , they may be transcribed simultaneously and co-ordinately, as shown for three Brassica GBFs, designated BnGBF1a, 1b and 2a. Their genes are transcribed at a constant ratio (1a > 2a > 1b) in various organs and developmental stages, with mRNA pools largest in photosynthetically active organs such as leaves and cotyledons .
Light [38,97], hypoxia [31,96], non-freezing low temperature [94,96], salt stress , abscisic acid [96,98] and gibberellic acid  individually influence transcription factor gene expression. In addition, plant transcription factor genes respond to multiple environmental signals, as shown for mLIP15 (maize low temperature-induced protein 15), a maize bZIP factor that binds to the promoters of wheat histone gene H3 and the low-temperature-inducible gene Adh1 (alcohol dehydrogenase-1). The amount of mlip15 transcript increases dramatically upon exposure to reduced temperature, salt stress and exogenous abscisic acid, but neither heat shock nor drought affects expression of this gene . Equally interesting, members of a gene family are not necessarily responsive to the same stimulus; some genes of the bZIP family are regulated by light [53,102], while others respond to abscisic acid, auxin and salicylic acid [98,103–105]. The impressive range of effectors that modulate expression of transcription factor genes forecasts an equally large number of response mechanisms, and these are described in the following sections.
Expression kinetics of transcription factor genes
Coincidental accumulation of mRNA from plant transcription factor genes and their targets occurs in response to several stimuli, and consistent with a role in activation, the proliferation of many regulatory gene mRNAs precedes the expression of their effector genes [31,38,95,98]. In contrast, quantitative changes in transcripts from other regulatory and target genes are inversely correlated, indicating that products of the former are transcriptional repressors [53,106]. Members of the same plant transcription factor multigene family may be expressed differently. Examples are the bZIP factors, CPRF-1 (common plant regulatory factor-1), -2 and -3, which bind specifically to Box II, an important promoter region of the parsley light-responsive chalcone synthase gene (chs). CPRF-1 mRNA is made transiently when dark-grown parsley cells are exposed to light, suggesting that its product participates in light-mediated activation of the parsley chs gene. Concurrently, CPRF-3 mRNA decreases gradually, revealing an inverse correlation to the amount of chs mRNA. CPRF-2 mRNA is unaffected by light . These distinct kinetics imply differences in the regulatory capacity of each of the related transcription factors.
The relationships between regulatory and regulated genes exhibit other nuances. Some genes encoding transcription factors that recognize stimulus-responsive cis-acting elements are not regulated by the stimulus; the tobacco bZIP factor, TFHP-1, interacts specifically with the wounding-responsive cis-acting element of prxC2, a horseradish peroxidase gene, but transcription of the TFHP-1 gene is indifferent to leaf damage . Similarly, transcription of genes encoding GT-1 (GT-rich motif binding factor 1), and CGF-1 (chlorophyll a/b-binding protein gene GATA-box factor 1), factors that associate with GATA boxes of selected light-inducible genes, is constitutive rather than light-responsive . As one explanation for this behavior, the GT-1 and CGF-1 regulator genes may lack appropriate stimulus-responsive cis-acting elements. In another case of atypical regulation, fluctuation in the amount of regulator gene transcripts does not influence expression of putative target genes, as occurs for ZmHox (Zea mays homeobox) 1a and 1b. These maize transcription factors recognize the promoter of Sh 1 (shrunken-1); however, this gene is unaffected by altering the expression of either ZmHox 1a or ZmHox 1b. Modulation of Sh 1 potentially entails other stimulus-responsive regulators that co-operate with transcription factors, or the regulator genes may be controlled post-transcriptionally, as discussed below.
Regulation of transcription factor genes by cis-acting elements
Quantitative variations in transcription factor mRNA, achieved by suppression and over-expression experiments, cause substantial changes in plants . Therefore, accurate regulation of transcription factor genes by their cis- and trans-acting elements is potentially very important and several examples of this are known. The seed-specific expression of Atmyc1, a Myc-related transcription factor gene of A. thaliana, is conferred by a cis-regulatory element designated as the Sph box (CATGCATG) . Because of common cis-acting elements, regulatory and regulated genes are sometimes influenced by the same transcription factors [68,111]. In addition, some transcription factor genes have cis-acting elements that are affected by their own products [68,97,111].
The reaction of a single plant transcription factor gene to more than one stimulus may depend upon multiple cis-elements. The regulatory sequence of the Arabidopsis HMG-I/Y gene contains putative binding sites for bZIP, Myc, Myb, GT-2, and HBP-1 transcription factors . The upstream sequence from position −142 to −132 (CGTCCATGCAT) of the maize transcription factor gene, C1, is essential for regulation by VP1, whereas a larger overlapping element, from −147 to −132 (CGTGTCGTCCATGCAT) modulates activation by abscisic acid. A separate light-sensitive cis-element, similar to those found in other genes that respond to light, is located between positions −116 and − 59 of C1. These findings support the premise that differential expression of related transcription factor genes upon exposure to disparate environmental stimuli is due to cis-acting elements. However, the regulatory sequences of more transcription factor genes must be characterized before the general applicability of this rule to plants is certain.
Alternative mRNA splicing, a post-transcriptional mechanism for regulation of transcription factor genes in plants
Alternative splicing has been documented for mRNAs originating from several plant transcription factor genes [72,114,115]. An example is the maize P gene, which generates two transcripts by alternative splicing at the 3′ end of the precursor mRNA. The larger message of 1802 nucleotides encodes a 43.7-kDa protein with an N-terminal region showing 40% identity with the DNA-binding domain of several Myb family proteins. A smaller message of 945 nucleotides produces a 17.3-kDa protein that contains most of the Myb domain, but it differs from the first protein at the C-terminus . A second illustration of alternative splicing involves the constitutively transcribed LSD1 (lesion simulating disease resistance 1), an Arabidopsis zinc finger protein gene encoding a repressor that acts during pathogen-induced cell death . Alternative splicing of LSD1 mRNA affords major and minor transcripts, with one and two start codons, respectively. The latter message produces the more prevalent protein, while the former gives rise to a protein with five extra amino acid residues at its N-terminus. In yet another case, alternative splicing of the rice myb7 gene transcript occurs. One mRNA yields a protein containing a partial Myb domain followed by a leucine zipper motif encoded by an unspliced intron; the other mature transcript, with two introns spliced, furnishes a complete Myb factor .
Transcription factors derived from the same precursor mRNA by alternative splicing may have distinct regulatory functions. As a consequence, adjusting the ratio between mRNAs, as occurs for rice myb7 during anoxia stress , acts as a switch that affects expression of other genes. More comparative analyses of sibling transcripts must be performed in order to determine how important alternative mRNA splicing is to the synthesis and function of plant transcription factors.
Repression of plant transcription factor mRNA translation by upstream open reading frames (ORFs)
Short upstream ORFs, usually with start codons in poor context when compared with the eukaryotic consensus (A > G) CCATGG, occur in the leader sequences of transcripts originating from members of several plant transcription factor gene families, including Arabidopsis HMG-I/Y, maize B/R and bZIP, and rice myb. As one example, the upstream ORF of Arabidopsis HMGI/Y mRNA, which encodes a polypeptide of 13 amino acid residues, ends 60 nucleotides upstream of the start codon for the transcription factor. Upstream ORFs in the mRNA transcripts from a few transcription factors repress translation from downstream ORFs. To demonstrate that expression of rice myb7 mRNA is repressed by its upstream ORF, the myb7 gene fragment containing the upstream ORF was fused upstream to a chloramphenicol acetyltransferase (CAT) reporter gene . In a transient expression system, CAT activity increased threefold when the start codon in the upstream ORF was mutated, suggesting that the upstream ORF represses translation of the myb7 mRNA ORF. Because upstream ORFs only exist in some members of a multigene family, translational repression by upstream ORFs varies the expression of related transcription factor genes, offering another mechanism for differential regulation of structural genes in plants.
Post-translational regulation of transcription factor activity
Effect of translocation on transcription factor activity
Plant transcription factors enter the nucleus by a selective process, this being a prerequisite for their function [116,117]. The nuclear pores of higher plants contain proteins that bind the NLS of transcription factors , and mutations in localization sequences impair interaction with these proteins. Some transcription factors, such as the Arabidopsis bZIP protein, HY5 (hypocotyl 5), normally occur in the nucleus  but the position of many others is controlled by environmental stimuli [88,117]. Two pools of bZIP GBFs were found in dark-grown parsley by using a cell-free system prepared from protoplasts . Nuclear translocation of one pool was determined by light-induced cytosolic phosphorylation of the constituent transcription factors, while localization of the other seemed dependent on phosphorylation caused by a different stimulus. Phosphorylation also regulates nuclear import of many proteins by altering cytoplasmic retention factors, affecting intra- and inter-molecular NLS masking, or by directly modifying the NLS . However, it is not known if the translocation of plant transcription factors is controlled by these mechanisms.
As revealed by in-situ immunolocalization, several plant transcription factors move between cells via plasmodesmata, resulting in nonautonomous effects [120,121]. The details of this mechanism were reviewed recently, demonstrating that intercellular translocation plays an important role in the actions of transcription factors .
Post-translational modifications affect binding of transcription factors to DNA
Regulation of transcription factor binding to DNA via protein phosphorylation and dephosphorylation may determine the expression of many target genes, including those that encode transcription factors. The interaction of some transcription factors with DNA is abolished by dephosphorylation [122,123] and stimulated by phosphorylation [122,124], whereas the reverse is true for others [125–127], or they are unaffected . It is possible that several kinases influence the combination of various plant transcription factors with DNA, but only casein kinase II  and serine kinases [124,126,127] have been characterized in this light. These kinases function in either the cytoplasm , nucleus  or other organelles where they may associate with the transcription machinery . Both external and internal stimuli affect the regulatory mechanisms. For instance, serine residues in the DNA-binding domain of the bZIP transcription factor HBP-1a(17) are phosphorylated in a Ca2+-dependent manner  while phosphorylation of another bZIP factor, Opaque2, is controlled by a circadian-clock-related mechanism .
Functional domains within many plant transcription factors have been characterized, but much remains to be learned about these proteins. The cloning and sequencing of more plant transcription factor genes is necessary, as is examination of their expression, either during development or upon exposure to external stimuli. X-ray crystallography and NMR spectroscopy will provide essential data, revealing in high resolution the structure of transcription factors and their interaction with DNA. Most known transcription factors possess only one distinct DNA-binding/oligomerization region; however, a few contain two, and some have unusual sequences such as an RNA-binding site or a G-protein β-subunit homologous domain. A demonstration of the existence of additional novel transcription factors and their biological significance awaits further study. Clarification of transcription factor gene evolution and chromosomal distribution will increase our knowledge of organism diversification and, from a practical perspective, advance genetic engineering. In addition, identification of cis- and trans-acting elements associated with plant transcription factors will reveal mechanisms that regulate their expression. Examination of post-transcriptional and post-translational controls of transcription factor gene expression will further elucidate the biological consequences of alternative mRNA splicing, translational repression, modification of DNA-binding activity and differential translocation on transcription factor activity. Clearly, control of transcription factor gene expression and function involves an important network of interrelated processes. Discerning the biophysical and biochemical bases of these processes is a major challenge, as well as an excellent opportunity to make fundamental contributions to our understanding of plant cell/molecular biology.
We thank Drs R. Mackay, B. Pohajdak, D. Richardson, S. Douglas, and Y. Pan for their critical comments. We are grateful for support from Saint Mary’s University. This work was funded by grants from the Natural Sciences and Engineering Research Council of Canada to M.J.W. and T.H.M.