*For correspondence (fax +1 212 995 4204; e-mail firstname.lastname@example.org). †Current address: Roanoke College Department of Biology, 221 College Lane, Salem, VA 24153, USA.
Mutations at the SCARECROW (SCR) locus in Arabidopsis thaliana result in defective radial patterning in the root and shoot. The SCR gene product contains sequences which suggest that it is a transcription factor. A number of Arabidopsis Expressed Sequence Tags (ESTs) have been identified that encode gene products bearing remarkable similarity to SCR throughout their carboxyl-termini, indicating that SCR is the prototype of a novel gene family. These ESTs have been designated SCARECROW-LIKE (SCL). The gene products of the GIBBERELLIN-INSENSITIVE (GAI) and the REPRESSOR of ga1–3 (RGA) loci show high structural and sequence similarity to SCR and the SCLs. Sequence analysis of the products of the GRAS (GAI, RGA, SCR) gene family indicates that they share a variable amino-terminus and a highly conserved carboxyl-terminus that contains five recognizable motifs. The SCLs have distinct patterns of expression, but all of those analyzed show expression in the root. One of them, SCL3, has a tissue-specific pattern of expression in the root similar to SCR. The importance of the GRAS gene family in plant biology has been established by the functional analyses of SCR, GAI and RGA.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
Identification of gene families is an important first step in elucidating the common molecular mechanisms by which members of the family function and in establishing the biochemical structures and interactions responsible for their activities. Sequence information is routinely used to identify specific functional domains. Sequence comparisons can also identify residues potentially vital for the function of the gene products, based on their absolute conservation in all members of the family. The effects of mutations at these sites may then be determined through reverse genetics.
We report the molecular analysis of a novel plant gene family in Arabidopsis thaliana. The first member of this family, SCARECROW (SCR), was isolated as the result of a screen for mutations that affect root development ( Benfey et al. 1993). Mutations at the SCR locus disrupt radial patterning of the root, resulting in the loss of a layer of ground tissue ( Scheres et al. 1995). The predicted SCR gene product contains a number of putative domains which strongly suggest that SCR functions as a transcription factor ( Di Laurenzio et al. 1996). A comparison of the predicted SCR sequence with sequences present in the databases revealed that several Arabidopsis Expressed Sequence Tags (ESTs) encode gene products with homology to a region termed the VHIID domain ( Di Laurenzio et al. 1996). Subsequently, we have derived the full-length sequences of these and other ESTs and discovered that their putative gene products show significant sequence similarity to SCR and to each other throughout their carboxyl (C)-termini. This highly conserved region does not show significant similarity to members of any recognized gene family, indicating that these sequences define a novel gene family whose members we have called SCARECROW-LIKE (SCL). Recently, the importance of this family has been confirmed through the molecular analysis of two components of the gibberellin (GA) signal transduction pathway. The gene products of the GIBBERELLIN-ACID INSENSITIVE (GAI) and the REPRESSOR of GA1 (RGA) loci, have been shown to be members of this family ( Peng et al. 1997; Silverstone et al. 1998). For the family as a whole, we will use the acronym GRAS, based on the locus designations of these three genes (GAI, RGA, SCR).
At present the GRAS family includes 19 members in Arabidopsis. Here, we report the deduced amino acid sequences of the SCL gene products that we have sequenced, in addition to the expression of these sequences in Arabidopsis. Intriguingly, the majority of the SCL genes are expressed predominantly in the root, and one of these (SCL3) has a tissue-specific expression pattern in the root that is similar to that of SCR. The fact that the SCR, GAI and RGA gene products have diverse roles in fundamental processes in plant biology (SCR in pattern formation and GAI/RGA in signal transduction) suggests that other members of this family may also play important roles in the physiology and development of higher plants.
Results and Discussion
Identification of the SCLs
Three Arabidopsis ESTs whose predicted gene products bear striking similarity to SCR in a region termed the VHIID domain ( Di Laurenzio et al. 1996 ) were sequenced in their entirety. Comparisons of these sequences (now designated SCL1, SCL3 and SCL5) with SCR indicated that the similarity among the predicted gene products extended beyond the VHIID domain, in both the N- and C-terminal directions. Additional Arabidopsis ESTs were identified on the basis of their similarity to these highly conserved sequences ( Table 1), and several were sequenced in their entirety (SCL6, SCL7, SCL8, SCL9, SCL11, SCL13 and SCL14).
Table 1. Accession numbers and map positions of the GRAS sequences in Arabidopsis
‘E’ indicates an EST or a BAC end sequence. ‘Com’ indicates a complete EST sequence. ‘G’ indicates a genomic sequence.
Database searches have also identified eight genomic sequences that potentially belong to this family ( Table 1). Six of these are represented in the EST database, of which three (SCL6, SCL9 and SCL13) correspond to ESTs that we have sequenced. Another three (SCL4, SCL15 and SCL19) are represented by ESTs that were not initially identified as members based on the partial sequence available. Two of the genomic sequences (SCL16 and SCL18) are not represented by ESTs and must be considered tentative members of the family. Unlike SCR (which contains a single intron), the genomic sequences for SCL4, SCL6, SCL9 and SCL15 all appear to contain a single open-reading frame encompassing all of the motifs present within the GRAS family.
The GRAS gene products share significant similarity throughout their C-termini, beginning at approximately 110 amino acid residues N-terminal of the highly conserved VHIID sequence and continuing throughout the C-terminal portion of the predicted products ( Fig. 1). This extensive sequence similarity can be subdivided into five distinct sequence motifs, found in the following order: leucine heptad repeat I (LHR I); the VHIID motif; leucine heptad repeat II (LHR II); the PFYRE motif; and the SAW motif ( Fig. 1).
The Leucine Heptad Repeats.The leucine heptad repeats (LHR I and LHR II) are unusual in structure. LHR I appears to consist of two repeat units (A and B in Fig. 1b) that are separated by a spacer that often contains a proline residue, known to disrupt alpha-helical structures. The two units within LHR I are not in phase with each other. LHR IA is similar to LHRs found in other proteins, consisting of between three to five regular heptads. LHR IB is shorter, usually consisting of only two such repeats. LHR II is also unusual: although specific leucine heptad repeats can be identified in this region in nearly all members of the family, the number of repeats is small, usually two or three. The presence of leucine heptad repeats in the GRAS proteins suggests that these gene products may function as multimers ( Hurst 1994). The presence of four possible (albeit unusual) LHRs in some of the members suggests a potentially complicated higher order of interaction.
The VHIID Motif.The VHIID sequence is readily recognizable in all members of the family, although it is not absolutely conserved: substitutions of valine, isoleucine and leucine at the 1, 3 and 4 positions yield a number of permutations. Within the larger region that we term the VHIID motif, the P-N-H-D-Q-L residues are absolutely conserved ( Fig. 1). The spacing between the proline and asparagine residues is identical among all members, as is the spacing between the histidine, aspartate, glutamine and leucine residues. The VHIID motif is bounded at its C-terminus by a conserved sequence referred to as LRITG for simplicity ( Fig. 1). Most of the deviations from this consensus sequence represent conservative changes.
The PFYRE Motif.The PFYRE motif is not as well conserved at the sequence level as are the VHIID and SAW motifs (only the P is absolutely conserved) ( Fig. 1). Within the PFYRE domain, however, the sequences are largely co-linear and portions of this region show a high degree of sequence similarity among all members of the family.
The SAW Motif.The SAW motif is characterized by three pairs of absolutely conserved residues: R-E, W-G and W-W ( Fig. 1). The W-W pair found nearly at the C-terminus of these sequences shows absolute conservation of spacing, as does the W-G pair. The spacing between the W-G and W-W pairs, however, is not conserved.
Those GRAS gene products for which N-terminal sequence data exists beyond that shown in Fig. 1 do not contain significant similarity among their N-termini, except in the case of GAI/RGA/RGAL ( Peng et al. 1997 ; Silverstone et al. 1998 ; Truong et al. 1997 ). The SCLs for which N-terminal sequence is available (SCL4, SCL6, SCL8, SCL9, SCL14 and SCL15) do not show any significant similarity to each other, to SCR, or to GAI/RGA/RGAL in this region (data not shown). The one common feature is that most of them contain homopolymeric stretches of certain amino acid residues (S, T, P, Q, G, E and/or H).
In summary, the GRAS gene products are characterized by a variable N-terminal region and a highly conserved C-terminal region. Importantly, the order of these motifs within each protein is the same. While the functions of the VHIID, PFYRE and SAW motifs are currently unknown, the absolute conservation of the residues in the VHIID and SAW motifs indicates that these residues are required for the activity of the GRAS gene products.
The GAI and RGA gene products also contain a sequence that fits the consensus sequence (LXXLL) demonstrated to mediate the binding of steroid receptor co-activator complexes to nuclear receptors ( Heery et al. 1997 ; Peng et al. 1997 ; Silverstone et al. 1998 ; Torchia et al. 1997 ). Sequences conforming to this consensus are also found in SCL4, SCL6, SCL15, RGAL and SCR. The significance of this sequence in plants is unknown.
The combination of motifs present in the GRAS family members suggests that they may act as transcriptional regulatory proteins. It is tempting to hypothesize further that the N-termini of the GRAS proteins function as activation domains: the variability of these sequences may result in the ability to mediate a number of different interactions with the basic transcriptional machinery and accessory proteins. The LHR I-VHIID-LHRII region may function as a DNA-binding domain, analogous to the bZIP protein–DNA interaction ( Ellenberger et al. 1992 ), with the LHRs mediating protein–protein interactions and the VHIID motif mediating protein–DNA interactions.
Comparison of conserved motifs among members of the GRAS family suggested that they could be grouped into distinct subsets. To determine the evolutionary relationship among these genes, the highly conserved sequences spanning the five motifs from the Arabidopsis members were analyzed by heuristic and bootstrap analyses to determine maximum parsimony. In the resulting phylogram, several distinct groups can be distinguished. These include: SCL11/14/9, SCL13/5/1, SCL4/7, SCL6/15 and GAI/RGA/RGAL/SCL19 ( Fig. 2). Three members, SCR, SCL3 and SCL8, do not group with any of the other sequences. The trees derived from analyses of amino acid and nucleotide sequences were nearly identical. Distance-based analyses yielded similar results.
Database searches identified putative GRAS family sequences in other plant species (rice, oat, alfalfa, maize, watermelon and Brassica napus). Recently, the Lateral suppressor (Ls) gene in tomato has been shown to be a new member of the GRAS family ( Schumacher et al. 1999 ). However, to date no significant similarity to the GRAS gene family has appeared among the sequences in any non-plant genome, including the fully sequenced yeast genome. Thus the GRAS gene family, like the AP2 family ( Okamuro et al. 1997 ; Weigel 1995), appears to be restricted to higher plants.
RNA gel blot analyses
To begin to characterize the expression patterns of the SCL genes, RNA gel blot analyses of mRNA extracted from siliques, shoots and roots were performed. As can be seen in Fig. 3, all of the SCLs analyzed are expressed in the root. SCL6 and SCL9 appear to be root-specific. A majority of the others show the highest level of expression in the roots (SCL1 and SCL7 are the exceptions). Although the levels of the SCL transcripts cannot be compared directly on these films due to variable exposure times and (in some cases) amounts of mRNA loaded on the gels, most of the SCLs appear to have similar levels of expression. The notable exceptions to this are SCL1, SCL6, SCL7 and SCL9. Hybridization with the SCL1 probe reproducibly resulted in multiple bands ( Fig. 3). Moreover, the exposure time required for detection with the SCL1 probe was significantly shorter than that required for all of the other probes (minutes as opposed to hours). In contrast, detection of the SCL6, SCL7 and SCL9 transcripts required increased amounts of mRNA and longer exposure times, indicating significantly reduced levels of expression of these SCLs relative to the others. The SCL probes do not show any detectable cross-hybridization under the conditions used. This was initially indicated by the fact that Southern analyses using the SCL1, SCL3 and SCL5 probes resulted in a single hybridizing band in recombinant inbred analysis. Additionally, the unique messages detected using the SCLs as probes (with the exception of SCL1) vary in size: from 1.8 kb for SCL3 to over 3.0 kb for both SCL9 and SCL14. In summary, our expression analysis shows that many of the SCL sequences are expressed predominantly in the root, suggesting that a subset of these sequences may play important roles in root biology.
In situ analyses
RNA in situ hybridization with probes for SCL3 were performed in order to begin to establish tissue-specific expression patterns. SCL3 is expressed predominantly in the root endodermis, in a pattern strikingly similar to that of SCR ( Fig. 4). This pattern does not result from cross-hybridization with SCR or with other SCLs. This conclusion is supported by several facts. Most importantly, SCL3 does not have significant stretches of absolute sequence homology at the nucleotide level with SCR or with any other member of the family. In addition, SCL3 probes routinely result in only single hybridization patterns on both Southern and Northern blots. Finally, as noted above, the sizes of the transcripts hybridizing to different SCL probes can be clearly distinguished. The fact that SCL3 is expressed in the root in a pattern very similar to SCR suggests that there exists a subset of SCLs involved in radial patterning and that SCL3 plays a role in endodermal specification, perhaps by regulating the expression of SCR or by being regulated by SCR.
Significance of the GRAS gene products
In spite of many unknowns, the significance of the GRAS family is beginning to be understood. The SCARECROW gene, the defining member of this family, is absolutely required for the proper radial patterning of the root and shoot in Arabidopsis ( Fukaki et al. 1998 ; Scheres et al. 1995 ).
Plants mutant at the GAI locus are reduced in stature and do not respond to applications of exogenous GA, indicating that the GAI protein is involved in GA perception and response ( Koornneef et al. 1985 ). GAI may act as a negative regulator of cell elongation. It has been hypothesized that in wild-type plants, GAI represses cell elongation in the absence of GA ( Peng et al. 1997 ). The phenotype of the rga mutants indicates that RGA also negatively regulates GA perception and response ( Silverstone et al. 1997 ). The N-termini of GAI and RGA are highly similar, indicating that they may act through similar mechanisms. Deletion of five amino acids (DELLA) in the N-terminus in GAI results in the dominant phenotype, implicating this region in GA perception and response ( Peng et al. 1997 ). Therefore, these three prototypical GRAS gene products (SCR, GAI and RGA) establish that the members of the GRAS family play important roles in plant biology.
DNA sequencing, alignments and phylogeny
The SCL ESTs were obtained from the Arabidopsis Stock Center (Columbus, OH, USA) with the exception of SCL6, which was kindly provided by Thierry Desprez (INRA, France). The plasmid DNA was prepared by alkaline lysis ( Sambrook et al. 1989 ) and sequenced using Sequenase 2.0 (United States Biochemicals), according to the manufacturer’s instructions. Sequences were translated using GeneWorks 2.0 (Oxford, UK) and aligned manually, based on alignments performed using GeneWorks and additional BLAST ( Altschul et al. 1997 ) searches. The sequences of the highly conserved region shown in Fig. 1 were analyzed using the PAUP program. Trees were obtained using both maximum parsimony (gaps informative) and minimum evolution (distance) of both the protein and nucleotide sequences. Bootstrap analyses confirmed that the branches were strongly supported (all clades occurred at a frequency greater than 0.75, with 1000 replicates).
The map positions of most of the sequenced Arabidopsis ESTs were determined using either the recombinant inbred lines (SCL1, SCL5) or PCR-based yeast artificial chromosome (YAC) library screening (SCL3, SCL6, SCL7, SCL8, SCL9, SCL13). Recombinant inbred mapping was performed as described previously ( Di Laurenzio et al. 1996 ). For YAC library screening primer pairs specific for each of the SCLs (18–21mers) were obtained from Ransom Hill. These primer pairs were utilized in a polymerase chain reaction with DNA from the CIC YAC library ( Creusot et al. 1995 ) using protocols and conditions described by Camilleri et al. (1998) . The map positions of seven of the SCL genomic clones (SCL4, SCL6, SCL9, SCL13, SCL15, SCL16, SCL18) are known as a result of the Arabidopsis genome sequencing projects ( Bevan et al. 1998 ; Camilleri et al. 1998 ; Schmidt et al. 1995 ; Schmidt et al. 1997 ; Zachgo et al. 1996 ). SCL11 and SCL14 could not be placed on a YAC by the PCR-based method and the map positions of SCL19 and RGAL are not known. The results of the mapping are summarized in Table 1.
RNA extraction and blot analysis
Total RNA was extracted from the roots and shoots of 14 day seedlings grown under standard sterile conditions (see Di Laurenzio et al. 1996 ) and from siliques from plants grown on soil. Ten (SCL1, SCL3, SCL5, SCL8, SCL11 and SCL13) or 18 (SCL6, SCL7, SCL9, SCL14) micrograms of total RNA were separated on formaldehyde gels, as in Di Laurenzio et al. 1996. The RNA was transferred to HyBond-N (Amersham) and hybridized with digoxygenin-labeled single-stranded DNA probes using the GENIUS non-radioactive detection system (Bohreinger Mannheim), as per the manufacturer’s instructions.
In situ analyses
Four- to seven-day-old light-grown seedlings grown under standard sterile conditions were fixed in paraformaldehyde, embedded in Paraplast Plus (Fisher), sectioned, and hybridized as reported in Di Laurenzio et al. (1996). Probes were digoxygenin-labeled using the protocol also described in Di Laurenzio et al. (1996).
Accession numbers for GRAS sequences in other plants
The GenBank/EMBL database accession numbers for the two members of this family that we sequenced from rice (OsSCL1) and maize (ZmSCL1) are AF067400 and AF067401, respectively. The following are accession numbers for ESTs that encode products with a VHIID motif: AA231684 (oat), AA751595 (rice), C72495 (rice), AA754049 (rice); with a SAW motif: AA231910 (oat), H74669 (Brassica), C28500 (rice); with similarities to the PFYRE motif: C20324 (rice), D15490 (rice), C28384 (rice). Additional ESTs encoding products with significant similarity to the GRAS sequences are C71780 (rice), AA660090 (watermelon), AA750594 (rice), and AA751136 (rice). The accession numbers of the ESTs that encode products with the DELLA sequence (like GAI and RGA) are AA660952 (watermelon) and D39460 (rice).
The authors would like to thank D. Fitch for assistance with the phylogenetic analyses; Y. Helaruitta and Y. Zhang for assistance with in situ protocols; the Arabidopsis Stock Center and T. Desprez for providing EST clones; C. Lister for analyzing the RI mapping data and determining the map positions for SCL1 and SCL5; and J. Malamy, G. Schindelman, J. Lim, K. Birnbaum and J. Jung for many helpful discussions. L.D.P. was supported by an NSF Postdoctoral Fellowship in Biosciences Related to the Environment, J.W.W.D. was supported by an NIH Postdoctoral Fellowship, and the work in the Benfey lab was supported by a grant from the NIH (R01 GM43778).