The −197 bp promoter of the rice seed storage protein gene, GluB-1, is capable of conferring endosperm-specific gene expression. This proximal 5′ flanking region contains four motifs, GCN4, AACA, ACGT and Prolamin-box, which are conserved in many seed storage protein genes. We previously showed that multiple copies of GCN4 conferred endosperm expression pattern when fused to the −46 core promoter of CaMV 35S. In this paper we demonstrate, using a similar approach, that tandem repeated copies of any of the other three motifs are unable to direct expression in seeds as well as other tissues of transgenic rice plants. Mutational analysis of individual motifs in the −197 bp promoter resulted in remarkable reductions in promoter activity. These results indicate that the GCN4 motif acts as an essential element determining endosperm-specific expression and that the AACA, ACGT and Prolamin-box are involved in quantitative regulation of the GluB-1 gene. A set of gain-of-function experiments using transgenic rice showed that either the Prolamin-box or AACA, although often coupled with GCN4 in many genes, is insufficient to form a functional promoter unit with GCN4, whereas a combination of GCN4, AACA and ACGT motifs was found sufficient to confer a detectable level of endosperm expression. Taken together, our results provide direct insight into the importance of combinatorial interplay between cis-elements in regulating the expression of seed storage protein genes.
Seed development serves as a transient link between parental and progeny sporophytic generations. A dominant process characteristic of seed development is the accumulation of reserve carbohydrates, lipids and proteins that are subsequently used for germinating seeds as a source of energy, carbon and nitrogen. Reserve seed proteins are mainly contributed by storage proteins which are the products of a limited number of specific genes and whose expression is tightly regulated during seed development. Thus, seed storage protein genes provide a model system to investigate the molecular mechanisms of specific gene expression. Several lines of evidence have indicated that the synthesis of storage proteins is controlled primarily at the transcriptional level. Furthermore, promoter regions of many storage protein genes have been shown to be capable of conferring seed-specific expression (for reviews, see Morton et al. 1995 ; Thomas, 1993). However, the molecular mechanisms governing seed-specific expression is not well understood, especially for those encoding seed storage proteins of cereal crops where the efficient production of stable transgenic plants has not been routine until recently.
Cereal seeds are structurally similar in having relatively small embryos and large endosperms, the latter serving as the storage organ for deposition of proteins and other reserves. A survey of cereal storage protein genes shows that several consensus sequences are often found in their promoter regions. These include the GCN4 motif and the Prolamin-box (PROL). The GCN4 and PROL motifs are conserved in many seed storage protein genes from maize, wheat, rice, barley, sorghum and oat (for lists of some of these genes, see Müller & Knudsen, 1993; Vicente-Carbajosa et al. 1997 ). In many prolamin genes, GCN4 and PROL are coupled with each other and separated by only a few nucleotides, and designated the bifactorial endosperm box ( Marzabal et al. 1998 ). The involvement of these two motifs in regulating gene expression has been demonstrated for several genes ( Albani et al. 1997 ; Marzabal et al. 1998 ; Müller & Knudsen, 1993).
These three conserved motifs are recognized by specific DNA binding proteins. The GCN4 motif is recognized by the bZIP protein Opaque-2 (O2) and its homologues ( Albani et al. 1997 ; de Pater et al. 1994 ; Vicente-Carbajosa et al. 1998 ; Wu et al. 1998 ); the PROL box is recognized by the DOF class of zinc finger proteins ( Mena et al. 1998 ; Vicente-Carbajosa et al. 1997 ), while the AACA is recognized by MYB proteins ( Suzuki et al. 1998 ). In addition, it is well known that O2 regulates 22 kDa zein genes and O2 binding sites in those genes contain an ACGT core ( Schmidt et al. 1992 ). These findings have provided a primary profile to elucidate mechanisms controlling the endosperm-specific expression. Further dissection of these important cis-elements and their possible interactions using homologous transgenic cereal plants would facilitate efforts to precisely answer how a promoter is able to direct transcription in a tissue-specific and quantitative manner.
We have analyzed the promoter of the rice storage protein gene GluB-1 in transgenic rice and identified a minimal promoter of 197 bp upstream of the transcription start site (−197) which is able to confer endosperm-specific expression ( Wu et al. 1998 ). It is interesting to note that this minimal promoter contains GCN4, PROL, AACA and ACGT core-containing (ACGT) motifs ( Fig. 1). We are interested in determining how each of the four elements contributes to the observed activity of the −197 promoter, and if any interactions exist between them. We have recently demonstrated that a nucleotide substitution mutation of the GCN4 motif in the −197 promoter caused a significant reduction of promoter activity and alteration of expression pattern in the endosperm ( Wu et al. 1998 ). Moreover, tandem repeats of a 21 bp fragment containing this motif exhibited an expression pattern similar to the native promoter in transgenic rice ( Wu et al. 1998 ). In this study, we demonstrate by substitution mutations that AACA, PROL or ACGT are required for maximum activity of the −197 promoter. However, any of them, as multimers, cannot confer a detectable transcriptional activity. To gain insight into the combinatorial interplay among these motifs, a series of constructs containing various combinations were examined in transgenic rice. Our results show that a combination of three motifs, including GCN4, is a minimal requirement for endosperm-specific expression.
Results and Discussion
Involvement of the Prolamin-box, ACGT and AACA motifs in regulating the transciptional activity of the −197 GluB-1 promoter
To understand how the Prolamin-box (PROL), ACGT core-containing (ACGT) and AACA motifs function in the native context of the minimal −197 promoter (−197) of the rice glutelin gene GluB-1, we introduced substitutive mutations into each of the three motifs ( Fig. 1, lower case letters). The intact and resulting three versions of the −197 promoter were transcriptionally fused 5′ to the GUS reporter gene ( Fig. 2) and co-introduced with the selectable marker gene into rice. Analysis of GUS activity in the seeds of stable transgenic plants showed that a mutation at any one of the three motifs resulted in a significant reduction of GUS expression as compared to that of the intact version ( Fig. 2). A mutation in the PROL site led to at least a 10-fold reduction in promoter activity ( Fig. 2, −197mPROL) and GUS expression directed by this mutated promoter was weakly observed only around the aleurone layer ( Fig. 4b). A mutation of the ACGT site caused on average about a fourfold reduction in promoter activity ( Fig. 2, −197mACGT) but with no apparent change in endosperm expression pattern as compared to the intact −197 promoter ( Fig. 4, compare c with a). A mutation of the AACA site resulted in a complete loss in promoter activity ( Fig. 2, −197mAACA) and no GUS activity was detected histochemically ( Fig. 4d). GUS activity in leaves was not detectable for either the intact −197 promoter or its mutated versions (data not shown). These results demonstrate that all the three motifs are required for maximum transciptional activity of the −197 promoter with each motif contributing different levels of expression.
The PROL, ACGT and AACA motifs alone are unable to confer seed-specific expression
We next determined whether the PROL, ACGT and AACA motifs could confer transcription activity to a heterologous inactive basal promoter on their own. Gain-of-function experiments were performed by fusing a desired fragment to the −46 CaMV 35S core promoter (−46) followed by the GUS reporter gene ( Fig. 2). The synthesized PROL fragment is derived from nucleotides −139 to −120; the ACGT fragment between −90 to −71; and the AACA fragment between −82 to −47 ( Fig. 1, underlined). Because ACGT and AACA flank each other, the ACGT oligonucleotide overlaps with the 5′ partial sequence of the AACA motif and the AACA oligonucleotide also covers the ACGT motif ( Fig. 1, double underlined). Since multimerization of a promoter motif can enhance its activity, all three oligonucleotides were tested as a tandem repeated trimer ( Figs 2, 3×PROL, 3×ACGT and 3×AACA). As the −46 promoter has been shown to be inactive in transgenic rice ( Wu et al. 1998 ) and contains no other protein binding sites except for the TATA box, any promoter activity, if gained, should be attributed to sequences upstream of −46. As shown in Fig. 2, neither PROL, ACGT alone, nor the AACA motif together with ACGT, was able to direct significant GUS gene expression in transgenic rice. Similarly, a hexamer of the ACGT motif placed upstream of the CaMV 35S core promoter was also tested and shown to be unable to direct detectable GUS activity in transgenic rice (data not shown). Collectively, these results show that the PROL, AACA and ACGT motifs as single cis-elements, and the AACA motif even together with the ACGT motif, are insufficient to confer seed-specific expression even as duplicated copies, but contribute to the quantitative expression of the −197 GluB-1 promoter. Combining the data from GCN4 described previously ( Wu et al. 1998 ), we conclude that among the four tested motifs from the −197 promoter, GCN4 is the only one that is necessary to mimic the endosperm-specific expression pattern of the native promoter when multimerized.
Combinatorial interactions between GCN4 and other cis-elements determine the endosperm specificity and expression level
The GluB-1 promoter contains one copy of the GCN4 motif ( Fig. 1). It is notable that a single copy of the isolated GCN4 is not sufficient to confer detectable promoter activity in stable transgenic rice ( Wu et al. 1998 ). Thus, we proposed that the GCN4 motif may serve as a central cis-element and that its combinatorial interaction with other cis-elements determine the tissue specificity and expression level of GluB-1.
To test this hypothesis various promoter combinations were constructed and their ability to direct expression of the GUS reporter gene was examined in transgenic rice ( Fig. 3). In the −245 GluB-1 promoter, another AACA motif exists upstream of −197 ( Fig. 1), and deletion of this motif resulted in an eightfold reduction in promoter activity ( Wu et al. 1998 ). Therefore, we used the fragment from −245 to −145 as a native pair of AACA and GCN4 to test whether such a combination between these two motifs is able to confer detectable transcriptional activity to the −46 core promoter. As shown in Fig. 3, one set of AACA and GCN4 (construct a) was not sufficient to form a functional unit whereas the dimerization of this fragment (construct b) could direct expression of the GUS gene at a level much higher than that conferred by the GCN4 trimer (Compare with Wu et al. 1998 ). Duplication of the −245 to −145 in the reverse orientation ( Fig. 3c), was also able to direct GUS gene expression although at a level lower than that obtained by the normal same fragments in the normal orientation. The seed expression pattern exhibited by these two constructs was similar to that of the −197 promoter ( Fig. 4e,f). Because neither a dimer of the GCN4 motif nor a trimer of the reversibly orientated GCN4 motif had detectable promoter activity in transgenic rice under our assay conditions (data not shown), the observed activities with the dimerized AACA-GCN4 fragment must result from an interaction between the two motifs. It should be pointed out that spacing between the two motifs and the TATA box may influence expression level.
We further tested the transcriptional activity of the GCN4 and PROL combination. These two cis-elements are coupled in many cereal storage protein genes. A fragment between nucleotides −171 to −127 of the GluB-1 promoter was used as a native pair of the two motifs. This fragment was fused to −46/GUS and introduced into rice ( Fig. 3d). Similar to the AACA-GCN4 fragment, a monomer of the GCN4-PROL fragment was unable to confer detectable promoter activity (data not shown) whereas the dimerized version could direct significant GUS gene expression in seeds ( Fig. 3d) without any apparent changes in the endosperm-specific expression pattern ( Fig. 4g). As a control, a fragment from −113 to −45, encompassing the ACGT element and the proximal AACA but without GCN4, was also tested by fusing to −46/GUS ( Fig. 3e). This fragment, even when dimerized, only directed very little promoter activity ( Fig. 3e). Such low activity was histochemically undetectable, even in aleurone cells where the native promoter always has the strongest expression (data not shown). This observation is consistent with the results obtained with the inactive AACA trimer which covered a short upstream region including the ACGT core ( Figs 2, 3×AACA) as well as the inactive −113 GluB-1 promoter which contained ACGT, AACA and the native TATA box ( Wu et al. 1998 ). In addition, the −145 GluB-1 promoter, containing PROL, ACGT and AACA but without GCN4, is inactive in transgenic rice plants ( Wu et al. 1998 ). Thus, we conclude that GCN4 is a core element, and that an addition of another motif is not sufficient to confer a detectable promoter activity unless such a two-component combination is dimerized. In other words, although GCN4 is often coupled with either AACA or PROL, the two native pairs in only one copy are incapable of conferring transcription activity to a basal promoter. These results suggest that in the native context, the GCN4 core element may require two or more partners to form a functional unit.
We examined this possibility by further analyzing the −245 promoter in transgenic rice. It has been previously shown in transgenic tobacco that an internal deletion from −145 to −90 from the −245 native promoter, which retained GCN4, ACGT and both AACAs, had even higher promoter activity than the intact −245 ( Washida et al. 1999 ). We tested the function of this AACA/GCN4/ACGT/AACA combination in rice ( Fig. 3f). This internally deleted promoter directed relatively high level expression of the GUS gene, but the level was significantly lower than that conferred by the −245 intact promoter (compare Fig. 3f with Wu et al. 1998 ), thus quantitatively contrasting to the tobacco data. Another promoter construct, containing the same four motifs in the native order but fused to the −46 CaMV 35S core promoter, was also able to direct GUS gene expression ( Fig. 3g). The lower expression conferred by this hybrid promoter may be attributed to the use of a heterologous TATA box. Based on these observations, we further examined whether a combination of three motifs is sufficient to confer a detectable promoter activity. Of two constructs tested, one consisting of AACA/GCN4/ACGT ( Fig. 3h) and the other AACA/GCN4/AACA ( Fig. 3i), only the combination of AACA, GCN4 and ACGT was able to direct GUS gene expression at a low level. As an alternative testing, the order of the three motifs was changed from AACA/GCN4/ACGT to GCN4/ACGT/AACA by using the proximal AACA instead of the distal one ( Fig. 3j). This change had no effect on the promoter activity, indicating that the distal and proximal AACAs are functionally interchangeable (compare h with j in Fig. 3). Inactivity of the AACA/GCN4/AACA combination was also observed in transgenic tobacco where an internally deleted promoter, having only GCN4 and two AACAs retained, displayed an activity close to the background level ( Washida et al. 1999 ). All of these active constructs ( Fig. 3f,g,h,j) had a predominant expression in aleurone cells of transgenic seeds ( Fig. 4h–k). GUS activity was not detectable in leaves for all the constructs tested here (data not shown).
Taken together, we have shown that a combination consisting of at least three motifs is a prerequisite for constituting a minimal functional promoter unit, and that the two AACAs contained in the −245 promoter are genetically exchangeable. It is possible that the promoter fragments tested may contain other as yet unidentified functional motifs, which could mask our understanding on the combinatorial interactions among the four motifs examined. However, several mutations outside the four motifs in the −245 GluB-1 promoter did not affect the promoter activity significantly as observed in transgenic tobacco ( Washida et al. 1999 ), thus minimizing this possibility.
Minimum promoter of the glutelin GluB-1 gene between −197 and +18 that was mutagenized at the sites of Prolamin box, ACGT and AACA motifs, respectively, was amplified by polymerase chain reaction (PCR) from the corresponding mutagenized fragments (from −245 to +18) already constructed using a forward primer containing overhanging SalI recognition site and a reverse primer containing BamHI site. Amplified fragments were purified by electrophoresis on 1.5% agarose gel and then completely digested with SalI and BamHI. Digested fragments were introduced into the corresponding sites of pBI201 to form constructs −197mPROL, −197mACGT and −197mAACA.
A trimer of the 20 bp Prolamin box (PROL) synthetic complementary oligonucleotide (between −139 and −120) with TCGA overhanging at 5′ and 3′ ends was phosphorylated at the 5′ end and annealed. The annealed oligonucleotide was inserted into the SalI site of S2–46 GUS vector. For the construction of trimers of the ACGT motif and the AACA motif, 24 bp synthetic complementary oligonucleotide between −90 and −71 with a GATC at the 5′ end and 40 bp synthetic complementary DNA fragment between −82 and −47 with an ACTG at the 5′ end were concatenated with T4 DNA ligase after phosphorylation at the 5′ end and annealing, and then introduced into pUC18. After confirmation by sequencing, these inserts were obtained by digestion with SalI and SmaI and then introduced into SalI and StuI sites of S2–46 GUS.
The fragments between −245 and −145 and between −113 and −45 were amplified as described previously ( Washida et al. 1999 ) and inserted into pUC18. After confirming the orientation and sequence of these inserts, they were double digested with SalI and SmaI and then cloned into the SalI and StuI sites of S2–46GUS. These fragments were also amplified by PCR with forward and reverse primers overhanging SalI site. The resulting fragments were concatenated by T4 DNA ligase and then purified by gel electrophoresis. Dimers of these fragments were inserted into the SalI site of S2–46 GUS. The synthetic oligonucleotide between −171 and −127 was annealed and concatenated. Dimer of this fragment was inserted into the StuI site of S2–46 GUS.
Various promoter deletions used in this study were amplified by PCR from the deletion series constructed as described previously ( Takaiwa et al. 1996 ; Washida et al. 1999 ) using a forward primer with the SalI site and a reverse primer with a SalI or EcoRV site. After these PCR fragments were completely digested with SalI or SalI and EcoRV and then purified by gel electrophoresis, they were inserted into the SalI site or the SalI and StuI sites of the S2–46 GUS.
Production of transgenic plants and analysis of GUS gene expression
Production of transgenic rice plants (Oryza sativa cv Kitaake) and fluorometric assay and histochemical analysis of GUS gene expression were essentially as described previously ( Wu et al. 1998 ).
The authors thank Drs T. Okita (Washington State University, USA) for critical reading of the manuscript and T. Guilfoyle (University of Missouri, USA) for providing the S2–46GUS vector. The authors are grateful to Ms M. Utsuno, Ms Y. Suzuki, Ms F. Ito and Ms Y. Shi for technical assistance. This work was supported by research grants from the Science and Technology Agency (Enhancement System of Center-of-Excellence), the Bio-orientated Technology Research Advancement Institute (PROBRAIN) and the Ministry of Agriculture, Forestry and Fishery of Japan to F.T.