Many plants synthesize the volatile phenylpropene compounds eugenol and isoeugenol to serve in defense against herbivores and pathogens and to attract pollinators. Clarkia breweri flowers emit a mixture of eugenol and isoeugenol, while Petunia hybrida flowers emit mostly isoeugenol with small amounts of eugenol. We recently reported the identification of a petunia enzyme, isoeugenol synthase 1 (PhIGS1) that catalyzes the formation of isoeugenol, and an Ocimum basilicum (basil) enzyme, eugenol synthase 1 (ObEGS1), that produces eugenol. ObEGS1 and PhIGS1 both utilize coniferyl acetate, are 52% sequence identical, and belong to a family of NADPH-dependent reductases involved in secondary metabolism. Here we show that C. breweri flowers have two closely related proteins (96% identity), CbIGS1 and CbEGS1, that are similar to ObEGS1 (58% and 59% identity, respectively) and catalyze the formation of isoeugenol and eugenol, respectively. In vitro mutagenesis experiments demonstrate that substitution of only a single residue can substantially affect the product specificity of these enzymes. A third C. breweri enzyme identified, CbEGS2, also catalyzes the formation of eugenol from coniferyl acetate and is only 46% identical to CbIGS1 and CbEGS1 but more similar (>70%) to other types of reductases. We also found that petunia flowers contain an enzyme, PhEGS1, that is highly similar to CbEGS2 (82% identity) and that converts coniferyl acetate to eugenol. Our results indicate that plant enzymes with EGS and IGS activities have arisen multiple times and in different protein lineages.
Eugenol and isoeugenol belong to a class of compounds, the phenylpropenes, which are derived from phenylalanine. The phenylpropenes are important constituents in many spices used by humans and have therefore played important roles in human nutrition (Prasad et al., 2004). The phenylpropenes are generally toxic to animals and microorganisms, and many plants synthesize them in their vegetative parts as defense against herbivores and pathogens (Grossman, 1993; Obeng-Ofori and Reichmuth, 1997).
The floral scent bouquet of many species contains volatile phenylpropenes. For example, the flowers of the California annual Clarkia breweri emit a mixture of volatiles that include eugenol, isoeugenol, methyleugenol, and methylisoeugenol (Raguso and Pichersky, 1995; Figure 1a). Flowers of Petunia hybrida, another moth-pollinated species, emit high levels of isoeugenol, as well as smaller amounts of eugenol (Verdonk et al., 2003). Many herbs, such as basil, also synthesize and store phenylpropenes in glands on their leaves (Gang et al., 2001).
We have recently shown that petunia flowers possess a NADPH-dependent enzyme, isoeugenol synthase 1 (PhIGS1), which converts coniferyl acetate to isoeugenol, while leaf glands of basil (Ocimum basilicum) possess the enzyme eugenol synthase 1 (ObEGS1), which converts the same precursor to eugenol (Koeduka et al., 2006). The PhIGS1 and ObEGS1 proteins are approximately 50% identical and are also homologous to several other reductases involved in phenylpropanoid metabolism in plants, including pinoresinol–lariciresinol reductase (PLR), isoflavone reductase (IFR), phenylcoumaran benzylic ether reductase (PCBER), leucocyanidin reductase (LAR), and pterocarpan reductase (PTR), collectively termed the PIP reductase family after the first three enzymes discovered in this family (Akashi et al., 2006; Gang et al., 1999; Min et al., 2003; Tanner et al., 2003).
Eugenol and isoeugenol differ in the position of the double bond in the propene side chain (Figure 1a). PhIGS1 and ObEGS1 represent an interesting example of two similar enzymes that use the same substrate but catalyze the formation of a different product. Recently, the crystal structure of ObEGS1 complexed with NADP+ and a coniferyl-acetate analog was obtained (Louie et al., 2007). Examination of this structure shows that the enzyme acts on the substrate via a ‘push–pull’ mechanism, removing the proton of the para-hydroxyl group and promoting the cleavage of the acetyl group. In the resultant quinone–methide intermediate, the C7 atom serves as the acceptor of the hydride ion from NADPH (Figure 1b). Presumably, the position of the substrate in the active site of PhIGS1 is such that the hydride is transferred to C9 instead of C7 (Figure 1b). However, the high overall level of divergence between the two enzymes precludes an easy identification of the residues involved without additional structural information, and attempts to crystallize PhIGS1 have so far failed.
Because C. breweri flowers, unlike petunia flowers or basil glands, emit a mixture of both eugenol and isoeugenol in roughly similar proportions, we investigated whether a single enzyme is responsible for their biosynthesis, or, if two or more enzymes are involved, how related they are to each other and to PhIGS1 and ObEGS1. Our results indicate that C. breweri flowers have a eugenol synthase and an isoeugenol synthase (CbEGS1 and CbIGS1) that are closely related to each other, and that the differing product specificity is determined by very few residues. Furthermore, C. breweri possesses a second eugenol synthase (CbEGS2) that is very unlike both CbEGS1 and CbIGS1 and is more related to non-phenylpropene producing enzymes. We also discovered that petunia has a eugenol synthase that is closely related to CbEGS2. These results suggest that both convergent and divergent evolutionary pathways have given rise to phenylpropene-forming enzymes in plants.
Levels of eugenol synthase and isoeugenol synthase activities in C. breweri
Enzyme activity measurements from crude protein extracts obtained from different floral parts 1 day post-anthesis indicated that the highest levels of IGS-specific activity were found in petals, followed by stamens, pistil, and sepals, with no activity in leaves (Table 1). The highest specific activity levels of EGS were found in the stamens, followed closely by the pistil and petals (Table 1). However, since the petal tissue constitutes the bulk of the flower (Pichersky et al., 1994), the highest overall amounts of enzymatic activities for both EGS and IGS occur in the petals (Table 1). These results are consistent with previous observations that emission of eugenol, isoeugenol, and their methylated derivatives from C. breweri flowers occurred mostly from petals (Wang et al., 1997).
Table 1. Isoeugenol synthase (IGS) and eugenol synthase (EGS) activities from crude extracts of different parts of Clarkia breweri flowers at 1 day post-anthesis
Organ (total WT per flower)
Specific activity (pkat g FW−1)
Total activity (pkat flower−1)
Specific activity (pkat g FW−1)
Total activity (pkat flower−1)
Values are the averages of three independent experiments ± SE.
FW, fresh weight; ND, not detected; WT, wild type.
Sepals (22.5 mg)
0.03 ± 0.00
0.65 ± 0.09
0.03 ± 0.01
0.72 ± 0.11
Petals (64 mg)
0.32 ± 0.04
20.74 ± 2.37
0.05 ± 0.00
3.07 ± 0.13
Stamens (24 mg)
0.13 ± 0.03
3.22 ± 0.60
0.06 ± 0.01
1.54 ± 0.14
Pistil (16 mg)
0.05 ± 0.01
0.72 ± 0.18
0.05 ± 0.01
0.86 ± 0.10
Purification of EGS and IGS activities from petals of C. breweri
Both IGS and EGS activities were therefore purified from petals in a protocol employing several chromatographic steps (Table 2). Fractions were monitored for both IGS and EGS activities. After diethylaminoethyl (DEAE) chromatography, a broad peak with both IGS and EGS activities was obtained. Two separate peaks of EGS activity were observed eluting from the Hitrap-Phenyl column, while only a single peak of IGS activity was obtained from this column, co-eluting with the earlier peak of EGS activity (EGS peak 1 described in Table 2). The fractions of peak 1 were pooled and loaded on a Hitrap-Q column, and eluted with a KCl gradient. The IGS and EGS activities eluting from this column did not separate, and the fractions in the peak of IGS and EGS activities contained three major bands of approximately 55, 38, and 36 kDa (Figure 2, lane1). The presence of the 36-kDa protein (marked with an asterisk in Figure 2, lane 1) correlated best with IGS and EGS activities in the various fractions. The fractions containing EGS activity in peak 2 of the Hitrap-Phenyl column were also pooled and loaded onto a second Hitrap-Q column, and activity was eluted with a KCl gradient. A sharp peak of EGS activity was obtained (150–230 mm KCl range), and the pooled fractions of this peak contained a single protein of approximately 38 kDa (marked with an asterisk in Figure 2, lane 2) with EGS activity.
Table 2. Purification of isoeugenol synthase (IGS) and eugenol synthase (EGS) from Clarkia breweri petals
Total protein (mg)
Total activity (pkat)
Specific activity (pkat mg−1)
aThere were two peaks of EGS activities eluting from the Hitrap-Phenyl column, and the single peak of IGS activity coincided with the first peak of EGS activity on this column, as described in the text. Data in this table labeled peak 1 and peak 2 were obtained from pooled fractions constituting each peak.
Hitrap-Phenyl (peak 1)a
Hitrap-Phenyl (peak 2)a
Hitrap-Q col. 1
Hitrap-Q col. 2
Isolation and characterization of C. breweri cDNAs encoding proteins with IGS and EGS activities
The 36-kDa protein band obtained in the final purification step (HiTrap-Q column) from the peak of mixed EGS/IGS activity originally obtained from on the HiTrap-Phenyl column (Figure 2, lane 1) was eluted from the gel, trypsinized, and sequenced by liquid chromatography-tandem mass spectrometry (LC-MS/MS), followed by screening a C. breweri flower expressed sequence tag (EST) database comprising around 2000 sequences (D’Auria et al., 2002). The seven peptide sequences obtained from this 36-kDa protein band were found in the predicted protein sequences encoded by two closely related EST that fell into two contigs, and there were no other sequences encoding any of these peptides. Because of the short nature of the sequences in this database (<500 nucleotides) and the few differences between the two contigs, it was not possible to unambiguously assign all seven ESTs to one contig or the other. However, there was one region that was clearly different between the two contigs (Figure 3a), although none of the seven peptide sequences obtained corresponded to this region. We subsequently designated the two genes represented by these two contigs as CbEGS1 and CbIGS1. Based on comparisons with ObEGS1 and PhIGS1, the contig representing CbEGS1 contained a complete open reading frame, but the contig representing CbIGS1 was missing the 5′ end. A rapid amplification of 5′ complementary DNA ends (5′ RACE) experiment was conducted to obtain the sequence of the beginning of CbIGS1, using an internal primer based on sequence in a divergent region between CbEGS1 and CbIGS1 (Table S1). The sequence thus obtained showed that the nucleotide sequences of CbEGS1 and CbIGS1 around the beginning of the open reading frame were identical to each other. Based on this information, additional cDNAs (>10) of CbEGS1 and CbIGS1 were generated by RT-PCR with primers designed for the beginning and end of the open reading frame, and the sequences of all cDNAs thus obtained were identical to either CbEGS1 or CbIGS1.
The single protein with an approximate molecular mass of 38 kDa present in the fractions constituting the peak EGS activity eluting from the second Hitrap-Q column (Figure 2, lane 2) was analyzed in the same way. The sequence of 11 peptides obtained from it matched the sequence of a protein encoded by a gene represented by four ESTs in the database constituting a single contig that was deemed to contain the entire coding region (based on comparison with PCBER proteins, see below). We consequently designated this gene as CbEGS2 and the protein it encodes as CbEGS2 (Figure 3a). There were no other EST variants encoding any of these peptides.
Transcript levels of CbEGS1, CbIGS1, and CbEGS2 were measured and found to be highest in petals; no transcripts were found in leaves (Figure 4a). Both CbEGS1 and CbIGS1 encode proteins of 318 amino acids (aa) with a calculated molecular mass of 36.0 kDa. When CbIGS1 and CbEGS1 were expressed in Escherichia coli, the resulting (non-His-tagged) proteins co-migrated on SDS-PAGE with the 36 kDa protein from Figure 2, lane 1 (data not shown). CbEGS2 encodes a protein of 309 amino acids with a calculated molecular mass of 34.2 kDa. Expression of full-length, non-fusion cDNA of CbEGS2 in E. coli resulted in a protein co-migrating with the 38 kDa from Figure 2, lane 2 (data not shown). These results indicate that the characterized cDNAs of CbEGS1, CbIGS1, and CbEGS2 each contain the complete coding information for the respective proteins.
The full-length cDNAs of CbEGS1, CbIGS1, and CbEGS2 were expressed in E. coli to produce His-tagged CbEGS1, CbIGS1, and CbEGS2 proteins, which were then purified and assayed for activity with coniferyl acetate. The purified CbIGS1 enzyme catalyzed the formation of only isoeugenol (Figure 5a). The purified CbEGS1 and CbEGS2 proteins produced only eugenol (Figure 5b,c, respectively).
Analysis of the evolutionary relatedness of CbEGS1, CbIGS1 and CbEGS2 to each other and to other PIP proteins
The CbEGS1 and CbIGS1 proteins are 95.9% identical to each other. A phylogenetic analysis based on the maximum likelihood method (Figure 3b) as well as other methods including neighbor-joining and maximum parsimony (not shown) all indicated that CbEGS1 and CbIGS1 are the most closely related, among biochemically characterized proteins, to ObEGS1 (52% identity) as well as to PhIGS1 (51% identity; Figure 3b). CbEGS2, however, is more closely related to several enzymes previously characterized (Gang et al., 1999) to have PCBER activity (Figure 3b). For example, it is 78% identical to Populus trichocarpa PCBER, but only 46% identical to CbEGS1 or CbIGS1.
Isolation of a petunia eugenol synthase 1 (PhEGS1) closely related to CbEGS2
Since petunia flowers emit small amounts of eugenol in addition to high levels of isoeugenol (eugenol levels are <3% of those of isoeugenol, Verdonk et al., 2003), and the previously characterized PhIGS1 catalyzes the formation of isoeugenol but not eugenol, the origin of eugenol in petunia was not clear. A search of petunia flower EST databases (containing >3000 ESTs) identified several cDNAs encoding a protein of 308 aa with 82.1% identity to CbEGS2 (Figure 3), but no other cDNAs encoding proteins with similarity to known IGS and EGS sequences. This 308-aa protein, designated as Petunia hybrida eugenol synthase 1 (PhEGS1), has only 47.7% aa identity to PhIGS1. Characterization of purified PhEGS1 produced in E. coli revealed that the protein catalyzes the formation of eugenol from coniferyl acetate (Figure 5d). PhEGS1 is expressed specifically in the scent-producing parts of the flowers (limbs and tube) but not in other parts of the flowers nor in leaves (Figure 4b). PhEGS1 transcript levels were about threefold lower than those of PhIGS1 (Figure 4c).
Enzymatic properties of C. breweri IGS1, EGS1, and EGS2 and P. hybrida EGS1
The apparent Km values of CbIGS1, CbEGS1, and CbEGS2 for coniferyl acetate were 212 ± 28, 93 ± 6, and 311 ± 45 μm, respectively (Table 3). The apparent kcat value of CbIGS1 was 0.99 ± 0.13 sec−1 and the corresponding values for CbEGS1 and CbEGS2 were 0.26 ± 0.01 and 0.25 ± 0.02 sec−1. Thus, the apparent catalytic efficiency (kcat/Km) of CbIGS1 is about twofold higher than CbEGS1 and six-fold higher than that of CbEGS2 (Table 3). PhEGS1 has a Km value for coniferyl acetate of 245 μm and a kcat of 0.60 sec−1, similar to the re-measured Km value for PhIGS1, but its turnover rate is twofold lower than that of PhIGS1 (Table 3).
Table 3. Kinetic parameters of Clarkia breweri and Petunia hybrida eugenol synthase (EGS) and isoeugenol synthase (IGS) enzymes for coniferyl acetate
Vmax (nmol sec−1 mg−1)
kcat/Km (sec−1 mm−1)
Values are averages of three independent experiments ±SE.
211.5 ± 27.9
27.6 ± 3.7
0.99 ± 0.13
93.3 ± 6.3
7.3 ± 0.2
0.26 ± 0.01
310.5 ± 45.2
6.9 ± 0.6
0.25 ± 0.02
245.3 ± 58.0
18.4 ± 1.6
0.60 ± 0.05
226.1 ± 70.3
35.7 ± 5.9
1.3 ± 0.2
Because of the sequence similarity of CbEGS2 and PhEGS1 to proteins characterized as PCBER enzymes, we tested them as well as CbEGS1, CbIGS1, and the previously characterized ObEGS1 and PhIGS1 (Koeduka et al., 2006) for their ability to reduce the PCBER substrate dehydrodiconiferyl alcohol (DDC) to isodihydrodehydrodiconiferyl alcohol (IDDDC). No IDDDC product was detected in any of these reaction assays after 1 h. (In the control assays containing coniferyl acetate instead of DDC with the same amount of protein and carried out for 1 h, >35% of the substrate was converted to the product.) When the enzymatic assays using DDC as a substrate were carried out over longer time periods (>3 h) and with a 23-fold increase in protein concentration, ObEGS1, PhIGS1, CbIGS1, and CbEGS1 were still not able to reduce any DDC to IDDC (Figure 6a–d); however, CbEGS2 and PhEGS1 catalyzed the formation of a small amount of IDDDC (Figure 6e,f) at the calculated rates of 6.7 and 24.4 nmol h−1 (mg protein)−1, respectively. These rates were comparable to the rates of 53 and 104.2 nmol h−1 (mg protein)−1 reported for P. trichocarpa and Pinus taeda PCBERs, respectively (Gang et al., 1999), but are approximately 2700- to 4000-fold slower than the rates in which PhEGS1 and CbEGS2, respectively, catalyze the production of eugenol from coniferyl acetate (Table 3).
Amino acid residues in CbIGS1, CbEGS1, and other EGS and IGS proteins involved in determining product specificity
Sequence comparison of CbIGS1 and CbEGS1 show that they differ at only 13 positions, and nine of these positions reside in a small region between positions 73 and 95 (Figure 7). To identify the specific residues that determine their product specificity, we first produced a hybrid protein by fusing the first 95 codons of CbEGS1 with codons 96–318 of CbIGS1. This hybrid protein catalyzed the formation of mostly eugenol, with a small proportion of isoeugenol (mutant 1 in Table 4). (The reciprocal hybrid protein, which was soluble and stable to a similar degree as the first hybrid protein, did not show any activity.) Since the result suggested that the product specificity mostly resides in region 73–95, site-directed mutagenesis of CbIGS1 was used to change individual residues or a cluster of residues in this region to the corresponding amino acids found in CbEGS1. Changing residues 83, 84, 87, and 88 (mutant 3) gave a protein with the highest ratio of eugenol/isoeugenol production, 85:15. Changing residues 73, 77, 83, and 84 (mutant 2) gave a product with a 29:71 eugenol/isoeugenol ratio. In contrast, changing amino acids at positions 91, 92, and 95 (mutant 4) did not affect the product preference. These data suggested that residues at position 83, 84, 87, and 88, or at least some of them, affect the product forming specificity significantly.
Table 4. Preferential product formation in eugenol synthase (EGS) and isoeugenol synthase (IGS)
To narrow down the contribution of these residues to product specificity, five additional mutants were generated (mutants 5–9 in Table 4). Among these mutants, mutant 7, in which residue V84 was changed to F and residue Y87 was changed to I, had the highest ratio of eugenol to isoeugenol formation, 75:25, followed by mutant 6 (66:34). Moreover, even the single substitution at position 87 produced a protein which catalyzed the formation of more eugenol than isoeugenol (62:38, mutant 9 in Table 4).
Affects of amino acid changes in CbIGS1 on the specific activity of the mutant enzymes
To determine how the amino acid changes described above affected the specific activity of the proteins, we measured this parameter for mutants 7, 8, and 9. Mutants 7 and 8 exhibited 1.8- and 3.4-fold higher total activity, respectively, compared with the wild-type enzyme (Figure 8). In contrast, the change at position 87 (mutant 9, Y87I) had no significant effect on total specific activity of the protein (Figure 8), even though the amino acid at this position appears to contribute most to the product specificity (Table 4).
Site-directed mutagenesis of CbEGS1, ObEGS1, PhIGS1, PhEGS1, and CbEGS2
Since positions 84 and 87 in CbIGS1 were identified as key positions determining product specificity, we examined the specific amino acids in the corresponding positions in the other EGS and IGS proteins identified in this and previous studies (Figure 7b; these positions are numbered slightly differently in each protein because of differences in upstream sequences, but for clarity the numbers referring to the CbIGS1 positions are used in the text to discuss comparisons). At position 84, both IGS enzymes (CbIGS1 and PhIGS1) have a V, while three EGS enzymes have F and one (PhEGS1) has Q. At position 87, both IGS proteins have Y, while three EGS proteins have I and one (again, PhEGS1) has L. The amino acids in these two positions in each protein were then changed to the corresponding residues in CbIGS1 (for EGS enzymes) or CbEGS1 (for PhIGS1). In each case, some change in product specificity, ranging from 12% to 69%, was observed (Table 4, mutants 10–14).
C. breweri and P. hybrida have distinct synthases for eugenol and isoeugenol biosynthesis
Purification of EGS and IGS activities from C. breweri flowers yielded three distinct proteins, two of which display only EGS activity and the third possessing only IGS activity. One of the C. breweri EGSs identified in this approach of direct protein purification and characterization, CbEGS2, proved to be only distantly related to the previously characterized basil EGS and petunia IGS, and its sequence was used to identify a gene from petunia, PhEGS1, that encodes a EGS. All three newly characterized C. breweri enzymes and the new petunia EGS, as well as the previously characterized PhIGS1 and ObEGS1, use coniferyl acetate to make a single product – either eugenol or isoeugenol – and have similar affinity to the substrate, although their catalytic efficiencies may differ by as much as sixfold. The lower catalytic efficiency of PhEGS1 and its lower level of expression compared with PhIGS1 may explain the much lower levels of eugenol, as compared with isoeugenol, emitted from petunia flowers.
Few residues in EGS and IGS enzymes have a major effect on product specificity
Our earlier crystallographic studies (Louie et al., 2007) of ObEGS1 in complex with an analog of coniferyl acetate, (7S,8S)-ethyl (7,8-methylene)-dihydroferulate (EMDF), clearly revealed the substrate-binding mode within the ObEGS1 active site (Figure 9a). The guaiacol ring is stacked against the nicotinamide ring of the co-factor, and the side chain, which in EMDF bears a cyclopropyl group and is distinctly kinked, is accommodated in a predominantly hydrophobic pocket at the top of the active-site pocket. Notably, the coniferyl acetate substrate would be most appropriately positioned for acceptance at C7 of a hydride from the cofactor nicotinamide, consistent with the formation of the eugenol product (Figure 9a).
The results from our in vitro mutagenesis experiments suggest that the residues at positions 84 and 87 in CbIGS1 and at the corresponding positions in the other EGS and IGS proteins in this study (Figure 7b) are major determinants of product specificity. Through a protocol identical to that used for the crystallographic analysis of wild-type ObEGS1 (Louie et al., 2007), we have now determined the structure of the ObEGS1 (F85V, I88Y) variant (mutant 12 in Table 4). This structure shows that the Y88 side chain projects into the base of the substrate-binding pocket (Figure 9b). The positioning of the Y88 side chain is assisted by the accompanying replacement of F85 by a residue bearing a smaller side chain, V. The bulky Y ring causes a slight displacement of neighboring residues within the active site, most notably the K132 side chain. The altered positioning of K132, the enlarged space arising from the loss of the bulky F85 side chain, and the introduction of an additional hydrogen bonding group (OH of Y88) that could interact with the C3 or C4 oxygen atoms of the substrate are likely to lead to a shift in the location of the substrate binding site, so that C9 is more appropriately positioned as the hydride acceptor, with the consequence that the altered enzyme produces a significant level (36%) of isoeugenol.
We have also elucidated the structure of the holoenzyme form of CbEGS1, without additional bound substrate or product analogs. The active sites of CbEGS1 and ObEGS1 are shown to be nearly identical (Figure 9c). Therefore, for these two enzymes, equivalent amino acid replacements would be expected to effect similar outcomes in product specificity, and, in general, this prediction is borne out by the results of the mutagenesis experiments. Nevertheless, in comparison to the ObEGS1 (F85V, I88Y) variant, the corresponding CbEGS1 (F84V, I87Y) variant (mutant 10 in Table 4) produces a much greater proportion of isoeugenol (69% versus 36%). In addition, the reciprocal changes in CbIGS1 (V84F, Y87I) produce more striking effects on product specificity (75% eugenol, mutant 7 in Table 4) than in petunia IGS (39% eugenol). A detailed understanding of the greater effect of these amino acid replacements on C. breweri EGS1 and IGS1, which are not apparent through modeling, must await structural analysis of these enzymes in complex with substrate, product(s), or analog.
The observation that changes of just a few residues can lead to new substrate or product specificity is also consistent with what has been found in other families of enzymes involved in specialized metabolism (Pichersky et al., 2006), for example the terpene synthase family, which also consists of several groups of enzymes that, within each group, use the same substrate but produce a different product (and, sometimes, multiple products from this same substrate).
Evolution of enzymatic function in the PIP reductase family
The previously identified petunia PhIGS1 and basil ObEGS1 were shown to be members of the PIP family of reductases, which also includes isoflavone reductase (IFR), pinoresinol–lariciresinol reductase (PLR), phenylcoumaran benzylic ether reductase (PCBER), pterocarpan reductase (PTR), and leucoanthocyanidin reductase (LAR; Figure 3b). Our phylogenetic analyses indicate with high degree of certainty that PhIGS1 and ObEGS1 fall into close but separate clades, and that CbEGS2 and PhEGS1, on the other hand, are closely related to a complex clade that contains proteins biochemically characterized to possess PCBER, IFR, and PTR activities.
The highly similar CbIGS1 and CbEGS1 (96% identity) both fall into the same clade with ObEGS1. The basal position of ObEGS1 in this clade suggests that the ancestor of CbIGS1 and CbEGS1 had EGS activity. However, since no other protein in this clade besides these three proteins has been biochemically characterized, such a conclusion is tentative. However, it is clear that the function of these two proteins diverged recently, and thus the origin of the IGS activity of CbIGS1 evolved independently of PhIGS1 (alternatively, if the ancestral protein in this clade had IGS activity, then the EGS activity of CbEGS1 evolved independently of ObEGS1).
Surprisingly, the CbEGS2 and PhEGS1 proteins fall in a clade in which proteins characterized to have PCBER activity as well as IFR and PTR enzymes also reside (Figure 3b). In this clade, the branching of IFR enzymes and the single PTR enzyme currently known, all from legumes, is uncertain: in the neighbor-joining tree the position of these two branches is reversed relative to each other, and in the maximum parsimony tree (legend to Figure 3b), all of these four sequences are monophyletic. However, in all of these trees the gymnospermous PCBER sequences (PtdPCBER from P. taeda and ThPCBER from Thuja plicata) constitute an outgroup to the IFRs, PTR, the angiospermous PCBER sequences (PtPCBER from P. trichocarpa and FiPCBER from Forsythia intermedia), and CbEGS2 and PhEGS1.
Gang et al. (1999), in their original report on the characterization of PCBER enzymes, pointed out that the P. taeda and P. trichocarpa PCBER enzymes (the only PCBER enzymes for which Vmax values have been reported) have an extremely low turnover rate with DDC that ‘cannot be explained at the present time’. They showed that they could not reduce pinoresinol, but, understandably, they did not test these enzymes with coniferyl acetate as EGS and IGS and their substrate were not known at the time. We show here that CbEGS2 and PhEGS1 also have very low turnover rates for the reduction of DDC that are comparable with that of P. trichocarpa and P. taeda PCBER enzymes. On the other hand, CbEGS2 and PhEGS1 each have a turnover rate for coniferyl acetate that is several thousand-fold higher than their rates with DDC and similar to the turnover rates of CbEGS1, CbIGS1, ObEGS1 and PhIGS1 with coniferyl acetate. These results indicate that CbEGS2 and PhEGS1 are bona fide EGSs, and further suggest that the sequences currently characterized as PCBER enzymes, as well as other sequences in this branch, might in fact prefer coniferyl acetate or other related substrates to DDC. While we were writing this paper, Vassão et al. (2007) reported that a PCBER-related protein (despite their title, the protein investigated was more similar to PCBER than to PLR) in the creosote bush (Larrea tridentate) is capable of synthesizing phenylpropenes from esters of alcohols of lignin precursors. However, this activity was not linked to specific phenylpropenes in the plant, nor was the PCBER activity of this protein examined.
Whether the ancestral protein of this clade was a bona fide EGS that used coniferyl acetate or another type of enzyme, and what the biochemical activities of the many other uncharacterized proteins in this clade are (deduced from ESTs and representing a wide variety of plant species) remain intriguing questions that will require additional studies to resolve. However, the phylogenetic analysis suggests that the protein that was ancestral to this clade as well as to the LAR and PLR clades was unlikely to be an EGS/IGS. It thus appears likely that phenylpropene synthases have evolved independently at least twice during plant evolution. A less likely scenario is that the ancestor of the entire family possessed phenylpropene synthase activity, in which case it appears that the phenylpropene synthases eventually evolved for unknown reasons into two distinct lineages.
Koeduka et al. (2006) have presented indirect evidence that ObEGS1 and PhIGS1 use a quinone methide intermediate-based mechanism to generate an intermediate to which the reductive transfer of the reducing hydride can then be easily accomplished, and Akashi et al. (2006) have proposed that some other PIP enzymes may also use the same mechanism on substrates that, like coniferyl acetate, contain a para-hydroxybenzyl moiety. The recent structure–function studies utilizing the crystal structure of basil EGS indeed support this hypothesis (Louie et al., 2007). Thus, the potential to generate a quinone–methide intermediate with a variety of substrates that share a para-hydroxybenzyl moiety appears to underlie the basis for the multiple types of substrates that the PIP family enzymes have evolved to handle. In addition to the previously demonstrated diversity of the PIP family, we now show here that the same product specificity, and most likely the same substrate specificity, have evolved independently more than once. Overall, these results highlight the strong potential for the evolution of new functions inherent in this family.
Plant materials and growth condition
Clarkia breweri were grown as described in Raguso and Pichersky (1995). Petunia hybrida cv. Mitchell (Ball Seed, http://www.ballseed.com/) plants were grown in the greenhouse with a 16-h light period (supplemented to 100 μmol m−2 sec−1) and temperature of 21°C, and an 8-h dark period at 16°C.
The EGS and IGS enzymatic reactions were performed and products were analyzed as previously described (Koeduka et al., 2006), except that a temperature gradient from 50°C to 275°C at 14°C min−1 was applied during gas chromatography/mass spectrometry (GC-MS). For PCBER enzymatic assays, we used identical conditions to the one employed for EGS/IGS assays, except that the substrate DDC was used instead of coniferyl acetate. The product of the reaction was identified by the published retention value (Gang et al., 1999) and by its UV absorption spectrum.
All manipulations were carried out at 4°C unless stated otherwise. Crude extract (50 ml, representing 5.0 g fresh weight petal tissue) was loaded onto a DEAE-cellulose column (8 ml of DE53, Whatman, http://www.whatman.com/) that was pre-equilibrated with a solution containing 25 mm Bis-2-amino-2-(hydroxymethyl)-1,3-propanediol (Bis-Tris), pH 7.0, and 1 mm DTT (buffer A). After a wash with 20 ml of buffer A, IGS and EGS activities were eluted with 20 ml of buffer A containing 200 mm KCl. Fractions with the highest IGS and EGS activities (which co-eluted) were pooled and loaded on a Hitrap-Phenyl HP column (0.7 × 2.5 cm, Pharmacia Biotech Inc.; http://www.gehealthcare.com) pre-equilibrated with 1 m (NH3)2SO4 in buffer A at a flow rate of 0.5 ml min−1. After washing with 5 ml of 1 m (NH3)2SO4 in buffer A, the activities were eluted with a linear reverse gradient (15 ml) from 1 m (NH3)2SO4 in buffer A to 0 m (NH3)2SO4 in buffer A followed by an additional 15 ml of buffer A. Fractions containing peak IGS activity and the first peak of EGS activity (which co-eluted) were pooled (total 6.75 ml) and loaded onto Hitrap-Q HP column (0.7 × 2.5 cm; Pharmacia Biotech Inc.) previously equilibrated with buffer A, followed by a wash with 5 ml of buffer A and then IGS and EGS activities were eluted with 20 ml linear gradient (0–400 mm) of KCl in buffer A at flow rate of 0.5 ml min−1. Fractions containing the second peak of EGS activity from the Hitrap-Phenyl HP column were pooled and similarly loaded onto a second Hitrap-Q HP column and the column treated similarly.
The 36-kDa protein band (Figure 2, lane 1) was eluted from the gel, trypsinized, and subjected to LC-MS/MS analysis as previously described (Chen et al., 2005), followed by a search of the C. breweri flower EST database using the program MASCOT (Perkins et al., 1999). Seven unique peptides obtained from this 36-kDa protein band (RSMGVTIIEGEMEEHEKM, KFVLNYEEDIAKY, RIVIYRPPKN, KSGLSFKK, KVHMPDEQLVRL, RLSQELPQPQNIPVSILHSIFVKG, RKDDIEASNLYPELEFTSIDGLLDLFISGRA) matched the two closely related protein sequences encoded by CbEGS1 and CbIGS1 (see Results) at a significance threshold of 0.001. At this stringency no peptide hits were found in a randomized database of the same size and amino acid composition, suggesting a very low false positive rate. The single protein of approximate molecular mass 38 kDa in the peak EGS activity eluting from the second Hitrap-Q column (Figure 2, lane 2) was analyzed in the same way, and 11 peptide sequences were obtained from it (KILIIGGTGYIGKF, KFIVEASVKE, KEGHPTFALVRE, RETTVSDPVKGKL, KFQNLGVSLLYGDLYDHDSLVKA, KQVDVVISTVGFMQIADQTKI, KIIAAIKE, KEAGNVKRF, KRFFPSEFGNDVDHVNAVEPAKS, KSVAFAVKA, RDKVIIPGDGNPKA) matched the protein sequence encoded by CbEGS2 (Figure 3a).
Isolation, characterization, and expression in E. coli of C. breweri cDNAs encoding proteins with IGS and EGS activities
As described in Results, a 5′ RACE experiment (Chenchik et al., 1996; Matz et al., 1999) was performed, using internal primers (Table S1), to obtain the complete sequence of CbIGS1. To construct E. coli expression vectors, full-length cDNAs were amplified by RT-PCR from flower RNA with forward and reverse primers (Table S1), the PCR fragments were spliced into pENTR (Invitrogen, http://www.invitrogen.com/), and analyzed by sequencing and the cDNA fragment was transferred to the expression vector pHIS9, a modified pET/T7 vector (Varbanova et al., 2007), to give an N-terminal in-frame addition of a peptide containing nine His residues. Expression in E. coli (BL21-CodonPlus-RIL) and DEAE and His-tag affinity purification of the proteins were performed as previously described (Koeduka et al., 2006; Nishimoto et al., 2007). CbEGS1, CbIGS1, and CbEGS2 were also amplified and spliced directly into the E. coli expression vector pEXP5-CT/TOPO (Invitrogen) for production of non-fused, non-tagged proteins.
In vitro mutagenesis
The EGS and IGS mutants were constructed in the pEXP5-CT/TOPO TA expression vector (Invitrogen) with the PCR method (Ho et al., 1989). The mutagenic primers for each mutation were designed in complementary pairs (Table S1). Mutations were confirmed by sequencing both strands.
Crystals of the ObEGS1 (F85V, I88Y) variant were obtained using the same protocol previously employed for wild-type ObEGS1 (Louie et al., 2007). Determination of the structure of the ObEGS1 variant was initiated with the isomorphous, orthorhombic crystal structure of wild-type ObEGS1 (Protein Data Bank entry 2QX7), and yielded a refined atomic model at a resolution of 2.15 Å with a crystallographic R-factor of 0.255 (free R 0.287). Crystals of CbEGS1 were grown from solutions of the protein mixed with 0.1 m sodium citrate (pH 5.4), 20% (v/v) isopropanol, 20% (w/v) polyethylene glycol 4000, and 5 mm NADP+. These crystals belong to space group C2 with unit-cell parameters a =67.3 Å, b =87.4 Å, c =51.5 Å, and β = 101.3. The initial structure solution of CbEGS1 was obtained by molecular replacement with a homology model constructed from ObEGS1. The atomic model of CbEGS1 was refined against X-ray data to 1.8-Å resolution with a crystallographic R-factor of 0.188 (free R 0.217). All crystallographic procedures employed were as described previously (Louie et al., 2007). X-ray diffraction data were measured at beamline 8.2.2 of the Advanced Light Source (Lawrence Berkeley National Laboratory). The atomic coordinates and structure factors were deposited in the PDB under accession codes 3C3X [ObEGS1 (F85V, I88Y)] and 3C1O (wild-type CbEGS1).
Multiple alignments of 32 PIP protein sequences were constructed using the muscle (Edgar, 2004) program. Alignment columns containing more than 16 gap characters were removed prior to tree reconstruction. Sequence distance matrix was constructed using the protdist program of the phylip package (Felsenstein, 1996) with the Jones–Taylor–Thornton evolutionary model (Jones et al., 1992). Neighbor-joining and least-squares trees were constructed using the neighbor and fitch programs, respectively, of the phylip package. The maximum parsimony tree was constructed using the protpars program of the phylip package. The maximum likelihood tree was constructed using the protml program of the molphy package (Hasegawa et al., 1991) by optimizing the least-squares tree with local rearrangements (Jones–Taylor–Thornton evolutionary model with adjustment for observed amino acid frequencies). Reliability of the internal branches was estimated using 10 000 resampling of the estimated log likelihood (RELL) bootstrap replications using the protml program of the molphy package.
We thank Dr Yuri I. Wolf (Associate Investigator, NCBI/NLM/NIH) for performing the phylogenetic analyses. This work was supported by National Science Foundation grants 0331353, 0312466, and 0718152 to EP, by National Science Foundation grant 0718064 to JNP, by National Research Initiative of the US Department of Agriculture Cooperative State Research, Education, and Extension Service grant 2005-35318-16207 and National Science Foundation/USDA-NRI Interagency Metabolic Engineering Program grant 0331333 to ND, and by a grant from the Fred Gloeckner Foundation, Inc to ND. JPN is an investigator of the Howard Hughes Medical Institute.