A systems biology investigation of the MEP/terpenoid and shikimate/phenylpropanoid pathways points to multiple levels of metabolic control in sweet basil glandular trichomes


*For correspondence (fax +1 520 6217 186; e-mail gang@ag.arizona.edu).


The glandular trichome is an excellent model system for investigating plant metabolic processes and their regulation within a single cell type. We utilized a proteomics-based approach with isolated trichomes of four different sweet basil (Ocimum basilicum L.) lines possessing very different metabolite profiles to clarify the regulation of metabolism in this single cell type. Significant differences in the distribution and accumulation of the 881 highly abundant and non-redundant protein entries demonstrated that although the proteomes of the glandular trichomes of the four basil lines shared many similarities they were also each quite distinct. Correspondence between proteomic, expressed sequence tag, and metabolic profiling data demonstrated that differential gene expression at major metabolic branch points appears to be responsible for controlling the overall production of phenylpropanoid versus terpenoid constituents in the glandular trichomes of the different basil lines. In contrast, post-transcriptional and post-translational regulation of some enzymes appears to contribute significantly to the chemical diversity observed within compound classes for the different basil lines. Differential phosphorylation of enzymes in the 2-C-methyl-d-erythritol 4-phosphate (MEP)/terpenoid and shikimate/phenylpropanoid pathways appears to play an important role in regulating metabolism in this single cell type. Additionally, precursors for different classes of terpenoids, including mono- and sesquiterpenoids, appear to be almost exclusively supplied by the MEP pathway, and not the mevalonate pathway, in basil glandular trichomes.


Glandular trichomes are highly specialized epidermal cells common to many plant groups and are involved in the synthesis, storage, and secretion of a large array of specialized metabolites such as phenylpropanoids and terpenoids, which are important for plant defense (Gang, 2005; Wink, 2003) and possess many properties important for human health (Cheng et al., 2007; Grassmann et al., 2002; Julsing et al., 2006; Kurkin, 2003; Withers and Keasling, 2007). Despite their great importance, the biosynthesis of these compounds is not completely understood and the mechanisms used by the plant to control or regulate the production of specific compounds are still largely unknown.

An array of phenylpropanoid and terpenoid pathway-derived compounds are produced in the peltate glands of sweet basil (Ocimum basilicum L.; Gang et al., 2001; Iijima et al., 2004a). Many chemically distinct yet morphologically similar breeding lines have been established (Gang et al., 2001; Gupta, 1994). In addition, because the glandular trichomes can easily be removed from the leaf surface and isolated from all other cell types, yet remain metabolically active, it is possible to monitor protein and RNA levels and enzyme activities in a single, fully differentiated plant cell type. This makes the basil peltate glandular trichome an attractive model system for investigating the regulation of specialized metabolite biosynthesis in planta, and these advantages have been utilized to elucidate details of the biosynthesis of biologically, pharmaceutically, and economically important compounds such as eugenol, methylchavicol, methylcinnamate, and citral (Gang et al., 2001, 2002a,b; Iijima et al., 2004b, 2006; Koeduka et al., 2006; Vassao et al., 2006). An expressed sequence tag (EST) database, containing over 23 000 ESTs and 7963 non-redundant unigenes (contigs plus singletons), has been established from four separate cDNA libraries from four basil lines, EMX-1, SD, SW, and MC. This EST database provided the sequence information necessary for protein identification using a database searching approach, and has also enabled the comparison of transcriptomic and proteomic data with metabolic profiling data.

Proteomics techniques have been widely adopted in the investigation of function and regulation of proteins (Chen and Harmon, 2006; Rossignol et al., 2006). Shotgun proteomic techniques (Lee and Cooper, 2006), including multidimensional protein identification technology (MudPIT) and gel-enhanced liquid chromatography-tandem mass spectrometry (GeLC-MS/MS) analysis, have been developed recently as alternatives to the more conventional and more expensive 2D-gel approach. Proteomic analysis has been applied to hairy trichomes of Arabidopsis (Wienkoop et al., 2004) and glandular secreting trichomes of tobacco (Nicotiana tabacum) (Amme et al., 2005) and resulted in the identification of 63 and 7 proteins, respectively. Although these studies offered certain insights into the metabolism and function of the trichomes, the limited number of proteins identified in these investigations was not sufficient to provide a comprehensive picture of the proteome of these cell types. Moreover, the integration of transcriptomic, proteomic, and metabolic profiling data to provide a systems-level framework to aid the elucidation of specialized metabolic pathways and their regulation has not been reported for glandular trichomes. Because the peltate glandular trichomes (glands) of sweet basil have been the target of both transcriptional and metabolic profiling as well as focused biochemical investigations, they provide an ideal opportunity for such an integrated investigation of metabolism within a single cell type.

Results and discussion

Profiles of volatile metabolites in basil lines

Major volatile compounds in the leaves of basil lines SW, MC, EMX-1, and SD were analyzed using GC-MS (see Figure 1 for examples of typical results). The phenylpropenes eugenol and methylchavicol are the major constituents in SW and EMX-1, respectively. In contrast, line MC contains almost no phenylpropenes, but primarily accumulates methylcinnamate in its glands. Line SD contains mainly terpenoids, such as the monoterpenoids neral and geranial and several sesquiterpenoids, such as β-caryophyllene, germacrene d, and α-bisabolene, among others, as its major volatile constituents. The major terpenoid in both EMX-1 and MC is the monoterpenoid 1,8-cineole. Line SW produces large amounts of linalool in addition to 1,8-cineole and an array of sesquiterpenoids, whereas MC produces only trace levels of sesquiterpenoids. Line EMX-1 produces only low levels of terpenoids, compared with its major phenylpropanoid pathway-derived volatile constituent, methylchavicol. These results (see Table 1 for a summary) are consistent with previous findings (Gang et al., 2001; Iijima et al., 2004a).

Figure 1.

 Representative total ion GC-MS chromatograms showing different metabolic profiles of basil lines.
Ethyl acetate extracts of basil leaves were analyzed. All vertical axis scales are normalized to peak area of the internal standard (ITS, 1,3,4-trichlorobenzene). Major compounds identified are: 1, 1,8-cineole; 2, limonene; 3, E-β-ocimene; 4, cis-δ-terpineol; 5, α-terpinene; 6, fenchone; 7, linalool; 8, camphor; 9, cis-verbenol; 10, borneol; 11, trans-verbenol; 12, α-terpineol; 13, methylchavicol; 14, chavicol; 15, neral; 16, geranial; 17, bornyl acetate; 18, methyl nerolate; 19, α-cubebene; 20, eugenol; 21, neryl acetate; 22, copaene; 23, E-methylcinnamate; 24, methyleugenol; 25, β-caryophyllene; 26, α-bergamotene; 27, α-humulene; 28, E-β-farnesene; 29, muurola-4,5-diene; 30, germacrene D; 31, β-selinene; 32, α-selinene + bicyclogermacrene; 33, α-bulnesene; 34, β-bisabolene; 35, γ-cadinene; 36, δ-cadinene; 37, α-bisabolene; 38, Z-nerolidol; 39, cubenol; 40, α-cadinol. See Table 1 for a summary of the major compounds produced in each basil line.

Table 1.   Major phenylpropanoid and terpenoid metabolites produced by glandular trichomes of the four basil lines evaluated in this investigation
LineMajor metabolites
SWEugenolLarge amounts of linalool; smaller amounts of 1,8-cineoleLarge amounts of α-bergamotene, germacrene d, α-selinene + bicyclogermacrene, γ-cadinene, and α-cadinol
MCMethylcinnamate1,8-CineoleSmall amounts of E-β-farnesene, germacrene d, γ-cadinene, and α-cadinol
EMX-1MethylchavicolSmall amounts of 1,8-cineole and fenchoneOnly trace amounts of mainly α-humulene, and α-bisabolene
SDSmall amounts of methylchavicolLarge amounts of citral (neral + geranial)Large amounts of β-caryophyllene, germacrene d, α-selinene + bicyclogermacrene, and α-bisabolene

The basil glandular trichome proteome

The proteome from the glands of each basil line was divided into microsomal and cytosolic fractions, which were analyzed by GeLC-MS/MS, and the cytosolic fraction was additionally analyzed using MudPIT. A custom peptide sequence database, containing both translated EST sequences from peltate glands of sweet basil and plant protein sequences from UniProt, was used in this analysis (see Experimental procedures). Due to the nature of this analysis and the stringency of the identification criteria, only the most abundant proteins or those proteins which yielded peptides that were very amenable to ionization in an electrospray ion source were identified.

The basil glandular trichome proteome dataset consisted of nearly 2000 non-redundant protein identifications; probability assignment and validation of this set of proteins using the Trans-Proteomic Pipeline (TPP) yielded a set of 881 unique proteins that were identified with high confidence (see Table S1). Rubisco, the most abundant leaf protein, was not found in any of our trichome protein samples, although the large and the small subunits of this protein were by far the most abundant polypeptides in total leaf protein extracts from basil leaves (see Table S2) and this protein has been readily identified in other proteomic investigations (Hajheidari et al., 2005; Koller et al., 2002; Porubleva et al., 2001). Other proteins related to photosynthesis were also abundant in the total leaf protein extract, but were absent from the glandular trichome proteomes. Moreover, the most abundant proteins identified in the basil glandular trichome proteomes often corresponded well to the most abundant EST in the different basil lines, (see Figure 2 for some examples relevant to the shikimate, phenylpropanoid, and terpenoid pathways). These results indicate that the gland preparations used for protein isolation and proteomic analysis were not contaminated by other cell types from basil leaves and that the protein samples analyzed do indeed represent the proteins present in the glands themselves.

Figure 2.

 Comparison of transcript, peptide and corresponding metabolite levels for selected genes/enzymes in the shikimate/phenylpropanoid and terpenoid pathways.
(a) 3-Deoxy-d-arabino-heptulosonate-7-phosphate synthase (DAHPS) versus total volatile phenylpropanoids.
(b) All genes/enzymes in shikimate pathway versus total volatile phenylpropanoids.
(c) Phenylalanine ammonia lyase (PAL) versus total volatile phenylpropanoids.
(d) p-Coumarate/coniferyl alcohol acetyltransferase (CAAT) versus volatile phenylpropenes.
(e) p-Coumaroyl-5-O-shikimate 3′-hydroxylase (C3′H) versus 3-hydroxylated phenylpropenes.
(f) 1,8-Cineole synthase versus 1,8-cineole (the only derivative of cineole produced by basil).
Relative standard errors of <10% were observed for metabolite data. Bar colors for different basil lines are: black, SW; dark grey, MC; light grey, EMX-1; white, SD.

The proteomes of the four basil lines were found to be highly diverse, as indicated by great differences in the most abundant proteins and in contrast to what might be expected considering that the same cell type from varieties of the same species was used in this investigation. When analogous proteins from the four different basil lines were considered to be the same protein, using the criteria that they most likely catalyzed the same reaction or served the same function in the plant, a total of 492 proteins were identified. Strikingly, of these only 71 (14.4%) were common to all four lines and 245 (49%) were unique to individual basil lines (Figure 3a and Table S1). For this non-redundant protein set, 118 proteins (24%) did not have a corresponding match in the basil trichome EST database (Figure 3b). Given the low false positive rate (<0.75%) of the protein identification, these observations might suggest that the EST database used in this study, although it represents a relatively large number of genes for a single cell type (7963 unigenes total), is still a relatively incomplete picture of all of the genes being actively transcribed in the glandular trichomes, with several proteins that are either abundant or have easily ionized peptides not having EST support. In many of these cases, however, a blast search back against the basil database using the amino acid sequence of the UniProt protein hit revealed one or more basil EST contigs with strong similarity to the UniProt protein hit for the proteomics peptide. Some of these UniProt protein identifications may represent false negative search results against the basil EST database and failure of the data processing methods and search algorithms, suggesting that the data processing software may require further refinement.

Figure 3.

 Venn diagrams showing (a) overlap among proteomes of four basil cultivars and (b) overlap of protein entries identified as basil glandular trichome expressed sequence tag (EST) entries or as plant protein IDs from UniProt.

We looked more closely at the proteins that were common to the glandular trichome proteomes of at least three of the four basil lines, because we surmised that this strong conservation of protein expression could be indicative of an essential role in glandular trichome cell biology or in common metabolic pathways. A total of 119 non-redundant proteins were identified that met these criteria (Table S3). The majority of these (72 proteins) are indeed enzymes essential for the metabolic processes involved in specialized metabolism, including members of the 2-C-methyl-d-erythritol 4-phosphate (MEP)/terpenoid and shikimate/phenylpropanoid pathways.

Spearman’s rank correlation coefficient determination and protein levels

To evaluate the relationship between transcript and protein levels for the basil glandular trichome transcriptome and proteome at large, we calculated Spearman’s rank correlation coefficient (rs) for the 164 genes with the highest transcript levels using the raw total spectra count (TSC) (Liu et al., 2004) as a measure of protein abundance and the raw total EST number (TEN) from the EST database as the measure of transcript level. Genes that did not meet minimal abundance criteria for either the peptide spectra or the cDNA count level were excluded from the analysis to ensure that comparisons would not reflect stochastic qualitative differences due to library sampling or peptide ionization effects.

More than half of the genes included in the analysis are biosynthetic enzymes contributing directly to specialized metabolite biosynthesis. The rest of the genes included housekeeping and structural proteins as well as apparent enzymes whose roles are unknown. The rs and associated probability values presented in Table 2 suggest that regulation of transcript levels contributes significantly to control of the production of enzymes involved in specialized metabolite biosynthesis as well as housekeeping and structural proteins in this plant cell type.

Table 2.   Correlation of transcript and protein abundance for the 164 most highly expressed genes in basil glandular trichomes and for genes involved in the biosynthesis of terpenoid and phenylpropanoid compounds
Pathway/liners*Probability, P-valueNo. genes
  1. *Spearman’s rank correlation coefficient (rs) was calculated for the same sets of genes expressed in basil glandular trichomes of four different basil lines.

Most abundant genes in EST database
 SW0.49<3.54 × 10−11164
 MC0.47<2.08 × 10−10
 EMX-10.57<1.72 × 10−15
 SD0.42<2.66 × 10−8
Primary metabolism
One-carbon metabolism
Phenylpropanoid metabolism
Terpenoid metabolism

We also evaluated the overall correlation between protein and transcript levels for the genes related to phenylpropanoid and terpenoid production. The rs for genes of primary/core metabolism and for one-carbon metabolism were significant and similar for all four basil lines. Interestingly, these values suggested a very strong correlation for genes involved in generating one-carbon units for S-adenosyl-l-methionine (AdoMet)-dependent methylation reactions, indicating that this process in the trichomes may be closely regulated at a transcriptional level. The rs for genes of phenylpropanoid metabolism suggested a significant relationship between transcript and protein levels for the pathway as a whole only in line MC, while significant correlation in terpenoid metabolism was only observed for lines SD and SW, which produce the highest levels of terpenoid compounds.

Differential expression of specific enzymes controls the chemical diversity of basil lines

Comparisons of normalized TSC and total EST number (TEN) data for specific known enzymes involved in the production of phenylpropanoids and terpenoids are shown in Table S4. Enzymes from core metabolism essential for the production of precursors for the shikimate/phenylpropanoid and terpenoid pathways, including sucrose synthase, 6-phosphogluconate dehydrogenase, transaldolase, transketolase, and glyceraldehyde-3-phosphate dehydrogenase, had high TSC values in all basil lines, in keeping with their central roles in all metabolic processes in the glandular trichomes. Variation of the TSCs between the four lines [average relative standard deviation (RSD) 44.1%] for these core enzymes was much smaller than that of enzymes in more specialized pathways such as the phenylpropanoid and terpenoid pathways (average RSD 110.0%). Many enzymes in these latter pathways displayed great variation in protein levels between basil lines, which often, but not always, matched the variation in transcript levels (Figures 2, 4, and 5).

Figure 4.

 Comparison of peptide and transcript profiles for enzymes in the phenylpropanoid pathway. Peptide, enzyme activity, and mRNA levels are indicated by solid black, solid grey, and hollow bars, respectively.

Figure 5.

 Comparison of peptide and transcript profiles for enzymes in the 2-C-methyl-d-erythritol 4-phosphate (MEP) and mevalonate (MVA) pathways. Peptide, enzyme activity, and mRNA levels are indicated by solid black, solid grey, and hollow bars, respectively.

The most interesting difference observed for the enzymes in the shikimate pathway was that these enzymes appeared to be expressed at very low levels in line SD (Figure 2a,b), which produces low levels of volatile phenylpropanoids, compared with their high expression in line EMX-1, which produces high levels of the phenylpropene methylchavicol. These two basil lines are very similar in morphology and growth habit, suggesting that they are close relatives, despite their differential chemotypes relative to production of phenylpropanoids versus terpenoids. This difference can be partially explained by the observed differences in transcript levels for the shikimate pathway relative to the terpenoid pathway. The only volatile phenylpropanoid compound produced at appreciable levels by line SD is methylchavicol, although it is produced at much lower levels than are observed for line EMX-1. The lower level of production of methylchavicol in line SD relative to line EMX-1, even though there were higher levels of transcripts and peptides for chavicol O-methyltransferase (CVOMT; the enzyme directly responsible for formation of methylchavicol) in line SD, may be explained by the very low level of expression in line SD of most of the phenylpropanoid pathway genes, based on mRNA transcript levels and by the absence of any proteomics support (Figure 4 and Table S4). This is especially true for cinnamoyl-CoA reductase and p-coumaryl/coniferyl alcohol acetyl transferase, which are important enzymes directly upstream from chavicol production and which were apparently expressed at only very low levels in line SD (too low to be detected in our experiments). On the other hand, CVOMT was expressed at about twice the level (measured by both RNA transcript and protein levels) of that observed for line EMX-1 (the high methylchavicol accumulator). Thus, limited availability of chavicol from the phenylpropanoid pathway may be responsible for lack of methylchavicol production in line SD. This apparent discrepancy is addressed further below.

In contrast, many enzymes related to the production of terpenoids were expressed at high levels in line SD (Figure 5). These included most of the enzymes in the MEP pathway (Rodriguez-Concepcion and Boronat, 2002; Rohmer, 1999; Rohmer et al., 1996), isopentenyl-diphosphate delta-isomerase (IDI), and all of the prenyltransferases identified. This protein expression pattern can help explain the chemical profile of line SD relative to the other lines, where reduced production of phenylpropanoids and the increased biosynthesis of terpenoids appear to have resulted from reduced flux into the shikimate/phenylpropanoid pathway and diversion of upstream metabolite intermediates into the MEP and subsequent terpenoid pathways due to differential levels of transcripts for pathway entry point genes and corresponding differences in levels of the encoded proteins. Our results also suggest that the low level of total terpenoid production observed for line EMX-1 (see Figure 1 and Table 1) can be attributed to the expression levels of the first two enzymes in the MEP pathway. Among the four lines, EMX-1 had the lowest mRNA and undetectable protein levels for 1-deoxy-d-xylulose-5-phosphate (DOXP) synthase (DXS) and DOXP reductoisomerase (DXR), the first two enzymes in the MEP pathway (Figure 5 and Figure S1). Assays for DXR activity (see Figure 5) further support these findings. Of the 13 enzymes that could play roles in production of the precursors of terpenoid (isoprenoid) biosynthesis – isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) – only the eight MEP pathway enzymes were detected in our proteomics dataset. Of these, seven enzymes were found to be highly expressed in line SD (Figure 5 and Table S4). The mevalonate pathway appeared to be practically inactive in all basil lines, as indicated by very low TEN values and the absence of peptides for any of the enzymes in the pathway (Figure 5 and Table S4).

Enzymes downstream from IPP and DMAPP production, including two important prenyl transferases – geranyl diphosphate synthase (GPPS) and farnesyl diphosphate synthase (FPPS) – were found in lines SW, SD, and EMX-1 (Table S4). Line MC contained peptides for GPPS but none for FPPS. These results may explain why line MC produces appreciable levels of monoterpenoids but only very low levels of sesquiterpenoids. In addition, 12 different terpene synthases (TPSs) were found in the EST database. Peptides for eight of these proteins were detected. It is interesting to note that the protein and mRNA levels of TPSs were not particularly high in SD, which was not consistent with previous results regarding their enzymatic activities (Iijima et al., 2004a), suggesting that post-translational activation of specific TPSs may occur in sweet basil.

These results, taken together with the results described above for the control of phenylpropanoid pathway-derived compounds and the apparently high level of genetic similarity between basil lines SD and EMX-1, suggest that the shikimate/phenylpropanoid and the MEP/terpenoid pathways compete for carbon and that carbon flow into these two pathways is tightly regulated in the glandular trichomes of basil. These results also suggest that efforts to alter production of specific compounds in one pathway may very well have an effect on the production of metabolites from the other pathway. The different metabolic phenotypes of these basil lines may be due to a small number of genetic perturbations that occurred during their breeding and development. Therefore, it may be possible to experimentally test the ability to switch flux between the terpenoid and phenylpropanoid metabolic branches by altering the expression of key differentially expressed genes in basil.

Transcriptional and post-transcriptional regulation of metabolism in basil glandular trichomes

For many enzymes observed in this investigation, differential protein levels among the different basil lines appeared to correspond to parallel differences in mRNA levels. This provided us with an opportunity to validate the proteomic and transcriptomic results with each other. Although ‘omics’-level analyses are generally considered to be hypothesis-generating rather than hypothesis-testing approaches, conclusions derived from such analyses are more reliable when consistent results are obtained from investigations at different levels (Ge et al., 2003). This is due to the fact that it is extremely unlikely to accidentally get the same false results multiple times from experiments that are independent of each other. For example, it is noteworthy that the mRNA levels of all five enzymes in the mevalonate (MVA) pathway were consistently very low, confirming a previous report (Iijima et al., 2004a), and their proteins were undetectable. These observations, obtained from eight separate omics-level datasets, substantiate the conclusion that the MVA pathway does not play a significant role in the production of terpenoids in basil glandular trichomes and that terpenoid precursors are (perhaps) exclusively supplied by the MEP pathway in this single cell type (Figure 5). Moreover, the MEP pathway is localized to the plastids (Rodriguez-Concepcion and Boronat, 2002), but farnesyl diphosphate synthase (FPS) and sesquiterpene synthases, required for the synthesis of sesquiterpenoids, are cytosolic enzymes (Dudareva et al., 2005; Steele et al., 1998; Szkopinska and Plochocka, 2005). The production of sesquiterpenoids in basil glands from MEP pathway products suggests transport of terpenoid precursors (IPP or DMAPP) out of the plastids and into the cytosol (Figure 5). Experiments that evaluate the incorporation of labeled precursors in the MEP versus MVA pathways into mono- and sesquiterpenoids in basil glands will be able to further support or refute this hypothesis. However, our results are clearly in line with other reports that have proposed near-exclusive involvement of the MEP pathway in production of specialized (‘secondary’) metabolites in other plant species (Dudareva et al., 2005; Kasahara et al., 2002). Indeed, our results support the hypothesis that perhaps only steroid-derived terpenoids are produced by the MVA pathway in plants. Furthermore, having both the shikimate and MEP pathways localized to the plastid allows for direct crosstalk within a single subcellular compartment and reciprocal control of these pathways, which are involved in the production of precursors for the phenylpropanoids and terpenoids, respectively, produced by basil.

Several enzymes evaluated in this investigation reside at important regulatory points in their respective pathways. In all such cases, mRNA and protein levels were consistent with the metabolic profiles in the respective basil line. For example, 3-deoxy-d-arabino-heptulosonate-7-phosphate synthase (DAHPS), phenylamine ammonia lyase (PAL), and DXS/DXR, the entry and control points for shikimate, phenylpropanoid, and terpenoid pathways (Bate et al., 1994; Carretero-Paulet et al., 2002; Rodriguez-Concepcion, 2006), respectively, had both protein and mRNA expression profiles that were consistent with the production of phenylpropanoids and terpenoids in the glandular trichomes of specific basil lines (Figures 2, 4, and 5). Similar profiles were observed for other control point enzymes, such as p-coumarate/cinnamate carboxymethyltransferase (CCMT) and CVOMT. Enzyme activity assays for several of these enzymes (we measured activity for PAL, DXR, CCMT, and CVOMT) also confirmed that enzyme activity levels were often associated with protein levels (see Figures 4 and 5). Similar relationships were observed for many other important enzymes, such as 4-coumaroyl-CoA ligase (4CL), caffeoyl-CoA O-methyltransferase (CCOMT), and IDI (Table S4). The good correlation among mRNA, protein, enzyme activity, and metabolite levels in these cases suggests that these pathway nodes are regulated at the transcriptional level and that control of gene expression plays an essential role in the control of the direction of carbon flow between major metabolic pathways in basil glandular trichomes.

In addition, many enzymes with consistent protein and mRNA levels were found to be co-regulated with each other. This co-regulation was most clearly seen in line SD, where almost all of the enzymes related to the production of phenylpropanoids were downregulated, and those related to the production of terpenoids were upregulated at both the mRNA and protein levels. In all lines, PAL and 4CL showed remarkable coordination in this regard, which was consistent with previous reports (Logemann et al., 1995; Reinold and Hahlbrock, 1997). Similar co-regulation was observed for DXS and IDI. This co-regulation implies common mechanisms for transcriptional control of these enzymes. In this regard, there are over 150 different transcription factors and similar regulatory proteins in the basil EST database. However, the role of specific regulatory proteins in controlling the transcription of any enzyme in basil glandular trichomes is yet to be determined.

On the other hand, the protein and mRNA levels of other groups of enzymes differed greatly from each other. In these cases, the proteomic data were more consistent with the metabolic profiling data. Examples include p-coumaroyl-5-O-shikimate 3′-hydroxylase (C3′H) and cinnamate 4-hydroxylase (C4H), among others (Figures 2 and 4, Table S4). The major phenylpropanoid product of line MC (methylcinnamate) does not have a 4-hydroxyl group, and thus its biosynthesis requires no C4H activity. This is consistent with the TSC data, which suggested that the C4H protein, although easily detected in the other basil lines, was not present at detectable levels in line MC (Figure 4). This contrasts with the TEN data, which showed C4H mRNA levels in MC to be the highest among all lines (Figure 4 and Table S4). The same pattern in line MC was not observed for the related cytochrome P450, C3′H, demonstrating that the discrepancy in C4H protein and mRNA levels was not due to inefficient membrane protein analysis in line MC. Indeed, the production of eugenol, the major compound in SW, requires C3′H and eugenol synthase activities. And the levels of these proteins, and not of the respective mRNAs, were more consistent with the eugenol levels in this line compared with other lines. Based on these results, it appears that post-transcriptional regulation of these particular enzymes may be more important for controlling their activities than is transcription itself.

A third pattern was also observed that included enzymes for which neither the respective transcript nor metabolite levels appeared to have any correlation with protein levels observed for the different basil lines. Excellent examples of this included R-linalool synthase and geraniol synthase. Deviations from direct correlation between transcript, peptide, and metabolite levels could be the result of technical factors, such as differences in specific protein extraction efficiencies between the different lines, differential peptide ionization efficiencies due to residue polymorphisms for equivalent peptides from different lines, ion suppression for a peptide in a particular line due to the presence of abundant peptides not present in the other line, etc., or they may reflect real biological differences in protein abundance due to differences in protein stability in particular lines (e.g. due to targeted degradation) or post-translational regulation and modifications of the enzymes themselves.

Indeed, our results demonstrated the existence of post-translational modifications (PTMs) of several proteins (phosphorylation, ubiquitination, and arginine monomethylation; see Table S5). Many of the 28 proteins identified with high confidence as possessing PTMs are enzymes in the pathways leading to the production of terpenoids and phenylpropanoids in basil glands, including phosphoglucomutase, glucose-6-phosphate isomerase, phosphoglycerate mutase, methionine synthase, S-adenosyl-l-methionine synthetase, phenylalanine ammonia-lyase, and CVOMT, among others. An additional 25 PTM-modified peptides were identified, but at a lower confidence level (Table S5). The differential presence of these PTMs also correlated in many instances with differences in metabolite production. For example, as introduced above, line SD displayed high expression levels for CVOMT mRNA and protein (based on TEN and TSC), but it produced only small amounts of methylchavicol and it had much lower CVOMT enzyme activity than line EMX-1. One possible explanation for this observation could be a lack of precursors for chavicol production due to the downregulation of enzymes in the phenylpropanoid pathway, as discussed above. However, ubiquitinated CVOMT was found in line SD but not in the other lines, which provides an alternative explanation, because ubiquitination would lead to rapid degradation of this enzyme and lowering of enzyme activity. In addition, phosphorylated peptides of PAL were identified only in line SD, and the phosphorylation of PAL has been reported to alter its enzymatic activity (Allwood et al., 1999; Bolwell, 1992), even though phosphorylated versions of PAL have never been isolated from plants until now. Clearly, elucidation of the biological significance of the PTMs mentioned in this paper will require further investigation. Nevertheless, these findings suggest complex, multilevel regulation of metabolism in basil glandular trichomes.


Our results suggest that carbon flow can be readily redirected between the phenylpropanoid and terpenoid pathways in specific cell types. Nevertheless, multiple layers of regulation control the production of specific compounds in this single cell type. We expect that our findings are not the exception but the norm in plant metabolism. These findings pose significant implications for attempts to metabolically engineer plants to produce desired outcomes. Clearly, much more research must be done to uncover all levels of regulation of metabolic pathways in specific plant cell types before rational design of specific metabolite profiles in plants can be expected to produce desired results.

Experimental procedures

Plant material and isolation of peltate trichomes

Plants of four lines of sweet basil, EMX-1, MC, SD, and SW, were grown in parallel under controlled conditions. All basil plants were grown in a growth chamber at 34°C under a photoperiod of 16-h light/8-h dark with an illumination intensity of 260 μmol m−2 sec−1 and watered daily with 20-20-20 nutrient solution (Tindara, Georgetown, MA, USA). Peltate glands were isolated from young basil leaves (2 cm in length or smaller) from 6-week-old plants using a method described elsewhere (Gang et al., 2001), with some modifications. The leaf soak buffer consisted of 5 mm 2-amino-2-(hydroxymethyl)-1,3-propanediol (TRIS)–HCl, pH 7.5, with 14 mmβ-mercaptoethanol. Ribonuclease inhibitor in the gland isolation and wash buffers was replaced by 1.0 mm phenylmethylsulfonyl fluoride (PMSF) during gland isolation for proteomic analysis. Glands used for protein isolation for enzyme activity assays were isolated from plants grown in the same manner, and PMSF was not included in the buffers used for these experiments.

GC-MS analysis of basil leaves

Triplicate samples of 0.5 g of pooled whole young basil leaves (of the same developmental stage as used for trichome isolation) were extracted for metabolite analysis by shaking overnight at room temperature in 2.0 ml of ethyl acetate (EtOAc) containing 1,3,4-trichlorobenzene as internal standard, using a previously described method (Jiang et al., 2006). Filtered extracts were used for GC-MS analysis, which was performed on a Thermo Electron Trace DSQ Ultra (http://www.thermo.com/) as previously described (Jiang et al., 2006). The mass spectrometer was equipped with an Alltech (http://www.alltech.com/) Econo-Cap®-EC®-5 (30 m × 0.25 mm internal diameter × 0.25 μm) capillary column and a 5-m guard column. Ultrapure helium was used as the carrier gas at a flow rate of 1.2 ml min−1. The injection volume was 2 μl, and the split ratio was 10:1. The temperatures of the injector, transfer line, and ion source were set at 220, 250, and 200°C, respectively. After a 2-min hold at 40°C, the column oven temperature was programmed to increase to 100°C at 8°C min−1, then to 300°C at 3°C min−1 and held at 300°C for 3.5 min. The electron voltage was set to 70 eV. Eluted compounds were identified by comparison of their MS fragmentation patterns with the NIST Mass Spectral Library Version 2.0 (http://www.nist.gov/) and by elution time compared with authentic standards.

Enzyme activity assays

Total soluble protein was isolated from glandular trichome samples by diluting isolated glands five-fold with enzyme extraction buffer (100 mm TRIS–HCl, 10% glycerol and 14 mmβ-mercaptoethanol, pH 7.5). Glands were disrupted by sonication and the lysate was centrifuged at 15 000 g and 4°C for 15 min. Protein concentrations of the supernatants were determined using the Bio-Rad Protein Assay kit (http://www.bio-rad.com/). Assays for PAL, CVOMT, and CCMT were performed using radiometric assays as previously described (Gang et al., 2001; Kapteyn et al., 2007) using 200 ng to 2.5 μg of protein in 50-μl assays. The DXR assays were performed by spectrophotometrically monitoring the oxidation of NADPH as previously described (Takahashi et al., 1998) using 5–10 μg of protein in a 500-μl assay volume. All assays were performed in quadruplicate with controls consisting of assays with substrates withheld or assays without protein.

Protein purification and fractionation

Isolated basil peltate glands (70 μl settled glands) were diluted 10-fold in ice cold sonication buffer (5 mm TRIS–HCl, 1 mm PMSF, pH 8), and disrupted by sonication. The lysate was centrifuged at 10 000 g and 4°C for 15 min. The supernatant was centrifuged at 100 000 g at 4°C for 1 h using an Optima TL ultracentrifuge (Beckman, http://www.beckmancoulter.com/). The pellet (microsomal proteins) was washed with 250 μl of the sonication buffer, and again centrifuged at 100 000 g at 4°C for 1 h. The supernatant of the first ultracentifugation step (cytosolic proteins) were precipitated by 20% (v/v) trichloroacetic acid (TCA), incubated on ice for 15 min, and centrifuged at 16 000 g at 4°C for 15 min. The pellet was washed twice with 0.1%β-mercaptoethanol in acetone at −20°C for 20 min. The protein concentrations of the cytosolic and microsomal protein fractions were determined as described above.

GeLC-MS/MS analysis of proteins

The protein pellets were resuspended in 40 μl of protein sample buffer (8.0% SDS, 30% glycerol, 1.0%β-mercaptoethanol, 0.02% bromophenol blue, 250 mm TRIS–HCl, pH 6.8), and then loaded on a TRIS–HCl Ready Gel (4–20% linear gradient; Bio-Rad). After electrophoresis, the sample gels were silver stained using a method adopted from Shevchenko et al. (1996). The sample lanes were divided into 32 equally sized pieces (∼2 mm each). Proteins in the gel pieces were digested with trypsin (Wilm et al., 1996), and extracted with 5% formic acid/50% CH3CN using a Multiprobe-II liquid handling system (PerkinElmer, http://www.perkinelmer.com/). Peptide extracts were concentrated to 10 μl using a SpeedVac vacuum centrifuge (Savant; http://www.thermo.com).

The peptides were introduced into a modified microbore HPLC system (Surveyor; ThermoFinnigan, http://www.thermo.com/) using an autosampler. This system was modified by a simple T-piece flow-splitter to operate at capillary flow rates. Sample (5 μl) was loaded on a nanocapillary reverse phase (RP)-LC column (6 cm × 100 μm internal diameter) packed with 5 μm Xorbax C18 resin. Buffers were 0.1% formic acid (A) and 0.1% formic acid in acetonitrile (ACN) (B), and the flow rate was 400 nl min−1. After a 10-min initial wash with buffer A, the separation of peptides was achieved with a linear gradient from 5% to 50% buffer B over 30 min, followed by 50–98% buffer B over 5 min, a 5-min wash at 98% B, 98–5% buffer B for 5 min, and 5% buffer B for 5 min. An LCQ-Deca XP Plus ion trap mass spectrometer (Thermo Electron, http://www.thermo.com/) with a nanospray source (electrospray voltage 1.8 kV) was used as the detector. The mass scan range was 400–1500 m/z, and data dependent scanning was used to acquire the MS/MS spectra of the top three most abundant ions in a precursor ion scan (Andon et al., 2002).

MudPIT analysis of proteins

The protein samples were applied to MudPIT with methods adopted from Breci et al. (2005). Briefly, protein pellets containing about 100 μg of proteins were resuspended in 100 μl of a urea buffer (8 m urea, 100 mm ammonium bicarbonate). Prior to trypsin digestion, the protein samples were mixed with reagents and incubated as follows: 2 μl of 100 mm dithiothreitol, 15 min; 3 μl of 100 mm iodoacetamide, 15 min in the dark; 12 μl of 0.3 μg μl−1 endoproteinase Lys-C, 37°C for 3 h; then 350 μl of fresh 100 mm ammonium bicarbonate, 50 μl of ACN, and 4.5 μl of 100 mm calcium chloride, and 6 μl of 0.1 μg μl−1 trypsin (Promega, http://www.promega.com/), 37°C overnight. Tryptic peptides were purified using Spec PT C18 solid phase extraction pipette tip (Varian, http://www.varian.com/), concentrated to near-dryness under vacuum, and dissolved in 50 μl of 0.5% formic acid. Digested protein sample was introduced into the same nano LC-MS/MS system described in the previous section with the following modifications: the column was packed with 6 cm of 100 Å, 5 μm Xorbax C18 resin and then 3 cm of 100 Å, 5 mm polyhydroxyethyl-A strong cation exchange resin (PolyLC; TheNestGroup; http://www.nestgrp.com). Buffers were 0.1% formic acid (A) and 0.1% formic acid in ACN (B), 250 mm ammonium bicarbonate (C), and 1.5 m ammonium bicarbonate (D). Chromatographic separation was performed using the following method: step 1, linear gradient of 0% to 100% buffer B in buffer A over 60 min; step 2, variable ratio of buffer C in A for 4 min; step 3, linear gradient of 5% to 50% buffer B in A over 60 min; step 4, linear gradient of 50% to 98% buffer B in A over 5 min; and step 5, 98% buffer B in A for 5 min. Steps 2 to 5 were repeated with 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100% of buffer C in A for step 2. The column was finally washed with 100% buffer B for 60 min.

Protein identification and TSC calculation

The acquired MS/MS spectra were analyzed using SEQUEST (Eng et al., 1994; Yates et al., 1995) and searched against a custom protein database. This database contained peptide sequences from three sources: (i) translated peptide sequences based on EST sequences from glandular trichomes of basil, (ii) plant protein sequences from the UniProt Knowledgebase (Release 6.9), and (iii) peptide sequences representing common protein contamination in proteomics experiments. Static modification of Cys with a m/z shift of 57 and differential modification of Met with a m/z shift of 16 were considered during the searching.

After searching against the database, the Trans-Proteomic Pipeline (TPP, version 2.9.5; GALE, Institute for Systems Biology, Seattle, WA, USA; Keller et al., 2002; Nesvizhskii and Aebersold, 2005; Nesvizhskii et al., 2003) was used to filter the proteomics datasets. Based on the TPP results, a minimal list of proteins that are sufficient to explain all peptides observed during the database searching was generated using MS Access. A representative protein ID was selected for each of the protein groups when the identification of a single protein was not conclusive. Proteins identified as common contaminants and proteins with a protein probability of <0.95 were discarded, which reduced the false positive rate to <0.75%.

Sequences of known plant enzymes were obtained from Expasy (http://www.expasy.org/enzyme) based on their EC numbers. These sequences were used as query sequences to blast against a protein sequence database containing all identified proteins with a protein probability of >0.95. The resultant protein list was validated manually, and the spectral count number of each protein was parsed from TPP result files using a Perl script (Find_peptides.pl), available through the Gang lab web page at http://ag.arizona.edu/research/ganglab/links.htm. The TSC was calculated as the sum of spectral count numbers of proteins identified as having the same identity or function. The TSC value for each gene in a specific line was normalized as a proportion of the total TSC value for all proteins from the same line. The same approach was used to obtain normalized TENs from the EST database.

Identification of post-translational modifications

A Perl script Append.pl (http://proteomics.arizona.edu/toolbox.html) was used to concatenate all DTA files generated during SEQUEST searching. The combined files were analyzed by X!tandem (Fenyo and Beavis, 2003) using mass shifts and neutral losses associated with one of the following three PTMs: (i) phosphorylation (mass shift of +80 on STY and neutral loss of −98 on ST; DeGnore and Qin, 1998), (ii) arginine methylation (mass shift of +14 on R; Brame et al., 2004), and (iii) lysine ubiquitination (mass shift of +144 on K; Peng et al., 2003). A downsized database was used for the X!tandem searching. It contained peptide sequences from three sources: (i) proteins identified by SEQUEST searching and TPP processing, including all proteins with a group probability >0.10, (ii) plant proteins with the specific PTM in the UniProt database, and (iii) the translated EST sequences analogous to (ii). The output xml files were submitted to the GPM server (http://h319.thegpm.org/tandem/thegpm_upview.html), and the spectral matching results of all peptides with the target PTM were manually reviewed.


We wish to thank Paul Haynes for advice on the shotgun proteomics experimental design, Linda Breci, George Tsaprailis, and Fatimah Hickman at the Arizona Proteomics Consortium for assistance with proteomics sample analysis and data mining, and the National Science Foundation (MCB-0210170) and the NRI of the USDA CSREES (ARZT-329100-G-25-532) for financial support. The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the National Science Foundation.