Profiles of volatile metabolites in basil lines
Major volatile compounds in the leaves of basil lines SW, MC, EMX-1, and SD were analyzed using GC-MS (see Figure 1 for examples of typical results). The phenylpropenes eugenol and methylchavicol are the major constituents in SW and EMX-1, respectively. In contrast, line MC contains almost no phenylpropenes, but primarily accumulates methylcinnamate in its glands. Line SD contains mainly terpenoids, such as the monoterpenoids neral and geranial and several sesquiterpenoids, such as β-caryophyllene, germacrene d, and α-bisabolene, among others, as its major volatile constituents. The major terpenoid in both EMX-1 and MC is the monoterpenoid 1,8-cineole. Line SW produces large amounts of linalool in addition to 1,8-cineole and an array of sesquiterpenoids, whereas MC produces only trace levels of sesquiterpenoids. Line EMX-1 produces only low levels of terpenoids, compared with its major phenylpropanoid pathway-derived volatile constituent, methylchavicol. These results (see Table 1 for a summary) are consistent with previous findings (Gang et al., 2001; Iijima et al., 2004a).
Figure 1. Representative total ion GC-MS chromatograms showing different metabolic profiles of basil lines. Ethyl acetate extracts of basil leaves were analyzed. All vertical axis scales are normalized to peak area of the internal standard (ITS, 1,3,4-trichlorobenzene). Major compounds identified are: 1, 1,8-cineole; 2, limonene; 3, E-β-ocimene; 4, cis-δ-terpineol; 5, α-terpinene; 6, fenchone; 7, linalool; 8, camphor; 9, cis-verbenol; 10, borneol; 11, trans-verbenol; 12, α-terpineol; 13, methylchavicol; 14, chavicol; 15, neral; 16, geranial; 17, bornyl acetate; 18, methyl nerolate; 19, α-cubebene; 20, eugenol; 21, neryl acetate; 22, copaene; 23, E-methylcinnamate; 24, methyleugenol; 25, β-caryophyllene; 26, α-bergamotene; 27, α-humulene; 28, E-β-farnesene; 29, muurola-4,5-diene; 30, germacrene D; 31, β-selinene; 32, α-selinene + bicyclogermacrene; 33, α-bulnesene; 34, β-bisabolene; 35, γ-cadinene; 36, δ-cadinene; 37, α-bisabolene; 38, Z-nerolidol; 39, cubenol; 40, α-cadinol. See Table 1 for a summary of the major compounds produced in each basil line.
Download figure to PowerPoint
Table 1. Major phenylpropanoid and terpenoid metabolites produced by glandular trichomes of the four basil lines evaluated in this investigation
|SW||Eugenol||Large amounts of linalool; smaller amounts of 1,8-cineole||Large amounts of α-bergamotene, germacrene d, α-selinene + bicyclogermacrene, γ-cadinene, and α-cadinol |
|MC||Methylcinnamate||1,8-Cineole||Small amounts of E-β-farnesene, germacrene d, γ-cadinene, and α-cadinol|
|EMX-1||Methylchavicol||Small amounts of 1,8-cineole and fenchone||Only trace amounts of mainly α-humulene, and α-bisabolene|
|SD||Small amounts of methylchavicol||Large amounts of citral (neral + geranial)||Large amounts of β-caryophyllene, germacrene d, α-selinene + bicyclogermacrene, and α-bisabolene|
The basil glandular trichome proteome
The proteome from the glands of each basil line was divided into microsomal and cytosolic fractions, which were analyzed by GeLC-MS/MS, and the cytosolic fraction was additionally analyzed using MudPIT. A custom peptide sequence database, containing both translated EST sequences from peltate glands of sweet basil and plant protein sequences from UniProt, was used in this analysis (see Experimental procedures). Due to the nature of this analysis and the stringency of the identification criteria, only the most abundant proteins or those proteins which yielded peptides that were very amenable to ionization in an electrospray ion source were identified.
The basil glandular trichome proteome dataset consisted of nearly 2000 non-redundant protein identifications; probability assignment and validation of this set of proteins using the Trans-Proteomic Pipeline (TPP) yielded a set of 881 unique proteins that were identified with high confidence (see Table S1). Rubisco, the most abundant leaf protein, was not found in any of our trichome protein samples, although the large and the small subunits of this protein were by far the most abundant polypeptides in total leaf protein extracts from basil leaves (see Table S2) and this protein has been readily identified in other proteomic investigations (Hajheidari et al., 2005; Koller et al., 2002; Porubleva et al., 2001). Other proteins related to photosynthesis were also abundant in the total leaf protein extract, but were absent from the glandular trichome proteomes. Moreover, the most abundant proteins identified in the basil glandular trichome proteomes often corresponded well to the most abundant EST in the different basil lines, (see Figure 2 for some examples relevant to the shikimate, phenylpropanoid, and terpenoid pathways). These results indicate that the gland preparations used for protein isolation and proteomic analysis were not contaminated by other cell types from basil leaves and that the protein samples analyzed do indeed represent the proteins present in the glands themselves.
Figure 2. Comparison of transcript, peptide and corresponding metabolite levels for selected genes/enzymes in the shikimate/phenylpropanoid and terpenoid pathways. (a) 3-Deoxy-d-arabino-heptulosonate-7-phosphate synthase (DAHPS) versus total volatile phenylpropanoids. (b) All genes/enzymes in shikimate pathway versus total volatile phenylpropanoids. (c) Phenylalanine ammonia lyase (PAL) versus total volatile phenylpropanoids. (d) p-Coumarate/coniferyl alcohol acetyltransferase (CAAT) versus volatile phenylpropenes. (e) p-Coumaroyl-5-O-shikimate 3′-hydroxylase (C3′H) versus 3-hydroxylated phenylpropenes. (f) 1,8-Cineole synthase versus 1,8-cineole (the only derivative of cineole produced by basil). Relative standard errors of <10% were observed for metabolite data. Bar colors for different basil lines are: black, SW; dark grey, MC; light grey, EMX-1; white, SD.
Download figure to PowerPoint
The proteomes of the four basil lines were found to be highly diverse, as indicated by great differences in the most abundant proteins and in contrast to what might be expected considering that the same cell type from varieties of the same species was used in this investigation. When analogous proteins from the four different basil lines were considered to be the same protein, using the criteria that they most likely catalyzed the same reaction or served the same function in the plant, a total of 492 proteins were identified. Strikingly, of these only 71 (14.4%) were common to all four lines and 245 (49%) were unique to individual basil lines (Figure 3a and Table S1). For this non-redundant protein set, 118 proteins (24%) did not have a corresponding match in the basil trichome EST database (Figure 3b). Given the low false positive rate (<0.75%) of the protein identification, these observations might suggest that the EST database used in this study, although it represents a relatively large number of genes for a single cell type (7963 unigenes total), is still a relatively incomplete picture of all of the genes being actively transcribed in the glandular trichomes, with several proteins that are either abundant or have easily ionized peptides not having EST support. In many of these cases, however, a blast search back against the basil database using the amino acid sequence of the UniProt protein hit revealed one or more basil EST contigs with strong similarity to the UniProt protein hit for the proteomics peptide. Some of these UniProt protein identifications may represent false negative search results against the basil EST database and failure of the data processing methods and search algorithms, suggesting that the data processing software may require further refinement.
Figure 3. Venn diagrams showing (a) overlap among proteomes of four basil cultivars and (b) overlap of protein entries identified as basil glandular trichome expressed sequence tag (EST) entries or as plant protein IDs from UniProt.
Download figure to PowerPoint
We looked more closely at the proteins that were common to the glandular trichome proteomes of at least three of the four basil lines, because we surmised that this strong conservation of protein expression could be indicative of an essential role in glandular trichome cell biology or in common metabolic pathways. A total of 119 non-redundant proteins were identified that met these criteria (Table S3). The majority of these (72 proteins) are indeed enzymes essential for the metabolic processes involved in specialized metabolism, including members of the 2-C-methyl-d-erythritol 4-phosphate (MEP)/terpenoid and shikimate/phenylpropanoid pathways.
Spearman’s rank correlation coefficient determination and protein levels
To evaluate the relationship between transcript and protein levels for the basil glandular trichome transcriptome and proteome at large, we calculated Spearman’s rank correlation coefficient (rs) for the 164 genes with the highest transcript levels using the raw total spectra count (TSC) (Liu et al., 2004) as a measure of protein abundance and the raw total EST number (TEN) from the EST database as the measure of transcript level. Genes that did not meet minimal abundance criteria for either the peptide spectra or the cDNA count level were excluded from the analysis to ensure that comparisons would not reflect stochastic qualitative differences due to library sampling or peptide ionization effects.
More than half of the genes included in the analysis are biosynthetic enzymes contributing directly to specialized metabolite biosynthesis. The rest of the genes included housekeeping and structural proteins as well as apparent enzymes whose roles are unknown. The rs and associated probability values presented in Table 2 suggest that regulation of transcript levels contributes significantly to control of the production of enzymes involved in specialized metabolite biosynthesis as well as housekeeping and structural proteins in this plant cell type.
Table 2. Correlation of transcript and protein abundance for the 164 most highly expressed genes in basil glandular trichomes and for genes involved in the biosynthesis of terpenoid and phenylpropanoid compounds
| Pathway/line||rs*||Probability, P-value||No. genes|
|Most abundant genes in EST database|
| SW||0.49||<3.54 × 10−11||164|
| MC||0.47||<2.08 × 10−10|
| EMX-1||0.57||<1.72 × 10−15|
| SD||0.42||<2.66 × 10−8|
We also evaluated the overall correlation between protein and transcript levels for the genes related to phenylpropanoid and terpenoid production. The rs for genes of primary/core metabolism and for one-carbon metabolism were significant and similar for all four basil lines. Interestingly, these values suggested a very strong correlation for genes involved in generating one-carbon units for S-adenosyl-l-methionine (AdoMet)-dependent methylation reactions, indicating that this process in the trichomes may be closely regulated at a transcriptional level. The rs for genes of phenylpropanoid metabolism suggested a significant relationship between transcript and protein levels for the pathway as a whole only in line MC, while significant correlation in terpenoid metabolism was only observed for lines SD and SW, which produce the highest levels of terpenoid compounds.
Differential expression of specific enzymes controls the chemical diversity of basil lines
Comparisons of normalized TSC and total EST number (TEN) data for specific known enzymes involved in the production of phenylpropanoids and terpenoids are shown in Table S4. Enzymes from core metabolism essential for the production of precursors for the shikimate/phenylpropanoid and terpenoid pathways, including sucrose synthase, 6-phosphogluconate dehydrogenase, transaldolase, transketolase, and glyceraldehyde-3-phosphate dehydrogenase, had high TSC values in all basil lines, in keeping with their central roles in all metabolic processes in the glandular trichomes. Variation of the TSCs between the four lines [average relative standard deviation (RSD) 44.1%] for these core enzymes was much smaller than that of enzymes in more specialized pathways such as the phenylpropanoid and terpenoid pathways (average RSD 110.0%). Many enzymes in these latter pathways displayed great variation in protein levels between basil lines, which often, but not always, matched the variation in transcript levels (Figures 2, 4, and 5).
Figure 4. Comparison of peptide and transcript profiles for enzymes in the phenylpropanoid pathway. Peptide, enzyme activity, and mRNA levels are indicated by solid black, solid grey, and hollow bars, respectively.
Download figure to PowerPoint
Figure 5. Comparison of peptide and transcript profiles for enzymes in the 2-C-methyl-d-erythritol 4-phosphate (MEP) and mevalonate (MVA) pathways. Peptide, enzyme activity, and mRNA levels are indicated by solid black, solid grey, and hollow bars, respectively.
Download figure to PowerPoint
The most interesting difference observed for the enzymes in the shikimate pathway was that these enzymes appeared to be expressed at very low levels in line SD (Figure 2a,b), which produces low levels of volatile phenylpropanoids, compared with their high expression in line EMX-1, which produces high levels of the phenylpropene methylchavicol. These two basil lines are very similar in morphology and growth habit, suggesting that they are close relatives, despite their differential chemotypes relative to production of phenylpropanoids versus terpenoids. This difference can be partially explained by the observed differences in transcript levels for the shikimate pathway relative to the terpenoid pathway. The only volatile phenylpropanoid compound produced at appreciable levels by line SD is methylchavicol, although it is produced at much lower levels than are observed for line EMX-1. The lower level of production of methylchavicol in line SD relative to line EMX-1, even though there were higher levels of transcripts and peptides for chavicol O-methyltransferase (CVOMT; the enzyme directly responsible for formation of methylchavicol) in line SD, may be explained by the very low level of expression in line SD of most of the phenylpropanoid pathway genes, based on mRNA transcript levels and by the absence of any proteomics support (Figure 4 and Table S4). This is especially true for cinnamoyl-CoA reductase and p-coumaryl/coniferyl alcohol acetyl transferase, which are important enzymes directly upstream from chavicol production and which were apparently expressed at only very low levels in line SD (too low to be detected in our experiments). On the other hand, CVOMT was expressed at about twice the level (measured by both RNA transcript and protein levels) of that observed for line EMX-1 (the high methylchavicol accumulator). Thus, limited availability of chavicol from the phenylpropanoid pathway may be responsible for lack of methylchavicol production in line SD. This apparent discrepancy is addressed further below.
In contrast, many enzymes related to the production of terpenoids were expressed at high levels in line SD (Figure 5). These included most of the enzymes in the MEP pathway (Rodriguez-Concepcion and Boronat, 2002; Rohmer, 1999; Rohmer et al., 1996), isopentenyl-diphosphate delta-isomerase (IDI), and all of the prenyltransferases identified. This protein expression pattern can help explain the chemical profile of line SD relative to the other lines, where reduced production of phenylpropanoids and the increased biosynthesis of terpenoids appear to have resulted from reduced flux into the shikimate/phenylpropanoid pathway and diversion of upstream metabolite intermediates into the MEP and subsequent terpenoid pathways due to differential levels of transcripts for pathway entry point genes and corresponding differences in levels of the encoded proteins. Our results also suggest that the low level of total terpenoid production observed for line EMX-1 (see Figure 1 and Table 1) can be attributed to the expression levels of the first two enzymes in the MEP pathway. Among the four lines, EMX-1 had the lowest mRNA and undetectable protein levels for 1-deoxy-d-xylulose-5-phosphate (DOXP) synthase (DXS) and DOXP reductoisomerase (DXR), the first two enzymes in the MEP pathway (Figure 5 and Figure S1). Assays for DXR activity (see Figure 5) further support these findings. Of the 13 enzymes that could play roles in production of the precursors of terpenoid (isoprenoid) biosynthesis – isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) – only the eight MEP pathway enzymes were detected in our proteomics dataset. Of these, seven enzymes were found to be highly expressed in line SD (Figure 5 and Table S4). The mevalonate pathway appeared to be practically inactive in all basil lines, as indicated by very low TEN values and the absence of peptides for any of the enzymes in the pathway (Figure 5 and Table S4).
Enzymes downstream from IPP and DMAPP production, including two important prenyl transferases – geranyl diphosphate synthase (GPPS) and farnesyl diphosphate synthase (FPPS) – were found in lines SW, SD, and EMX-1 (Table S4). Line MC contained peptides for GPPS but none for FPPS. These results may explain why line MC produces appreciable levels of monoterpenoids but only very low levels of sesquiterpenoids. In addition, 12 different terpene synthases (TPSs) were found in the EST database. Peptides for eight of these proteins were detected. It is interesting to note that the protein and mRNA levels of TPSs were not particularly high in SD, which was not consistent with previous results regarding their enzymatic activities (Iijima et al., 2004a), suggesting that post-translational activation of specific TPSs may occur in sweet basil.
These results, taken together with the results described above for the control of phenylpropanoid pathway-derived compounds and the apparently high level of genetic similarity between basil lines SD and EMX-1, suggest that the shikimate/phenylpropanoid and the MEP/terpenoid pathways compete for carbon and that carbon flow into these two pathways is tightly regulated in the glandular trichomes of basil. These results also suggest that efforts to alter production of specific compounds in one pathway may very well have an effect on the production of metabolites from the other pathway. The different metabolic phenotypes of these basil lines may be due to a small number of genetic perturbations that occurred during their breeding and development. Therefore, it may be possible to experimentally test the ability to switch flux between the terpenoid and phenylpropanoid metabolic branches by altering the expression of key differentially expressed genes in basil.
Transcriptional and post-transcriptional regulation of metabolism in basil glandular trichomes
For many enzymes observed in this investigation, differential protein levels among the different basil lines appeared to correspond to parallel differences in mRNA levels. This provided us with an opportunity to validate the proteomic and transcriptomic results with each other. Although ‘omics’-level analyses are generally considered to be hypothesis-generating rather than hypothesis-testing approaches, conclusions derived from such analyses are more reliable when consistent results are obtained from investigations at different levels (Ge et al., 2003). This is due to the fact that it is extremely unlikely to accidentally get the same false results multiple times from experiments that are independent of each other. For example, it is noteworthy that the mRNA levels of all five enzymes in the mevalonate (MVA) pathway were consistently very low, confirming a previous report (Iijima et al., 2004a), and their proteins were undetectable. These observations, obtained from eight separate omics-level datasets, substantiate the conclusion that the MVA pathway does not play a significant role in the production of terpenoids in basil glandular trichomes and that terpenoid precursors are (perhaps) exclusively supplied by the MEP pathway in this single cell type (Figure 5). Moreover, the MEP pathway is localized to the plastids (Rodriguez-Concepcion and Boronat, 2002), but farnesyl diphosphate synthase (FPS) and sesquiterpene synthases, required for the synthesis of sesquiterpenoids, are cytosolic enzymes (Dudareva et al., 2005; Steele et al., 1998; Szkopinska and Plochocka, 2005). The production of sesquiterpenoids in basil glands from MEP pathway products suggests transport of terpenoid precursors (IPP or DMAPP) out of the plastids and into the cytosol (Figure 5). Experiments that evaluate the incorporation of labeled precursors in the MEP versus MVA pathways into mono- and sesquiterpenoids in basil glands will be able to further support or refute this hypothesis. However, our results are clearly in line with other reports that have proposed near-exclusive involvement of the MEP pathway in production of specialized (‘secondary’) metabolites in other plant species (Dudareva et al., 2005; Kasahara et al., 2002). Indeed, our results support the hypothesis that perhaps only steroid-derived terpenoids are produced by the MVA pathway in plants. Furthermore, having both the shikimate and MEP pathways localized to the plastid allows for direct crosstalk within a single subcellular compartment and reciprocal control of these pathways, which are involved in the production of precursors for the phenylpropanoids and terpenoids, respectively, produced by basil.
Several enzymes evaluated in this investigation reside at important regulatory points in their respective pathways. In all such cases, mRNA and protein levels were consistent with the metabolic profiles in the respective basil line. For example, 3-deoxy-d-arabino-heptulosonate-7-phosphate synthase (DAHPS), phenylamine ammonia lyase (PAL), and DXS/DXR, the entry and control points for shikimate, phenylpropanoid, and terpenoid pathways (Bate et al., 1994; Carretero-Paulet et al., 2002; Rodriguez-Concepcion, 2006), respectively, had both protein and mRNA expression profiles that were consistent with the production of phenylpropanoids and terpenoids in the glandular trichomes of specific basil lines (Figures 2, 4, and 5). Similar profiles were observed for other control point enzymes, such as p-coumarate/cinnamate carboxymethyltransferase (CCMT) and CVOMT. Enzyme activity assays for several of these enzymes (we measured activity for PAL, DXR, CCMT, and CVOMT) also confirmed that enzyme activity levels were often associated with protein levels (see Figures 4 and 5). Similar relationships were observed for many other important enzymes, such as 4-coumaroyl-CoA ligase (4CL), caffeoyl-CoA O-methyltransferase (CCOMT), and IDI (Table S4). The good correlation among mRNA, protein, enzyme activity, and metabolite levels in these cases suggests that these pathway nodes are regulated at the transcriptional level and that control of gene expression plays an essential role in the control of the direction of carbon flow between major metabolic pathways in basil glandular trichomes.
In addition, many enzymes with consistent protein and mRNA levels were found to be co-regulated with each other. This co-regulation was most clearly seen in line SD, where almost all of the enzymes related to the production of phenylpropanoids were downregulated, and those related to the production of terpenoids were upregulated at both the mRNA and protein levels. In all lines, PAL and 4CL showed remarkable coordination in this regard, which was consistent with previous reports (Logemann et al., 1995; Reinold and Hahlbrock, 1997). Similar co-regulation was observed for DXS and IDI. This co-regulation implies common mechanisms for transcriptional control of these enzymes. In this regard, there are over 150 different transcription factors and similar regulatory proteins in the basil EST database. However, the role of specific regulatory proteins in controlling the transcription of any enzyme in basil glandular trichomes is yet to be determined.
On the other hand, the protein and mRNA levels of other groups of enzymes differed greatly from each other. In these cases, the proteomic data were more consistent with the metabolic profiling data. Examples include p-coumaroyl-5-O-shikimate 3′-hydroxylase (C3′H) and cinnamate 4-hydroxylase (C4H), among others (Figures 2 and 4, Table S4). The major phenylpropanoid product of line MC (methylcinnamate) does not have a 4-hydroxyl group, and thus its biosynthesis requires no C4H activity. This is consistent with the TSC data, which suggested that the C4H protein, although easily detected in the other basil lines, was not present at detectable levels in line MC (Figure 4). This contrasts with the TEN data, which showed C4H mRNA levels in MC to be the highest among all lines (Figure 4 and Table S4). The same pattern in line MC was not observed for the related cytochrome P450, C3′H, demonstrating that the discrepancy in C4H protein and mRNA levels was not due to inefficient membrane protein analysis in line MC. Indeed, the production of eugenol, the major compound in SW, requires C3′H and eugenol synthase activities. And the levels of these proteins, and not of the respective mRNAs, were more consistent with the eugenol levels in this line compared with other lines. Based on these results, it appears that post-transcriptional regulation of these particular enzymes may be more important for controlling their activities than is transcription itself.
A third pattern was also observed that included enzymes for which neither the respective transcript nor metabolite levels appeared to have any correlation with protein levels observed for the different basil lines. Excellent examples of this included R-linalool synthase and geraniol synthase. Deviations from direct correlation between transcript, peptide, and metabolite levels could be the result of technical factors, such as differences in specific protein extraction efficiencies between the different lines, differential peptide ionization efficiencies due to residue polymorphisms for equivalent peptides from different lines, ion suppression for a peptide in a particular line due to the presence of abundant peptides not present in the other line, etc., or they may reflect real biological differences in protein abundance due to differences in protein stability in particular lines (e.g. due to targeted degradation) or post-translational regulation and modifications of the enzymes themselves.
Indeed, our results demonstrated the existence of post-translational modifications (PTMs) of several proteins (phosphorylation, ubiquitination, and arginine monomethylation; see Table S5). Many of the 28 proteins identified with high confidence as possessing PTMs are enzymes in the pathways leading to the production of terpenoids and phenylpropanoids in basil glands, including phosphoglucomutase, glucose-6-phosphate isomerase, phosphoglycerate mutase, methionine synthase, S-adenosyl-l-methionine synthetase, phenylalanine ammonia-lyase, and CVOMT, among others. An additional 25 PTM-modified peptides were identified, but at a lower confidence level (Table S5). The differential presence of these PTMs also correlated in many instances with differences in metabolite production. For example, as introduced above, line SD displayed high expression levels for CVOMT mRNA and protein (based on TEN and TSC), but it produced only small amounts of methylchavicol and it had much lower CVOMT enzyme activity than line EMX-1. One possible explanation for this observation could be a lack of precursors for chavicol production due to the downregulation of enzymes in the phenylpropanoid pathway, as discussed above. However, ubiquitinated CVOMT was found in line SD but not in the other lines, which provides an alternative explanation, because ubiquitination would lead to rapid degradation of this enzyme and lowering of enzyme activity. In addition, phosphorylated peptides of PAL were identified only in line SD, and the phosphorylation of PAL has been reported to alter its enzymatic activity (Allwood et al., 1999; Bolwell, 1992), even though phosphorylated versions of PAL have never been isolated from plants until now. Clearly, elucidation of the biological significance of the PTMs mentioned in this paper will require further investigation. Nevertheless, these findings suggest complex, multilevel regulation of metabolism in basil glandular trichomes.