The biosynthesis of Caryophyllaceae-like cyclic peptides in Saponaria vaccaria L. from DNA-encoded precursors


(fax +1 306 975 4839; e-mail


Cyclic peptides (CPs) are produced in a very wide range of taxa. Their biosynthesis generally involves either non-ribosomal peptide synthases or ribosome-dependent production of precursor peptides. Plants within the Caryophyllaceae and certain other families produce CPs which generally consist of 5–9 proteinogenic amino acids. The biological roles for these CPs in the plant are not very clear, but many of them have activity in mammalian systems. There is currently very little known about the biosynthesis of CPs in the Caryophyllaceae. A collection of expressed sequence tags from developing seeds of Saponaria vaccaria was investigated for information about CP biosynthesis. This revealed genes that appeared to encode CP precursors which are subsequently cyclized to mature CPs. This was tested and confirmed by the expression of a cDNA encoding a putative precursor of the CP segetalin A in transformed S. vaccaria roots. Similarly, extracts of developing S. vaccaria seeds were shown to catalyze the production of segetalin A from the same putative (synthetic) precursor. Moreover, the presence in S. vaccaria seeds of two segetalins, J [cyclo(FGTHGLPAP)] and K [cyclo(GRVKA)], which was predicted by sequence analysis, was confirmed by liquid chromatography/mass spectrometry. Sequence analysis also predicts the presence of similar CP precursor genes in Dianthus caryophyllus and Citrus spp. The data support the ribosome-dependent biosynthesis of Caryophyllaceae-like CPs in the Caryophyllaceae and Rutaceae.


Cyclic peptides (CPs) represent a diverse class of natural products showing broad distribution in bacteria, fungi, plants and animals (Pomilio et al., 2006; Tan and Zhou, 2006; Craik et al., 2007; Cascales and Craik, 2010; Morita and Takeya, 2010). Many CPs are of pharmaceutical interest. In their simplest form, CPs are cyclic polyamides formed from a small number of proteinogenic amino acids (e.g. see Figure 1). More complex CPs may be polycyclic and may include non-proteinogenic amino acids and other variations in structure.

Figure 1.

 Segetalin A, a homomonocyclopeptide containing six proteinogenic amino acids which accumulates in the seeds of Saponaria vaccaria.

Cyclic peptides occur in 26 families of flowering plants and are particularly common in the Caryophyllaceae, Rhamnaceae and Violaceae (Picur et al., 2006; Tan and Zhou, 2006; Cascales and Craik, 2010; Morita and Takeya, 2010). Plant CPs include the so-called homocyclopeptides, having rings formed from peptide bonds. Two well-studied groups in this category are the cyclotides and the Caryophyllaceae-like CPs. Cyclotides, which appear to serve an anti-insect role, occur mainly in the Violaceae (Craik et al., 2007). They are typically 28–37 proteinogenic amino acids which are polycyclic by virtue of disulfide bonds which give rise to a knotted structure.

Caryophyllaceae-like CPs are homocyclopeptides, with a single ring formed with peptide bonds, of two or between five and 12 α-amino acids (usually l isomers; Tan and Zhou, 2006; Morita and Takeya, 2010). As expected from the name, Caryophyllaceae-like CPs are particularly common in the Caryophyllaceae family, but they are also found in nine other families. In the 1990s, considerable efforts were made to characterize the CPs which occur in various members of the Caryophyllaceae (Ding et al., 1999; Tan and Zhou, 2006). One of the species investigated was Saponaria vaccaria (syn. Vaccaria segetalis, Vaccaria hispanica; cowcockle). Eight CPs, called segetalin A to segetalin H, were isolated and shown to comprise from five to nine proteinogenic amino acids. The sequences of these are shown in Figure 2(a). A number of these were shown to have estrogen-like and/or vasorelaxant activity (Morita et al., 2006; Tan and Zhou, 2006). These observations are in keeping with the use of S. vaccaria in traditional Chinese medicine as a treatment for amenorrhea, regulating blood flow, and increasing lactation (Itokawa et al., 1995). The biological role of segetalins in the plant is less clear, but may include antibiotic or anti-feedant activity.

Figure 2.

 Predicted amino acid sequences of cDNAs encoding putative cyclic peptide precursors (CPPs).
Manual alignment of predicted amino acid sequences of cDNAs encoding putative CPPs of (a) Saponaria vaccaria, (b) Dianthus caryophyllus, and (c) Citrus spp. Known mature CP sequences are shown in reverse type; predicted CP sequences are in italics. Presegetalin name, GenBank accession numbers, and/or Citrus species names are shown at the right.

There are basically two biosynthetic routes involved in generating the sequence of amino acids in CPs. In one case, ribosomes are involved in the initial ordering of mRNA-encoded amino acids to form a linear peptide precursor (Schmidt et al., 2005). Alternatively, the amino acids are ordered without direct ribosome involvement, by non-ribosomal peptide synthetases (NRPSs; Finking and Marahiel, 2004). While evidence for ribosome-dependent biosynthesis of cyclotides has been presented, relatively little is known about the biosynthesis of other classes of CPs in plants. Anderson and co-workers have found cDNAs from Oldenlandia affinis encoding cyclotide precursors (Gillon et al., 2008). When a corresponding precursor gene is expressed in tobacco, mature cyclotides are produced. An asparaginyl protease in tobacco has been implicated in the biosynthetic process (Saska et al., 2007). Thus, cyclotides are biosynthesized from ribosome-derived linear precursors which are apparently trimmed and cyclized with the aid of at least one protease-like enzyme. In terms of Caryophyllaceae-like CPs, Jia et al. (2006) have reported the cyclization of a linear version of heterophyllin B in an incompletely described extract from Pseudostellaria heterophylla (Caryophyllaceae).

Within the Caryophyllaceae, S. vaccaria provides a suitable system with which to study CP biosynthesis. As indicated above, considerable phytochemical work has elucidated a number of CPs present in S. vaccaria seeds. Also, the development of a relevant expressed sequence tag (EST) collection from S. vaccaria (Meesapyodsuk et al., 2007) provides a starting point for a molecular genetic investigation of the biosynthesis of Caryophyllaceae-like CPs. In this paper, we report the discovery of cDNAs encoding CP precursors (CPPs) and studies which confirm the role of the corresponding genes in CP biosynthesis.


Cyclic peptide precursor genes from S. vaccaria

In an effort to understand the biosynthesis of natural products in S. vaccaria, approximately 14 000 ESTs have been generated from cDNA libraries derived from roots and developing seed RNA (Meesapyodsuk et al., 2007). This led to the isolation of one of the first cDNAs encoding a triterpenoid glycosyltransferase, namely UGT74M1, which is involved in saponin biosynthesis. We investigated the S. vaccaria EST collection for information about CP biosynthesis. A search for ESTs which matched circular permutations of mature CP amino acid sequences revealed nucleotide sequences encoding short 30–40 amino acid peptides (see Figure 2a). The ESTs in this group are highly abundant in the developing seed collection, comprising 14% of the total sequences. The corresponding peptide sequences showed highly conserved N- and C-terminal domains which flank the mature CP sequences. These data suggest that CPs in S. vaccaria are biosynthesized ribosomally as linear precursors (presegetalins or CPPs) which are then processed to mature CPs. Thus, it would appear that segetalin A is formed from the presegetalin A1 peptide encoded by a presegetalin A1 gene.

In order to understand the relationships among the CPP cDNA sequences, they were collected and reassembled into contigs. Putative presegetalin cDNA sequences were first collected based on the presence of nucleotide sequences encoding mature CP sequences. Added to this collection was an additional group of sequences which showed a high degree of similarity to members of the above collection. The collection was clustered with parameters which favoured the clustering of sequences encoding the same mature CP sequences, but not sequences encoding other mature CP sequences (see Experimental Procedures). Due to the large number of sequences involved and the possibility of sequencing errors, contigs smaller than five ESTs were ignored in the sequence analysis. In general, more than one cluster was obtained for each segetalin. For example, for segetalin D, four clusters were found to have distinct cDNA sequences, which encode three distinct amino acid sequences, all of which include the same circular permutation of the mature segetalin D amino acid sequence. This gave rise to the nomenclature detailed in Table 1. Thus, Sgd3b is a gene corresponding to the second of two cDNAs with distinct nucleotide sequences which encodes the third (SgD3) of three putative segetalin D precursors. SgD3 is presumed to give rise to segetalin D.

Table 1. Saponaria vaccaria genes encoding segetalin precursors inferred from expressed sequence tag (EST) data. The contig size for the developing seed EST collection (Meesapyodsuk et al., 2007) is indicated
SegetalinSegetalin precursorGeneContig size (no. of ESTs)Representative cDNA clone from SVAR04NG libraryGenBank accession number

Interestingly, the sequence analysis revealed cDNAs which (i) showed predicted amino acid sequence similarity to the putative precursors of known segetalins and (ii) appeared to encode the precursors of additional segetalins, which were named segetalin J, K, and L (see Table 1 and Figure 2a). Based on the sequence analysis, there appear to be at least 15 S. vaccaria genes (or alleles) encoding 12 (precursor) amino acid sequences, which include the sequences of six known (mature, cyclic) segetalins and three putative segetalins. The known segetalins represented are A, B, D, F, G, and H. This matches well with the segetalins which have been detected chemically in the Pink Beauty variety (A, B, D, F, G, H; J. J. Balsevich, unpublished data). In addition, there appear to be cDNAs in our S. vaccaria library which encode precursors of segetalins which have not previously been detected by chemical analysis. By comparison with the precursor sequences of the known segetalins, the unknown segetalins J, K, and L, were predicted to have the sequences FGTHGLPAP, GRVKA, and GLPGWP, respectively (Figure 2a and Table 1).

Based on sequence similarities, the putative precursor sequences can be divided into two classes which we call ‘A’ and ‘F’ (Figure 2a). The class A precursors are distinguished by an initial Gly in the mature CP sequence and a Phe immediately following the mature sequence. The only two class F precursors, presegetalin F and presegetalin J1, have Phe and Ile in the respective positions. The flanking sequences differ somewhat in length and sequence between the two classes.

Quantitative RT-PCR analysis indicates that genes encoding certain presegetalins, namely A, J, and H, are highly expressed in developing and mature seeds relative to other tissues tested (see Figure 3). This is consistent with the occurrence of segetalins A and H in S. vaccaria seed. The Sgl1 gene, whose putative CP product has not been reported, did not show high expression in developing seed.

Figure 3.

 Expression levels of five presegetalin genes.
Relative expression levels of five presegetalin genes [Sga1, black; Sgg1, dark grey; Sgh1, light gray; Sgj1, white; Sgl1, dotted (not visible)] were quantified in six different tissues by quantitative PCR relative to Actin1. The maximum expression in germinating seed, root, leaf and flower did not exceed 0.4% of the expression of Sga1 in developing seed. Standard deviations are represented by error bars (= 3).

Segetalin A is derived from presegetalin A1 in vivo and in vitro

To test the hypothesis that S. vaccaria CPs are synthesized from ribosomally produced precursors, transformed root cultures of S. vaccaria (Schmidt et al., 2007) were generated which express presegetalin A1. The variety White Beauty was used, since it was found not to produce segetalin A naturally (J. J. Balsevich, unpublished data). Segetalin A production was assayed by liquid chromatography/mass spectrometry (LC/MS; single quadrupole or ion trap) using single ion monitoring. The mass spectrum of segetalin A is shown in Figure S1. Based on LC/MS (single quadrupole), three independent hairy root lines which were not engineered to express presegetalin A1 did not contain detectable amounts of segetalin A (see Figures 4 and 5). On the other hand, multiple independent hairy root lines expressing presegetalin A1 were found to contain segetalin A in the range of 0.1–5 μg g−1 fresh weight (Figures 4 and 5). Thus, S. vaccaria roots appear to have all of the biochemical requirements for processing CPPs to mature CPs.

Figure 4.

 Expression of presegetalin A1 in transformed roots of Saponaria vaccaria results in segetalin A formation.
Single ion chromatograms [single-quadrupole liquid chromatography/mass spectrometry (LC/MS), positive electrospray ionization (ESI+), m/z 610, M + 1] are shown for (a) segetalin A standard, (b) hairy root line JC003-6 (expressing presegetalin A1), (c) hairy root line pK7-OE-9 (control), and (d) a control hairy root line transformed with wild-type Agrobacterium rhizogenes LBA9402. Chromatograms were normalized on a fresh weight basis.

Figure 5.

 Production of segetalin A in transformed Saponaria vaccaria White Beauty root cultures.
Production of segetalin A in transformed S. vaccaria White Beauty root cultures generated using Agrobacterium rhizogenes harboring pJC003 (for presegetalin A1 expression) or pK7WG2D (empty vector, denoted by pK7-OE). Plasmid and root culture line numbers are indicated. Segetalin A was determined by single-quadrupole liquid chromatography/mass spectrometry (LC/MS) using an external standard. Means and standard deviations are indicated (= 3).

In order to investigate the biochemical machinery involved in CPP processing, extracts of developing S. vaccaria seed were examined. Synthetic presegetalin A1 was incubated with extracts of developing seed and the assay samples were analyzed by LC/MS (ion trap) using single ion monitoring of m/z = 851.5 for presegetalin A1 and m/z =thinsp;610.3 for segetalin A (Figure S2). As indicated in Figures S2 and 6, presegetalin A1 was converted to segetalin A in a time- and presegetalin A1-dependent manner. While a substantial fraction of presegetalin A1 was consumed in the reaction, the extent of conversion to segetalin A was quite low. No segetalin A was formed in the absence of presegetalin A1 (Figure S2c). Thus, for presegetalin A1, the biochemical machinery for processing to the mature cyclic segetalin is present in developing seeds of S. vaccaria.

Figure 6.

In vitro conversion of presegetalin A1 to segetalin A.
A time course is shown for the production of segetalin A in crude developing seed extract assays containing 15 mm 2-amino-2-(hydroxymethyl)-1,3-propanediol (TRIS; pH 8), 100 mm NaCl, 2 mg ml−1 BSA, 2 mm DTT, 4.0 μg total protein with 2.5 μg presegetalin A1 in a total volume of 100 μl at 30°C. Samples were analyzed by liquid chromatography/mass spectrometry (LC/MS) (ion trap) using external standardization curves for segetalin A and presegetalinA1. Means and standard deviations are indicated (= 3).

Detection of segetalins J and K in S. vaccaria seeds

In order to test the prediction that segetalins J, K, and L are produced in S. vaccaria, the CP content of Pink Beauty seeds was investigated. After extraction with 70% methanol and column chromatography, fractions were analyzed by LC/MS (single quadrupole). Analysis of fractions collected with 50–60% methanol was consistent with a mixture of flavonoids, monodesmosidic saponins and CPs. Two compounds were identified as the known segetalins F and G (MW 954 and 518) based on their mass spectra. In addition, a compound was identified with a mass spectrum consistent with a CP structure, a relative molecular mass of 511 and a sequence of GRVKA (segetalin K; Figure 7a).

Figure 7.

 Detection of segetalin J- and segetalin K-like compounds in extracts of Saponaria vaccaria seeds.
Mass spectra [positive electrospray ionization (ESI+), single quadrupole] of compounds found in the 65–70% methanol (a) and 50–60% methanol (b) fractions from column chromatographic purification of seed extracts are shown. These spectra are consistent with the cyclic peptides indicated.

Later fractions collected with 65–70% methanol contained a mixture of mono- and bisdesmosidic saponins as well as CPs. The known segetalins A and B (MW 609 and 484) were identified, as was another compound with a mass spectrum consistent with a CP structure. In this case, the MS data were consistent with a relative molecular mass of 877 and a sequence of FGTHGLPAP, that of segetalin J (Figure 7b). The above chemical analysis strongly supports the presence of segetalins J and K in S. vaccaria seed and the predictions represented in Figure 2, based on sequence analysis.

A general role for ribosome-dependent CPP production in the biosynthesis of Caryophyllaceae-like CPs

Given the evidence for CP biosynthesis from ribosome-generated precursors, it was of interest to investigate related taxa. With this in mind, sequence similarity searches of the GenBank EST database were performed with the presegetalin A1 amino acid sequence. This revealed cDNA sequences of Dianthus caryophyllus (carnation) which showed a high degree of similarity (see Figure 2b). Based on sequence similarity and the information presented herein for S. vaccaria, it seems very reasonable to interpret the D. caryophyllus sequences as representing genes encoding CPPs which give rise to mature CPs with sequences GPIPFYG and GYKDCC. Thus, we would predict the presence of these CPs in D. caryophyllus. Incidentally, the authors are aware of only one report of CPs in D. caryophyllus, which provides evidence for the presence of caryophllusin [sic] A [cyclo-(GPYFT)] and delavayin B [cyclo-(GSIFFA)] (Morita et al., 1997; Li et al., 2008).

Expressed sequence tags from the genus Citrus, in the family Rutaceae, also provide evidence for genes encoding CPPs (see Figure 2c). A search of translated ESTs from GenBank for exact matches to the known CPs of Citrus aurantium [cyclo-(GLLPPFPG) and cyclo-(GLVLPS)] (Matsumoto et al., 2002) and Citrus natsudaidai [cyclo-(GYLLPPS)] (Morita et al., 2007), revealed sequences encoding putative CPPs. As with the putative CPPs from the Caryophyllaceae, the flanking regions of the Citrus sequences showed a very high degree of similarity. Further searches revealed gene products from Citrus sinensis which were previously reported as DNA-binding proteins for which expression was stress-induced (Mozoruk et al., 2008; Figure 2c, last four sequences). It is notable that all of the Citrus CPP sequences have Gly as the first amino acid of the mature CP and Ser immediately following the mature sequence. These data are consistent with the ribosome-dependent biosynthesis of CPs via CPPs in the genus Citrus.


Previous phytochemical work has indicated that the so-called Caryophyllaceae-like CPs occur in, of course, the Caryophyllaceae, but also Rutaceae and Linaceae, for example. These CPs are homomonocyclopeptides generally derived from proteinogenic amino acids. To date, very little is known about their biosynthesis. Certainly, the amino acid composition is consistent with the involvement of translation on ribosomes. Other systems provide some insight into how the process of CP formation might occur.

Regarding the biosynthesis of CPs in plants, the cyclotides of the Violaceae, Rubiaceae, and Cucurbitaceae are the best understood. For certain cyclotides, cDNA sequences encoding precursors have been isolated. Evidence, particularly from transgenic tobacco studies, indicates that the corresponding mRNAs are translated on ribosomes (Saska et al., 2007; Gillon et al., 2008). The mature cyclotide domain is excised from the precursor and cyclized. This appears to involve an unidentified protease which cleaves at the N-terminus of the mature domain and an asparaginyl protease which both removes the C-terminus of the precursor and results in cyclization to the mature cyclotide.

A well characterized example from cyanobacteria shows many parallels to cyclotide biosynthesis. Patellamides from the cyanobacterium genus Prochloron are CPs which are derived from ribosomally derived precursors (Jones et al., 2009; Lee et al., 2009). The protease PatA is involved in the cleavage of the precursor at the N-terminus of the amino acid sequence destined for the mature peptide. PatG is a protease-like enzyme which catalyzes a transamidation to give a cyclic product.

In the fungal genus Amanita, the bicyclic peptide Amanita toxins have recently been shown to be biosynthesized ribosomally (Walton et al., 2010). The resulting precursors are cleaved at two proline residues by a prolyl oligopeptidase to give linear intermediates. The mechanism of cyclization by peptide bond formation is unclear; however, it may be related to the formation of a Trp–Cys (tryptathionine) linkage.

The above examples provide precedents for the biosynthesis of CPs from ribosome-derived precursors through the action of protease-like enzymes. The data from S. vaccaria are certainly consistent with this scenario and many of the features of the plant cyclotide, bacterial, and fungal systems. There is a high degree of conservation in the flanking regions of the presegetalin sequences. This may be important for the recognition of the processing enzymes. Further work is required to characterize the enzyme activity present in developing seed extracts to determine the number and nature of the enzymes involved in presegetalin processing. Given the similarities to the D. caryophyllus and Citrus systems, it is likely that this will give further general insight into the biosynthesis of Caryophyllaceae-like CPs. Furthermore, the ability to predict CP sequences from nucleic acid sequence data can serve as an aid to the detection of novel CPs in various plant families.

Experimental Procedures


Presegetalin A1 was chemically synthesized at the Sheldon Biotechnology Centre, McGill University, Montreal, Canada (MW 3400.30 and purity ≥75%). Segetalin A was isolated from S. vaccaria seed by the method of Morita et al. (1994).

Plant material

Saponaria vaccaria‘Pink Beauty’ and ‘White Beauty’ seeds were obtained from CN Seeds Ltd ( Plants were grown under a daily regime of 16 h light (150 μE m−2 sec−1) at 24°C and 8 h dark at 20°C. Stage 2 developing seeds were harvested according to the following scheme: Stage 1, seed white, pod green; Stage 2, seed tan; Stage 3, seed copper, pod partially desiccated; Stage 4, seed dark brown, pod desiccated.

Saponaria vaccaria EST analysis

A S. vaccaria Pink Beauty developing seed, EST collection developed previously (Meesapyodsuk et al., 2007) was investigated for sequences relating to segetalin biosynthesis. Initially, six reading-frame translations of the S. vaccaria EST database were searched for exact matches to all circular permutations of segetalin amino acid sequences. The presence of numerous cDNA sequences appearing to encode different segetalin precursors, but showing a high degree of similarity, required reclustering using special parameters. Each set of ESTs containing sequences that corresponded to a single circular permutation of a given segetalin amino acid sequence was first collected and then separately clustered with CAP3 (Huang and Madan, 1999) using a minimum percentage identity (p) of 97 and an overlap cutoff (o) of 50. To check the EST database for precursors of previously unknown segetalins, a TBLASTN search was conducted using the consensus amino acid sequence for the precursor of presegetalin A1.

Presegetalin gene expression in S. vaccaria

Total S. vaccaria Pink Beauty RNA was isolated from germinating seed, roots, leaves and flowers, using RNeasy Plant Mini kit (Qiagen; For mature and developing seeds, RNA was isolated by the method of Wang and Vodkin (1994). The first-strand cDNA synthesis was performed using Superscript II reverse transcriptase (Invitrogen; utilizing 2 μg of total RNA as the template cDNA. Quantitative RT-PCR was performed using an Applied Biosystems Step One real-time PCR system with a Power SYBR® Green PCR Master Mix ( The thermal cycling conditions were as follows: 50°C for 2 min, 95°C for 4 min, followed by 40 cycles of 95°C for 30 sec, 57°C for the indicated annealing time (see Table S1), and 72°C for 30 sec.

Expression of presegetalin genes was compared with an S. vaccaria Actin1 gene. The clone SVAR04NG_064_H08 was found to contain an insert with a nucleotide sequence representing a partial open reading frame (ORF) which was highly similar to other plant actins including an actin 7-like protein from Pelargonium × hortorum (96% identity over 171 amino acids) and an actin from Suaeda maritima (98% over 140 amino acids). The corresponding gene was named Actin1. Table S1 indicates the oligonucleotide primers, PCR product sizes and annealing times used for the reference gene Actin1 and the presegetalin genes.

Preparation of pJC003, the Sga1 plant expression plasmid

The Sga1 ORF was amplified from the clone SVAR04NG_004_E02 from a previously prepared S. vaccaria Pink Beauty developing seed cDNA library (Meesapyodsuk et al., 2007) using Vent DNA polymerase (New England Biolabs; and the primers JC1 (5′-CACCATGTCTCCAATCCTC-3′) and JC2 (5′-TTACACAGGGGCTGAAGC-3′). The 103-bp PCR product was gel-purified using QIAEXII (Qiagen) and cloned into the Gateway entry vector pENTR/D-TOPO (Invitrogen). The DNA sequence was verified using the BigDye terminator cycle sequencing kit (Applied Biosystems) with an ABI3730 DNA sequencer. LR Clonase II (Invitrogen) was used to transfer the insert into the binary plant transformation vector pK7WG2D (Karimi et al., 2002). After DNA sequence verification, the resultant plasmid, pJC003, was used to transform electrocompetent cells of Agrobacterium rhizogenes LBA9402. Agrobacterium rhizogenes LBA9402 was also transformed with pK7WG2D alone. Polymerase chain reaction was used to confirm transformation (see below).

Transformed S. vaccaria roots

Sterile leaf explants of S. vaccaria White Beauty (which does not contain segetalin A; J. J. Balsevich, unpublished data) were transformed separately with either pJC003 or pK7WG2D and hairy roots were regenerated as described previously (Schmidt et al., 2007). Rapidly growing lines that showed kanamycin resistance and green fluorescence with no bacterial contamination were used to establish single hairy root lines. All transformed root lines originated from independent green fluorescent adventitious roots.

Hairy root DNA extraction and PCR analysis

DNA was extracted from a 100–200 mg sample of each root culture using the DNeasy Plant Mini kit (Qiagen) and subjected to multiplex PCR analysis to simultaneously score for the presence or absence of the rolC, virD, egfp, and nptII genes as described previously (Schmidt et al., 2007). To confirm that kanamycin-resistant and egfp-positive hairy roots were transformed, the presence of the Sga1 gene was verified by PCR. The PCR reaction mixture (25 μl) contained 1 μl of DNA, as prepared above, in 1× PCR reaction buffer, 2.5 mm MgCl2, 0.2 mm of each dNTP, 0.4 μm of each primer [JC3 5′-CCGACAGTGGTCCCAAAGATG-3′ (vector-specific) and JC4 5′-GCCTGAAAAGCCCAAACTGG-3′ (gene-specific)] and 5 U Taq DNA polymerase (Invitrogen). Amplification was performed in a Stratagene Robocycler Gradient 96 ( using the following program: 94°C for 10 min, 30 cycles of 94°C for 30 sec, 62°C for 40 sec, and 72°C for 50 sec, followed by 72°C for 10 min. The expected size of the PCR fragment was 398 bp.

Single quadrupole LC/MS analysis of hairy roots

For each transformed root line, 1.2–2.2 g fresh weight of hairy roots was added to 5 ml methanol in a 10 ml glass screw-top tube and homogenized using a Polytron (Kinematica; The sample was sonicated for 20 min using a Branson 2510 ultrasonic cleaner (Branson Ultrasonic Corporation;, centrifuged at 1400 g for 3 min and the supernatant was transferred to a new tube. An additional 5 ml of methanol was added to the pellet and sonicated, centrifuged, and decanted, as above. This step was repeated once. A tube containing the combined supernatants was placed in a heating block at 30–35°C and the methanol was evaporated under a nitrogen stream. The sample was resuspended in 1 ml distilled H2O, transferred to a 1.5 ml tube, and centrifuged at 12 000 g for 5 min. The supernatant was then placed in a Costar SPIN-X® (0.22 μm cellulose acetate) (Corning; centrifuge filter unit and centrifuged at 12 000 g for 1 min. The filtrate was then used for analysis by LC/MS (single quadrupole) using a 2695 Alliance chromatography system with inline degasser, coupled to a ZQ mass detector and a 2996 photodiode array detector (Waters, MassLynx software was used for data acquisition and analysis. The column used was a Waters Sunfire 3.5-μm RP C-18 150 × 2.1 mm. The flow rate was 0.15 ml min−1. The column was maintained at 35°C during analysis. The binary solvent system consisted of 90:10 v/v water/acetonitrile containing 0.12% acetic acid (solvent A) and acetonitrile containing 0.12% acetic acid (solvent B). The gradient program used was 0–8 min, 95: 5 A/B; 8–31 min, 95:5 to 50:50 A/B; 31–33 min, 50:50 to 0:100 A/B; 33–48 min, 0:100 A/B. Voltage parameters for positive electrospray ionization (ESI+) were: capillary, 3.50 kV; cone, ramped from +15 to +45 V; extractor, 6.00 V; RF lens, 0.9 V.

In vitro processing of presegetalin A1

‘Stage 2’ developing seed from S. vaccaria (var. White Beauty) was homogenized manually with a plastic pestle in a 1.5 ml low protein binding microtube. One gram of seeds was ground for 2 min in 4 × 250 μl 20 mm 2-amino-2-(hydroxymethyl)-1,3-propanediol (TRIS) buffer (pH 8) on ice followed by centrifugation at 13 000 g for 5 min. The supernatant was removed and another 250 μl of buffer was added and the grinding and centrifugation was repeated. The supernatant fractions were pooled, and this crude extract supernatant was used for enzyme assays. The crude extract protein was measured using Bradford reagent with BSA as a calibration standard (Bio-Rad; The in vitro assay contained 15 mm TRIS (pH 8), 100 mm NaCl, 2 mm DTT, 0.2 mg BSA, and 25 μg ml−1 presegetalin A1 and was initiated by the addition of crude extract supernatant, equivalent to 4.0 μg protein, in a total reaction volume of 100 μl. The assay was incubated at 30°C for up to 5 h and stopped by placing reactions in dry ice. The assays were lyophilized, re-suspended in methanol, evaporated and re-suspended in 50:50 v/v methanol/water for LC/MS analysis. Ion trap ESI+ LC/MS/MS analysis was used to detect production of segetalin A using an Agilent 6320 Ion Trap LC/MS system under default Smart Parameter settings. The analyzer and ion optics were adjusted to achieve proper resolution (Agilent Installation Guide #G2440-90105) using the ESI Tuning Mix (Agilent #G2431A). The mass spectrometer scanned from 50 to 2200 mass units at 8100 mass units sec−1 with an expected peak width of ≤0.35 atomic mass units. For auto MS/MS, the trap isolation width was 4 atomic mass units. The associated Agilent 1200 LC was fitted with a Zorbax 300 EXTEND-C18 column (150 × 2.1 mm, 3.5 μm particle size) maintained at 35°C. The binary solvent system consisted of 90:10 v/v water/acetonitrile containing 0.1% formic acid and 0.1% ammonium formate (solvent A) and 10:90 v/v water/acetonitrile containing 0.1% formic acid and 0.1% ammonium formate (solvent B). The separation gradient was 90:10 A/B to 50:50 A/B in 3 ml over 20 min.

Analysis of CPs in S. vaccaria seed

S. vaccaria seed var Pink Beauty (100 g) was finely ground and extracted with 70% methanol (3 × 300 ml). The combined methanolic extract was concentrated in vacuo to approximately 200 ml and diluted with an equal amount of water. The aqueous solution was applied to an open column (approximately 500 ml) of Amberchrom CG300M in 0.01% acetic acid (aq.). The column was eluted with a methanol-0.01% acetic acid (aq.) gradient from 0–100% methanol. Fractions were analyzed by LC/MS (single quadrupole) in ESI+ mode as described above.


We are grateful to Greg Bishop, Carla Barber, Dustin Cram, and the PBI Bioinformatics and DNA Technology Units for technical support, to Sean Hemmingsen and Jon Page for reviewing the manuscript, to the National Research Council of Canada’s GHI2, CEHH, PHW, and PPHS programs for funding.

Accession Numbers:

JF297947-JF297960, JF303653.