Genome-wide analysis of a land plant-specific acyl:coenzymeA synthetase (ACS) gene family in Arabidopsis, poplar, rice and Physcomitrella


Author for correspondence:
Carl J. Douglas
Tel: +1 604 822 2618
Fax:+1 604 822 6089


  • • The plant enzyme 4-coumarate:coenzyme A ligase (4CL) is part of a family of adenylate-forming enzymes present in all organisms. Analysis of genome sequences shows the presence of ‘4CL-like’ enzymes in plants and other organisms, but their evolutionary relationships and functions remain largely unknown.
  • • 4CL and 4CL-like genes were identified by blast searches in Arabidopsis, Populus, rice, Physcomitrella, Chlamydomonas and microbial genomes. Evolutionary relationships were inferred by phylogenetic analysis of aligned amino acid sequences. Expression patterns of a conserved set of Arabidopsis and poplar 4CL-like acyl-CoA synthetase (ACS) genes were assayed.
  • • The conserved ACS genes form a land plant-specific class. Angiosperm ACS genes grouped into five clades, each of which contained representatives in three fully sequenced genomes. Expression analysis revealed conserved developmental and stress-induced expression patterns of Arabidopsis and poplar genes in some clades.
  • • Evolution of plant ACS enzymes occurred early in land plants. Differential gene expansion of angiosperm ACS clades has occurred in some lineages. Evolutionary and gene expression data, combined with in vitro and limited in vivo protein function data, suggest that angiosperm ACS enzymes play conserved roles in octadecanoid and fatty acid metabolism, and play roles in organ development, for example in anthers.


The enzyme 4-coumarate:CoA ligase (4CL) plays important roles in phenylpropanoid metabolism by generating CoA esters of hydroxycinnamic acids. These cinnamyl CoA esters are used as intermediates in the biosynthesis of a large array of phenolic secondary natural products, including monolignols and flavonoids (Hahlbrock & Scheel, 1989). The first 4CL gene was cloned from parsley (Douglas et al., 1987), and 4CL enzymes are encoded by multigene families in all vascular plants examined to date (Lee & Douglas, 1996; Allina et al., 1998; Hu et al., 1998; Ehlting et al., 1999; Lindermayr et al., 2002; Kumar & Ellis, 2003; Hamberger & Hahlbrock, 2004; Tuskan et al., 2006; Hamberger et al., 2007). Analysis of enzymatic properties of recombinant enzymes has revealed that 4CL isoenzymes have differential activity towards different hydroxycinnamyl substrates (Lee & Douglas, 1996; Allina et al., 1998; Hu et al., 1998; Ehlting et al., 1999; Stuible & Kombrink, 2001; Lindermayr et al., 2002; Hamberger & Hahlbrock, 2004). The analysis of the 4CL gene family in the fully sequenced Arabidopsis (Ehlting et al., 1999; Hamberger & Hahlbrock, 2004), poplar (Tsai et al., 2006; Tuskan et al., 2006; Hamberger et al., 2007) and rice (Tuskan et al., 2006; Hamberger et al., 2007) genomes showed that 4CL is encoded by four to five genes in these plants. Differential 4CL gene expression patterns in Arabidopsis and poplar, coupled with 4CL isoenzyme substrate utilization preferences, suggest that 4CL genes and enzymes have undergone subfunctionalization for the biosynthesis of different classes of phenylpropanoid-derived compounds (Ehlting et al., 1999; Harding et al., 2002), with the phylogenetically distinct Class I and Class II clades of 4CLs specialized for monolignols and flavonoid biosynthesis, respectively.

The 4CL enzymes are members of the adenylate-forming enzyme superfamily, which share a common reaction involving formation of an adenylate intermediate (Becker-Andre et al., 1991; Schneider et al., 2005), including those involved in fatty acid chain elongation (Shockey et al., 2003). Following the generation of genome sequence data from Arabidopsis, a number of genes encoding adenylate-forming enzymes were annotated as being closely related to true 4CLs but of unknown specific biochemical function, and these may function in diverse pathways in plant metabolism and natural product biosynthesis. For example, initial Arabidopsis genome sequence data revealed the presence of four genes with distant similarity to 4CL genes (Cukovic et al., 2001). In a subsequent analysis, eight members of a larger set of Arabidopsis 4CL-like adenylate-forming enzymes annotated in the completed Arabidopsis genome were classified as 4CL-like genes because of their close phylogenetic relationship to bona fide 4CLs (Raes et al., 2003), with which they share structural similarities such as a conserved substrate binding domain and conserved Box I and II domains (Ehlting et al., 2001; Schneider et al., 2003). As summarized by Ehlting et al. (2005), a total of nine genes encoding enzymes most closely related to genes encoding 4CL have been annotated as 4CL or 4CL-like by different groups (Costa et al., 2003; Raes et al., 2003; Shockey et al., 2003). Interestingly, in contrast to 4CL proteins, most 4CL-like enzymes are predicted to be targeted to the peroxisome because of the presence of C-terminal peroxisome targeting sequence 1 (PTS1) sequences, and such localization has been experimentally verified in three cases (Schneider et al., 2005; Koo et al., 2006). According to the analysis of Reumann et al. (2004) nine major PTS1 tripeptides ([SA][RK][LM]> without AKM> plus SRI> and PRL>) located at the C-terminus of plant proteins are strong indicators for peroxisomal localization, with SRL, SRM, SKL, ARL and PRL being among the most common sequences, while other less frequently observed sequences such as SNL are also functional.

Unlike true 4CL genes, expression of 4CL-like genes is not associated with lignification or flavonoid biosynthesis (Raes et al., 2003; Ehlting et al., 2005), suggesting that they may not encode enzymes with 4CL activity and lack activity towards known 4CL hydroxycinnamate substrates (p-coumaric, caffeic, ferulic, 5-hydroxyferulic and sinapic acids). Indeed, in vitro testing of seven 4CL-like recombinant proteins (At5g63380, At4g05160, At4g19010, At3g48990, At1g20510, At1g62940 and At5g38120) with phenylpropanoid pathway intermediates did not result in any measurable catalytic activity (Costa et al., 2005). Independently, the enzymes encoded by several 4CL-like genes were shown to be acyl-coenzyme A synthetases that accept medium to long-chain fatty acids and in some cases the cyclopentenone 12-oxo-phytodienoic acid (OPDA) and/or OPDA derivatives as in vitro substrates (Schneider et al., 2005; Kienow et al., 2008). OPDA is an intermediate in the octadecanoid pathway leading jasmonic acid (JA) biosynthesis. The later steps of this pathway, formation of OPDA-CoA thioesters followed by beta-oxidation of OPDA-CoA resulting in acyl chain shortening, occur in the peroxisome (Li et al., 2005; Koo et al., 2006). Recently, the Arabidopsis 4CL-like gene At1g20510 has been shown to encode a peroxisomal OPDA:CoA ligase involved in JA biosynthesis, and is now designated OPCL1 (Koo et al., 2006; Kienow et al., 2008). However, the biological functions of remainder of the Arabidopsis genes originally annotated as 4CL-like are still unknown. Based on the activities of several 4CL-like enzymes as acyl-CoA synthetases, and their lack of activity against hydroxycinnamate substrates, we use the name acyl-CoA synthetase (ACS) for the remainder of the 4CL-like genes in Arabidopsis without known in vivo functions, and for homologues of these genes in other plants. Completion and annotation of the rice (Yuan et al., 2005;, poplar (Tuskan et al., 2006;, and Physcomitrella ( genomes opens the door to comparative genomics approaches aimed at understanding the evolution and potential functions of genes such as those in the 4CL-like/ACS family, using Arabidopsis as a reference genome (Arabidopsis Genome Initiative, 2000;

In this study, we used Arabidopsis 4CL sequences for blast searches of eukaryotic and prokaryotic genome databases to identify over 100 4CL-related genes in diverse taxa, and identified and annotated ACS and genes encoding related adenylate-forming enzymes in the fully sequenced poplar, rice, and Physcomitrella genomes. In addition, we retrieved full-length ACS sequences from a maize genome database and publicly available plant genome databases. Phylogenetic reconstructions based on amino acid sequence alignments showed that ACS genes, including the nine canonical Arabidopsis 4CL-like ACS genes, belong to a land plant-specific clade most closely related to true 4CLs, and that each fully sequenced angiosperm plant genome has representatives in each of the five well-defined ACS clades, four of which contain proteins predominantly predicted to be peroxisomally localized. This suggests that ACS enzymes perform important, conserved roles in land plant metabolism. We profiled the developmental and stress-induced expression of Arabidopsis ACS genes and selected poplar homologues, representing all five clades. The combination of phylogenetic reconstruction, gene expression, and in silico co-expression analyses provided insights into the evolutionary diversification of ACS function in angiosperm lineages, subfunctionalization of duplicated ACS genes, identification of putative poplar orthologues of Arabidopsis ACS genes, and potential functions of representative ACS genes.

Materials and Methods

Plant growth conditions

Arabidopsis thaliana (Arabidopsis) seedlings were transferred from MS agar plates to soil (Sunshine mix 5; Sungrow Horticulture, Saba Beach, Alberta, Canada) and cultivated at 20°C under long-day conditions (18 h light) until maturity. Poplar (Populus trichocarpa × Populus deltoides genotype H11-11) plants were grown indoors at the University of British Columbia Horticulture Greenhouse, Vancouver, Canada, as described in Ralph et al. (2006).

Gene expression analysis

Arabidopsis total RNA was isolated from the specified tissues frozen in liquid nitrogen, ground to a fine powder, and RNA extracted using the Trizol reagent (Gibco BRL, Grand Island, NY, USA) following manufacturer's instructions. The quality of the RNA samples was assessed by visual inspection of the rRNA bands on a 1% agarose gel. RNA samples were quantified spectrophotometrically and 2 µg DNAse I-treated RNA per 20 µl reaction was used to generate cDNA using Superscript II Reverse Transcriptase (Invitrogen, Carlsbad, CA, USA) following the manufacturer's protocol. Poplar RNA was extracted from various poplar organs and cDNA prepared as described (Ralph et al., 2006).

For the quantitative reverse transcriptase polymerase chain reaction (qRT-PCR) analysis of Arabidopsis gene expression, 10 ng of cDNA was incubated with 10 µl QuantiTect SYBR Green PCR mastermix (Qiagen Inc., Valencia, CA, USA) and 30 nmol of each a forward and a reverse primer in a total volume of 20 µl. After an initial denaturation step at 95°C for 15 min, 40 cycles at 95°C for 15 s, 55°C for 30 s and 68°C for 45 s were followed by a fluorescence reading. After a final incubation at 68°C for 5 min, a melting curve was generated ranging from 90°C to 60°C. Threshold cycles (CT) were adjusted manually, and the CT values for a housekeeping control amplified in parallel on each plate were subtracted from CT values obtained for each gene of interest, thus generating normalized CT values (ΔCT). The relative starting quantities of each gene were determined by setting as a base value the gene with the highest CT value within a tissue panel or treatment series, and relative quantities were calculated using the ΔΔCT method, as described in Hietala et al. (2003). The ΔCT values were calculated after normalization using the following control genes: Arabidopsis adenine phosphoribosyltransferase (APT1– At1g27450) for all Arabidopsis expression experiments and poplar eukaryotic translation initiation factor 5A-1/eIF-5A-1 (GenBank accession CV251327; Ralph et al., 2006) for all poplar experiments. The ΔΔCT value was calculated using the following reference tissues: the highest expressing tissue for the developmental expression panel (see Fig. 3); and unwounded leaf tissues in the stress treatment panels (see Figs 4 and 5). Intron-spanning primers used to amplify each gene are given the Supporting Information, Table S1. Selected qRT-PCR reaction products were sequenced to insure fidelity of the amplifications.

Figure 3.

Figure 3.

Developmental expression profiles of ACS (acyl-CoA synthetase) genes in Arabidopsis and poplar. Expression levels were determined by quantitative reverse transcriptase polymerase chain reaction (qRT-PCR), and expression is represented relative to the tissue with the highest level of expression, set at 100%. Adjacent bars represent technical replicates using the same tissue samples. The level of expression of each gene relative to the Arabidopsis or poplar reference gene, set arbitrarily at 1.0, is given above the bars for the organs or tissue in which the gene was most highly expressed. Locations of ACS phylogenetic clades A–E for which expression data is shown are indicated by tinted circles. (a) Clade A genes; (b) Clade B genes; (c) Clade C genes; (d) clade D genes; (e) clade E genes. Abbreviations for Arabidopsis organs: 7 d, 7-d old seedlings; F, flower; ML, mature leaf; MS, mature shoot; R, root; YL, young leaf; YS, young stem. Abbreviations for poplar organs and tissues: B, bark; FF, female flower; MF, male flower; ML, mature leaf; MR, mature root; P, petiole; Ph, phloem; X, xylem; YL, young leaf; YR, young root.

Figure 3.

Figure 3.

Developmental expression profiles of ACS (acyl-CoA synthetase) genes in Arabidopsis and poplar. Expression levels were determined by quantitative reverse transcriptase polymerase chain reaction (qRT-PCR), and expression is represented relative to the tissue with the highest level of expression, set at 100%. Adjacent bars represent technical replicates using the same tissue samples. The level of expression of each gene relative to the Arabidopsis or poplar reference gene, set arbitrarily at 1.0, is given above the bars for the organs or tissue in which the gene was most highly expressed. Locations of ACS phylogenetic clades A–E for which expression data is shown are indicated by tinted circles. (a) Clade A genes; (b) Clade B genes; (c) Clade C genes; (d) clade D genes; (e) clade E genes. Abbreviations for Arabidopsis organs: 7 d, 7-d old seedlings; F, flower; ML, mature leaf; MS, mature shoot; R, root; YL, young leaf; YS, young stem. Abbreviations for poplar organs and tissues: B, bark; FF, female flower; MF, male flower; ML, mature leaf; MR, mature root; P, petiole; Ph, phloem; X, xylem; YL, young leaf; YR, young root.

Figure 3.

Figure 3.

Developmental expression profiles of ACS (acyl-CoA synthetase) genes in Arabidopsis and poplar. Expression levels were determined by quantitative reverse transcriptase polymerase chain reaction (qRT-PCR), and expression is represented relative to the tissue with the highest level of expression, set at 100%. Adjacent bars represent technical replicates using the same tissue samples. The level of expression of each gene relative to the Arabidopsis or poplar reference gene, set arbitrarily at 1.0, is given above the bars for the organs or tissue in which the gene was most highly expressed. Locations of ACS phylogenetic clades A–E for which expression data is shown are indicated by tinted circles. (a) Clade A genes; (b) Clade B genes; (c) Clade C genes; (d) clade D genes; (e) clade E genes. Abbreviations for Arabidopsis organs: 7 d, 7-d old seedlings; F, flower; ML, mature leaf; MS, mature shoot; R, root; YL, young leaf; YS, young stem. Abbreviations for poplar organs and tissues: B, bark; FF, female flower; MF, male flower; ML, mature leaf; MR, mature root; P, petiole; Ph, phloem; X, xylem; YL, young leaf; YR, young root.

Figure 4.

Wound responsiveness of Arabidopsis and poplar peroxisomal ACS (acyl-CoA synthetase) genes in clades B–E. (a) Histochemical assay of glucuronidase (GUS) activity in transgenic Arabidopsis plants expressing ACS promoter–GUS gene fusions. Fully expanded rosette leaves were wounded 1 h before staining for GUS activity. (b) Quantitative reverse transcriptase polymerase chain reaction analysis of ACS gene expression in response to wounding in Arabidopsis and poplar. Data are expressed as fold-change in expression (y-axis) relative to unwounded control leaves. Adjacent bars represent technical replicates using the same tissue samples. Time (in hours) after wounding is given on the x-axis.

Figure 5.

Stress responsiveness of poplar ACS (acyl-CoA synthetase) genes in clades B–E. Quantitative reverse transcriptase polymerase chain reaction analyses were carried out on RNA isolated from poplar leaves after simulated herbivory (SH, mechanical wounding plus insect regurgitant), herbivory by the forest tent caterpillar (Malacosoma disstria; FTC), and exposure to methyl jasmonate (MJ) for 2 h (closed bars), 6 h (dark tinted bars), and 24 h (light tinted bars). Data are expressed as fold-change in expression relative to untreated control leaves or leaves sprayed with the solvent Tween20, in the case of MJ treatment. Adjacent bars of the same shading represent duplicate determinations using the same tissue samples. The reference gene used was that encoding elongation initiation factor 5A (GenBank accession CV251327).

Recombinant DNA methods

Plasmid DNA was prepared using Qiagen spin Miniprep and Midiprep kits, following the manufacturer's instructions. DNA sequencing was performed by the University of British Columbia Nucleic Acid and Protein Service unit, using BigDye 3.0 and a Prism Sequencer (Applied Biosystems, Foster City, CA, USA).

Arabidopsis ACS promoter regions were amplified from genomic DNA using the pwo polymerase (Roche) and cloned into the NcoI site of the pCambia 1305.1 vector (, preserving the start codon of the GUS (beta-glucuronidase) reporter gene (Jefferson et al., 1987). Primers containing the compatible restriction site sequence are given in the Supporting Information, Table S2.

The coding sequences of both AtOPCL1 and PoptrACS5 were polymerase chain reaction (PCR) amplified using cDNA derived from organs where these genes were most highly expressed using Phusion high fidelity enzyme (Finnzymes, Espoo, Finland) according to the manufacturer's protocol. The PCR primers are given in the Supporting Information, Table S3. The PCR products were cloned into a Gateway (Invitrogen) compatible entry vector using TOPO-TA cloning kit (Invitrogen) and subsequently recombined into the destination vector using LR Clonase II enzyme mix according to the manufacturer's instructions (Invitrogen). The PCR products were cloned in-frame N-terminal to the GFP gene driven by the CaMV35S promoter.

Bioinformatic and phylogenetic reconstruction methods

The set of Arabidopsis genes encoding 4CL enzymes (Ehlting et al., 1999) was used in blast homology searches to identify potential 4CL-like/ACS genes in the Arabidopsis genome, using the database maintained at The Arabidopsis Information Resource ( Poplar homologues were identified by reciprocal blast searches of the poplar genome assembly (Joint Genome Institute, Populus trichocarpa v.1.1; using At4CL and AtACS sequences as queries. The poplar gene models (from automated ab initio gene-calling programs; Tuskan et al., 2006) assigned for a given locus were evaluated, annotated manually and revised as necessary (Table 1 and the Supporting Information, Table S4). All annotated candidates corresponded to loci anchored to poplar linkage groups or to sequence scaffolds, as described in Tuskan et al. (2006). Corresponding rice homologues (Table 1 and Table S4) were identified in the rice genome using blast searches of the rice genome annotation at The Institute for Genome Research (TIGR; Physcomitrella sequences were identified using the JGI website ( All sequences are given in Table 1 and Table S4. Selected microorganism sequences were obtained by blast searches using At4CL and AtACS sequences as queries at the NCBI website ( The accession numbers of these sequences are given in Table S4. Protein sequences were aligned using the Genomatix Dialign program ( and the multiple protein sequence alignments were manually optimized. To reconstruct phylogenetic trees, maximum likelihood analyses with 100 bootstrap replicates were carried out using phyml v. 2.4.4 (Guindon & Gascuel, 2003) with the JTT model of amino acid substitution.

Table 1.  Poplar, rice, Physcomitrella and Arabidopsis 4CL (4-coumarate:coenzyme A ligase) and ACS (acyl-CoA synthetase) genes
NameClade1Gene modelPredicted PTS1 targeting sequence2
Poptr4CL14CLestExt_fgenesh4_pg.C_1210004 scaffold3
Poptr4CL24CLgw1.XVIII.2818.1 LG_XVIII
Poptr4CL34CLgrail3.0100002702 LG_I
Poptr4CL44CLgrail3.0099003002 LG_IX
Poptr4CL54CLfgenesh4_pg.C_LG_III001773 LG_III

β-Glucuronidase (GUS) and green fluorescent protein (GFP) report gene activity

To generate transgenic Arabidopsis Col-0 plants expressing the GUS reporter under control of ACS gene promoters, plants were transformed following the floral dip protocol (Clough & Bent, 1998). Five to ten transgenic lines per construct were surveyed for histochemical GUS activity at various stages of development, and wound-induced GUS activity was assayed in representative lines.

The GUS histochemical assay solution was prepared by mixing an aqueous solution of 100 mm Na3PO4, pH 7.0, 0.5% X-GLUC (bromochloroindoyl-b-glucuronide) with an aqueous solution of 2 mm K3Fe(CN)6 and 2 mm K4Fe(CN)6 in 0.1% Triton X. Young Arabidopsis leaf blades were wounded with scissors, cut from the plant after 1 h and placed in cold GUS assay solution, vacuum infiltrated for 15 min and incubated at 37°C for 2 h or until a blue colour could be observed. Chlorophyll was removed from the tissue by incubation in 95% ethanol overnight. Results were documented using a Leica dissecting microscope equipped with a Spot32 camera at the UBC BioImaging Facility, Vancouver, Canada.

Agrobacterium strains carrying GFP::ArathACS4 and GFP::PoptrACS5 were used to generate eight independent transgenic tobacco lines by leaf disc transformation (Hauffe et al., 1991) and plants were screened for fluorescence indicating GFP expression under an epifluorescence microscope (Zeiss Axioplan 2). Plants with both high and low levels of GFP fluorescence were selected for analysis by confocal microscopy (Zeiss Meta Confocal). Wild-type tobacco plants were used as negative controls. Chloroplast autofluorescence was excited with a 488 nm argon laser and detected after passage through a long-pass 650 nm emission filter. The GFP fluorescence was excited with a 488 nm laser and detected after passage through a band pass 505–530 nm emission filter. Images were reconstructed using the imagej software suite (Rasband, 1997–2007).

Results and Discussion

Genome-wide phylogenetic analysis Acyl:CoA synthetase related genes

The adenylate-forming enzyme superfamily that uses a common reaction mechanism includes members from all organisms studied to date, including prokaryotes and eukaryotes (Becker-Andre et al., 1991; Shockey et al., 2003; Schneider et al., 2005). We identified 88 full-length genes encoding adenylate-forming enzymes related to 4CL from genomic databases, using in silico similarity searches based on the amino acid sequences of Arabidopsis 4CL proteins (Ehlting et al., 1999; Hamberger & Hahlbrock, 2004). In this analysis, we focused on three angiosperms with complete genome sequences available (Arabidopsis, poplar and rice), the genomes of maize, Physcomitrella, Chlamydomonas and the genomes of selected other microorganisms (fungi and bacteria) for which complete or substantial genome sequence data is available.

A phylogenetic analysis of over 100 sequences from these various organisms, including bona fide 4CL sequences from Arabidopsis, poplar, and rice, is shown in Fig. 1. This analysis revealed two general groups of adenylate-forming proteins. One large group contained representatives from all organisms analysed, including bacteria, fungi, Chlamydomonas, Physcomitrella and angiosperm plants (large arc in Fig. 1). These probably represent adenylate-forming enzymes with metabolic functions that are conserved in many lineages. As an example of possible functions, one clade in this group contains the Arabidopsis ACN1 gene, which encodes an acetate:CoA ligase that functions as an entry point to the glyoxylate cycle during seed germination (Turner et al., 2005). The clade containing ACN1 is enriched for angiosperm plant sequences, and includes Arabidopsis genes encoding the acyl activating enzymes (AAE) AAE4 and AAE6, which may play housekeeping functions related to fatty acid metabolism (Shockey et al., 2003), as well as two Chlamydomonas genes. A sister clade contains a Saccharomyces cerevisiae gene encoding the FAT2 peroxisomal acyl-CoA synthetase, as well as the Arabidopsis AAE3 gene and several poplar, rice, and Physcomitrella genes, all of unknown function. The ACS protein encoded by the bacterium Streptomyces coelicolor A3(2), ScCCL, has been shown to have high activity against 4-coumaric and cinnamic acids and is thus designated as a cinnamic acid-CoA ligase, ScCLL (Kaneko et al., 2003). This raises the possibility that other bacterial and fungal enzymes in this clade have activities against phenolic substrates. Based on their similarity to Arabidopsis AAE genes, we designated the annotated poplar, rice, and Physcomitrella genes in this large group as AAEL (acyl activating enzyme-like) genes (Table S4).

Figure 1.

Phylogenetic tree of 104 proteins related to 4-coumarate:CoA ligase (4CL). 4CL-related (acyl-CoA synthetase (ACS)) sequences corresponding to translated nucleotide sequences from full-length cDNAs and expressed sequence tags (ESTs) were aligned, and an unrooted phylogenetic tree was generated. Nodes with bootstrap values above 70% are indicated by asterisks. Triangles represent the subclades of land plant-specific ACSs most closely related to true 4CL enzymes. The large dashed arc indicates a large group of genes found in all organisms from all lineages investigated (land plants, algae, fungi, protists and prokaryotes); the dashed oval indicates the clade of land plant-specific ACS genes, as discussed in the text. Solid triangle, true 4CL enzymes; stippled triangle, nonperoxisomal ACSs in clade A; hatched triangles, ACS clades B–E as described in the text. A Physcomitrella clade of land-plant ACSs is indicated. Gene name prefixes: At, Arabidopsis thaliana; Os, Oryza sativa; Poptr, Populus trichocarpa; Pp, Physcomitrella patens. Gene names and identifiers are given in Table 1 and the Supporting Information, Table S4. The scale represents 0.1 amino acid changes.

A second group of adenylate-forming proteins (Fig. 1, highlighted by an oval) contained both bona fide 4CL proteins and previously annotated Arabidopsis 4CL-like (Costa et al., 2003; Raes et al., 2003; Shockey et al., 2003; Ehlting et al., 2005) acyl-CoA synthetase (ACS) proteins. Strikingly, this ACS group is apparently land plant-specific. All angiosperm genomes, as well as the genome of the moss Physcomitrella patens, contained genes encoding proteins in this group, while no representatives from other eukaryote lineages were found. It is interesting, however, that Penicillium and Dictyostelium contain genes relatively closely related to this land plant group. Inclusion of full-length gymnosperm ACS gene sequences, when they become available, in this analysis will allow definitive testing of the interpretation that this clade is restricted to land plants.

Within the apparent land plant-specific group, 5 clades containing angiosperm ACSs could be further delineated (Fig. 1; clades A–E). The ACS genes encoding proteins in these clades are phylogenetically closely related to bona fide 4CLs, which form a sister clade to clades A–E. As presented in more detail later, each of clades A–E contains at least one sequence representative of each of the four angiosperm plant species analysed (Arabidopsis, poplar, rice and maize), demonstrating that these proteins are evolutionarily conserved in the angiosperm lineage and that a common ancestor of these clades was present before the divergence of monocots and eudicots. The Arabidopsis, poplar, and rice proteins represented in the bona fide 4CL clade have been described and annotated (Ehlting et al., 1999; Hamberger & Hahlbrock, 2004; Tsai et al., 2006; Tuskan et al., 2006; Hamberger et al., 2007). The previously and currently annotated Arabidopsis, poplar, rice and Physcomitrella 4CL and ACS genes are given in Table 1.

Analysis of the Physcomitrella genome revealed four putative 4CL genes encoding proteins that grouped in the bona fide 4CL clade (Table 1; de Azevedo Souza et al., unpublished), as well as five ACS genes falling outside the angiosperm ACS clades B–E. One Physcomitrella ACS sequence grouped into clade A (PpACS6; Table 1). This suggests that 4CL and clade A ACS genes originated early during the evolution of land plants before divergence of tracheophytes (vascular plants) and bryophytes, consistent with a possible role for these enzymes in the biosynthesis of phenylpropanoids and/or the extracellular matrix, a key innovation for adaptation to the land environment (Bowman et al., 2007). Interestingly, Physcomitrella PpACS5 is basal to the ACS clades B–E, suggesting that it could represent an ancestral ACS gene retained in Physcomitrella. The remaining four Physcomitrella ACS sequences formed a distinct clade, suggesting bryophyte-specific evolution and diversification of this ACS gene family, which could encode proteins with functions distinct from those of angiosperm ACSs.

Almost all sequences in clades B, C, D and E, as well as Physcomitrella sequences PpACS1–4, contain a consensus PTS1 peroxisomal target sequence (Table 1, Fig. 2; Reumann et al., 2004) at their C-termini, which suggests they are targeted to this organelle. Localization of ACS proteins to the peroxisome has been experimentally verified for Arabidopsis proteins in clade E (At4g05160, ACS6 and At5g63380, ACS9; Schneider et al., 2005) and the Arabidopsis OPCL1 (At1g20510) protein in clade D (Koo et al., 2006). Interestingly, all fungal ACS-related enzymes identified and shown in Fig. 1 also have peroxisomal target signals. Given that peroxisomes play a major role in fatty acid metabolism in both plants and fungi, and that acyl:CoA ligases are widely used in modification of these molecules, it is possible that ACSs and 4CLs were recruited from fatty acid metabolism early in land plant evolution, to perform their current functions. None of the ACSs in clade A, most closely related to bona fide 4CLs, contained the PTS1 sequence, suggesting that loss of this sequence may have played a role in the acquisition of 4CL and subclade A functions. Furthermore, Physcomitrella PpACS5 located at a position basal to ACS clades B–D also lacks a PTS1 targeting sequence, suggesting a potentially distinct biochemical function for this enzyme, relative to the peroxisomally targeted ACSs.

Figure 2.

Phylogenetic relationships of plant-specific acyl-CoA synthetases (ACSs) from three fully sequenced angiosperm genomes. Translated nucleotide sequences corresponding to ACS genes from Arabidopsis, poplar and rice were aligned and an unrooted phylogenetic tree generated. Nodes with bootstrap values above 70% are shown by stars. The 4-coumarate:CoA ligase (4CL) and ACS clades A–E discussed in the text are circled and contain at least one representative of each plant species. Protein names in shaded boxes contain the PTS1 peroxisomal target signal. Bar represents 0.1 amino acid changes.

The angiosperm-specific ACS clades were next analysed in more detail, focusing on complete gene families from Arabidopsis, poplar, and rice, for which whole genome sequence information is available. As shown in detail in Fig. 2 and Table 1, all three species contained ACS proteins in each of clades A–E. While the number of ACS genes within each genome was similar (13 in poplar, 12 in rice, and 9 in Arabidopsis; Table 1 and Fig. 2), the number in each clade varied between species. Certain ACS clades were greatly enriched for proteins from a particular species. For example, clade D is an Arabidopsis gene-rich clade, with five Arabidopsis representatives, two from poplar and only one rice member. By contrast, clade E is poplar rich with seven genes, three from rice and only one from Arabidopsis. Clade A, unique in containing ACS genes lacking the PTS1 targeting signal, is the only clade that contained a single representative from each species.

These data clearly show that ACS genes in different clades have undergone differential expansion in each angiosperm lineage, perhaps reflecting different events in the evolution of their genomes and differences in life histories that placed varying selective pressures on the elaboration of biochemical pathways requiring ACS activity. All four Arabidopsis ACS genes in clade D are located in tandem on chromosome 1, suggesting tandem duplication and selection for retention of the duplicated copies. Two of the poplar ACS genes in clade E (PoptrACS11 and PoptrACS12) appear to have arisen by tandem duplication on linkage group XII. However, other members of this and other clades for which the poplar gene models are anchored to linkage groups are physically unlinked and on different linkage groups. This suggests that tandem gene duplication has not played a major role in diversification of the poplar ACS gene family. Many of the poplar ACS genes may rather have been retained after the salicoid whole genome duplication in the poplar lineage, in which chromosome doubling and subsequent rearrangement is thought to have increased the poplar chromosome complement from n = 10 to the current n = 19 (Tuskan et al., 2006). For example, in the poplar-rich clade E, PoptrACS10 and PoptrACS11/12, which are located on duplicated homoeologous linkage groups XII and XV, and PoptrACS6 and PoptrACS7, which are located on duplicated homoeologous linkage groups XIII and X, are likely to have arisen in this manner (Tuskan et al., 2006). Also noteworthy among the poplar clade E ACS proteins is the loss of C-terminal PTS1 peroxisomal targeting sequences in two members (PoptrACS6 and PoptrACS12), suggesting that functional diversification may have taken place at the level of enzyme localization in the poplar lineage after gene duplication. Taken together, these data show conservation of ACS gene number in all three lineages for some clades (A, B and C, with one or two members from each lineage), suggesting possible conservation of function. Conversely, diversification of gene numbers in clades D and E has occurred in a lineage-specific manner, suggesting possible diversification of function in a lineage specific manner.

Developmental expression of poplar and Arabidopsis ACS genes

In order to gain insights into the possible functions of ACS proteins, we examined the developmental gene expression patterns of all Arabidopsis ACS genes, as well as representative poplar genes in clades A-E, by qRT-PCR. Expression profiles in different Arabidopsis and poplar organs and tissues are shown in Fig. 3, relative to the organ or tissue with the highest expression. To evaluate absolute expression levels, we also calculated the expression level of each gene tested relative to the respective Arabidopsis or poplar control gene in the organ or tissue where the gene was most highly expressed (Fig. 3). In general, Arabidopsis and poplar ACS genes from the same clade tended to have similar relative developmental expression patterns, and similar levels of expression. This is especially evident in those clades with single Arabidopsis and poplar ACS representatives. A striking example is clade A, in which AtACS5 expression was strongly flower specific, and PoptrACS13 had a similar pattern of flower-preferred expression. Interestingly, PoptrACS13 expression was specific to male flowers, and a putative AtACS5 orthologue in tobacco shows an anther-preferred expression pattern (Varbanova et al., 2003), suggesting a role for ACS enzymes of this clade in a biochemical pathway important in anther and/or pollen development. Another example is the predominant expression of both Arabidopsis and poplar representatives of clade C in leaves, with lower expression in stem/xylem and flowers. Both the Arabidopsis and poplar representatives were expressed at low levels relative to the respective control genes, suggesting a potential specialized function (for example, expression in restricted cell types). Clade B contains two poplar genes and a single Arabidopsis member. Genes in this clade were expressed in all organs, but both poplar PoptrACS2 and AtACS6 showed highest expression in mature leaves, and lower expression in other organs. The pattern of PoptrACS1 expression differed from that of PoptrACS2, with highest expression in flowers, bark and young leaves, suggesting subfunctionalization in expression patterns, as predicted from genes retained after gene duplication events (Duarte et al., 2006). The Arabidopsis and poplar genes in this clade were expressed at similar, high levels relative to the control genes, but expression of PoptrACS2, the most highly expressed poplar ACS gene, was much higher than that of PoptrACS1, supporting the subfunctionalization of these genes at the level of expression.

More complex expression patterns were observed in clades where expansion of gene family members in either Arabidopsis or poplar has occurred. In clade D, the duplicated and highly similar poplar genes PoptrACS4 and PoptrACS5 appeared to have similar expression patterns across a range of organs (Fig. 3), and shared similar, high expression levels. However, the transcribed portions of the two poplar genes were so similar that cross-detection cannot be excluded. By contrast, for the five representatives of the Arabidopsis members of clade D, distinct and complementary expression patterns were observed throughout the majority of organs tested, and the genes were expressed at remarkably different levels (0.51 relative to the control gene level for AtASC2, to 1482 relative to the control gene for OPCL1), which suggests expression subfunctionalization of these genes. The demonstration that the Arabidopsis OPCL1 gene in this clade encodes an OPDA-CoA ligase (Koo et al., 2006) is consistent with the very high developmental expression of OPCL1 in flowers, where JA plays a role in anther development (Ito et al., 2007). Given their phylogenetic relationships to OPCL1, other genes in this clade could share this activity, especially since some of them are wound and methyl jasmonate (MJ) inducible (Koo et al., 2006 and later), and since the opcl1 mutant retains the ability to make JA both developmentally and in response to wounding (Koo et al., 2006). The distinct developmental expression patterns and expression levels of the Arabidopsis clade D ACS genes could therefore suggest specialization of at least some of the genes for developmental biosynthesis of JA in different organs.

The only Arabidopsis representative in clade E, AtACS9, was most highly expressed (at a high absolute level) in seedlings, followed by flowers. The three poplar homologues most closely related to AtACS9 (PoptrACS10, PoptrACS11 and PoptrACS12) displayed expression patterns largely complementary to each other, covering leaves, roots and male flowers, although all had high expression in roots. Like AtACC9, expression levels of the poplar genes were generally high, although they varied over 10-fold between each other. Thus, expansion of this family of ACS genes in poplar appears to have been accompanied by subfunctionalization for developmental expression.

Stress-induced expression of poplar and Arabidopsis ACS genes

The stress-responsiveness of the set of ACS genes, except the flower-specific genes in clade A (which are not expressed in vegetative organs in Arabidopsis), was tested at various times after mechanical wounding in both Arabidopsis and poplar. The ACS promoter-GUS fusions were constructed, using genomic sequences upstream of the ATG start codon, between 1.5 kb and 2 kb in length. Transgenic lines were generated, and promoter activity following wounding assayed in representative lines. Wound-induced transcription of Arabidopsis and poplar genes was also tested by qRT-PCR. In addition, expression of the poplar ACS genes was tested by qRT-PCR after each of the following treatments: herbivory by the forest tent caterpillar (Malacosoma disstria; FTC), simulated herbivory (SH; wounding plus Malacosoma disstria regurgitant) and exposure to MJ.

Wound-induced expression data are shown in Fig. 4, and the responses of poplar genes are shown in Fig. 5. Arabidopsis 4CL2, known to be wound inducible (Ehlting et al., 1999), was used as a positive control for Arabidopsis wound treatments, and was up-regulated over 5 fold at 4 h after wounding (data not shown). The endogenous AtACS6 gene and AtACS6 promoter-GUS transgene were not wound inducible (Fig. 4), and PoptrACS2 in the same clade (B) was unresponsive both to wounding and other stress treatments (Figs 4 and 5). However, PoptrACS1, also in clade B, was upregulated 1.6-fold after 4 h wounding, and was strongly and transiently upregulated by SH and FTC treatments (Fig. 5). In a separate microarray expression profiling experiment, AtACS6 expression was not activated by diamondback moth herbivory (J. Ehlting and J. Bohlmann, unpublished). These data suggest that AtACS6 and PoptrACS2 do not function in wound or herbivory stress-related biochemical pathways, but that PoptrACS1 may have gained this function after duplication of the gene in the poplar lineage, and that the biochemical pathway using the product of the enzyme encoded by PoptrACS1 may play a role in defence against herbivory.

In clade C, AtACS7 expression was downregulated by wounding to less than half the level of the unwounded control, and a similar result was obtained for the single poplar homologue in this group, PoptrACS3. Interestingly, PoptrACS3 was strongly and transiently upregulated by SH but not other stresses. These results suggest that enzymes in this clade are not likely to play roles in biochemical pathways related to herbivory and similar stresses, although the strong response of PoptrACS3 to SH treatment warrants further investigation.

Arabidopsis and poplar genes in clade D were particularly responsive to wounding stress. Histochemical assays of transgenic promoter-GUS lines showed that the Arabidopsis ACS2, ACS3 and OPCL1 promoters were strongly responsive to wounding (Fig. 4a). Accumulation of ACS2 and ACS3 mRNA was also strongly upregulated by wounding, with ACS3 levels 14-fold above the untreated the control within 1 h of wounding (Fig. 4b). While wound-induced accumulation of OPCL1 and AtACS8 mRNA was not obvious under the conditions used here, Koo et al. (2006) previously showed that expression of OPCL1 is strongly, and ACS8 weakly, wound inducible. OPCL1 has been shown to encode a wound and MJ-inducible OPDA:CoA ligase involved in JA biosynthesis (Koo et al., 2006). Our data are consistent with potential roles of other Arabidopsis ACS genes in this clade in JA biosynthesis, in addition to OPCL1, perhaps as members of an OPCL gene family, as proposed by Koo et al. (2006). In support of this, AtACS8, as well as OPCL, is activated by diamondback moth herbivory based on microarray expression profiling (J. Ehlting and J. Bohlmann, unpublished data). Furthermore, analysis of public microarray gene expression data using the Bio-Array Resource, (; Toufighi et al., 2005) showed that expression of OPCL1, AtASC2 and AtASC8 is induced by MJ treatment in 7-d-old seedlings. AtASC3 is not represented in the microarray probe sets used in these experiments, but was strongly wound inducible in our experiments (Fig. 4). Unlike the other Arabidopsis members of Clade D, there is no evidence that AtACS1 is wound or MJ inducible (Fig. 2), and it lacks apparent activity against JA precursors in vitro (Kienow et al., 2008) suggesting that it is less likely to be involved in JA biosynthesis. However, further analyses will be required to determine if Arabidopsis Clade D proteins other than AtOPCL1 have in vivo activities against OPDA, or preferentially use other substrates.

The expression of the poplar genes in this clade, PoptrACS4 and PoptrACS5, was induced up to fivefold 4 h after wounding, and remained high 24 h after treatment (Fig. 4b). PoptrACS4 and PoptrACS5 were also strongly upregulated by herbivory, SH and MJ, with the last treatment leading to a transient 20-fold increase in expression by 2 h, with mRNA returning to control levels by 24 h (Fig. 5). These two highly similar poplar genes are most similar to Arabidopsis OPCL1 based on phylogenetic reconstruction (Fig. 2), and like OPCL1 are highly expressed in a developmental context (Fig. 3d). These data are consistent with PoptrACS4 and PoptrACS5 encoding poplar OPDA:CoA ligases, a function that OsASC4 (Os03g04000), the only rice gene in clade D (Fig. 2), could share. Interestingly, if this is the case, the two duplicated and highly similar poplar OPCL genes, and the single rice gene, contrast sharply in number to the expanded clade D family of potential OPCLs in Arabidopsis, suggesting that the regulation of JA biosynthesis in Arabidopsis may be more complex than in these other two angiosperm lineages.

Expression of the single Arabidopsis clade E gene, AtACS9 was downregulated after 1 h in response to wounding, and expression stayed below the control levels up to 24 h. Of the three poplar homologues most related to AtACS9, PoptrACS12 showed the most similar expression pattern, whereas PoptrACS10 and PoptrACS11 showed little or no change in expression in response to wounding. The poplar genes in clade E had similar responses to FTC, SH, and MJ treatments (PoptrACS12 was downregulated; PoptrACS10 and PoptrACS11 showed only minor fluctuations). The enzyme encoded by AtACL9 has been shown to be a fatty acyl:CoA synthetase, with activity in vitro with fatty acids and OPDA, a precursor in JA biosynthesis (Schneider et al., 2005; Kienow et al., 2008). It is localized to the peroxisome and has been suggested to be a potential OPDA:CoA ligase (Schneider et al., 2005; Kienow et al., 2008). However, the lack of wound, herbivory and MJ activation of AtACL9 and of the most closely related poplar ACS genes, as well as the phylogenetic distance between this gene and clade D containing the Arabidopsis OPCL1 gene, do not support a role for these enzymes in the stress induced synthesis of JA, and the in vivo substrate of AtACL9 remains to be clarified.

The clade E gene PoptrACS12 has distinct expression patterns relative to the most closely related genes in poplar, PoptrACS10 and PoptrACS11. It shows downregulated expression in response to wounding and related stresses (Figs 4 and 5), and shows a contrasting developmental expression pattern, with highest expression in roots, phloem and bark (Fig. 3e). Interestingly, the PoptrACS12 protein lacks the PTS1 peroxisomal targeting sequence (Fig. 2) suggesting, that while PoptrACS10 and PoptrACS11 may have enzymatic functions similar to the AtACS9 protein as peroxisomally localized fatty acyl:CoA synthetases, PoptrACS12 may have acquired a new function specific to the poplar lineage, as a nonperoxisomal CoA ligase.

Subcellular localization of PoptrACS5

We hypothesized that PoptACS5 is an orthologue of the Arabidopsis OPCL1 gene. To test this, we attempted to express PoptACS5 heterologously in Escherichia coli, but were unable to generate a strain expressing sufficient PoptACS5 protein to test its activity in vitro. To test the subcellular localization of PoptrACS5, we generated an N-terminally tagged PoptrACS5 and Arabidopsis OPCL1-GPF fusions, generated transgenic tobacco lines, and assayed reporter fluorescence in mesophyll and epidermal cells (Fig. 6). This analysis showed that both proteins were localized to similar subcellular structures consistent with a peroxisomal localization, as previously demonstrated by Koo et al. (2006) for AtOPCL1. Mesophyll cell peroxisomes in which PoptrACS5-GFP accumulated appeared to be physically associated with chloroplasts (Fig. 6b). This is consistent with the necessity of shuttling intermediates in JA biosynthesis between the plastids, where OPDA is produced, and peroxisomes, where OPDA:CoA thioesters are generated in preparation for acyl chain shortening to produce JA (Li et al., 2005; Koo et al., 2006).

Figure 6.

Subcellular localization of AtOPCL and the poplar homologue PoptrACS5 in guard cells of transgenic tobacco lines. (a) Green fluorescent protein (GFP) fluorescence lines expressing N-terminal AtOPCL:GFP and PoptrACS5:GFP proteins. An untransformed line was used as negative control. Yellow signals derive from chlorophyll autofluorescence and GFP fluorescence is green. (b) Detailed view of PoptrACS5:GFP showing close proximity of peroxisomes (showing GFP fluorescence) with chloroplasts (red autofluorescence).


Our data show that the 4CL-like ACS enzymes are a conserved, land plant-specific group of adenylate-forming enzymes most closely related to true 4CL enzymes among a larger group of plant and microbial acyl activating and acyl:CoA synthetase enzymes. The fact that ACS representatives, as well as true 4CL enzymes, are found in the moss Physcomitrella suggests that both groups of enzymes evolved early in the transition of plants to the terrestrial environment and were present in the most recent common ancestor of bryophytes and vascular plants. The ACS enzymes are largely predicted to be peroxisomally located and are related to peroxisomal enzymes found in fungi and prokaryotes. This suggests that the progenitors of land plant 4CL and the related ACS enzymes may have originally performed peroxisomal functions and that bona fide 4CL enzymes and clade A ACS enzymes evolved from such progenitor enzymes by loss of the PTS1 peroxisomal targeting sequence and acquisition of new functions in other cellular compartments. It appears that loss of peroxisomal targeting sequences can occur relatively easily in this group of enzymes, since two apparent independent examples in the duplicated poplar ACS genes (PoptrACS6 and PoptrACS12, both in clade E), and the Physcomitrella PpACS5 gene were found. The nonperoxisomal poplar proteins have presumably acquired new functions in the poplar lineage, subsequent to the gene duplication events.

The in vivo biological functions of the angiosperm ACS enzymes are for the most part still unknown, although such functions may be conserved within the conserved angiosperm clades A–E, and in many cases they have expression patterns that suggest functions in developmental and/or stress-related biochemical pathways not related to phenylpropanoid metabolism. In addition to Arabidopsis OPCL1 and its poplar homologues, other stress-induced ACS enzymes in clade D may have acyl-activating functions associated with chain-shortening reactions in JA biosynthesis. AtACS6 (At4g05160; clade B) and AtACS9 (At5g63380; clade E) have activity against a variety of acyl substrates including OPDA or its derivatives and long-chain fatty acids (Schneider et al., 2005; Kienow et al., 2008), although in vivo substrates have not been identified. Thus, many or all of the ACS proteins may accept a diversity of different fatty acid or other acyl substrates. Additional data from surveys of potential substrates (Schneider et al., 2005; Kienow et al., 2008), 4CL and ACS structural information, and in vivo functional assays (for example using reverse genetic approaches) will be necessary to determine functions of Arabidopsis ACL enzymes.

We targeted Arabidopsis ACS genes in two of the clades, A and E, with single Arabidopsis representatives (AtACS5 and AtACS9, respectively, Fig. 2) for reverse genetic analysis using T-DNA insertion null-mutant lines (data not shown). While no metabolic or morphological phenotypes were evident for a homozygous acs9 mutant line (data not shown), the acs5 mutant showed a striking male sterility phenotype (data not shown; C. de Azevedo Souza, C. Douglas et al., unpublished). This is consistent with roles for AtACS5 and the poplar homologue PoptrACS13 in anther development predicted from their developmental expression patterns (Fig. 3a). Further comparative biochemical studies on enzymes encoded by the corresponding poplar, rice, and Physcomitrella ACS genes will help to shed light on the conservation of pathways utilizing ACS enzymes in land plant lineages.


We thank Jürgen Ehlting for discussions and advice, Keith Turner and colleagues at the British Columbia Institute of Technology (BCIT) for generating transgenic tobacco plants, and John D’Auria, Max-Planck Institute for Chemical Ecology, Jena, for Arabidopsis APT1 primer sequences. This work was supported by a Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant to C.J.D., and by Genome Canada and the Province of British Columbia (Treenomix I project) funds to C.J.D. and J.B.