Cofactome analyses reveal enhanced flux of carbon into oil for potential biofuel production


(fax 530 752 5410; e-mail


To identify the underlying molecular basis of carbon partitioning between starch and oil we conducted 454 pyrosequencing, followed by custom microarrays to profile gene expression throughout endosperm development, of two closely related oat cultivars that differ in oil content at the expense of starch as determined by several approaches including non-invasive magnetic resonance imaging. Comparative transcriptome analysis in conjunction with metabolic profiling displays a close coordination between energy metabolism and carbon partitioning pathways, with increased demands for energy and reducing equivalents in kernels with a higher oil content. These studies further expand the repertoire of networks regulating carbon partitioning to those involved in metabolism of cofactors, suggesting that an elevated supply of cofactors, here called cofactomes, contribute to the allocation of higher carbon pools for production of oils and storage proteins. These data highlight a close association between cofactomes and carbon partitioning, thereby providing a biotechnological target for conversion of starch to oil.


Endosperm is the major starch storage organ that constitutes up to 95% of mature cereal seeds (Bartels and Thompson, 1986; Radchuk et al., 2009). Although endosperm of some cereals, including barley and wheat, contains low levels of oil (∼3%), oil accumulation is predominantly confined to the embryo and scutellum, organs with ∼15% oil but insignificant contribution to the total grain weight (Price and Parsons, 1975; Hargin et al., 1980; Neuberger et al., 2009). Exhaustive breeding and engineering efforts directed at the development of high-oil maize only led to increases in the size and oil concentration of the embryo and scutellum without a significant impact on the endosperm oil levels, maintained at <1% (Leng, 1961; Alexander et al., 1967; Shen et al., 2010; Alonso et al., 2011). Oat (Avena sativa) is unique among cereals, as there are oat varieties that contain up to 18% oil (Alexander et al., 1967;Price and Parsons, 1975; Alonso et al., 2011), mainly (up to 90%) deposited in the endosperm (Price and Parsons, 1979; Banas et al., 2007; Heneen et al., 2008; Heneen et al., 2009). Detailed examination of two closely related oat cultivars, cv. Matilda and Freja, demonstrated their similarities in grain weight, and their differences in the composition of storage contents. Specifically, in contrast to Freja, Matilda contains higher oil and protein levels at the expense of starch (Banas et al., 2007; Ekman et al., 2008), supporting the earlier findings that in oat the oil content is positively correlated with protein and is negatively correlated with starch content (Peterson and Wood, 1997). The differential developmental regulation of oil production between the endosperm of these two cultivars was established by demonstrating that in Freja oil deposition is limited up to the mid-developmental-stage whereas in Matilda this process continues to maturity (Ekman et al., 2008).

The fatty acid and starch biosynthetic pathways and their coordination have been studied extensively, and yet our understanding of the nature of the regulatory coordinating components is still fragmentary. Oat endosperm offers a suitable platform to identify the metabolic switches responsible for the coordination of carbon partitioning between these two pathways.

Oils are the most abundant form of reduced carbon chains in nature, having diverse utility in food and industrial applications (Thelen and Ohlrogge, 2002). Rising cost of petroleum together with increased environmental concerns have recently led to an unprecedented demand for sustainable oils as renewable alternatives to fossil fuels and other oil-based industrial applications. However, one of today’s biggest challenges in plant oil production is the restricted feedstock, since the supply of vegetable oils relies upon limited amounts of a few crops (Durrett et al., 2008).

The knowledge derived from using oat as a platform for identification of regulatory network switches involved in carbon partitioning may bring to light new targets for the generation of designer oil crops capable of redirecting carbon flux from starch to oil, thereby overcoming the current constraints in producing economically viable levels of plant oil. Towards this goal, we have utilized a combination of 454 sequencing and custom microarray profiling in parallel with metabolic profiling at developmental periods with near-linear phase oil deposition in Matilda and Freja endosperm. Collectively, these analyses established a close association between energy metabolism and carbon allocation, and have expanded the repertoire of regulatory networks involved in carbon partitioning to cofactor metabolism as a biotechnological target for conversion of starch to oil.


Nuclear magnetic resonance studies display differential lipid distribution patterns in oat endosperm

The development of Matilda and Freja seeds was compared by phenotypic and metabolic analyses from early stages to near maturity (Fig. 1a–g). The phenotypic comparison at various stages of seed development (previously classified as stages A to H) (Ekman et al., 2008), was combined with fresh weight measurements at the corresponding stages (Fig. 1a). These data established that the general course of growth is extremely similar between the seeds of these two cultivars, and therefore here we only present the data obtained from Matilda seeds (Fig. 1a). However, the composition of the storage compounds differs significantly between them, as Matilda seeds accumulate higher levels of proteins and lipids and lower levels of starch compared with Freja (Fig. 1b).

Figure 1.

 Phenotypic and metabolic profiling combined with NMR analysis display differential lipid distribution patterns and amino acid levels in oat seeds.
(a) Oat seed phenotypes at sequential developmental stages days after pollination (DAP) and their corresponding fresh weight measurements (mg per seed).
(b) Levels (%w/w) of starch, lipid and proteins at three developmental stages (C–E) represented in bold and highlighted by the grey box.
(c) Analysis of amino acid levels [μmol g−1 dry weight (DW)] at developmental stages C–E in Matilda (orange) and Freja (blue) seeds. The measurement scale varies for each amino acid; actual amounts are presented in Table S2.
(d) Cross and (e) longitudinal views of seeds at developmental stage D/E, displaying chlorenchyma (ch), embryo (em), endosperm (en), nucelar projection (np) and vascular region (vs). The dashed red line indicates the cross-sectional area selected for presentation of the NMR data.
(f) Cross-section of lipid distribution patterns in Freja (upper panel) and Matilda (lower panel) seeds at developmental stage C/D by non-invasive NMR.
(g) Longitudinal view of lipid distribution in Matilda and Freja seeds by NMR analysis. Lipid content is color coded as shown in color bar.

To determine whether the higher total protein level in Matilda as compared with Freja is supported by an increase in the levels of all or only a selected group of amino acids, we examined the levels of free amino acids using the whole oat kernel at the three developmental stages (C–E). These analyses determined that an improved protein level in Matilda is well supported by the elevated levels of 16 out of the 19 free amino acids examined. The levels of glutamate and glutamine remained unchanged between the two, while cysteine exhibited higher levels in Freja than in Matilda (Fig. 1c). The developmental decline in the level of most free acids determined here may be representative of a general pattern, as similar observations are reported for other seed species (Rolletschek et al., 2002; Rolletschek et al., 2004b). This suggests that the increased level of total protein is sustained through ongoing storage throughout the developmental stages.

Next we examined whether the presence of different levels of lipid in these two cultivars is solely a reflection of a difference in the cellular content and/or is a manifestation of differential lipid accumulation in different cell layers. For these studies, we examined the lipid distribution patterns within the intact seeds using non-invasive magnetic resonance imaging (NMR) as described previously (Neuberger et al., 2008, 2009). These NMR experiments were performed on cross and longitudinal views of seeds at developmental stage D/E, a stage with completely formed and distinctly organized tissue structures (Fig. 1d). The analysis of the cross-sections presented here is focused on the central part of the seeds marked by the dotted red line (Fig. 1e), whereas the longitudinal views display the entire length of the seeds. Subsequent to data reconstruction, images of lipid distribution patterns were generated in both cross and longitudinal views of the intact seeds (Fig. 1f,g) as previously described (Neuberger et al., 2009). These data depict endosperm as the predominant site of lipid accumulation in both cultivars and highlight their distinct lipid distribution patterns. In Freja, lipid deposits are detected exclusively in the outermost layers of endosperm (i.e. the aleurone and sub-aleurone), while in Matilda these deposits are present in both the central and outermost layers of endosperm (Fig. 1f,g).

Parallel time-indexed analysis reveals discordance between transcript and core metabolite levels in sequential stages of seed development

To capture the most complete transcriptional profiles of oat endosperm during the full cascade of developmental events, we employed a combination of 454 pyrosequencing coupled with custom microarrays (Fig. 2). Initially, we focused on 454 pyrosequencing of Matilda endosperm transcripts at the two previously identified developmental stages corresponding to the time-index with maximum and minimum rates of oil deposition (Ekman et al., 2008). From this data, 29 575 contigs were constructed using custom PERL scripts. These contigs were subsequently validated as probes for high-resolution custom microarray profiling, from which a total of 27 570 were found suitable as probes by described criteria (see Experimental Procedures and Fig. 2). These contigs were employed in the preparation of the final microarrays hybridized with cDNAs derived from RNA isolated from 15 sequential stages spanning from anthesis to mature endosperms of both cultivars (Fig. 2). A combination of bioinformatics and mixed model anova-based statistical analysis determined that ∼10% of these contigs are significantly and differentially expressed between Matilda and Freja (Table S1). Among the differentially expressed contigs 55% remained unassigned and the remaining 45% were assigned to transcripts of genes associated mainly with metabolic processes, protein synthesis, chromatin assembly and transportation group genes.

Figure 2.

 Flowchart of the experimental design.
The flowchart depicts the developmental stages employed for 454 pyrosequencing, followed by microarray-assisted probe validation and subsequent high-resolution microarray analysis performed on Freja and Matilda endosperm throughout the full developmental cascade. Steady-state metabolite analyses were performed on three developmental stages, C–E. The levels of malonyl-CoA at the three developmental stages of Freja (blue) and Matilda (orange) seeds are presented as an example. Two-factor mixed anova was utilized to identify the metabolites that depend upon the interaction of genotype and development.

In parallel with the expression profiling approach, we examined the steady-state levels of selected core pathway metabolites with known involvement in starch and lipid metabolism during the three mid-developmental stages (C–E) (Figs 2 and 3). The selection of these developmental stages was based on a combination of the results from the current as well as several previous studies that established near-linear phase oil deposition in the oat endosperm during this period (Ekman et al., 2008), and illustrated the prominence of the enzymes involved in sucrose assimilation, glycolysis and de novo fatty acid synthesis at these developmental stages in a number of oilseeds (Hajduch et al., 2005; Hajduch et al., 2006; Hajduch et al., 2007).

Figure 3.

 Comparative transcriptome and metabolome analysis display discordance at sequential developmental stages of oat seeds.
Parallel transcriptome (endosperm) and metabolome (whole seed) of Freja (blue) and Matilda (orange) were analyzed at sequential stages of seed development. Expression levels, presented on log2 graphs, are measurements of signal intensities derived from 15 serial microarray profiles for each putatively characterized transcript per cultivar over the course of endosperm development. Gray shading in the transcriptome boxes depicts the developmental stages employed for metabolite analysis. Steady-state analysis of metabolites at the three developmental stages C–E are displayed bound in blue (higher in Freja), orange (higher in Matilda) or black (no difference) boxes. The red star indicates significantly differentially expressed genes with a P-value < 0.05 after adjusting for false discovery rate (FDR) at the 0.2 level, in the two oat cultivars, using anova analysis. Full names of genes and metabolites are provided in Table S4.

Subsequent to the metabolic analysis, we utilized the unsupervised statistical method, principal components analysis (PCA), and tested the major sources of variance within the combined transcriptome and metabolome dataset available for the three developmental stages (C–E) of the two oat genotypes Matilda and Freja. This analysis showed that the first two PCA vectors contained over 99% of the total variance, with PCA1 being 97.5% of the variance by itself. However, neither vector was able to independently partition genotype or developmental stage, suggesting that the majority of information present in the dataset is only obtainable by investigating the interaction of genotype and developmental stage (Fig. S1a). Similar results were obtained when solely focusing on the transcripts or metabolites in stages C, D and E or transcripts at all stages (Fig. S1b,c, respectively). Thus, we proceeded to conduct two-factor mixed anova to identify the metabolites that depend upon the interaction of genotype and development. The anova-based comparative analyses indicate that the expression levels and patterns of only a few transcripts within the core metabolic pathways are statistically different between the two cultivars, while the steady-state levels of several of the corresponding metabolites differ significantly between them (Fig. 3). This discordance is most notable between the levels of metabolites and the expression of their respective genes in the glycolytic pathway and in the citric acid cycle. Among the several outstanding examples are higher levels of the glycolytic intermediates glucose, fructose and UDP-glucose in Matilda as compared with Freja, while the expression levels of the gene encoding sucrose synthase (SuSy), a sucrose-hydrolyzing enzyme primarily associated with cereal endosperm (Emes et al., 2003), is similar between the two. However, the expression level of the gene encoding a second hydrolyzing enzyme, invertase (Inv), is lower in Matilda, hypothetically an indication that sucrose degradation in Matilda is mainly catalyzed by the energy-conserving route provided via SuSy. In addition, the differences in the levels of hexose phosphates between the two cultivars don’t correspond to the changes in the transcript profiles of their respective genes encoding fructokinase (FK), hexokinase (HK), and UDP-glucose pyrophosphorylase (UGP) enzymes. Moreover, the transcript patterns of the gene encoding ADP-glucose pyrophosphorylase (AGP) do not match the steady-state profile of the respective product, ADP-glucose, that is metabolized from glucose 1-phopshate (glucose 1-P). This discrepancy is evident throughout the glycolytic pathway to the penultimate step, conversion of 2-phosphoglycerate (2-PG) to phosphoenolpyruvate (PEP), catalyzed by enolase. This discordance is also well displayed between the expression profile of the gene encoding pyruvate dehydrogenase, the enzymatic link between glycolysis and the citric acid cycle, and levels of the respective product, acetyl-CoA. This trend is further extended to the apparent discordance between transcript and metabolite levels of various steps of the citric acid cycle. Higher steady-state levels of acetyl-CoA and several other citric acid cycle metabolites, including isocitrate, 2-oxoglutarate and succinyl-CoA in Freja as compared to Matilda, suggests a more rapid turnover of these metabolites in Matilda. These data further imply a higher reliance of Matilda endosperm on mitochondrial oxidative phosphorylation driven by oxidation of acetyl-CoA to meet the high energy demands for enhanced accumulation of the energetically costly storage product, oil. Subsequent measurements of oxygen uptake rates lend support to the notion that Matilda seeds do indeed tend to display a higher respiration rate than Freja seeds. Specifically, the mean values of oxygen uptake in oat seeds [measured as nmol (gfw min)−1; fw = fresh weight] at the three developmental stages (C–E) were respectively 470, 348 and 423 in Freja, versus 649, 321 and 523 in Matilda.

Despite similar expression levels and patterns of most fatty acid biosynthetic genes encoding the plastidial and endoplasmic reticulum (ER) localized enzymes in both cultivars, the steady-state levels of the primary substrates for fatty acid synthesis, acetyl-CoA and malonyl-CoA, are higher in Freja than in Matilda, suggesting the higher turnover rate for enhanced oil accumulation in the latter. The notion of higher turnover rate of primary fatty acid substrates corresponds well to the exhausted levels of some of their precursors including the glycolytic intermediates fructose-1,6-diphosphate, glyceraldehyde-3-phosphate (GAP), dihydroxyacetone phosphate (DHAP) and phosphoenolpyruvate (PEP) in Matilda. These data in part allude to a prominent role for metabolic efficiency potentially achieved by various post-transcriptional modifications impacting the steady-state levels of proteins, and possible post-translational modification, ultimately resulting in improved enzymatic activities rather than altered expression levels of core metabolic pathway genes in determining the oil content of oat endosperm.

Parallel transcriptional and metabolic network analysis highlights the association of cofactomes with carbon partitioning

Energy metabolism is tailored to the specific demand of the cell. Given the notable differences in the oil and protein content of the two cultivars, we therefore examined the energy state of the seeds by measuring the steady-state levels of energy and reducing equivalents at the three developmental stages (C–E). Intriguingly, significantly lower levels of di- and triphosphate nucleotides, namely UDP/UTP, ADP/ATP, and CDP/CTP, combined with higher levels of UMP, AMP, and CMP are present in high-oil Matilda as compared with Freja (Fig. 4). The transcriptional profiling of genes encoding enzymes involved in purine and pyrimidine biosynthesis supported the metabolic data. Specifically, the most notable highly expressed gene in Matilda versus Freja is that encoding aspartate carbamoyltransferase, the enzyme catalyzing the first step in UMP synthesis. The next most differentially expressed genes in the two cultivars with a higher expression in Matilda are those encoding phosphoribosyl aminoimidazole (AIR) synthase and adenylosuccinate synthase, enzymes of the de novo purine biosynthesis pathway.

Figure 4.

 Parallel transcriptional and metabolic network analysis highlights association of cofactomes with carbon partitioning.
A simplified schematic representation of the interconnectedness of cofactor biosynthetic and recycling pathways with their respective selected transcript and metabolite levels presented in different color schemes for each pathway. Dashed arrows represent pathways with additional steps to those graphically represented. Larger boxes represent metabolites measured at three developmental stages (C–E), and smaller boxes represent the Freja (blue) and Matilda (orange) log2 graphed transcriptome signal intensity of each corresponding contig matching the listed gene from 15 serial microarrays. Gray boxes within transcriptome graphs represent the corresponding time period selected for metabolic analysis. Only transcripts whose alterations are statistically significant by anova analysis are presented. The red star indicates the cultivar with higher transcript levels of the designated genes with a P-value < 0.05 after adjusting for FDR at the 0.2 level. Full names of genes and metabolites are provided in Table S4.

Steady-state levels of NADH and NADPH are notably lower in Matilda compared with Freja, in contrast to the levels of their oxidized equivalents, NAD+ and NADP+, that are higher in Matilda than Freja (Fig. 4). This high demand for oxidation of NADH may potentiate a higher citric acid cycle flux, consistent with the greater demand for energy and reducing equivalents for synthesizing lipid and protein at the expense of starch in Matilda (Fig. 4).

In addition to the above noted differences, these cultivars also differ in the levels of transcripts and metabolites involved in cofactor metabolism, most notably those in S-methyl methioinine (SMM), S-adenosyl methionine (SAM) and folate biosynthetic pathways. In addition, the gene encoding thiamine biosynthetic enzyme (Thi1) is also significantly differentially expressed over the span of endosperm development between the two cultivars (Fig. 4).

Within the folate biosynthetic pathway, expression levels of the gene encoding a serine hydroxymethyltransferase (SHMT), the enzyme that catalyzes the conversion of 5,10-methylenetetrahydrofolate (5-methyl-THF) from unsubstituted tetrahydrofolate (Ravanel et al., 2001), is higher in Matilda than in Freja. It is noteworthy that, in contrast to Freja, this higher expression levels is maintained throughout endosperm development in Matilda. Metabolic analyses further validated the expression data, as the level of 5-methyl-THF metabolite is also higher in Matilda than in Freja, suggesting an increased rate of synthesis or a decreased rate of degradation potentially coupled to an enhanced demand for methyl groups required for a myriad of methylation reactions. Furthermore, as compared with Freja, Matilda displays higher expression levels of the gene encoding homocysteine methyltransferase, the enzyme that transfers a methyl group to homocysteine to generate methionine (Ranocha et al., 2001). The data also correspond well to the enhanced levels of methionine as well as the activated methyl cycle products including SAM, and S-adenosylhomocysteine (AdoHyc) in Matilda with respect to Freja.

The methionine-recycling pathway in Matilda is also distinct from that of Freja, as evidenced by higher expression levels of the gene encoding 5′-methylthioadenosine/S-adenosylhomocysteine nucleosidase (MTAN), a methionine-recycling enzyme (Siu et al., 2011), as well as higher steady-state levels of the respective product deoxyadenosine. In contrast, however, Matilda displays lower expression levels of the gene encoding S-adenosylmethionine decarboxylase, the enzyme that catalyzes the decarboxylation of SAM to S-adenosyl-methioninamine, the substrate used for production of polyamines (Roje, 2006).

Lastly, the expression level of Thi1 is initially low in both cultivars, with notable increases detected only at developmental stage C of Matilda endosperm, suggesting that this increase is a response to metabolic events in the developing Matilda endosperm, rather than contributing to them. Interestingly however, despite this differential expression, the levels of the resulting product thiamine diphosphate, the active form of vitamin B1 and a key cofactor of the essential enzymes involved in carbon metabolism, is similar between the two cultivars.


To dissect the molecular basis of the metabolic shift resulting in enhanced levels of oil and protein synthesis in the oat cultivar Matilda as compared with the closely related cultivar Freja (Fig. 1), we employed global transcriptional analysis capturing expression profiles of oat endosperm genes during the full cascade of developmental events. In parallel, we measured the steady-state levels of metabolites in the time period of a near-linear phase of oil deposition coinciding with the prominence of the core metabolic enzymes in a number of oilseeds (Hajduch et al., 2005, 2006, 2007; Ekman et al., 2008). Collectively, these data illustrate that Matilda accumulates higher protein and oil and lower starch levels compared with Freja, confirming the previous findings (Banas et al., 2007; Ekman et al., 2008). These higher protein levels are supported by the elevated levels of 16 of the 19 free amino acids examined (Fig. 1c). The lower levels of cysteine in Matilda compared with Freja, are in part balanced by higher levels of methionine (Fig. 4), and further suggest potential compositional differences between the two cultivars with regard to cysteine-enriched polypeptides.

The NMR-based analysis further verified endosperm as the predominant site of oil deposition, and established that the difference in oil levels is not restricted to the cellular content but also to the number of cell layers containing oil. Specifically, lipid deposition in Freja is restricted to the outer layers, whereas in Matilda lipids are present in both the central and the outermost cell layers of the endosperm. These distinct oil distribution patterns reflect a tightly and delicately balanced coordination between carbon partitioning and developmental regulation.

Comparative analyses displayed an unexpected discordance between the levels of a number of core metabolites and their respective transcript levels in the endosperm of Matilda and Freja. Surprisingly, the maintenance of steady expression profiles of most core metabolic genes throughout endosperm development in oat is in stark contrast to other seeds including coffee (Joët et al., 2009), brassicas (Niu et al., 2009), Medicago (Gallardo et al., 2007), maize (Prioul et al., 2008), barley (Sreenivasulu et al., 2004) and wheat (McIntosh et al., 2007), where the expression levels peak at the onset of endosperm development and decline gradually throughout the development. Collectively, these findings suggest that oat has lost these distinct transcriptional regulatory programs within the core metabolic pathways in favor of alternative strategies for improved metabolic efficiency through concomitant alteration of cofactors and energy metabolism status potentially achieved by various mechanisms including post-transcriptional modifications shown to impact the steady-state levels of proteins (Hajduch et al., 2010), as well as post-translational modification and improved enzymatic activities.

Several studies have demonstrated that in conjunction with enhanced oil accumulation in seeds there is an improvement in substrate availability (acetyl-CoA, PEP and malate) (Vigeolas et al., 2003; Rolletschek et al., 2004b; Rolletschek et al., 2005a; Rolletschek et al., 2005b; Borisjuk and Rolletschek, 2009). However, the evidence provided here shows the exhaustion of some glycolytic intermediates that serve as typical inputs for oil biosynthesis (DHAP, GAP and PEP) as well as lower levels of immediate precursors, acetyl-CoA and malonyl-CoA. Specifically, the lower levels of PEP in Matilda as compared with Freja were unexpected. The sequential conversion of PEP to pyruvate and acetyl-CoA by the plastidic pyruvate kinases (PK) and the pyruvate dehydrogenase (PDH) complex, respectively (Reid et al., 1977; Lernmark and Gardestrom, 1994; Andre and Benning, 2007; Andre et al., 2007; Flügge et al., 2011), constitute the two key steps toward committing sufficient carbon to support fatty acid synthesis utilized in part for triacylglycerol production (Slabas and Fawcett, 1992; Ohlrogge and Jaworski, 1997; Rawsthorne, 2002). In fact, because of the established role of the plastidic pyruvate kinase in committing sufficient carbon to support fatty acid synthesis, this enzyme is now a target for the engineering of endosperm for production of oil (Alonso et al., 2011). Therefore, it is possible that higher levels, and/or enzymatically superior PK and PDH enzymes, may have resulted in a more efficient conversion of PEP to pyruvate and acetyl-CoA in Matilda. In addition to lower PEP, as compared with Freja, Matilda seeds contain lower levels of the immediate precursors of fatty acids, acetyl-CoA and malonyl-CoA. This lower level of glycolytic intermediates and fatty acid precursors is suggestive of a rapid turnover of these substrates for oil biosynthesis. This rapid turnover has to be supported by the energy state, as the synthesis of fatty acids requires stoichiometric amounts of ATP and acetyl-CoA, and NADPH and NADH for each C2 addition to a growing acyl chain in the reactions catalyzed by acetyl-CoA carboxylase and fatty acid synthetase (Slabas and Fawcett, 1992). In chloroplasts, light energy is used to provide the required ATP and reductant, and to allow the operation of a Rubisco bypass flux that leads to more efficient conversion of hexose to oil (Schwender et al., 2004; Goffman et al., 2005). However, plastids of heterotrophic tissues such as endosperm (Neuhaus and Emes, 2000; Olsen, 2001, 2004) need to import these cofactors or to generate them intraplastidially through carbohydrate oxidation or metabolite shuttles (Rawsthorne, 2002) in order to convert their carbon supply with high efficiency despite substantial futile cycles (Alonso et al., 2011). Thus, the notion of a rapid turnover of the substrates for oil biosynthesis is supported by the energy state that regulates the rate of sucrose import (Vigeolas et al., 2003) and lipid biosynthesis in developing seeds (Rolletschek et al., 2003; Vigeolas et al., 2003; Rolletschek et al., 2004b; Rolletschek et al., 2005b, 2007). Indeed, compared with Freja, the levels of energy and reducing equivalents (ATP, NADH and NADPH) are much lower in Matilda, while AMP, NAD+ and NADP+ levels are elevated, suggesting that rapid consumption of these molecules in conjunction with the rapid turnover of precursors of fatty acid biosynthesis support the increased production of oil in Matilda seed. These data therefore suggest the existence of a mechanism(s) within the Matilda endosperm that, compared with Freja, partitions a larger share of the products of respiratory metabolism towards the production of energetically costly storage product, oil, at the expense of starch.

Our finding further highlights a close association between the metabolism of cofactors and carbon partitioning. This notion is in agreement with the elevated levels of gene expression and their corresponding metabolites involved in the recycling and de novo biosynthetic pathways of cofactors in Matilda. Amongst them are the SMM and SAM pathways, followed closely by and interconnected with those involved with folate biosynthesis, suggesting that efficient metabolic processes supporting higher oil and protein in Matilda are strongly associated with higher demands for cofactors such as methyl groups. Another example is the metabolism of THF, a mediator of one-carbon metabolism, a process central to a large number of essential cellular activities including methyl group biogenesis and the synthesis of nucleotides, vitamins, and some amino acids. One-carbon units are derived from the catabolism of three donor molecules, namely serine, glycine and formate, which are subsequently activated and compartmentalized by attachment to THF for biosynthetic processes. In most organisms however, serine is the principal one-carbon donor contributing to the pool of 5,10-methylene-THF by the action of the enzyme serine SHMT (Prabhu et al., 1996; Piper et al., 2000; Li et al., 2003). The elevated expression levels of SHMT in conjunction with increased levels of serine lend support to the potential functional importance of THF-mediated metabolism in supporting elevated levels of lipid and protein in Matilda as compared with Freja. To a lesser extent, the metabolic efficiency is correlated with pyrimidine and purine biosynthesis pathways, as the building blocks for nucleic acid synthesis, as the keys to energy metabolism, and for the continued synthesis of many biosynthetic products such as phospholipids (Traut and Jones, 1996; Stasolla et al., 2003; Stasolla et al., 2004; Kafer et al., 2004).

Another metabolite of interest is thiamine diphosphate, a cofactor in major metabolic pathways such as the citric acid cycle, the pentose phosphate cycle and glycolysis, and thus an essential constituent of all living cells (Suzuki et al., 2010). In plant cells, thiamine diphosphate also acts as a cofactor for the plastid localized isozymes pyruvate dehydrogenase and transketolase (Chabregas et al., 2001). Similar levels of thiamine diphosphate in Matilda and Freja, therefore, may allude to the role of this cofactor in the activation of enzymes involved in carbohydrate metabolism, as well as to its functional participation in oil accumulation in an otherwise starch-accumulating organ, the endosperm.

Two of the key coenzymes, lipoic acid and biotin, found in all three domains of life require the fatty acid synthetic pathway for their synthesis (Cronan and Lin, 2011). Fatty acid biosynthesis is also dependent on biotin, because two of the four subunits of acetyl-CoA carboxylase, an enzyme catalyzing ATP-dependent carboxylation of acetyl-CoA to malonyl-CoA, are biotinylated (Li and Cronan, 1992; Thelen et al., 2000; Li et al., 2011). Biotin synthase, the last enzyme for biotin synthesis, belongs to the family of AdoMet-dependent enzymes that reductively cleave AdoMet into a deoxyadenosyl radical (Marquet et al., 2001). Lipoate synthase, an enzyme that catalyzes the formation of two C–S bonds from octanoic acid, has a very high sequence similarity with, and is mechanistically related to, biotin synthase (Marquet et al., 2001). Elevated levels of AdoMet in Matilda may allude to higher levels of and/or activities of biotin synthase and lipoate synthase. Our attempts to accurately measure biotin levels were not successful; however, heightened levels of lipoic acid in Matilda as compared with Freja may be one consequence of enhanced synthesis of fatty acids in this cultivar.

Thus far manipulation of enzyme levels through transcriptional modulation of genes has had limited success in enhancing oil levels in seeds (Roesler et al., 1997; Thelen and Ohlrogge, 2002; Hills, 2004; Rolletschek et al., 2004a; Vigeolas et al., 2011). The close association of an improved cofactor pool, herein designated as the cofactome, with the metabolic shift directed toward enhanced oil production provides a novel insight into the regulatory networks involved in carbon partitioning and offers a biotechnological target for the conversion of starch to oil.

Experimental procedures

Plant material

Two closely related oat cultivars, Matilda and Freja (Svalöv Weibull AB, were grown under a light intensity of 200 μmol m−2 sec−1 in a 16-h light/8-h dark cycle at 21/18°C temperature and 70% humidity. Seed stages were determined by the size and weight, color, texture and endosperm consistency as previously described (Ekman et al., 2008). Whole kernels as well as endosperm at designated developmental stages (Fig 1 and Fig S1), were harvested and immediately frozen in liquid nitrogen and stored at −80°C until use.

RNA extraction and high-throughput pyrosequencing

Total RNA was extracted using PureLink® Plant RNA Reagent (catalog no. 12322-012; Invitrogen, as recommended by the manufacturer. Removal of genomic DNA from RNA samples was performed using the Plant RNeasy kit. (catalog no. 74904, 79254; Qiagen, The RNA for pyrosequencing and for preliminary 384k microarray analysis (Roche NimbleGen, was prepared from a pool of endosperm collected from 40 to 100 kernels from stage D/E [25–35 mg, 10–14 days post-anthesis (d.p.a.)], and stage G (48–52 mg, 20–22 d.p.a.), followed by double-stranded cDNA synthesis (SuperScript Double-Stranded cDNA synthesis kit, catalog no. 11917-010; Invitrogen,

High-throughput pyrosequencing was performed at the W. M. Keck Center for Comparative and Functional Genomics (University of Illinois, Urbana, IL, USA), using the Roche/454 GSFLX titanium platform (

Pyrosequencing fragments were organized using custom PERL scripts, and sequences were aligned using the programs tgicl and cap3 programs after the removal of contaminating sequences from vector and cDNA construction primers. To further check the quality of the sequences we used a custom PERL script and performed NCBI Blast as a method to identify and remove any remaining contaminating sequences. Subsequently the sequences were clustered together using tgicl and cap3. BlastX analysis was carried out using oat target sequences searched against all rice and Arabidopsis databases.

The custom 384 k microarray probes comprised 13 60mer probes for each of the 29 522 contigs obtained from 454 pyrosequencing data analysis, using Roche NimbleGen recommended methods.

Design of the Roche NimbleGen custom microarrays

Initially, four 384 k feature microarrays were hybridized with cDNA generated from the same RNA pool initially used for the 454 pyrosequencing experiments. The details of labeling, hybridization, scanning and normalization of the data are as described by NimbleGen. Contigs that showed no signal over these microarrays, and had no Blast references, as well as those probes whose values differed by two standard deviations from the mean were discarded. The probes which showed the most consistent expression near the mean using a custom PERL script were selected. Based on the length of the contig, different numbers of probes were assigned. Specifically, contigs with a length of less than 200 bp were assigned three 60mer probes, 200–300 bp had four probes, 300–400 bp had five probes, 400–500 bp had six probes, 500–600 bp had seven probes, 600–700 bp had eight probes and >700 bp had nine probes. Finally the selected probes representing the 27 575 contigs were printed on 12 × 135k feature arrays.

The final 12 × 135K arrays were processed for expression profiling using RNA samples isolated from the endosperm of kernels at sequential developmental stages (Fig. S1). Sample preparation, labeling, and array hybridizations were performed at UCSF Shared Microarray Core Facilities using Agilent Technologies (; Arrays were scanned using the Agilent microarray scanner and raw signal intensities were extracted with NimbleScan v2.6 software (NimbleGen). The presented experimental data consist of a file containing the value for each individual probe per contig (Table S2). The values are log2 normalized (e.g. 20–216, or 1–65 536, becomes 0–16) and graphs represent the values between 6 and 16.

Statistical analysis

To integrate the transcriptome and metabolome datasets, we conducted PCA using the combined metabolite and transcript profile data obtained from developmental stages C, D and E, as these were the only stages to have both transcript and metabolite data. To make the data comparable, the metabolite concentrations and transcript levels were averaged among samples at the same developmental stage for each genotype. This was also done to allow combination of the metabolite and transcript data which were collected from different samples. The PCA was run using the prcomp function in the R statistical analysis program suite (R Development Core Team, 2011, Eigenvalues from this analysis were extracted with the first six vectors having 97.5, 1.8, 0.6, 0.06, 0.009 and 0.002% of the total variance. The first two principal components were plotted on a two dimensional graph.

To estimate developmental progression of the samples, we utilized the adjusted per tissue weight within each sample as a numerical description of development. The development of seeds within the two genotypes followed a similar linear progression. To adjust for a 5% lower per tissue weight at each developmental stage, we multiplied the weight of the Matilda genotype by 1.05 to bring the two genotypes to the same adjusted weight scale. To test for differential expression within a gene, we implemented a mixed-model anova analysis in R. Gene expression for each gene contig (y) was modeled as a function of the variance that can be attributed to the independent probe per contig (P), oat genotype (G), and adjusted weight (W) as follows:


Accounting for each probe within a contig in the model accounts for differential hybridization among the individual probes. Within the mixed model, probe and genotype were fixed effects while adjusted weight was a random effect. As we were only interested in testing for differential expression for a gene, we did not explicitly test the probe × genotype interaction and let this potential error collapse into the random error term. This provides a more conservative estimate of differential gene expression due to genotype or development.

Nuclear magnetic resonance analysis

The NMR experiments were carried out as previously described (Neuberger et al., 2008, 2009) using a wide-bore (inner diameter 89 mm) 17.6 T superconducting magnet (Bruker BioSpin GmbH, with a custom-build NMR coil and an actively shielded gradient system (maximum strength 1 T m−1). Reconstruction and visualization of the data was performed using in-house software (written in Java), while segmentation of individual seeds and the determination of their total NMR-signal content was achieved using Amira (Visage Imaging GmbH,

Metabolite analysis

Metabolite analyses were performed on freeze-dried caryopses of stages C–E as previously described (Rolletschek et al., 2005a). Briefly the analyses of soluble sugars and free amino acids were performed by ion chromatography and HPLC as previously described (Rolletschek et al., 2004a). The remaining metabolites were measured by mass spectrometry coupled to liquid chromatography according to previous methods (Rolletschek et al., 2004a; Bajad et al., 2006) but modified as outlined (Table S3). Data were normalized on the basis of the external standard 13C-succinate (Cambridge Isotope Laboratories, added to the samples during the extraction (Table S3).


Respiration was measured as total oxygen uptake of kernels kept in the dark. Individual kernels were submerged in buffer [50 mm sucrose, ¼ MS medium, 10 mm Gln, 10 mm Asn, 10 mm 2-(N-morpholine)-ethanesulfonic acid (MES), pH 6.8] in gas-tight closed 10 ml vessels equipped with an SP-PSt3 oxygen sensor and connected to a Fibox 3 oxygen meter (PreSens Precision Sensing GmbH, Oxygen concentration in the samples was registered during a time period of 60 min. From recorded data the respiration rate of seeds was calculated by linear regression.


We would like to thank Matt Wong for his contribution to bioinformatics, N. Heinzel (IPK) for help with mass spectrometry, and J. Fuchs (IPK) for NMR analysis. We also are grateful to Chevron for supporting this work by the grant awarded to KD and SS. ÅG and SS are grateful for financial support from Vinnova and FORMAS.