Metabolomics is a rapidly developing technology. Major approaches currently used in plant metabolomics research include metabolic fingerprinting, metabolite profiling and targeted analysis (Fiehn 2002, Halket et al. 2005, Shulaev 2006). Depending on the question asked in each particular study, specific metabolomics approaches or their combination are used. Some of these are described below.
Metabolic fingerprinting is largely used to identify metabolic signatures or patterns associated with a particular stress response without identification or precise quantification of all the different metabolites in the sample. Pattern recognition analysis is then performed on the data to identify features specific to a fingerprint. Fingerprinting can be performed with a variety of analytical techniques, including NMR (Krishnan et al. 2005), MS (Goodacre et al. 2003), Fourier transform ion cyclotron resonance mass spectrometry or Fourier transform infrared (FT-IR) spectroscopy (Johnson et al. 2003).
One of the limitations of NMR spectroscopy is its low sensitivity, which makes it difficult to detect low-abundance cellular metabolites. MS has an advantage over NMR in terms of resolving power, providing higher sensitivity and lower limit of detection. However, MS generates more complex spectrum because of the formation of product ions and adducts, and its results comes in a form of discriminant ions. This can provide a significant challenge for data validation. Using MS with different classification tools, a larger subset of metabolites associated with the phenotype can be identified.
Metabolic fingerprints can be analyzed with a variety of pattern recognition and multivariate statistic techniques (Sumner et al. 2003). Both unsupervised and supervised algorithms have been used in fingerprinting, although supervised techniques generally show greater discrimination power. Unsupervised techniques most often used with metabolomics data include principal component analysis (PCA), self-organizing maps (SOMs) and hierarchical clustering, while supervised algorithms include discriminant function analysis (DFA), partial least squares (PLS) and anova. Most metabolomics data sets are underdetermined, meaning they contain many more variables than samples (Kohane et al. 2003), and for proper statistical analysis, it is important to reduce the number of variables to obtain uncorrelated features in the data. This can be achieved by using evolutionary algorithms such as genetic algorithms (GAs) or genetic programming (Pena-Reyes and Sipper 2000). For metabolomics applications, evolutionary algorithms are typically combined with the secondary algorithm (e.g. DFA or PLS) (Goodacre 2005).
In order to increase sample throughput mass spectra are usually obtained using direct infusion of the analytical sample into a mass spectrometer, i.e. without fractionation. However, direct infusion has problems, mostly because of a phenomenon known as cosuppression where the signal of many analytes can be lost at the mass spectrometer interface.
To minimize the cosuppression effect, samples can be separated using very rapid gradients with a short chromatographic column and the HPLC-MS data can then be analyzed using multivariate analysis to identify the discriminant ions. To confirm the fingerprinting results, samples are then re-analyzed with long HPLC gradient. This two-step fingerprinting/validating strategy was used to characterize the wound response in Arabidopsis (Grata et al. 2007).
In our laboratory, we have been employing a similar approach to metabolic fingerprinting where we actually carry out a chromatographic or electrophoretic separation prior to the MS. This is similar to what is performed for metabolite profiling, except that we do not attempt to identify all the molecules responsible for the peaks in the separation, rather, we focus on those that demonstrate to be discriminant between groups. This approach significantly reduces the cosuppression effect seen in direct infusion MS, the dominant method used for fingerprinting. Fig. 1 shows the comparison of the summary mass spectrum obtained following chromatographic separation using capillary column or direct infusion. Distribution of m/z within the acquisition mass range of 100–1500 atomic mass units using chromatography prior to MS shows ions at m/z 404, 579, 636, 740, 824, 1173, 1343 and 1392 corresponding to important plant metabolites including flavonoids and anthocyanins. These and many other ions are almost undetectable in the mass spectrum obtained by direct infusion because of the matrix suppression effect. Following the data acquisition, we have a data cube consisting of thousands of mass spectra at different elution times. This is then transformed into a single cumulative mass spectrum that is equivalent to what a direct infusion mass spectrum would be minus the cosuppression interference. This cumulative spectrum is then used for sample discrimination using statistical and machine-learning algorithms. Since all the original data cube where the separation data are actually kept for later analysis, we can inspect its details and identify specific molecules of interest against a library without a need for additional experiments.
Figure 1. Summary mass spectrum of Arabidopsis leaf extract following either chromatographic separation (A) or direct infusion (B). Ions were detected for positive ionization full-scan MS. Chromatography was performed on a 0.1 × 450 mm monolithic C18 column. Summary mass spectrum, which derives from adding up all mass scans over the chromatographic run, shows distribution of m/z within the acquisition mass range of 100–1500 atomic mass units, exceeding S/N > 6.
Download figure to PowerPoint
Metabolite profiling is aimed at a simultaneous measurement of all or a set of metabolites in a sample. Multiple analytical techniques can be used for metabolite profiling (Shulaev 2006, Sumner et al. 2003). These techniques include NMR, GC-MS, liquid chromatography–mass spectrometry (LC-MS), capillary electrophoresis–mass spectrometry (CE-MS) and FT-IR spectroscopy. The advantages and disadvantages of each technique for metabolite profiling were previously discussed (Shulaev 2006, Sumner et al. 2003).
To date, GC-MS is the most developed analytical platform for plant metabolite profiling. Historically, it was one of the first techniques used for high-throughput metabolite profiling in plants (Roessner et al. 2000). The GC-MS is generally performed using electron impact (EI) quadrupole or time-of-flight (TOF) mass spectrometry (Fiehn et al. 2000, Roessner et al. 2000). Using GC-MS, it is possible to profile several hundred compounds belonging to diverse chemical classes including sugars, organic acids, amino acids, sugar alcohols, aromatic amines and fatty acids.
The major advantage of GC-MS for metabolomics is the availability of both commercially and publicly available EI spectral libraries (Halket et al. 2005). The limitation of the GC-MS profiling is that it can only analyze volatile compounds or compounds that can be volatilized following chemical derivatization.
For non-volatile compounds, LC-MS and CE-MS provide a better alternative. LC-MS application in metabolomics is steadily increasing especially after the recent adoption of the ultra performance liquid chromatography technology that can dramatically increase separation efficiency and decrease analysis time (Giri et al. 2007, Granger et al. 2007). CE-MS provides a viable alternative for metabolite profiling due to its high resolving power, low sample volume requirements and the ability to separate cations, anions and uncharged molecules simultaneously (Soga et al. 2003).
Untargeted metabolite profiling is often paralleled with metabolomics because it is most often used in metabolomics studies. It is particularly useful to obtain a global view of the metabolism of cells or identify new metabolites/pathways. A substantial drawback of untargeted profiling is that it is semiquantitative, i.e. it provides relative concentration data based on the same ‘surrogate’ internal standard. These semiquantitative data have to be further validated using targeted quantitative assays. Targeted profiling is used when it is necessary to determine the precise concentration of a limited number of known metabolites and provides a very low limit of detection. Targeted analysis has been widely used to follow the dynamics of a limited number of metabolites known to be involved in a particular stress.
Targeted analysis can also be used for comparative metabolite profiling of a large number of known metabolites. For example, highly parallel targeted assays based on SRM can be used for very sensitive simultaneous analysis of over 100 metabolites in a single chromatographic run [see review by Bajad and Shulaev (2007)].
For truly quantitative measurement, targeted compounds should be available in a pure form and preferably labeled with stable isotope, which provides a significant challenge for plant stress research because many plant metabolites involved in stress response and their intermediates are not available in a pure form. A joint effort of the plant community and the chemical industry is required to synthesize these compounds and make them available to researchers.
An alternative approach for quantitative profiling may provide in vivo enrichment of metabolites with stable isotopes like 13C and 15N. This can be achieved by growing plants or plant cells in liquid media containing 15N-labeled inorganic nitrogen sources (K15NO3, 15NH415NO3) or 13C-labeled carbon dioxide or glucose (Hegeman et al. 2007, Huege et al. 2007). This approach allows for in vivo synthesis of stable isotope-labeled plant metabolites that can be used for quantitative metabolite analysis using stable isotope dilution method. Furthermore, extract of the fully labeled plant can be used as a complex internal standard for simultaneous quantitative profiling of the large number of known metabolites. Uniform metabolic labeling combined with MS has been successfully used for quantitative metabolic profiling in microorganisms (Lafaye et al. 2005, Mashego et al. 2004, Wu et al. 2005). In addition, in vivo stable isotope enrichment followed by metabolite analysis over the time-course experiment can provide information on metabolic fluxes and overall dynamics of metabolism (Hellerstein 2003, Huege et al. 2007, Kleijn et al. 2007, Matsuda et al. 2003). This information is essential for mathematical modeling of metabolic networks.