A decade of plant proteomics and mass spectrometry: Translation of technical advancements to food security and safety issues

Authors


Correspondence to: G. K. Agrawal,

Research Laboratory for Biotechnology and Biochemistry (RLABB), P.O. Box 13265, Sanepa, Kathmandu, Nepal.

E-mail: gkagrawal123@gmail.com

R. Rakwal,

GGEC Program Office, Seino A 205, Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba 305-8572, Ibaraki, Japan.

E-mail: plantproteomics@gmail.com

Abstract

Tremendous progress in plant proteomics driven by mass spectrometry (MS) techniques has been made since 2000 when few proteomics reports were published and plant proteomics was in its infancy. These achievements include the refinement of existing techniques and the search for new techniques to address food security, safety, and health issues. It is projected that in 2050, the world's population will reach 9–12 billion people demanding a food production increase of 34–70% (FAO, 2009) from today's food production. Provision of food in a sustainable and environmentally committed manner for such a demand without threatening natural resources, requires that agricultural production increases significantly and that postharvest handling and food manufacturing systems become more efficient requiring lower energy expenditure, a decrease in postharvest losses, less waste generation and food with longer shelf life. There is also a need to look for alternative protein sources to animal based (i.e., plant based) to be able to fulfill the increase in protein demands by 2050. Thus, plant biology has a critical role to play as a science capable of addressing such challenges. In this review, we discuss proteomics especially MS, as a platform, being utilized in plant biology research for the past 10 years having the potential to expedite the process of understanding plant biology for human benefits. The increasing application of proteomics technologies in food security, analysis, and safety is emphasized in this review. But, we are aware that no unique approach/technology is capable to address the global food issues. Proteomics-generated information/resources must be integrated and correlated with other omics-based approaches, information, and conventional programs to ensure sufficient food and resources for human development now and in the future. © 2013 Wiley Periodicals, Inc. Rapid Commun. Mass Spectrom. 32: 335–365, 2013.

I. INTRODUCTION

Genome sequencing of plants in the 21st century will help to drive a revolution in how we approach issues of food security, safety, and human health. A few plants have already been sequenced and annotated, such as Arabidopsis and rice, and many more are in the pipeline (Feuillet et al., 2010). Particularly, crop genome sequencing is bringing a paradigm shift in the approach to plant biology and crop breeding to meet future global food demand through crop improvement (Flavell, 2010). The information from plant genomes enables the use of high-throughput technologies [such as transcriptomics (gene expression), proteomics (protein expression), and metabolomics (metabolites)] providing a platform to systematically reveal the function of each gene in the genome in combination with other functional genomics tools (Fukushima et al., 2009; Weckwerth, 2011). These technologies are playing an important role in: (i) better understanding biology at the whole plant level, (ii) expediting the process of high-throughput development of molecular markers to assist crop improvement through a combination of modern technology and breeding programs, (iii) developing plant biomarkers for human health and food security, and (iv) food analysis and safety issues.

In the present review, we envision a united approach towards the solution of global food security and safety, focusing mainly on the potential of “plant proteomics and mass spectrometry (MS)” advances, the exploitation of which can increase the present agriculture capacity to feed the world of tomorrow. As the content of this review is broad, encompassing multiple research disciplines, we provide a glimpse on advancements of proteomics and MS technology in plants in the past decade followed by some examples of their utilization in addressing various issues of food security, analysis, safety, and human health/nutrition. We also briefly discuss the importance of integrating the proteomics-generated information with other scientific disciplines (such as functional genomics, biotechnology, and molecular breeding) and various scientific and non-scientific organizations for food security.

II. PLANT PROTEOMICS AND MASS SPECTROMETRY: A DECADE

A. Historical Perspectives: Then and Now

In 2015, it will be 40 years since the advent of proteomics, revolutionized via the introduction of two-dimensional gel electrophoresis (2-DGE) (Klose, 1975; O'Farrell, 1975; Scheele, 1975) and later refined with the introduction of immobilized pH gradients (IPGs) (Bjellqvist et al., 1982; Righetti et al., 2008; Gianazza & Righetti, 2009; Görg et al., 2009) and MS (Aebersold & Mann, 2003; Yates, Ruse, & Nakorchevsky, 2009). Forty years can be considered a long time, but the fact is that the study of proteins within the term proteomics (Wilkins et al., 1995) is quite young, fluid, and diversifying as a technology. Being part of the three young high-throughput omics technologies of genomics (transcriptomics), proteomics, and metabolomics, which are now, allied to high-throughput phenotyping (phenomics), and being amalgamated into the field of systems biology (Ward & White, 2002; Bradshaw & Burlingame, 2005; Bradshaw, 2008; Souchelnytskyi, 2008; Coruzzi, Rodrigo, & Guttierrez, 2009). The relatively younger face of plant proteomics can be realized when we see its wide-spread application in isolation, identification & cataloguing of proteins, and addressing/answering biological questions from 2000 to now, more than a decade of research in plant proteomics (for reviews and books see, Finnie, 2006; Samaj & Thelen, 2007; Thiellement, 2007; Agrawal & Rakwal, 2008a; Ranjithakumari, 2008; Weckwerth et al., 2008; Agrawal et al., 2011) (Fig. 1).

Figure 1.

Timeline of plant proteomics development. Details are in the main text.

As per publications on plant proteomics in PubMed, the progress in plant proteomics can be divided into phases: pre, initial, and progressive (Fig. 1). The prestage can be considered the beginning of proteomics where 1(one)-DGE and 2-DGE techniques were applied to separate proteins and their identification using N-terminal Edman sequencing. The initial stage started with the genome revolution in the year 2000 onwards. Since the publication of the draft genome sequences of two plants, Arabidopsis thaliana (weed and dicot model) (The Arabidopsis 2000) and rice (Oryza sativa L., cereal crop and monocot model: Goff et al., 2002; Yu et al., 2002) in 2000 and 2002, respectively, plant proteomics research has seen a rapid growth. In this initial phase we also could see an effort by the Arabidopsis scientific community to start working toward the proteome of this model plant via the establishment of a Multinational Arabidopsis Steering Committee Proteomics subcommittee (MASCP, www.masc-proteomics.org). Since then, plant proteomics has moved into the progression stage, where researchers have been involved in enriching the scientific community by concerted efforts to publish reviews in series on rice, plants, and protein phosphorylation and publication of five books in plant proteomics. The initial years of this decade also saw the development of an idea on a global initiative on plant proteomics that led to the establishment of the International Plant Proteomics Organization (INPPO, www.inppo.com). With more plant genomes being sequenced, from model to non-models (Feuillet et al., 2010; Agrawal et al., 2011), there is no turning back to the utilization of proteomics approaches in various aspects of plant biology research.

The biggest hurdles faced by the scientific community and the population in general are the issues of food security, human health, and our changing environment, and dealing with these issues is one of the visions behind the global movement on plant proteomics, starting from Arabidopsis (Jones et al., 2008; Wienkoop, Baginsky, & Weckwerth, 2010) and MASCP in the early 2000s to INPPO in 2011. At INPPO, we have defined ten initiatives that we hope to move forward on with the support of plant biologists around the world (Agrawal et al., 2011). We can also refer to it as the—Global Action Plan on Plant Proteomics in the 21st century (GAPs-21), and as the acronym symbolizes there is indeed a gap needing to be bridged between the plant proteomics researchers worldwide to engage in more cooperative research, breaking boundaries, and having an open-door policy to tackle the pressing need for translational proteomics, that is, from the lab to the field (Agrawal et al., 2012a, 2012b). With this background, we discuss below some of the advancements seen recently, which have a relevance in shaping plant proteomics research tomorrow and tackling the issues that are being raised in this review, namely food security and safety.

B. Technical Progression

The proteome of a cell or tissue at a specific time point is extremely complex and diverse. Each current technique is only able to focus on a subfraction of proteins due to the complex chemical nature of proteins and their large dynamic range. Since no polymerase chain reaction (PCR) equivalent exists for multiplying proteins, each technique has a strong bias toward the most abundant proteins. The biggest challenge is to develop techniques to measure the deep proteome and to create a workflow to handle the data. It is currently impossible to study the whole proteome in one single experiment. Therefore, a proteomics experiment has to be designed very carefully according to the biological questions that have to be answered. An array of approaches has been developed to address proteome analysis. There are two main complementary approaches in proteomics: a gel-based approach (also known as protein-based approach) and a gel-free approach (also called peptide-based approach). 2-DGE coupled to spot picking and tandem mass spectrometry (MS/MS) is the cornerstone of proteome analysis and has an unequalled resolving power for separation of complex protein mixtures. It is a complete methodology providing a qualitative and quantitative high-resolution image of intact proteins giving a good overview of different isoforms and posttranslational modifications (PTMs). However, 2-DGE is difficult to automate and is greatly dependent on a scientist's skills and has a limited throughput. Most of the gel-free approaches are based on a bottom-up approach, where intact proteins are digested into peptides prior to separation. The physical and chemical properties of tryptic peptides are more homogeneous than protein extracts, which facilitates automated separation via multidimensional chromatography (MDLC).

1. Protein Extraction

Protein sample preparation is a critical step and is absolutely essential for obtaining good results. Most plant tissues are not a ready source and need specific precautions. When a particular group of proteins is of interest, a prefractionation technique needs to be applied. Various strategies have been developed over the years to fractionate proteins into subproteomes based on biochemical, biophysical, and cellular properties and are discussed (Ephritikhine, Ferro, & Rolland, 2004; Rose et al., 2004; Righetti et al., 2006; Bodzon-Kulakowska et al., 2007; Barkla et al., 2009).

The majority of the plant protocols introduce a precipitation step to concentrate the proteins and to separate them from the interfering compounds. In the eighties, much effort was invested in the establishment of robust 2-DGE and sample preparation methods for plant tissue (Damerval et al., 1986; Granier, 1988; Meyer et al., 1988), but proteomics had at that time still many technical limitations and was not yet widely applied. Later on, people picked up those “old” techniques, compared, and optimized them (Saravanan & Rose, 2004; Carpentier et al., 2005; Isaacson et al., 2006; Mechin, Damerval, & Zivy, 2007). Protein extraction differs for a protein based approach and a peptide based approach and both methods are complementary. In the first dimension of 2-DGE, none of the denaturants should interfere with the intrinsic charge of proteins, which excludes strong detergents, such as SDS (sodium dodecyl sulfate). For the extraction of proteins for a gel-free-based approach detergents cannot be used unless they are removed before separation and MS analysis by, for example, the filter aided sample preparation method (Manza et al., 2005; Wisniewski et al., 2009; Vertommen et al., 2011b) or chemically broken down (e.g., acid-labile surfactants, ALSs) (Yu et al., 2003). Currently, several commercially ALSs are available among others: RapiGest (Waters, Milford, MA), PPS, (Protein Discovery, San Diego, CA), Protease MAX (Promega, Madison, WI). A disadvantage of cleavable detergents is the co-precipitation of hydrophobic proteins together with the degradation products. Therefore, a commercially available detergent (Invitrosol™ LC/MS Protein Solubilizer Kit, Invitrogen, Grand Island, NY) has been developed that elutes in three peaks that are well separated from the elution times of most peptides (Invitrogen).

2. Protein Separation

The most widely used 2-DGE protocol separates denatured proteins according to two independent properties: isoelectric point (pI) and molecular mass (Mr). In the original procedure the first dimension, isoelectric focusing, was executed in thin polyacrylamide gel rods inside glass tubes (O'Farrell, 1975). Bjellqvist et al. (1982) introduced IPG generated by buffering acrylamide derivatives containing carboxylic and tertiary amino groups (Immobilines). Now-a-days IPG-gradient gels or “strips” are commercially available and have numerous advantages. High resolutions can be obtained by using IPG strips with narrow pH ranges (e.g., 1 pH unit) (Hoving, Voshol, & Oostrum, 2000; Hoving et al., 2002).

A technical advancement in the second dimension is the development of the polyacrylamide gradient gels that are immobilized on a low-fluorescence plastic back. This ensures a maximum resolving capacity, gives robust strong gels that can be sent for automated spot picking and the low-fluorescence background enables the usage of fluorescent dyes. In the old days, gels were mostly run separately on a horizontal tray. Then vertical electrophoresis equipment was introduced where up to 12 gels can now be run simultaneously. Currently the horizontal electrophoresis is reintroduced. A new high-performance electrophoresis (HPE) system was recently developed whose separation in the second dimension is performed at higher voltages where up to four low-fluorescence plastic-backed flatbed gels can be run in parallel (Serva, Heidelberg, Germany). Becher et al. (2011) describe its testing in Bacillus subtilis.

Classical 2-DGE is not suitable to analyze native proteins, protein complexes, and hydrophobic proteins. For an overview on the alternatives on classical 2-DGE, the reader is referred to Miller, Eberini, and Gianazza (2010). For a recent overview on the techniques used to analyze hydrophobic proteins in plants, readers are referred to Vertommen et al. (2011b) and Kota and Goshe (2011).

3. Peptide Separation

Two-dimensional (2-D) LC-separation prior to MS analysis was introduced more than 10 years ago based on strong cation exchange (SCX) and reversed phase (RP) chromatography (Wolters, Washburn, & Yates, 2001; Washburn et al., 2002). The separation of complex mixtures of tryptic peptides profited enormously from the development of commercially available capillary columns suitable for RP chromatography (Tomer et al., 1994; Carr, 2002). Also, the development of nano-columns with smaller particles and higher pressure ultra-performance LC (UPLC) systems (Wilson et al., 2005) was a critical factor. Gilar et al. (2005) introduced an alternative system based on RP-RP where peptides are separated using two different pH values for elution from the first and second RP column. According to the authors, the advantages of this approach are: (i) after optimization, the number of peptides divided in several consecutive fractions is very limited; (ii) the peptide losses in the first dimension are smaller compared to SCX separation; and (iii) the mobile phases are salt free. A commercial hardware platform for 2-D RP-RP nano-UPLC (2-D nano Acquity) has been introduced by Waters, and has already been successfully used by Vertommen et al. (2011a) to analyze plasma membrane proteins. To take advantage of the properties of both gel-based protein and gel-free peptide separation, the two techniques can be used in a geLC approach. Proteins are first separated according to their molecular size through SDS–PAGE. After removal of the detergent and in-gel digestion of the proteins, the resulting peptides are separated using a RP column which can be on-line coupled to a mass spectrometer. Main advantages are that the strong solubilizing power of SDS is used for protein solubilization and that a semi-automatic workflow is achieved. The final peptide mixture present in one LC run is significantly reduced in complexity which makes the chance of identifying low abundant proteins higher. On the other hand, the method is less suitable for quantitative studies due to artifacts in electrophoresis and in-gel digestion.

4. Gel-Based Protein Quantification

One of the biggest advances in protein staining in terms of reproducibility and throughput was the development of succinimidyl ester derivatives of different cyanine fluorescent dyes that modify free amino groups of proteins prior to separation (Unlu, Morgan, & Minden, 1997). Succinimidyl ester derivatives react with the nucleophilic primary amines, subsequently releasing the N-hydroxysuccinimide group. At a specific pH (8.5), these reagents react almost exclusively with the ϵ-amino group of lysine to form stable amide linkages that are highly resistant to hydrolysis. Originally, only two different cyanine dyes were included (Cy3 and Cy5) but the concept was extended with a third dye (Cy2) that opened the way for a total new experimental design that further exploits the sample multiplexing capabilities of the dyes, by including an internal standard (Alban et al., 2002, 2003). The internal standard is a mixture of equal amounts of each sample and guarantees a high accuracy of protein quantification, reduces the variability considerably, and justifies the use of powerful parametric statistics after transformation of the standardized volume (Karp & Lilley, 2005). For an overview on protein staining and quantification of proteins the reader is referred to (Miller, Crawford, & Gianazza, 2006). It should be noted here that protein identification is usually done by MS.

5. Mass Spectrometry-Based Protein Quantification

Recent advances in plant proteomics have been largely made possible by developments in biological MS. The orbitrap mass analyzer has become the instrument of choice for many proteomics applications, since its commercial introduction in 2005 because of its ability to deliver low-ppm mass accuracy and extremely high resolution, all within a time scale compatible with nano-LC separations. One powerful and highly specific type of MS analysis is based on selective reaction monitoring (SRM), also called multiple reaction monitoring (MRM). Lehmann et al. (2008) called this approach Mass Western and used it to analyze the specific isoforms of sucrose phosphate synthase in Arabidopsis. In an SRM experiment, a predefined precursor ion and one of its fragments are selected by the two mass filters of a triple quadrupole instrument and monitored over time for precise quantification (Lange et al., 2008).

Most peptide-based quantitative proteomic analyses are comparative or relative and are based on universal approaches requiring chemical labeling of peptides and are suitable for plant protein/peptide labeling. Among others there are isobaric tag for relative and absolute quantitation (iTRAQ), isotope-coded affinity tag (ICAT), and labeling with H218O while digesting the protein samples. In contrast to relative quantification, absolute quantification of peptides (AQUA) relies on the synthesis of isotopically labeled peptide standards, for example in SRM experiments. The need for synthesis of isotopically labeled peptides somewhat limits the scale for quantitative analysis. These techniques and their applications to quantitative plant proteomics methods are reviewed in Oeljeklaus, Meyer, and Warscheid (2009), Kline and Sussman (2010), Schulze and Usadel (2010), and Bindschedler and Cramer (2011a,2011b). However, an increasing number of experiments are now being attempted using label-free approaches for plant proteome research (America & Cordewener, 2008; Griffin et al., 2010; Matros et al., 2011). For an overview of the latest developments on peptide-based separation and label-free quantification the reader is referred to Matros et al. (2011).

Alternatives to chemical labeling and label-free quantitation are the use of metabolic labeling. While stable isotope labeling with amino acids in cell culture (SILAC) is highly successful in animal systems, it is less applicable to plant cell cultures as amino acid incorporation is poor and labeling is only partial, due to the ability of green cells to fix carbon. However, plant-specific quantitative methods such as HILEP and SILIP (Palmblad, Bindschedler, & Cramer, 2007; Bindschedler, Palmblad, & Cramer, 2008; Schaff et al., 2008; Kline, Barrett-Wilt, & Sussman, 2010; Bindschedler & Cramer, 2011a, 2011b) take full advantage of plant metabolism and long-established culture practices. In such approaches, whole plants are grown and proteins are labeled in hydroponic or plant cell cultures with 15N inorganic salts as sole nitrogen source as reviewed in Bindschedler and Cramer (2011b) and Arsova, Kierszniowska, and Schulze (2012). Apart from nearly complete, convenient, and cost-effective labeling of Arabidopsis with 15N salts, it was shown that other plants such as tomato (Schaff et al., 2008) and the woodland strawberry Fragaria vesca (Bindschedler, Dunwell, Jambagi, & Cramer, unpublished data) can be labeled this way. For instance, Figure 2 shows that F. vesca grown hydroponically was amenable to quantitative proteomic analysis using isotopically different nitrogen salts in the growth medium. F. vesca leaves from 14N and 15N grown plants were pooled in a 1:1 ratio. Proteins were extracted, digested, and analyzed by nLC-ESI-MS/MS. Quantitation was then performed by comparing the peak area ratios of the 14N and 15N isotopic envelopes of the corresponding co-eluting 14N and 15N peptide doublets.

Figure 2.

Metabolic 15N-labeling is suitable for relative quantitation of proteomes of Fragaria vesca (wild strawberry) as shown for the doubly charged peptide VALEACVQAR. A ratio of 1 is expected as equivalent amounts of 14N grown leaves were pooled with 15N grown leaves. A: Total and extracted ion chromatograms of an nLC-ESI MS analysis. 14N and 15N extracted ion chromatograms show co-elution of the 14N and 15N peptide isotopologues. B: MS spectrum at 30.03 min with details of the 14N and 15N doubly charged isopotologues of VALEACVQAR show a 14N/15N ratio close to 1. C: MS/MS spectrum of the 14N isopotologues. D: Identification and quantitation details for the doubly charged peptide VALEACVQAR. XCalibur (ThermoFisher Scientific, Hemel, Hempstead, UK) and Mascot Distiller (MatrixScience, London, UK) softwares were used for data presentation and analysis as described previously (Bindschedler & Cramer, 2011b).

6. Posttranslational Modifications Analysis

Posttranslational modifications (PTMs) are covalent processing events that determine a protein's activity state, localization, turnover, and interaction with other proteins (Mann & Jensen, 2003). Despite the pivotal roles of PTMs in cellular functions, the studies on PTMs are cumbersome. PTMs can be analyzed at both the protein level and the peptide level. Once modified proteins have been identified, PTMs of the proteins are subsequently characterized by MS (Jensen, 2004). Developments in affinity based enrichment and MS should bring new insights on the dynamics and spatio-temporal control of protein activities by PTMs, and reveal their roles in biological processes (Jensen, 2004). Mass spectrometers continue to evolve towards increased sensitivity, higher mass accuracy and resolving power, improved duty cycle, and more efficient fragmentation of peptides in MS/MS. However, mapping PTM is a quite challenging task due to low abundance and changed physicochemical characteristics of the modified peptides. The lability in the gas phase of many modifications, which are easily lost under vibrational (collisional or infrared) excitation without revealing their positions, lowers their detection efficiency. The alternative non-vibrational excitation technique electron capture dissociation (ECD) has shown on several occasions its profound potential for PTM characterization (Kjeldsen et al., 2003). With ECD, the positions of phosphorylation, N- and O-glycosylation, sulfation, and γ-carboxylation can be easily established; these modifications are rapidly lost upon vibrational excitation. In this approach, termed “reconstructed molecular mass analysis” (REMMA), the molecular mass distribution of the intact protein is measured first, which reveals the extent and heterogeneity of modifications. Then the protein is digested with one or several enzymes, with peptides separated by reversed-phase HPLC, and analyzed by Fourier transform MS (FTMS) (Kjeldsen et al., 2003). When a measured peptide molecular mass indicates the possibility of a PTM, vibrational excitation is applied to determine via characteristic losses the type and eventually the structure of the modification, while ECD determines the PTM site. ECD enables efficient sequencing of phosphopeptides, glycopeptides, and other types of modified peptides.

Bond, Row, and Dudley (2011) describe the protocols for mapping plant PTMs. Among the studied PTMs are phosphorylation, glycosylation, ubiquitination, methylation, acetylation, sulfonation, sumoylation, myristoylation, palmitoylation, prenylation/farnesylation, and the redox proteome. N-terminal myristoylation plays a vital role in membrane targeting and signal transduction in plant responses to environmental stress. Podell and Gribskov (2004) developed a new method based on a plant-specific training set and the use of a probability-based hidden Markov model for predicting N-terminal myristoylation sites specifically in plants. PhosPhat, P3DB, PlantsP, and plantsUPS are plant-specific databases useful to investigate PTM. PhosPhat is the Arabidopsis Protein Phosphorylation Site Database developed by Heazlewood et al. (2008). Moreover, Gao et al. (2009) developed a plant-specific protein phosphorylation Database P3DB. PlantsP is a server developed by the Purdue University Database Group and offers several tools among them the prediction of phosphorylation and meristoylation sites. The most extensively studied PTM is phosphorylation (Kersten et al., 2006; Kersten et al., 2009; Schulze, 2010; Bond, Row, & Dudley, 2011). There are a variety of techniques and methodologies for phosphoprotein analysis (radioactive labeling, immunoprecipitation, affinity chromatography, and chemical derivatization). Advancement in analytical techniques and evolution of various high-resolution mass spectrometers during the last decade has accelerated the large scale screening of PTMs from various biological sources.

C. Proteogenomics and Genomic/Proteomic Databases

Facilitated by the speed and decreased cost of third-generation DNA sequencing, genome-wide sequencing of plant species, in particular main food crops, is on the rise after a decade of sequencing of A. thaliana, and the Indica and Japonica rice subspecies. In 2005, the first map-based sequence of the annotated rice genome was also completed (International Rice Genome Sequencing Project, 2005). Entire genomes are now becoming available for some of the major crops, such as maize (Schnable et al., 2009), sorghum (Paterson et al., 2009), potato (Xu et al., 2011), tomato, soybean, domesticated apple (Velasco et al., 2010), or banana (D'Hont et al., 2012), as recently reviewed in (Feuillet et al., 2010; Miller, Eberinin, & Gianazza, 2010; Sonah et al., 2011) and at http://en.wikipedia.org/wiki/List_of_sequenced_eukaryotic_genomes#cite_note-42. With the explosion in the amount of available data, it is increasingly difficult to provide a complete and updated picture of genome availability. Thus, this review has to restrict itself to the main model and crop species and some basic aspects of their genomic and proteomic sequence acquisition and availability.

In general, genome sequence assembly and annotation of crops are challenging tasks due to large genome sizes and the fact that typically over 80% of the genome is constituted by repetitive transposable elements, as it is the case for the 2.3 billion bases large maize genome (Schnable et al., 2009), barley, and wheat. Polyploidy is another challenge to overcome for many cultivated crops, for example, wheat, potato, tomato, oil seed rape Brassica napus, and even fruit crops (such as banana or strawberry), thus requiring independent sequencing of the various wild-type haplotypes (Shulaev et al., 2011).

Genome annotation, which will give information on gene function predominantly, relies on the prediction of protein-encoding genes based on sequence comparison or in silico gene prediction. However, validation of open reading frames (ORFs) prediction depends on extensive transcriptomics sequence data, such as the recently published ten thousands of unique cDNAs that were sequenced and assembled for barley (Matsumoto et al., 2011) and maize (Soderlund et al., 2009). Alternatively or in combination, a proteogenomics approach using large-scale shotgun proteomics has proven to be extremely powerful in discovering unpredicted ORFs of extensively and intensively annotated genomes of model organisms, such as fly, human, and Arabidopsis (Castellana et al., 2008; Castellana & Bafna, 2010). For Arabidopsis, this is illustrated by 13% new ORFs that were identified in an in-depth proteo(geno)mics study (Castellana et al., 2008).

Improved MS-based proteomic workflows now allow proteogenomics to become the method of choice to validate exon–intron structures of ORFs by mapping the identified peptides to the genome and grouping these peptides into proteins (Ansong et al., 2008; Armengaud, 2010). Such approach has already been described extensively not only for Arabidopsis (Castellana et al., 2008) but also for rice (Helmy, Tomita, & Ishihama, 2011), and fungal wheat and barley pathogens (Bringans et al., 2009; Bindschedler et al., 2011). Proteogenomics can use imperfect genomic databases to identify proteins by proteomic means (Ansong et al., 2008; Castellana et al., 2010; Agrawal et al., 2011; Bindschedler et al., 2011) and help to annotate short or species-specific ORFs. Therefore, newly assembled and (poorly) annotated crop genomes still enable proteomic investigations. This is quite important as in protein sequence databases, such as UniProtKB the plant protein entries are well behind the entries for species of other kingdoms. For instance, the number of UniProtKB entries for Viridiplantae is underrepresented with only 32,666 entries out of 536,789 total entries, representing less than 10% of total protein entries (http://www.uniprot.org/program/plants/statistics - accessed in July 2012). Of these plant protein entries, one third consists of Arabidopsis entries (10,617) and over 2,000 entries from rice.

Concomitant with the emerging plant genomic and proteomic information (Armengaud, 2010; Renuse, Chaerkady, & Pandey, 2011), new bioinformatic tools are being developed to automatically map identified peptides on whole genomes (Sanders et al., 2011; Specht et al., 2011) and assign function to unknown proteins (Bindschedler et al., 2011). Such technological developments will make proteogenomic approaches even more popular and suitable for plant proteomics and complement quantitative plant proteomic measurements (Bindschedler & Cramer, 2011a) by providing the necessary data on protein identity and function for the investigation of plant proteomes.

Resources for crop proteomics such as genomic and proteomic databases are still in their infancy. Most resources, with the exception of some maize and rice databases (Tables 1 and 2), have been developed for the model plant system Arabidopsis as reviewed by Weckwerth et al. (2008).

Table 1. A non-exhaustive list of some genomic and proteomic databases and websites for food cropsThumbnail image of
Table 2. Proteomics resourcesThumbnail image of

D. Preparing the Stage for Systems Biology: Data Integration at Systems Level

Messenger RNA (mRNA)-based approaches are extremely powerful and highly automated, allowing massive screening of several genes at once. However, it is important to recognize that there might be a possible discrepancy between the messenger (transcript) and its final effector (mature protein). As most biological functions in a cell are executed by proteins and metabolites rather than by mRNA, transcript profiling does not always provide pertinent information for the description of a biological system. Expression studies on prokaryotic as well as lower and higher eukaryotic organisms revealed in certain cases a poor correlation between mRNA transcript level and protein abundance (Gygi et al., 1999; Griffin et al., 2002; Corbin et al., 2003; Greenbaum et al., 2003; MacKay et al., 2004; Tian et al., 2004; Carpentier et al., 2008b) or enzyme activity (Gibon et al., 2004). If transcripts are only an intermediate on the way to produce functional proteins and in turn proteins regulate the metabolite abundances, why measure mRNA? It is clear that a correlation between mRNA and protein abundance exists, and that several studies did find a correlation between mRNA (Goossens et al., 2003; Hirai et al., 2004) and metabolites, and in the cell all networks are connected. Furthermore, each approach has its bias and drawbacks. Hence, several biological variables coming from transcripts, proteins, and metabolites need to be integrated to understand systems biology and will lead to new insights. Saito, Hirai, and Yonekura-Sakakibara (2008) review the strategy of combining transcriptome and metabolome studies as a powerful tool for helping the annotation of plant genomes. But data integration from different biological variables to understand the dynamic phenotype of a plant are even more challenging: (i) good algorithms and statistics are needed to extract significant information and to cope with the high dimensionality structure of the data, (ii) the data have to be of good quality, and (iii) the experimental set-up with the plants should not be too complicated. Wienkoop et al. (2008) present an approach to investigate the combined covariance structure of metabolite and protein dynamics in a systemic response to abiotic temperature stress in Arabidopsis wild-type plants. The concept of high-dimensional data profiling and subsequent multivariate statistics for dimensionality reduction, and covariance structure analysis is a powerful strategy to the systems biology of a plant under particular conditions. The systematic integration of transcript, protein, and metabolite profiling needs to be modeled in time to find the correlations between the different levels and this is a challenge for bioinformaticians and statisticians (Weckwerth, 2011).

III. GLOBAL FOOD SECURITY AND SAFETY: CHALLENGING OUR SURVIVAL

“Civilization as it is known today could not have evolved, nor can it survive, without an adequate food supply.”—Dr. Norman Ernest Borlaug (A Nobel Laureate)

Food security is defined as a situation in which all people at all times have physical and economic access to sufficient, safe, and nutritious food to meet their dietary needs and live an active healthy life (World Food Summit, 1996). Over the past couple of decades, food security and its related issues have been one of the most discussed topics around the world, and will likely remain a prime concern for the next 50 years and beyond. There are many issues/factors that affect global food security, including global population, climate change, and exploitive agriculture (Fig. 3). Despite global efforts in this direction, the number of hungry or under-nourished people has increased by 75 million throughout the world in the year 2007 alone, and the report of the US Department of Agriculture has predicted that the number will rise sharply to 1.2 billion by 2017 (FAO, 2010).

Figure 3.

The vicious cycle of the global food security crisis. Food security is a multifaceted problem, which is affected by and affects multiple factors worldwide. We show here the major issues that are the cause of food crisis, as well as those created by the food crisis. These problems are both social and scientific in nature, and therefore, we need to come together and act united toward a hunger free and sustainable world.

Improvement of crop and horticultural plants has long been the major focus for addressing and solving food security related issues. The “green revolution” during the 1960s to 1970s is burning evidence; where higher wheat production has saved almost 10 billion people suffering from severe hunger worldwide (Hesser, 2006). This first revolution was also “recognition” for the first global and united effort for a single aim called “food security.” Thanks to the work of among others, Dr. Norman Ernest Bourlaug introduced some important traits in wheat crop like disease resistance and dwarf height through breeding techniques. Combined with proper irrigation and fertilization, it was shown to the world that wheat yield could be doubled in Mexico, USA, and major South-Asian countries (like India, Pakistan, and Bangladesh) (Hesser, 2006). In a recent commentary published in the journal “Nature,” Dr. Jason Clay had elegantly pointed to some of the needed approaches for global farming (Clay, 2011), which if taken together might enable the present agriculture to perform in a more sustainable way (Herdt, 2006).

Fighting against the global food security issues in the after days of “green revolution,” scientists were convinced that conventional technologies alone will not be able to feed the world of tomorrow. Thus, the search for modern technologies [e.g., plant biotechnology, especially crop biotechnology (Herdt, 2006)] took a protagonist place. Crop biotechnology has been of great importance in improving the global agricultural production and reducing the environmental impact associated with the use of pesticide and with soil erosion, in both developed and developing countries (Barfoot & Brookes, 2007; Brookes & Barfoot, 2009). In developing countries alone, the biotech crops have proven beneficial to more than 12 million farmers (James, 2009). Plant breeders have also used the technology to improve the nutritional value of crops (Newell-McGloughlin, 2008). However, there is awareness that, not only crop improvements but a thorough postharvest management system and smart food processing technologies are necessary to guarantee availability of food in coming years. Food analysis and safety are growing issues that have received attention in recent years. Over the past few years, regulatory agencies have introduced and defined the term—food contamination (Council Regulation 315/93, 1993). Moreover, agencies have recommended the use of appropriate analytical tools to properly identify and quantify the very low levels of food contaminants. One such tool is “proteomics,” and as the section below highlights, we discuss the translation of discoveries and advancements in plant proteomics and MS to solving global issues of food security and safety by discussing some examples from model to non-model plants, crops, biofuels, biotic and abiotic stresses, postharvest technology, foodstuff analysis, genetically modified crops and allergens.

IV. TRANSLATING PLANT PROTEOMICS KNOWLEDGE TOWARDS FOOD SECURITY AND SAFETY ISSUES

Food security, analysis, safety, and human health include multiple research areas across various disciplines from crop improvement to postharvest technologies. In this section, we provide some recent examples and progress where proteomics and the use of MS are playing important roles in addressing these issues.

A. Model Versus Non-Model Plants

In past decades, research communities have focused on species that facilitate experimental laboratory research because of their particular size, generation time, and undemanding growth requirements that make it amenable to high-throughput analysis (e.g., Escherichia coli, Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, and Mus musculus). The first classical plant model, A. thaliana, is ideal for laboratory studies. However, Arabidopsis is phylogenetically only related to a limited number of agricultural important species, which restricts the extrapolation of results to other species (translational proteomics; see also Agrawal et al., 2012a). Brachypodium is a relative to a large number of temperate cereal crops and grasses and has emerged with its small nuclear genome, a life cycle of less than 4 months and its small size as a new interesting model plant (Draper et al., 2001). Arabidopsis research has provided an enormous quantity of data used in genomics, transcriptomics, proteomics, and metabolomics. Brachypodium starts to be integrated in proteomics studies (Larré et al., 2010; Wang et al., 2010). However, the plant kingdom has an enormous richness in biodiversity providing a wealth of possibilities to interest specific needs of nutritionists, breeders, plant physiologists, agricultural researchers, and food scientists. Research on models has proven in the past to be of great value for fundamental research but models need to be validated and some models are simply too far away from their application.

In the past decade, with the rapid development of sequencing techniques and the rapid drop in price, the number of genome-sequencing projects increased and more and more plants and specifically crops are getting sequenced and put forward as “representatives.” Now-a-days, several “representatives” have seen their genome fully sequenced (Feuillet et al., 2010) (find latest data on http://www.plantgdb.org/), one should also take into account the fact that the genome of those “economically important representatives” might not be annotated and that the genomes are quite complex often because of the size and the ploidy level. Genome duplication, polyploidy, and allopolyploidy have played an important role in the evolution of plants including important crops (Soltis & Soltis, 1999). This increase in genetic diversity enables plants to diverge and specialize and to survive stress conditions. However, its occurrence considerably complicates the analysis of real crops (Carpentier et al., 2011). Proteomics approach has a great potential to study non-model species (Carpentier et al., 2008a, 2008b, 2011; Vertommen et al., 2011a). Proteins are well conserved making the high throughput identification of non-model gene products by comparison to well-known orthologous proteins quite efficient (cross-species identification) (Liska & Shevchenko, 2003). While some of the studies done in “the classical model” plants relied totally on the genome sequences to obtain identification only based on peptide mass fingerprints (PMF), the largest part of the studies are nowadays done using MS/MS to improve the confidence one can put on protein identification and reduce the occurrence of false identification. Regarding this last aspect, cross-species identification is therefore an important issue and an extremely useful tool to study “orphan” species, whose genome is not yet sequenced, or when only a few expressed sequence tag (EST) sequences are available. Numerous reviews have recently been published on the subject by Bräutigam et al. (2008), Carpentier et al. (2008a), Remmerie et al. (2011), or Vertommen et al. (2011a,2011b). Indeed, although when considering only the identification of proteins from orphan species, two main strategies have been followed in the past: either using protein homology or using de novo sequencing (Remmerie et al., 2011). In the first case, databases of proteins or ESTs closely related to the considered species (e.g., same species, same family, or same taxonomical group) are created and used with search engines. In the second one, the bioinformatics tool used is rather a database-free search that is performed. The two approaches can also be combined (i.e., as provided by Shevchenko et al., 2001; MSBLAST: http://dove.embl.de/Blast2/msblast.html). However, it should be mentioned that combinations of both approaches can be used. The studies purely based on de novo sequencing have become very rare in the last 5 years. Furthermore, alternative approaches are used. Given the relatively high protein sequence homology between even very distant plants instead of using sequences of close relatives, databases containing as much as possible information are often used. However, this approach is at the cost of computation time.

B. The Application of Proteomics and Mass Spectrometric Tools in the Study of Biofuel Crops

In recent years, public concern over the rising environmental and financial costs associated with the consumption of fossil fuels, has led to an intensified search for cleaner and more sustainable energy sources. Biofuels constitute one such alternative that has attracted significant global interest. From 2000 to 2009, global bioethanol production increased from 16.9 to 72.0 billion liters, while biodiesel output surged from a low level of 0.8 to 14.7 billion liters (Sorda, Banse, & Kemfert, 2010). These impressive gains have had the welcome corollary effect of generating new economic opportunities in the rural sector, whilst promoting energy security and self-sufficiency in a more environmentally sustainable way. In spite of these advantages however, the growth of the biofuels industry, which has essentially relied on food crops (such as cereals, oily seeds, and sugarcane/sugarbeet) as feedstock, has also been blamed for rising food prices and instability in the food supply market. To reduce the negative impact that biofuel production may have on food security issues, greater emphasis is now being focused on the development of second-generation bioethanol and biodiesel. Here, some of the latest developments in biofuels research will be presented, with a special focus on the contributions made by proteomics and MS. Attention will not only be drawn to important biofuel crops, such as Sorghum and Jatropha. But an attempt has been made to cover how proteomics and MS techniques are being applied more broadly to the field of second-generation biofuel research with the goal of improving the efficiency of the biomass-to-biofuel production process. As shown here below, proteomics and MS technologies are continuing to make a vital contribution to the successful development of a sustainable global biofuels industry.

To start we shall first consider the latest work done on Sorghum bicolor. The controversy surrounding the use of maize and other food crops for biofuels production brought attention towards an African grain plant called sorghum. In addition to its importance as a source of food, Sorghum (particularly sweet stem varieties) has attracted significant interest in recent years as a promising energy crop (Munns & Tester, 2008). Sorghum can grow in a wide range of marginally arable geographical areas as it requires relatively less fertilizer and water compared to other grain crops (Kasuga et al., 1999). The combination of S. bicolor natural drought tolerance traits and its recent genome sequencing milestone (Paterson et al., 2009) makes it one of the most logical model plant for both proteomics and genomics research in cereals. Sorghum proteome analysis profiled the 2-DGE protein patterns of the total soluble proteins and secreted culture filtrate protein, and culminated in a comprehensive mapping and characterization of the sorghum cell suspension culture secretome (Ngara & Ndimba, 2011). More recently, this group completed the study towards profiling and identification of sorghum seedling's salt-stress responsive proteins (Ngara et al., 2012).

Another important biofuel crop species that has been studied using proteomics is Jatropha curcas, a non-edible oilseed crop. Similar to Sorghum, Jatropha is a hardy plant that can withstand arid and semi-arid conditions (Sudhakar Johnson, Eswaran, & Sujatha, 2011). Proteomics has mainly been applied to Jatropha to foster a better understanding of the factors involved in maintaining its high seed oil content. In a study conducted by Popluechai et al. (2011), a comprehensive characterization of the proteome of Jatropha seed oil bodies was achieved using LC-MS/MS, revealing the major contribution of three main types of oleosins.

In terms of second-generation biofuels, the difficulties and high costs involved in the breakdown of lignocellulosic materials into easily fermentable sugars, is the main stumbling block that is preventing the widespread uptake of this form of biofuel production. Researchers in the field however are making important progress by beginning to narrow in on the natural process of biomass degradation, as accomplished by specialist microorganisms. In a study by Tolonen et al. (2011), quantitative MS analysis of both the proteome and secretome of Clostridium phytofermentans, allowed for a systems-wide analysis of the various proteins implicated in the efficient fermentation of cellulosic biomass. As a result of this study, over 2,500 proteins were identified, which will serve to direct the scientific identification and engineering of targets for second-generation biofuel production. In a similar vein, the comprehensive analysis of the secretome of the fungus Phanerochaete chrysosporium, using iTRAQ LC-MS/MS, allowed researchers to quantitatively profile the expression of over 300 proteins involved the degradation of several types of agricultural and forestry wastes (Adav, Ravindran, & Sze, 2012). This marked an improvement of up to sevenfold compared to previous studies and will provide important clues as to how enzymatic cocktails can be optimized for improved biofuel production.

C. Crop Disease Proteomics: Reducing Pathogen Damage to Agricultural Crops

An estimated 10% of the developed world's food is lost due to plant pathogens annually (Strange & Scott, 2005), and much more during epidemics. The greatest losses suffered by agricultural crops today are caused by insect damage and plant diseases, with plant diseases being the most devastating. Serious plant diseases are caused by bacteria, viruses, nematodes, as well as fungi and fungus-like Protozoa, but fungi probably cause the most severe losses around the world because there are more genetically diverse plant pathogenic fungi than there are plant pathogenic bacteria or viruses. Clearly, molecular plant pathology is an area in which a scientific approach to enhancement of disease resistance can make a significant impact on crop productivity. Many of the benefits of this research would eventually reach the developing world, where agricultural losses to pathogens tend to be higher. Losses also occur where pathogens contaminate grain or other edible produce with mycotoxins. The cost of deoxynivalenol contamination of wheat alone by Fusarium graminearum has cost the US and Canada an estimated $3 billion since 1990 (Ward et al., 2008); in Southern Africa, where subsistence farmers are often forced to consume contaminated grain, the cost in human suffering must also be considered (Ncube et al., 2011). The best strategy for controlling plant diseases includes using resistant crop cultivars, and for plant breeders to develop these, an understanding of plant–pathogen interactions is essential. Since proteins are important players in plant–pathogen communication, proteomics is a logical choice for dissecting the molecular events that frequently lead to plant disease.

A great deal of progress has been made in model plant–pathogen systems, notably with Arabidopsis and bacterial elicitors, and this has advanced the understanding of events at the molecular level which lead either to disease progression or limited pathogen growth in the case of disease resistance (Nishimura & Dangl, 2010). Thus, the gene-for-gene theory, advanced by Flor (1971), based on the interaction between plant resistance R-genes and pathogen avirulence avr-genes in flax–flax rust, has been confirmed using a proteomics approach (Dodds et al., 2006) and has evolved into the zigzag model described by Jones and Dangl (2006). This model seeks to explain molecular events that occur upon infection of a plant by a pathogen and although it was built up with experimental evidence from A. thaliana, it is generally applicable to other plant–pathogen systems. In brief, the plant must overcome or neutralize the actions of various pathogen elicitors, which it first recognizes as microbe-associated molecular patterns (MAMPs). If the pathogen is successful, expression of avirulence proteins results and these interact with plant R-gene products the majority of which are nucleotide-binding site leucine-rich repeat (NBS-LRR) proteins (Boller & Felix, 2009). On the other hand, if the plant is able to overcome the action of the avr proteins, a hypersensitive response results. This causes a halt to the invading pathogen with little penalty to the host. If it cannot, then the pathogen completes its life cycle and causes disease. A real-life example of this interaction is presently playing out in East Africa where the wheat R gene Sr31, the main line of defense used by breeders to combat stem rust caused by Puccinia graminis, has been overcome by isolate Ug99 with devastating results for the region (Singh et al., 2008). The impact of Ug99 in the Western hemisphere would be less, as P. graminis can be well managed by fungicides. The tools available for crop proteomics cannot be compared to those available for Arabidopsis research. For example, with the exception of rice, no major crop plant has a fully sequenced (i.e., mature and nicely annotated) genome; in many cases creation of genetic mutants—for example, through the use of transgenics—is difficult; only a few R genes have been cloned and characterized, and far fewer avr genes.

The contribution from the studies that employed model plant–pathogen systems and MS-based proteomic techniques are comparable with other conventional approaches that successfully characterized plant–pathogen interaction to date. Expression profiling at the protein level represents the core of current proteomic approaches. However, the majority of comparative proteomics studies detected and identified only a handful of responding proteins (Rampitsch & Bykova, 2012). The most successful studies that identified dozens and even hundreds of proteins involve subcellular fractionation and LC separation in addition to the gel electrophoresis approaches. These studies reported that many plant proteins were differentially expressed upon pathogen attack or elicitor challenge but only a few responding proteins were identified on the pathogen side. One of the reasons for difficulties in studying the protein expression of the invading pathogen is the lack of established methodologies for recovery of pathogens and/or enrichment of proteins from the pathogens postinfection. Another important observation is that the transcript (mRNA) levels of some genes did not correlate with the protein expression pattern, and some proteins were found to be regulated at the translational and posttranslational level. Based on proteomics derived informations, our current understanding of biological processes occurring during plant–pathogen interaction remains rudimental. More in-depth analysis involving spatial and temporal distribution of responding proteins and metabolites will help elucidate the details of pathogens invasion strategies and the complex interplay between pathogen and host. Although many proteins were found to be differentially expressed either in the plant or pathogen postinfection, roles of individual proteins and mechanisms involved in a particular disease have to be validated by further experiments.

In spite of these challenges, good progress has been made in a number of crop-pathogen systems (Quirino et al., 2010). Most preliminary studies, especially with poorly characterized interactions employ 2-DGE coupled with MS/MS and homology-based matching. These studies tend to report increased expression of antioxidant enzymes, fungal cell-wall degrading enzymes, pathogenesis related (PR) proteins, and certain metabolic enzymes (i.e., glyceraldehyde-3-phosphate dehydrogenase). There are many examples of such published studies, for example barley-F. graminearum (Yang et al., 2010), pea-downy mildew (Peronosporaviciae) (Amey et al., 2008), and rape-seed-blackleg (Leptoshaeria maculans) (Sharma et al., 2008). Although this indicates that diverse plant pathogen systems share common response pathways, it highlights some of the limitations of the 2-DGE approach when applied to whole, unenriched tissues. More recently, gel-free techniques such as quantitative MDLC have been applied. For example, Marsh et al. (2010) used iTRAQ to compare the response of susceptible V. vinifera (grape) to powdery mildew caused by Erysiphe necator. Their results indicated that the plant was able to mount a basal defense response, but could not overcome the avr gene products, thus leading to disease. A study of bean (P. vulgaris) inoculated either with virulent or avirulent races of bean rust (Uromyces appendiculatus), conducted by multidimensional chromatographic separation of the proteome and quantification by spectral counting, indicated that proteins of the basal response did not accumulate to higher levels in the virulent pathogen interaction compared to the avirulent pathogen interaction (Lee et al., 2009). They concluded that basal and R-gene-mediated responses occur together and that in the resistant cultivar the R-gene products repair the basal defense system, which is inherently strong, rather than acting independently of the effector-mediated response (Lee et al., 2009).

All of the important “calorie crops” are monocots, and all are susceptible to fungal pathogens, among which the rusts and powdery mildews occupy an important place. Although much can be gained from studying Arabidopsis pathology, it is a dicotyledonous plant with no rust pathogens. On the other hand, the oomycetes Albugo laibachii and A. candida may prove themselves suitable models (Thines et al., 2009). Progress in crop-rust, crop-mildew, and crop-oomycete proteomics would certainly benefit from research conducted in a robust model system. Numerous proteomics studies using non-crop model systems have contributed tremendously to discovering response patterns in plant–pathogen interactions, which have to be further validated in plants with market importance.

D. Crop Improvement Against Abiotic Stress: Finding a Way Out

Understanding the crop response to climate variations has always been the focal theme for agriculturists, agro-meteorologists, plant biologists and environmentalists, as it directly affects the food security (Porter & Semenov, 2005). In search of higher yield and better grain quality, breeders have developed numerous cultivars through conventional breeding programs. Diverse researches during the last two to three decades have revealed that these cultivars are not potent enough to cope with the present rate of climate change. Wheat production must continue to increase at least 2% annually until 2020 to meet the future demand, but the major components of climate change [like increases in ozone (O3), carbon dioxide (CO2), ultraviolet-B (UV-B), etc.], increasing drought and salinity, changing soil nutritional dynamics, etc., do not seem to let that happen (Cho et al., 2011; Zargar et al., 2011). Keeping this realistic picture in mind, Dr. Norman Borlaug had concluded that the next era of crop biotechnology would be governed by identifying genotypes that could maximally exploit the future environment for yield enhancement vis-a-vis improved stress tolerant traits (Reynolds & Borlaug, 2006). Engineering crops for the future requires a basic understanding on the detailed network of induced biomolecular changes in different genotypes of present crops (Ainsworth et al., 2008). Plant acclimation to stress is associated with profound changes in proteome composition. Since proteins are directly involved in plant stress response, proteomic studies can significantly contribute to unravel the possible relationships between protein abundance and plant stress acclimation. Protein response pathways shared by different plant species under various stress conditions. However, studies showed that the major damage in the photosynthetic machinery and primary metabolism pathways are quite similar in all the plants under various stresses (Cho et al., 2011; Zargar et al., 2011).

1. New Hints From the Proteome of Resurrection Plants

Climate variability in Southern Africa poses a direct danger to food security in the region with particular respect to maize as it is relatively sensitive to drought (Tschirley et al., 2004). Vegetative desiccation tolerance is a specific trait found in certain species of byrophytes, lichens, ferns, and in a small group of angiosperms known as “resurrection plants” (Gaff, 1989; Oliver, Wood, & O'Mahony, 1998). Resurrection plants can tolerate more than 95% loss of their cellular water during dehydration, remain in this state for extended periods, and then regain full metabolic activity upon rehydration (reviewed in Farrant & Moore, 2011).

To date, few proteomics studies of resurrection plants have been reported during dehydration and rehydration (Ingle et al., 2007; Jiang et al., 2007; Abdalla, Baker, & Rafudeen, 2010; Oliver et al., 2011). Furthermore, there are few studies in which functional assessment, using biochemical, physiological, and structural studies of proteins are performed to characterize roles of such products in planta (reviewed in Moore et al., 2009; Farrant & Moore, 2011). Here, we discuss proteomics studies conducted on one indigenous (South Africa) monocotyledonous Xerophyta (Velloziaceae) species, namely X. viscosa, as example. Proteomics of X. viscosa leaf tissue revealed marked changes in protein expression in two phases (Ingle et al., 2007); the first occurring upon drying to 65% RWC and the second, more dramatic change occurring when leaves were dried to 35% RWC (Table 3). These stages (referred to as “early” and “late” stages of protection) correspond to physiological and biochemical changes in Xerophyta species at similar RWCs (Illing et al., 2005; Farrant & Moore, 2011). Differentially accumulated proteins were found to be involved in antioxidant metabolism, PSII stabilizers, chaperonins, and RNA binding proteins.

Table 3. Identified proteinsThumbnail image of

Proteomics analysis of the X. viscosa nucleus (Abdalla, Thomson, & Rafudeen, 2009; Abdalla, Baker, & Rafudeen, 2010) was followed due to the importance of this organelle in gene expression and signaling responses (Table 3). The nuclear protein profile of the late dehydration stage was analyzed by 2-DGE-based approach (Abdalla, Baker, & Rafudeen, 2010). MS analysis of eighteen differentially expressed nuclear protein spots resulted in the identification of proteins associated with gene transcription and regulation, cell signaling, molecular chaperone and proteolysis type activities, protein translation, energy metabolism, and novel proteins. This finding suggests that expression of appropriate stress–response proteins in the nucleus was sufficient to protect the cellular structures during dehydration and in the dried state.

This resurrection plant, although lacking a sequenced genome, is an ideal source of obtaining novel drought-inducible proteins that might be exploited to improve drought tolerance of crop plants (Moore et al., 2009; Farrant & Moore, 2011).

2. Protein Oligomerization

Protein oligomerization is one of the PTMs (Witze et al., 2007). Oligomeric proteins—unless they exist solely or mostly as oligomers—come to exist under the influence of changed concentration, temperature, pH or through other stimuli such as binding to small or large molecules (nucleotides included) and other PTMs such as phosphorylation or glycosylation, etc. (Ali & Imperiali, 2005).

More than 60% of proteins function in the cell in some kind of a complex with themselves or other proteins. However, what is important to be considered here is conditional change in the oligomerization status of a particular protein that leads to change in function or activity. Abiotic stresses such as drought, heat, high light, chilling, flooding, and salinity quickly change the factors regulating in vivo protein oligomerization, such as temperature, pH, redox status, water content, nutrients, and enzyme substrates. Therefore, before transcriptional responses start and before major energy requirement/consumption changes are noticed, protein oligomerization may be one of the quickest responses to stress conditions, which can change the active enzyme and chaperone complement of the proteome.

For conventional breeding or genetic engineering-mediated improvement in crop plants towards food security issues in the future, the main emphasis will lie on abiotic stress tolerance or yield under various abiotic stresses in general and under drought and salinity in particular. This is especially true for rice because the climate change scenarios predict these two abiotic stresses to be the main reasons for decline in rice yields in future. Hence a comprehensive understanding of plant response to such stress conditions is imperative. Factors such as differential oligomerization mediated changes in the enzyme complement can only be captured through targeted and advanced proteomic approaches, including the study of redox proteomes. Unless methods and protocols are standardized for capturing differential oligomerization of proteins, especially those that do not necessarily change at the transcript level under stress, a part of our understanding of plant response and hence our capacity to modulate it may remain incomplete.

E. Postharvest Proteomics

The burden to fulfill the increasing food demand has been placed mainly in agriculture through crop improvement. While, crop improvement, without any doubt has a pivotal role to play to meet this demand, broader strategies can play a significant role. For example, especially in developing countries, postharvest losses account significantly that of the total production (Floros, Newsome, & Fisher, 2010) mainly due to inadequate postharvest management and processing practices. The prediction that food production should be doubled by 2050 probably can be partially circumvented by complementary strategies, such as management of postharvest losses and use of appropriate food processing techniques. In this section, we will focus on current applications of postharvest proteomics to understand and reduce postharvest produce losses. The application of proteomics in postharvest dates back several years. Horticultural crops after being harvested are constantly exposed to stresses of different nature (e.g., mechanical, physical) (Gómez-Galindo et al., 2007) during handling and transport from the centers of production throughout the whole food supply chain. Envisioning an extended shelf life, many crops are stored in controlled and modified atmospheres (Pedreschi et al., 2010). But postharvest physiological problems cannot be ruled out (Casado-Vela, Selles, & Bru Martinez, 2005). These physiological problems result in huge food and economical losses, which are due to improper handling during harvest and postharvest management. Proteomics has been a useful tool to understand physiological disorders in pears, apples, peaches, and citrus fruits (Lliso et al., 2007; Pedreschi et al., 2007, 2009; Nilo et al., 2010).

There is awareness in the postharvest community that the only way to understand physiological disorders and other postharvest physiological events affecting quality is through the application of holistic approaches. This would allow early decisions on how to manage the product (e.g., early sale, food processing, etc.). Given the biological complexity involved in understanding the different events for example during ripening (e.g., respiratory changes, volatile compound changes, and ethylene synthesis), and postharvest management (e.g., gas and mass transfer events, temperature control), a systems biology approach (Hertog et al., 2011) is recommended to integrate not only physiological data but biophysical data and models. The potential of systems biology in postharvest applications such as regulation of processes and responses, quality prediction, plant improvement, and virtual crop has been recently emphasized (Hertog et al., 2011).

F. Food Proteomics: Food Analysis and Traceability

Nutrigenomics and nutrigenetics are two emerging disciplines (Bagchi, Lau, & Bagchi, 2010) aiming to elucidate how nutrients modulate gene expression, protein synthesis, and metabolism. The difference between the two is subtle but not trivial. Whereas nutrigenomics investigates the impact of nutrients on gene regulation, nutrigenetics studies the effect of genetic variations on individual differences in response to specific food components. In other words, the latter discipline should, in a way, resemble personalized medicine, a future goal of physicians in tailoring drug dosage and drug types to the individual genetic background, as modulated, for example, by single nucleotide polymorphism (SNP). Both disciplines have resulted in the commercial launch of “Nutraceuticals” (a neologism combining the world nutrients and pharmaceuticals) and “functional foods” that could regulate health effects based on individual genetic profiles. Such nutraceuticals should contribute to the prevention of diseases, such as cancer, cardiovascular disease, obesity, and type II diabetes. The Japanese are much advanced in their studies on “functional food,” that is, those foods that could improve human health. The Ministry of Health, Labour and Welfare of Japan has already approved 820 such products in total and allowed them to carry labels that make health claims. These health foods have been divided into 11 categories, ranging from modulation of gastrointestinal conditions, of cholesterol levels, of blood pressure, and the like. Well, these authors have explored a number of these foods via Gene Chips and report specific modulation of several genes after intake of the various functional food categories, ranging from soy protein isolates, to sesame seeds, cocoa, royal jelly, and the like. Perhaps plenty of people might find information that could be useful to cure their life-long mini-dysfunctions.

Curiously, in many recent scientific reports on food analysis, there are scarce proteomic data, most of the work being focused on metabolites, small-molecules mimicking drug function and aromas. Only recently proteomic science has been applied to the exploration of protein components (especially those present in traces) in various types of dietary products. This section will offer a survey of such work, with the provision that it will be limited to the application of combinatorial peptide ligand libraries (CPLLs) to selected foodstuff and beverages in search of trace species that might affect (positively or negatively) human health. Our selection of the CPLL technique is entirely based on literature survey, where this technique was found to be one of the most promising among the enrichment techniques for identification of low-abundance proteins, to dig deeper into proteomes, and applicability to a wide-range of biological samples derived from different organisms including plants. This statement does not necessary mean that other proteomic techniques have not been utilized for studying food proteins; or by any means it is publicity for CPLL. Moreover, space limitation is another constraint.

Combinatorial peptide ligand library (CPLL) is the most recent sample treatment process (Fig. 4) that, from an initial stage of curiosity, is today largely used for the detection of very-low-abundance proteins from a variety of biological extracts. Vast literature on CPLL (reviewed in Righetti et al., 2006; Guerrier, Righetti, & Boschetti, 2008; Righetti & Boschetti, 2008; Righetti, Fasoli, & Boschetti, 2011) indicates the potential of CPLL as an emerging new tool for proper identification of the very low level of food contaminants, as per the requirements of the regulatory agencies.

Figure 4.

As little as 1 µg/L of casein added as fining agent can be efficiently detected in white wines via capture with combinatorial peptide ligand libraries. MM: molecular weight standards. Tracks: 1 and 2: 2 and 5 µg casein standards; 3: 1 µg casein detected in a fined wine from the Veneto Region (Soave). SDS–PAGE gel, silver stained (modified from Cereda et al., 2010, by permission).

1. Alcoholic Beverages

In modern times, it has become customary to fine wines, so as to remove residual grape proteins that might flocculate during storage. Among the fining agents, one of the most popular is casein derived from bovine milk. However, caseins are also known as major food allergens and therefore, according to the Directive 2007/68/EC of the European Community (EC), “any substance used in production of a foodstuff and still present in the finished product” must be declared in the label, especially if it originates from allergenic material. Wine producers have never honored this directive, on the grounds that any residual casein would be below the detection limit set by the EC (200 µg/L via indirect ELISA assay). Via the CPLL technique, Cereda et al. (2010) and D'Amato et al. (2010) have been able to detect such residual casein down to as low as 1 µg/L. Thus, it turns out that, if wines have been treated with casein (or with egg albumen), residues of such additives will always be there and detectable via CPLLs, something that the EC rulers should be aware off. This is what can be found in “treated,” but what would happen with “untreated” wines? The exploration of the global proteome of wine products would have a quite ambitious aim: to see if, by assessing the global content of a given wine from a producer, one could obtain a proteomic signature (proteo-typing) that might enable its identification against counterfeited products. Especially in the case of “grands crus,” counterfeited products are reported more and more frequently to invade the market, with severe damage for both producers and consumers. This has been attempted with a Recioto (a dessert wine produced in the Veneto region in Italy with Garganega grapes): 106 unique gene products could be typed via CPLL capture (D'Amato et al., 2011a), something unique considering that in untreated Champagne (from Reims, France) only nine species could be identified (Cilindre et al., 2008) and, in a Chardonnay white wine from Puglia, Italy, just 28 glycoproteins could be detected (Palmisano, Antonacci, & Larsen, 2010). Whether or not some 100 or so proteins might enable us to distinguish different crus remains to be seen. At the moment, research is progressing on typing of Champagne made with single grapes versus Champagne produced with three grape varieties (Righetti, Cilindre et al., work in progress). By the same token, the beer proteome has been explored via the CPLL methodology, permitting the identification of no less than 22 barley proteins, two maize proteins (this lager beer had been produced with a mixture of the two grains), and 40 S. cerevisiae proteins (Fasoli et al., 2010), the latter present in minute traces and thus escaping detection by conventional techniques.

2. Non-Alcoholic Beverages

Supermarket shelves, especially in the USA, are heavily colonized by a huge variety of non-alcoholic beverages, claiming the presence of any possible plant or fruit extract. Yet, nobody has ever analyzed their proteome content, in order to see if such beverages contain a least traces of the plant material from which they are claimed to be produced. Just as an example, ginger drinks (including ales) can be found with an incredible variety of brands and names. Yet, would they contain any ginger root extract? Two types of drinks, very popular all along the Mediterranean countries, have been recently analyzed with the CPLL technique: almond's milk and orgeat syrup (Fasoli et al., 2011). In the first product 137 unique protein species were identified. In the second beverage, a handful of proteins (just 13) were detected, belonging to a bitter almond extract. In both cases, the genuineness of such products was verified, as well as the fact that almond milk, judging on the total protein and fat content, must have been produced with 100 g ground almonds per liter of beverage, as required by food authorities. On the contrary, cheap orgeat syrups produced by local supermarkets and sold as their own brands, were found not to contain any residual proteins, suggesting that they were produced only with synthetic aromas and no natural plant extracts. By the same token, a commercial coconut milk beverage was analyzed via CPLLs (D'Amato, Fasoli, & Righetti, 2012). A grand total of 314 unique gene products could be listed, 200 discovered via CPLL capture, 146 detected in the control, untreated material, and 32 species in common between the two sets of data. This unique set of data could be the starting point for nutritionists and researchers involved in nutraceutics for elucidating some proteins responsible for the cornucopia of unique beneficial health effects attributed to coconut milk.

Perhaps one of the most striking results was obtained by analysis of a cola drink, produced by an English company and stated to be produced with cola nut as well as Agave tequilana extracts. Indeed, a few proteins in the Mr 15- to 20-kDa range could be identified by treating large beverage volumes (1 L) and performing the capture with CPLLs at very acidic pH values (pH 2.2) under conditions mimicking reverse-phase adsorption (D'Amato et al., 2011b). Ascertaining the presence of proteins deriving from plant extracts has confirmed the genuineness of such beverage and suggests the possibility of certifying whether soft drinks present on the market are indeed made with vegetable extracts or only with artificial chemical flavoring. As a negative control, Coca Cola was analyzed as well. It is generally believed that present-day Coca Cola is indeed a fully artificial beverage, in which neither the coca leaf nor cola nuts extracts are utilized. In fact, when looking at the following drinks at the official Coca Cola web site [Diet Coke, Sprite, Coke 20, Coke Classic, Coca Cola Classic, Coke Zero, Cherry Coke, Fresca, Diet Sprite Zero (partial list)], one can easily obtain this information: protein content, zero; and fat content, zero. An impressive parade of “zeroes,” suggesting that no vegetable extracts should be present. As a matter of fact, when applying the same procedure to one liter of Coca Cola beverage, D'Amato et al. (2011b) could not detect any trace of proteins.

In the most recent investigation, the same group of authors has analyzed the proteome of white-wine vinegar (Di Girolamo, D'Amato, & Righetti, 2011). A total of 27 unique gene products were identified. The most abundant species detected, on the basis of spectral counts, was seen to be the whole genome shotgun sequence of line PN40024, scaffold_22 (a protein of the glycosyl hydrolase family). Curiously, up to the present, no information had been available on the vinegar proteome. These authors also speculated on the possible structure and amino acid composition leading to the survival of just these 27 grape proteins in such an acidic environment (pH 2.2), which should in general lead to protein denaturation and precipitation.

3. Selected Foodstuff

Even prior to the work on beverages, CPLLs were applied to analysis of some foodstuff of daily consumption. The first one regarded egg white and was deemed a good challenge for CPLLs, due to the fact that a few proteins dominated the landscape and massively masked the signal of low-abundance species; additionally, only a dozen or so proteins had been known in this type of nutriment. By using two types of hexapeptide libraries, terminating either with a primary amine or modified with a terminal carboxyl group, D'ambrosio et al. (2008) identified 148 unique protein species. In a subsequent report, Farinazzo et al. (2009) analyzed the chicken egg yolk cytoplasmic proteome, this time by using also a third CPLL, terminating with a tertiary amine. The results were most exciting: 255 unique protein species were brought to the limelight. These two articles formed the basis of an extensive study on the interactomics of egg white and yolk proteomes (D'Alessandro et al., 2010).

In yet an additional report, the cow's whey proteome was investigated via CPLLs (D'Amato et al., 2009). That study identified a total of 149 unique protein species, of which 100 were not described in any previous proteomics studies. A polymorphic alkaline protein, found only after treatment with CPLLs, was identified as an immunoglobulin (Ig), a minor allergen that had been largely amplified. Donkey's milk was analyzed as well. This milk is today categorized among the best mother milk substitute for allergic newborns, due to its much reduced or absent allergenicity, coupled to excellent palatability and nutritional value. By exploiting CPLLs, and treating large volumes (up to 300 mL) of defatted, decaseinized (whey) milk, Cunsolo et al. (2011) have been able to identify 106 unique gene products. Due to poor knowledge of the donkey's genome, only 10% of the proteins could be identified by consulting the database of Equus asinus; the largest proportion (70%) could be identified by homology with the proteins of Equus caballus.

G. Food Safety: Hints From Proteomics

The uniqueness of qualitative/quantitative protein biomarkers might become pivotal also in food testing to determine both food safety and authenticity. On one hand there is interest in knowing which product has been used to produce a specific food, and thus identify the production origin of an aliment of certified and guaranteed “controlled origin.” On the other hand there is concern to evaluate its (product) edibility through biochemical assessment of product purity, both under a chemical and microbiological standpoint. As both these concepts are somehow intimately intertwined, proteomics might provide valuable shortcuts to give an answer to both interests.

The recent epidemic of mutant E. coli, which has involved the North of Germany and almost paralyzed vegetable commerce within Europe at the beginning of June 2011, might represent a warning sign. Nonetheless, big strides in the field of MS application to microbiology are still defining an ongoing revolution that might spread its benefits to the food safety endeavor as well: the introduction of Bruker Daltonics' Matrix Assisted Laser Desorption/Ionization (MALDI)-Biotyper (Seng et al., 2009). This technology allows for rapid and accurate identification of bacteria and microorganism (and region-specific substrains) cultured from routine clinical samples through the identification of species-specific proteins (Fig. 5). Its application to the field of food safety might result in something more than a suggestive perspective, contributing to the cause of reducing the likelihood of untoward risks rising from the assumption of unsafe food. Quality control becomes a founding principle in the alimentary industry when it comes to GM organisms. Over recent years, it has become clear that food and feed plants carry an inherent risk of contaminating our food supply (Ahmad et al., 2010). The current procedures to assess the safety of food and feeds derived from modern biotechnology include the investigation of possible unintended effects. To improve the probability of detecting unintended effects, proteomics has been utilized as complementary analytical tools to the existing safety assessment; details are mentioned under the subtitle “Assessing the Equivalence of Genetically Modified Crops.”

Figure 5.

Workflow of bacteria identification through the MALDI Biotyper technology. Food safety (A). Manual or automated cultures can be performed starting from food products (B). Cells are then analyzed through MALDI-based MS platforms (C), such as MALDI Biotyper, which associates mass spectra to each bacterium strain and/or sub-strain (D). The concept underpinning this approach is that a peculiar MALDI spectrum profile is associated to each bacterium strain, in a fingerprint like fashion (E), thus it is possible to ultimately identify (F) bacteria through a rapid application of this workflow. Automation of each and every phase of the workflow allows for increasing confidence of the identification of the bacterium strain in a fast and reliable way.

H. Assessing the Equivalence of Genetically Modified Crops

Improving agronomic traits of crops has been a major objective of conventional breeding and modern biotechnology. Plant varieties derived from conventional methods of plant breeding have been commercialized for many years without premarket regulation and assessment. However, GM crops remain highly controversial, notably in Europe. The major issue of concern is the safety of the introduction of new desired traits for humans and animals. The questions arise: (i) does the introduction of genes into a crop recipient cause unintended effects? and (ii) do crops harboring foreign genes have any negative impacts on health? (Cellini et al., 2004; Ruebelt et al., 2006a-2006c; García-Cañas et al., 2011).

The concept of “substantial equivalence” is the most popular principle for safety assessment of GM crops (OECD, 1993). According to this concept, GM crops should be compared to their conventional counterparts to evaluate whether they have the substantial equivalent components. A primary evaluation method is targeted analyses to uncover variations in some analytes for each crop varieties. However, targeting only some specific analytes does not expose all the unintended effects caused by transgenes (Millstone, Brunner, & Mayer, 1999). Proteomics has emerged as very useful techniques in evaluating unintended effects (Kuiper, Kok, & Engel, 2003; Barros et al., 2010; Herrero et al., 2012). Proteins are of great interests in food or feed safety assessment, because of their involvement in metabolism and cellular development. In some cases they can also negatively impact human or animal health, behaving as toxins, anti-nutrients, or allergens. Accordingly, investigation into the proteome would increase the chance to obtain more information about the unintended effects (Lovegrove, Salt, & Shewry, 2009).

2-DGE combined with MS is the most popularly used techniques in proteomics aimed at GM food safety. Protein profiles of seeds of 12 transgenic Arabidopsis lines expressing different transgenes and their parental line were found to be substantially equivalent (Ruebelt et al., 2006a-2006c). In another study, there were no consistent proteome differences in twelve transgenic Arabidopsis lines expressing the bar gene either (Ren et al., 2009). GM crops have also been widely investigated from a proteomic point of view. GM maize MON810 expressing Cry1Ab is one of the most popular GM crops in safety assessment. Coll et al. (2011) showed that there were virtually identical protein patterns between maize MON810 and non-GM lines. However, it was observed that 43 proteins displayed altered abundance levels in maize MON810 (T6) compared to non-transgenic plants (WT6), which could be specifically related to the expression of the transgene (Zolla et al., 2008). Furthermore, by using 2-DGE, Batista and Oliveira (2010) documented the role of natural plant-to-plant variability in observable differences between MON810 and control maize plants. These results disclosed that some of the differences encountered between pools of plants (GM vs. non-GM plants) could be the result of a high plant-to-plant natural variability, which emphasized the importance of assessing natural variability in safety assessment. When GM pea was studied by 2-DGE combined with MS, 33 proteins were found to be differentially expressed in αAI1-containing lines compared with control lines, and 16 proteins were identified by MALDI-time of flight (TOF)-TOF (Chen et al., 2009). Many other GM plants were also subjected to proteomic analysis, such as GM wheat (Di Luccia et al., 2005), potato (Lehesranta et al., 2005), tomato, and tobacco (Corpillo et al., 2004). In contrast, there was little 2-DGE-proteomic information on GM rice to date. Wang et al. (2008) performed a proteomic study on mature embryos of hybrid rice based on 2-DGE and MALDI-TOF-MS analyses, and identified 54 differentially expressed proteins between hybrid rice and parental lines.

Leaf proteomes of scFv(G4)-expressing tomato and scFv(B9)-expressing tobacco were compared with corresponding non-transgenic plants by 2D-differential in-gel electrophoresis (DIGE) (Di Carli et al., 2009). Of differentially expressed proteins (10 for tomato and 8 for tobacco), PCA showed undefined separation between transgenic plants and controls. It was concluded that the proteomics differences between transgenic and non-transgenic plants were more likely due to physiological variations. This conclusion was confirmed by another study of the effects of transgenic αAI on proteomes of two pea cultivars carried out by 2D-DIGE (Islam et al., 2009). These authors found that even transformed with the same gene, two transgenic cultivars showed no (at least little) common alterations of protein profiles. Teshima et al. (2010) applied 2D-DIGE to proteomic phenotyping of natural variants in 10 varieties of rice, and showed extensive natural variability of rice seed proteome resulting from different genetic background. Shotgun proteomics has also been used to monitor the protein profiles of natural mutant rice RCN and its wild-type control (Lee et al., 2011). iTRAQ-based shotgun proteomics was used to compare and quantify seed proteomes of transgenic rice expressing hGM-CSF and wild-type control (Luo et al., 2009).

I. Plant Biomarkers for Human Health: Examples on Food Allergens

Significant and rapid advancements have been made in the field of biomarker discovery for human health. With the phenomenal advancement in proteomics technology along with the increased potential of bioinformatics tools, biomarkers have gone from being physical measurements of a particular health or disease state to being precise molecular indicators. Moreover, whereas before individual proteins were used as biomarkers, with some degree of success, now discriminating patterns of proteins observed in a particular proteome are allowing for much earlier and accurate detection of a disease (Johann et al., 2004). A further advantage is that these multiplexed biomarkers compensate for both patient and disease state heterogeneity helping to achieve sufficient clinical efficacy. While much of the progress in molecular biomarker breakthrough employing proteomics has occurred in the fields of cancer and heart disease predisposition and detection (Srinivas, Kramer, & Srivastava, 2001; Vasan, 2006), it is also being applied successfully in the areas of plant-specific biomarkers for human health and food security. Promising applications are in the field of biomarkers for plant food allergen identification and detection.

In the past decade, there has been a marked increase in food allergies of plant origin recorded around the world but particularly in developed countries, with approximately 5% of infants in the USA suffering from some form of food allergy (Hadley, 2006), including the well-known baker's asthma (Tathman & Shewry, 2008). These allergies can present a severe health risk due to the potential of some plant allergens to trigger life threatening immunological reactions resulting in anaphylaxis. It is therefore essential to develop sensitive methods for identifying and quantifying potential plant food allergens, detecting the presence of trace amounts of allergens in processed food and importantly, detecting both the native form of the allergen as well as the form resulting from food processing practices.

Many different plant food allergens have now been identified and characterized by MS-based proteomic analysis, including but not limited to allergens from peanuts, soybean, fenugreek, hazelnut, and wheat flour (Houston et al., 2005; Akagawa et al., 2007; Chassaigne, Norgaard, & Arjon, 2007; Chassaigne et al., 2009; Weber et al., 2009; Faeste et al., 2010). Proteomics techniques for quantification of the biomarkers have also been employed. Houston et al. (2011) directly quantitated the allergens in different soybean varieties by spectral counting. Other quantitative proteomics techniques, including DIGE, have been employed to determine the allergen biomarker variation between varieties of peanuts (Schmidt et al., 2009), to select for low allergen containing varieties.

Disclosure of the potential allergenicity of food products is essential to ensure human food safety and this is particularly important when a new food product is introduced to the market. Detection of allergenic proteins in foods has traditionally depended on costly immunochemical techniques such as Enzyme-Linked Immuno-Sorbent Assay (ELISA), however, the reliability of the technique was poor, conditional on the specificity and stability of the antibodies employed, and recognition was strongly influenced by changes induced in the antigen by processing treatments (Picariello et al., 2011). Differences in antibody specificity also make it difficult to quantify the amounts of the contaminating antigen present in the food, which is essential for safe food labeling practice. Multi-allergen detection and quantification of food allergens at trace levels is aimed (Heick, Fischer, & Popping, 2011; Heick et al., 2011) to ensure food safety to the allergenic consumers and be able to reinforce current legislation on the subject (Johnson et al., 2011).

Much of the focus in this area has been toward the identification of nut allergens, and in particular peanuts, in food products due the prevalence and seriousness of this allergy in humans and the wide use of nuts/peanuts as a source of protein in food products. The LC-MS/MS has been used to confirm the presence of a major peanut allergen Ara h1 in ice cream (Shefcheck & Musser, 2004) as well as to detect the presence of this allergen in dark chocolate (Shefcheck, Callahan, & Musser, 2006). Other peanut allergens including Ara h2 and Ara h3/4 were detected by LC-MS/MS, employing a triple quadrupole mass analyzer, in food products, such as rice crispy and chocolate-based snack foods (Careri et al., 2007). Several different nut allergens, including Ana o 2 from cashew-nut, Cor a 9 from hazelnut, Pru 1 from almond, Ara h3/4 from peanut, and Jug r 4 from walnut, were identified in cereals and biscuits employing LC–linear ion trap MS/MS (Bignardi et al., 2010). Capillary LC/Q-TOF (MS/MS) was also determined to be a valid approach to detect peanut allergens in processed peanuts (Chassaigne, Norgaard, & Arjon, 2007). Multiallergen detection of seven allergenic foods (five of plant origin) was feasible through shotgun and targeted SRM MS-based approaches (Heick, Fischer, & Popping, 2011; Heick et al., 2011).

The application of proteomics in the field of food allergy biomarker discovery and detection provides cost effective and sensitive techniques that will help to improve food allergy diagnosis, therapy, and allergenic risk assessment.

V. HOW WE ARE PIPING TODAY: AN ORGANIZATIONAL AND COMMUNITY-BASED APPROACH

Active involvement of organizations (scientific, government, and non-government organizations) is needed to tackle the threatening “the global food security and safety” problem. In the following section, we discuss organizational approaches dealing with food security issues at their own capabilities.

A. International Bodies Related to Food Security Issues

As the food security crisis is a trans-disciplinary social problem, the premier organizations directly working in this field are mainly societal in framework, that is—inter- and intra-governmental organizations, and non-governmental organizations. The largest among them is the United Nations (UN) dedicated to confront the challenges of the food security crisis at a worldwide level through numerous organizational approaches, like World Food Program (WFP), World Bank (WB), Food and Agricultural organization (FAO), and World Health Organization (WHO). There are several other initiatives too, and the majority of them are non-governmental in nature, like the Rockefeller Foundation, William and Flora Hewlett Foundation, and Bill and Melinda Gates Foundation.

The Consultative Group on International Agricultural Research (CGIAR) established a unique worldwide network of agricultural research centers coordinating and collaborating activities towards the improvement of global agriculture. The first two institutes established by CGIAR were the International Maize and Wheat Improvement Center (CIMMYT; http://www.cimmyt.org/) and International Rice Research Institute (IRRI; http://www.irri.org/). These organizations have specific scientific interests; but their impact on global food security is commendable starting with the Green Revolution innovations of the late 20th century that reduced the fraction of the world's hungry from half to less than a sixth, even as the population doubled from 3 to 6 billion. Rice is the most important food crop of the developing world and Asia accounts for 90% of the global rice production. Over the past 50 years, the research centers of IRRI are continuously and successfully working to improve the rice production quantity and quality worldwide. In an opportune move, in November 2010 IRRI launched a program called Global Rice Science Partnership (GRiSP; http://irri.org/our-science/global-rice-science-partnership-grisp), which is an exemplary, trail-blazing, tour-de-force initiative in coordinating rice research at a global level for optimal inputs and maximal gains towards enhancing rice productivity for food security and poverty alleviation. The core goal of GRiSP is captured in Figure 6, which illustrates basic and applied research, development, extension, and policy promotion centered around the three pillars of sustainability, economic, environmental, and societal benefits. The International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) is another organization that conducts agricultural research for development in Asia and sub-Saharan Africa with a wide array of partners throughout the world. The World Vegetable Center, previously known as Asian Vegetable Research and Development Center (AVRDC), is the international organization specifically dedicated to vegetable research and development. Two organizations working towards agricultural biodiversity and tropical agriculture are also worth noting, namely the Bioversity International (http://www.bioversityinternational.org/) and CIAT (International Center for Tropical Agriculture, http://www.ciat.cgiar.org/Paginas/index.aspx).

Figure 6.

Core goal of GRiSP enabling farmers to enter a virtuous circle (adapted from the GRiSP document by Dr. Berta Miro; http://www.ciat.cgiar.org/cgiar/Documents/GRiSP.pdf).

B. International Bodies Related to Proteomics Issues

In terms of the organizational initiatives in this specific domain, the Human Proteome Organization (HUPO) deserves to be mentioned first (www.hupo.org) (Legrain et al., 2011). Over the last 10 years, HUPO has been instrumental in formulating/initiating diverse coordinated programs, including the Human Proteome Project (HPP) and most recently an initiative on model organism proteomes (iMOP), globally for the sake of human health (for details, see www.hupo.org). However, the global community of plant biologists have been conspicuously absent from such initiatives till the recent past. Over several years, the MASCP was the only global initiative in plant proteomics, albeit specifically focused on the model plant Arabidopsis (www.mascproteomics.com). While the Plant Proteomics in Europe (EuPP), group under the European Proteomics Association (EuPA), was a popular and successful initiative, it is concentrated in the European region. To address this point and formulate a global platform for the plant proteomics scientist community, the INPPO (www.inppo.com) was conceived in 2008, and formally founded in 2011 with 10 initiatives (Agrawal et al., 2011, 2012b). Among the major aims of the organization, food security related issues are of top priority. We hope that in the coming future, INPPO initiated projects and programs can successfully help society to find some sustainable solution for global food security and safety (see also Weckwerth, 2011).

C. Networking: Time to Move United

United We Stand, Divided We Fall

The proverb is probably one of the oldest, but it is the ultimate truth behind human civilization. The concepts of unity and humanity induce us to organize and develop activities and approaches aimed to solve problems or issues faced by the population; whether it—be scientific or societal in nature. In the previous sections, we have discussed some of the approaches; which, in their own capacities, can have significant impact on global food security and safety, and also sustain human development into the future. However, despite these initiatives, the global food crisis is increasing every day. We need to unite the existing approaches, accelerate their activities, and create a global “network” with a singular aim, global food security and sustainability (Fig. 7). In this cyber age, the meaning of “networking” has become much broader. Different social, professional, and scientific networking sites in World Wide Web have given people an enormous virtual space, where they can not only share their data, but their insight and experiences too. To combat food security issues at an organizational level, we need more than an existing physical network, that is—a virtual network. It is true that most of the above-discussed organizations are focused on their own issues, objectives, agendas, and activities. It might not be a very easy job to make a physical connection/collaboration/network between them; but, a virtual network for sharing knowledge, experiences and activities is feasible, as all of them, at specific phases, share a common issue “food security and safety” (Fig. 7).

Figure 7.

There are multiple organizations mainly working for the fundamental, scientific, and societal solutions of solving the global food crisis in their own capacity. But, the increasing crisis in global food storage directly shows that this approach is not enough. We need something more, probably a network of all the organizational approaches. We believe that in this age of Internet communication, we should depend more on networking/sharing our knowledge. Details are in the main text.

VI. CONCLUDING REMARKS

The ever-growing human population and its demand for food were largely met by a “green revolution,” and all the biotechnological implications therein to improve seed quality and crop yield. These innovations, however, introduced new and critical problems, such as the isolation or dominance of high-yielding cultivars, in modern day agriculture. A higher demand for food production has forced the so-called “exploitive” agriculture to use more irrigation, fertilizers, and pesticides, and that degrades the natural soil fertility and species biodiversity. As a result, new threats in future food security, sustainability, and safety have appeared. Thus, there is need to set a long-term goal capable of maintaining a balance between human numbers and human capacity to produce food of adequate quantity, quality, and variety. The “evergreen revolution” coined by one of the pioneers of the green revolution, Dr. M. S. Swaminathan, relies on crop biotechnology.

A broader vision, able to integrate the advances in crop biotechnology, postharvest management, and smart food processing technologies might be the answer for a “sustainable green revolution” and food security. Advancements in proteomics technologies in the past decade have seen their tremendous application to burning issues on food security, analysis, safety, and human health as exemplified in this review. The application of proteomics approaches and integration of such into systems biology approaches seem to be the goal to aim for. It is time to come together, form an interdisciplinary global network, share our knowledge, and move together toward developing an efficient strategic roadmap for securing safe and nutritious food and at the same time increasing food production without damaging either biodiversity or the environment. Working together on multiple strategies simultaneously around the globe will be critical for a new and better future.

VII. ABBREVIATIONS

ALSs

acid-labile surfactants

AQUA

absolute quantification

CF

culture filtrate

CPLLs

combinatorial peptide ligand libraries

Cy

cyanine

1-DGE

one-dimensional gel electrophoresis

2-DGE

two-dimensional gel electrophoresis

DIGE

differential in-gel electrophoresis

ECD

electron capture dissociation

ELISA

enzyme-linked immunosorbent assay

EST

expressed sequence tag

EuPA

Europe Proteomics Organization

EuPP

plant proteomics in Europe

FTMS

Fourier transform mass spectrometry

HPE

high-performance electrophoresis

HPP

Human Proteome Project

HUPO

Human Proteome Organization

ICAT

isotope-coded affinity tag

iMOP

initiative on model organism proteomes

INPPO

International Plant Proteomics Organization

IPG

immobilized pH gradient

iTRAQ

isobaric tag for relative and absolute quantitation

LC

liquid chromatography

MALDI

matrix assisted laser desorption/ionization

MAMPs

microbe-associated molecular patterns

MASCP

Multinational Arabidopsis Steering Committee Proteomics Subcommittee

MDLC

multidimensional chromatography

MRM

multiple reaction monitoring

MS

mass spectrometry

MS/MS

tandem mass spectrometry

NBS-LRR

nucleotide-binding site leucine-rich repeat

NMR

nuclear magnetic resonance

OGE

off-gel electrophoresis

ORF

open reading frame

PCR

polymerase chain reaction

PMF

peptide mass fingerprints

PR

pathogenesis related

PTMs

posttranslational modifications

REMMA

reconstructed molecular mass analysis

RP

reverse phase

SCX

strong cation exchange

SDS–PAGE

sodium dodecyl sulfate

SILAC

stable isotope labeling with amino acids in cell culture

SNP

single nucleotide polymorphism

SRM

selective reaction monitoring

TSP

total soluble proteins

UPLC

ultra-performance liquid chromatography

ACKNOWLEDGMENTS

G.K.A. appreciates the Japan Society for the Promotion of Science (JSPS; ID Number S-10182) for his stay and research at Plant Genome Research Unit (NIAS, Tsukuba, Japan). AS acknowledges financial help from the Council of Scientific & Industrial Research (CSIR), New Delhi, India, in the form of CSIR—Senior Research Fellowship—during his stay and work at BHU. B.J.B. acknowledges DGAPA (grant IN212410) and CONACyT (grant 49735) which support proteomics studies in her laboratory. The research work in the laboratory of R.C. has been supported by the BBSRC (grant BB/H001948/1). R.C. and L.V.B. thank Jim Dunwell and Shridar Jambargi for providing hydroponically grown and labeled F. vesca plants. L.Z. thanks COFIN-PRIN 20087ATS57 “Food allergenes”. T.W. was supported by the Chinese transgenic project (grant 2008ZX08012-002). RR acknowledges the great support of Professors Yoshihiro Shiraiwa (Chairperson, Faculty of Life and Environmental Sciences, University of Tsukuba) and Seiji Shioda and Dr. Tetsuo Ogawa (Department of Anatomy I, Showa University School of Medicine) in promoting interdisciplinary research and unselfish encouragement. Authors acknowledge the INPPO platform for this initiative in bringing together scientists of different disciplines in constructing this review. Finally, given the vast amount of research in these disciplines and space limitations, many works could not be cited and discussed in this review.

Ancillary