Recent progress in research on small post-translationally modified peptide signals in plants


  • Communicated by: Mitsuhiro Yanagida


Peptide signaling plays a major role in various aspects of plant growth and development, as has been shown in recent biochemical, genetic and bioinformatic studies. There are over a dozen secreted peptides recognized in plants known to regulate cellular functions. To become functional, these secreted peptide signals often undergo post-translational modifications, such as tyrosine sulfation, proline hydroxylation, and hydroxyproline arabinosylation, and successive proteolytic processing. These types of ‘small post-translationally modified peptide signals’ are one of the major groups of peptide signals found in plants. In parallel with the discovery of peptide signals, specific receptors for such peptide signals were identified as being membrane-localized leucine-rich repeat receptor kinases. This short review highlights the recent progress in research on small post-translationally modified peptide signals, including our own research.


Recent biochemical, genetic, and bioinformatic analyses have shown that secreted peptides are important components in intercellular signals in plants as well as animals. In higher plants, a number of genes encoding small secreted peptides have been identified in the genome (Lease & Walker 2006; Silverstein et al. 2007; Ohyama et al. 2008) and some candidates for intercellular signals are expected to be among their products.

From a structural point of view, secreted peptide signals can be categorized into two groups: the ones generated by the addition of complex post-translational modifications that are followed by extensive proteolytic processing, and the ones characterized by the presence of multiple intramolecular disulfide bonds. The former peptides are called ‘small post-translationally modified peptides’, and the latter peptides are defined as ‘cysteine-rich peptides’. Small post-translationally modified peptides are a structurally characteristic group of peptide signals, including phytosulfokine (PSK) (Matsubayashi & Sakagami 1996), tracheary element differentiation inhibitory factor (TDIF) (Ito et al. 2006), CLAVATA3 (CLV3) (Fletcher et al. 1999; Ohyama et al. 2009) and root meristem growth factor (RGF) (Matsuzaki et al. 2010). These peptides are characterized by the small size of mature peptides (<20 amino acids) and the presence of post-translational modifications such as tyrosine sulfation, proline hydroxylation and arabinosylation. Post-translational modification is thought to affect peptide conformation through steric interactions with the peptide backbone, thereby modulating the binding affinity and specificity of peptides to target receptors.

Here, I review the progress of recent research on small post-translationally modified peptide signals, including our own research, focusing on their structural characteristics, their physiological functions and the mechanisms of modification (Table 1).

Table 1.   List of structurally characterized small post-translationally modified peptide signals in Arabidopsis
PSKInvolvement in growth and development (pleiotropic)
CLV3Regulation of stem cell fate in shoot apical meristem
CLE41/44/TDIFRegulation of vascular stem cell fate
RGF1Maintenance of root stem cell niche
PSY1Involvement in cellular proliferation and expansion
CEP1Involvement in root development?

Structural characteristics of small post-translationally modified peptides

Based on our own analysis, there are 979 putative genes for secreted peptides (SignalP score > 0.75) with an open reading frame size between 50 and 150 amino acids in The Arabidopsis Information Resource (TAIR7) genome annotation release (Ohyama et al. 2008). Although it is difficult to estimate the total percentage of secreted peptides that function as bioactive peptide signals, the presence of many ‘orphan receptors’ among receptor-like kinases and receptor-like proteins in Arabidopsis (Shiu & Bleecker 2001, 2003; Wang et al. 2008) suggests that there remains a substantial number of uncharacterized signals.

Accumulating evidence suggests that secreted peptide signals can be divided into two major groups based on their structural characteristics and biosynthetic pathways (Fig. 1). In general, secreted peptide signal genes are initially translated, followed by cleavage of the N-terminal signal peptide by signal peptidases to produce propeptides. The cleavage sites of N-terminal signal peptides can be predicted by SignalP software with a high degree of accuracy (Bendtsen et al. 2004).

Figure 1.

 Pathways of secreted peptide signal biosynthesis categorized according to their structural characteristics. There are two major groups of secreted peptide signals: small post-translationally modified peptides and cysteine-rich peptides. Small post-translationally modified peptides involve complex post-translational modifications followed by extensive proteolytic processing. In contrast, cysteine-rich peptides involve multiple intramolecular disulfide bonds, and several of these peptides also undergo proteolytic processing.

Small post-translationally modified peptides are characterized structurally by the presence of post-translational modifications mediated by specific modification enzymes and by their small size (<20 amino acids), a result of proteolytic processing (Fig. 1). These peptide signals include PSK (Matsubayashi & Sakagami 1996), CLV3 (Fletcher et al. 1999; Ohyama et al. 2009), CLAVATA3/ESR-related (CLE) peptides (Cock & McCormick 2001; Ohyama et al. 2009), TDIF (Ito et al. 2006), RGF1 (Matsuzaki et al. 2010), PSY1 (Amano et al. 2007) and C-terminally encoded peptide 1 (CEP1) (Ohyama et al. 2008).

Interestingly, the primary sequences of these peptides have common structural features, including multiple paralogous genes that primarily encode cysteine-poor secreted peptides of approximately 70–110 amino acids that show significant sequence diversity, with the exception of the conserved C-terminal domain, which corresponds to the mature peptide domains (Fig. 2). Amino-acid sequences with little conservation outside of the mature peptide domain implies that the regions removed by proteolysis are not under strong selection pressure, and thus, genes encoding paralogous secreted peptides with similar features may encode small post-translationally modified peptides. Indeed, CEP1 and RGF1 were identified in part by in silico screening of the peptide families with these characteristics (Ohyama et al. 2008; Matsuzaki et al. 2010).

Figure 2.

 Structures and amino acid sequences of precursor polypeptides and their mature peptides of several representative small post-translationally modified peptides. Deduced amino acid sequences are shown for (A) PSK, (B) CLV3 and CLE peptides including TDIF, (C) RGF1, (D) PSY1 and (E) CEP1, and their homologs. Underlined: domains encoding mature peptides; Black highlight: identical amino acid residues; Gray highlight: similar amino acid residues.

Another major group of secreted peptide signals are ‘cysteine-rich peptides’, which are characterized by the presence of an even number of cysteine residues that participate in the formation of intramolecular disulfide bonds (Fig. 1). This peptide signal group is represented by rapid alkalinization factor (4 cysteine residues), a peptide of 49 amino acids suggested to be involved in various aspects of plant development (Pearce et al. 2001b; Wu et al. 2007; Covey et al. 2010), stomagen (6 cysteine residues), a peptide of 45 amino acids that positively regulates stomatal density (Hunt et al. 2010; Kondo et al. 2010; Sugano et al. 2010), LUREs that are involved in pollen guidance (Okuda et al. 2009) and epidermal patterning factors (EPF1 and EPF2) that regulate epidermal cell patterning (Hara et al. 2007, 2009; Hunt & Gray 2009).

Post-translational modifications

In plants, there have been three types of post-translational modifications identified in small secreted peptide signals to date, namely tyrosine sulfation, proline hydroxylation and hydroxyproline arabinosylation.

Tyrosine sulfation

Tyrosine sulfation is a post-translational modification for peptides and proteins synthesized through the secretory pathway of most eukaryotes, including higher plants. This modification is mediated by tyrosylprotein sulfotransferase (TPST), a specific enzyme that catalyzes the transfer of sulfate from 3′-phosphoadenosine 5′-phosphosulfate (PAPS) to the phenolic group of tyrosine (Moore 2003). Although the peptide tyrosine sulfation motif has not been completely elucidated, the presence of an aspartic acid residue N-terminally adjacent to a tyrosine residue, namely the Asp–Tyr sequence, is known to be a minimum requirement for tyrosine sulfation in plants. Sulfation can also be enhanced with the presence of multiple acidic amino acids near this tyrosine residue (Hanai et al. 2000b).

As shown by our affinity purification studies by using PSY1-immobilized column, Arabidopsis TPST (AtTPST) is a 62-kDa transmembrane protein localized in the Golgi (Komori et al. 2009), and although it is expressed throughout the plant body, it is most highly expressed in the root apical meristem. Orthologs of AtTPST have been found in other higher plants, including rice and maize, and in the moss Physcomitrella patens, but not in yeast or animals. Although both animal and plant TPSTs catalyze identical sulfate transfer reactions using the same cosubstrate, PAPS, they have no sequence similarity. In addition to this difference, AtTPST is a type I transmembrane protein with a C-terminal transmembrane domain, whereas animal TPSTs are type II transmembrane proteins with N-terminal transmembrane domains (Beisswanger et al. 1998; Ouyang et al. 1998). These findings strongly suggest that the AtTPST and animal TPST genes evolved from separate ancestral origins through convergent evolution. Sulfated peptides play diverse roles in plant growth and development, as shown by studies of a defective mutant of AtTPST (tpst-1) in which a marked dwarf phenotype was displayed with stunted roots, loss of root stem cell maintenance, considerable decrease in meristematic activity, pale green leaves, reduction in higher order veins and early senescence (Komori et al. 2009; Zhou et al. 2010; Matsuzaki et al. 2010).

Proline hydroxylation

Proline hydroxylation is mediated by prolyl 4-hydroxylase (P4H). P4H, a member of a family of 2-oxoglutarate-dependent dioxygenases that requires 2-oxoglutarate and O2 as cosubstrates (Myllyharju 2003), is a type II membrane protein with an N-terminal transmembrane domain and is localized in both the ER and Golgi. There have been 13 P4H genes identified in Arabidopsis to date (Hieta & Myllyharju 2002; Tiainen et al. 2005; Yuasa et al. 2005; Velasquez et al. 2011). Although some sequence motifs have been reported for efficient proline hydroxylation (Shimizu et al. 2005), there has been no consensus sequence determined for proline hydroxylation of secreted peptide signals in plants.

Hydroxyproline arabinosylation

Our biochemical analysis showed that hydroxyproline residues of several secreted peptide signals are further modified with an O-linked l-arabinose chain (tri-arabinoside) via β-1,2-bonds (Amano et al. 2007; Ohyama et al. 2009). In general, O-glycosylation involves glycosyltransferase-catalyzed successive addition of nucleotide-activated sugars in the Golgi. Biosynthesis of Hyp-bound β-1,2-linked triarabinoside is suggested to involve two distinct arabinosyltransferases: one that mediates the formation of a β-linkage with a hydroxy group of Hyp (hydroxyproline arabinosyltransferase), and one that mediates the β-1,2-linkage between arabinose residues (arabinosyltransferase). Although recent chemical genetic screening results have indicated that XEG113 (At2g35610) may encode β-1,2-arabinosyltransferase (Gille et al. 2009), there have been no reports on hydroxyproline arabinosyltransferase.

Tyrosine-sulfated peptide signals

Tyrosine-sulfated peptide signals have been found both in plants and animals (for details of animal peptides, see review, Moore 2003). To date, there have been three tyrosine-sulfated peptide signals, PSK, PSY1 and RGF, identified in plants. PSK, a 5-amino-acid secreted peptide containing two sulfated tyrosines (Fig. 2A), was identified by our group as a growth-promoting signal involved in the ‘conditioning effect’ of plant cell cultures (Matsubayashi & Sakagami 1996). In general, plant cell proliferation is suppressed under low-density culture conditions. However, addition of ‘conditioned medium’ derived from a separate rapidly growing highly dense culture has been shown to promote cell proliferation, suggesting that individual cells in culture secrete growth-promoting signal into the medium. This growth-promoting signal was purified and shown to be produced by post-translational sulfation by TPST and proteolytic processing of ≈80-amino-acid precursor peptides (Fig. 2A) (Yang et al. 1999; Matsubayashi et al. 2006). This small sulfated peptide was named PSK. It was later shown that PSK also promotes in vitro tracheary element differentiation of Zinnia mesophyll cells (Matsubayashi et al. 1999), somatic embryogenesis (Kobayashi et al. 1999; Hanai et al. 2000a; Igasaki et al. 2003) and pollen germination (Chen et al. 2000). PSK genes are widely expressed in a variety of tissues and have been found to be up-regulated by wounding (Matsubayashi et al. 2006). Furthermore, biochemical analysis showed that PSK is recognized by a membrane-localized leucine-rich repeat receptor kinase (LRR-RK), PSKR1 (Matsubayashi et al. 2002), and that disruption of PSKR1 and its two homologs in Arabidopsis causes pleiotropic growth defects, such as short roots, smaller leaves and early senescence (Matsubayashi et al. 2006; Amano et al. 2007).

The second sulfated peptide, PSY1, is a secreted glycopeptide of 18 amino acids containing one sulfated tyrosine residue (Fig. 2D) and was identified by exhaustive tyrosine-sulfated peptide analysis in plant cell culture media by our group (Amano et al. 2007). The expression pattern and physiological activity of PSY1 is similar to that of PSK. PSY1 is expressed in various Arabidopsis tissues and promotes cellular proliferation and expansion at nanomolar concentrations.

The third sulfated peptide, RGF, is a secreted peptide of 13 amino acids that is involved in root stem cell niche maintenance in Arabidopsis (Matsuzaki et al. 2010). RGFs are produced from post-translational sulfation and proteolytic processing of precursor peptides of ≈100 amino acids (Fig. 2C). RGF1 was identified by our group through a search of sulfated peptides that could recover root meristem defects in tpst-1 mutants. As mentioned earlier, tpst-1 mutants show a stunted root phenotype with a loss of root stem cell maintenance and a considerable decrease in meristematic activity. This approach assumes that the tpst-1 mutant phenotype is deficient in all functional tyrosine-sulfated peptide biosynthesis. Importantly, PSK and PSY1, two sulfated peptide signals, promoted cell elongation activity of tpst-1 mutant roots but did not restore meristematic activity, which indicated that tyrosine-sulfated peptides other than PSK and PSY1 are involved in root stem cell maintenance and meristematic activity regulation.

Root meristem growth factor family peptides are expressed mainly in the stem cell area and the innermost layer of central columella cells and diffuse through the apoplast into the meristematic region. RGF peptides regulate root development by stabilizing PLETHORA transcription factor proteins, which are specifically expressed in root meristem and mediate root stem cell niche patterning (Galinha et al. 2007).

Proline-hydroxylated peptide signals

Hydroxyproline (Hyp) residues have been found in defense-related peptide TobHypSys (Pearce et al. 2001a), PSY1 (Amano et al. 2007), TDIF (Ito et al. 2006), CEP1 (Ohyama et al. 2008), CLV3 (Kondo et al. 2006; Ohyama et al. 2009), CLE2 (Ohyama et al. 2009) and RGF1 (Matsuzaki et al. 2010). Among these, further modification with l-arabinose occurs for Hyp residues of PSY1, CLV3 and CLE2, as will be described later.

Tracheary element differentiation inhibitory factor, a peptide of 12 amino acids with two hydroxylated proline residues (Fig. 2B), was initially identified as an inhibitory factor of transdifferentiation of dispersed Zinnia (Zinnia elegans L.) mesophyll cells into tracheary elements (the main conductive cells of the xylem) (Ito et al. 2006). The TDIF peptide promotes cell division in vitro and can suppress xylem cell development at subnanomolar concentrations. The TDIF sequence is homologous to the Arabidopsis CLE41 and CLE44 12 C-terminal amino acids sequence. In vivo studies have showed that the TDIF/CLE41/CLE44 peptides, which are expressed in phloem and neighboring cells, are recognized by TDR/PXY, an LRR-RK located in the plasma membrane of procambial cells, and control the procambial cell fate (Hirakawa et al. 2008). This signal suppresses procambial cell xylem cell differentiation and promotes their proliferation.

Another Hyp-containing peptide is CEP1, a peptide of 15 amino acids with two Hyp residues (Fig. 2E), which was initially identified by in silico gene screening for structural features of prepropeptides of known small post-translationally modified peptides by our group (Ohyama et al. 2008). All known small post-translationally modified peptide signals are encoded by multiple paralogous genes, and the primary products of these genes are cysteine-poor secreted peptides of approximately 70–110 amino acids with significant sequence diversity, with the exception of the short conserved domains of the mature peptide sequences. CEP1 family peptides are one of such small post-translationally modified peptides and are mainly expressed in the lateral root primordia and significantly arrest root growth when overexpressed or externally applied. Therefore, it is a strong candidate as a novel peptide signal with possible involvement in root development.

Hydroxyproline-arabinosylated peptide signals

Hydroxyproline arabinosylation is a post-translational modification that is unique to plants. The first structurally characterized arabinosylated peptide signal was PSY1, a secreted glycopeptides of 18 amino acids containing one sulfated tyrosine residue (as described above) and two Hyp residues, one of which is further modified with three l-arabinose residues (Amano et al. 2007) (Fig. 2D).

The second arabinosylated peptide, CLV3, is a 13-amino-acid peptide signal that regulates stem cell fate in Arabidopsis shoot apical meristem (Fletcher et al. 1999; Ohyama et al. 2009) (Fig. 2B). This mature CLV3 peptide was identified by our group in the culture medium of whole-plant submerged cultures of CLV3-overexpressing Arabidopsis plants. CLV3 acts as a negative regulator of stem cell maintenance by repressing WUS, which encodes a homeodomain transcription factor that is expressed in the organizing center and promotes the identity of stem cells (Fletcher et al. 1999). CLV3 is recognized by three functionally redundant receptors; CLV1 (Clark et al. 1997; Ogawa et al. 2008), CLV2/CORYNE complex (Kayes & Clark 1998; Miwa et al. 2008; Muller et al. 2008; Guo et al. 2010) and RPK2 (Kinoshita et al. 2010). Arabinosylated CLV3 peptide interacts more strongly with CLV1 than nonarabinosylated forms (Ohyama et al. 2009).

CLAVATA3 belongs to a family of secreted peptides, designated CLE peptides, which possess a conserved 14-amino-acid domain (called CLE domain) at or near their C-terminus (Cock & McCormick 2001). Mature structure of one of such CLE peptides, CLE2, was identified by our group to be a peptide of 12 amino acids with three l-arabinose residues (Ohyama et al. 2009) (Fig. 2B). Interestingly, the CLE2 glycopeptide binds CLV1 with nanomolar binding affinity. These data suggest that CLE2 functions as a ligand for CLV1, although physiological function of CLE2 in Arabidopsis plants has yet to be fully elucidated. In legume plants, CLE2 orthologs, LjCLE-RS1 and LjCLE-RS2, are thought to act as root-derived mobile signals that systematically regulate nodule numbers through the CLV1 ortholog, HAR1 receptor kinase (Okamoto et al. 2009). Binding of the CLE2 glycopeptide to CLV1 strongly supports the molecular basis of this autoregulation model.

Proteolytic processing

In animals and yeasts, biosynthesis of small peptide signals often involves the proteolytic processing of precursor polypeptides to produce mature functional peptides. Primary sequences of many animal peptide signals have been examined and have showed that cleavage of the precursor polypeptide occurs on the C-terminal side of paired basic amino acids. In animals, subtilisin/kexin-like prohormone convertases catalyze this cleavage (Rehemtulla & Kaufman 1992).

Proteolytic processing is also critical in plants for the biosynthesis of small post-translationally modified peptides; however, the peptide processing mechanisms are different between plants and animals. First, instead of a paired basic amino acid motif adjacent to the mature peptide domain within the precursor polypeptides (Fig. 2), there is processing enzyme activity that cleaves the N-terminal side of a single Arg residue of CLV3 precursor polypeptide, which has been detected in crude cauliflower plant extract in vitro (Ni & Clark 2006; Ni et al. 2011). This suggests that substrate specificity of plant subtilases is distinct from the substrate specificity of animal subtilisin/kexin-like prohormone convertases. Second, AtSBT1.1, one of the Arabidopsis subtilases, is responsible for the initial processing of PSK4 in vivo (Srivastava et al. 2008), but its processing site is upstream of the mature peptide domain, suggesting that basic amino acids may be the initial processing sites, but that they do not always directly define the boundary of the mature peptide domain. Plant subtilases are more structurally similar to prokaryotic degrading-type subtilases than to animal processing-type subtilases, most likely due to the lack of the P-domain which is characteristic of animal enzymes (Berger & Altmann 2000). Proteolytic processing of plant peptide signals may involve a number of complex steps, such as initial cutting and further trimming of the peptides.

How the final processing site is defined by this seemingly ambiguous proteolysis system in plants remains unclear. One possibility is that the mature peptide domain may have conferred resistance to proteolytic digestion because of the presence of post-translational modifications or multiple Pro residues contained in the mature peptide domains. The regulatory mechanisms of proteolytic processing of peptide signals may be more complex in plants than in animals.

Future perspective

There are several possible novel approaches to identify novel small peptide signals in plants. From a structural point of view, small post-translationally modified peptide signals are produced after post-translational modification and subsequent proteolytic processing of peptides through a secretory pathway. These processes involve, at the least, a post-translational modification enzyme and a proteolytic processing enzyme. Cosubstrates are also required for post-translational modification, including sulfation donor PAPS and arabinosylation donor UDP-l-arabinose synthesized using ATP. Thus, considerably more energy is required for the biosynthesis of small post-translationally modified peptides than is required for normal proteins and peptides. Nevertheless, many post-translationally modified peptides have been conserved throughout evolution, suggesting that these energy-expensive modified peptides have the physiological benefits worth their high energy cost in plants. As such, post-translational modifications, as well as proteolytic processing, can be indicative of biologically active peptides. Indeed, the peptidomics approach targeting sulfated peptides has successfully identified a novel peptide signal, PSY1 (Amano et al. 2007), and an in silico gene screening approach has showed a Hyp-containing peptide signal candidate, CEP1 (Ohyama et al. 2008), demonstrating that post-translationally modified peptides are good candidates as peptide signals.

The phenotypic analysis of loss-of-function mutants of post-translational modification enzymes is another approach for the research into post-translationally modified peptides. Post-translational modification enzymes recognize particular sequences within multiple target peptides; thus, their loss-of-function should be reflected in defective biosynthesis of modified peptides, which could show the presence of novel post-translationally modified peptide signals. Indeed, a novel peptide signal, RGF1, was successfully identified in this way by a search of sulfated peptides that could recover root meristem defects in the tpst-1 mutant (Matsuzaki et al. 2010). In this context, identification of hydroxyproline arabinosyltransferase and phenotypic analysis of loss-of-function mutants of this enzyme could facilitate elucidation of the functions of arabinosylated peptide signals in plants.


Our research was supported in part by the Funding Program for Next Generation World-Leading Researchers from the Japan Society for the Promotion of Science (No. GS025).