Physical, chemical and biological stress factors, such as microbial infection, upregulate the transcription levels of a number of plant genes, coding for the so-called pathogenesis-related (PR) proteins. For PR proteins of class-10 (PR-10), the biological function remains unclear, despite two decades of scientific research. PR-10 proteins have a wide distribution throughout the plant kingdom and the class members share size and secondary structure organization. Throughout the years, we and other groups have determined the structures of a number of PR-10 proteins, both in the crystalline state by X-ray diffraction and in solution by NMR spectroscopy. Despite the accumulating structural information, our understanding of PR-10 function is still limited. PR-10 proteins are rather small (~ 160 amino acids) with a fold consisting of three α helices and seven antiparallel β strands. These structural elements enclose a large hydrophobic cavity that is most probably the key to their functional relevance. Also, the outer surface of these proteins is of extreme interest, as epitopes from a PR-10 subclass cause allergic reactions in humans.
phenolic oxidative coupling protein from Hypericum perforatum
leucine-rich-repeat protein 1
major latex protein
type 2C protein phosphatase
steroidogenic acute regulatory protein
steroidogenic acute regulatory protein-related lipid transfer
vegetative storage protein
Plants have developed several defense mechanisms for protection against ubiquitous pathogenic activity. Among other mechanisms [1, 2] they can produce antibiotic compounds, called phytoalexins [3-8], and the expression of several genes is induced by various types of pathogens, including viruses, bacteria and fungi, or alternatively by chemicals such as ethylene and salicylic acid which emulate a pathogen infection, thereby inducing a stress response . The induced genes encode the so-called pathogenesis-related (PR) proteins which participate in a general defense mechanism [10, 11]. The current classification divides the PR proteins into 17 classes, PR-1–PR-17 [12, 13]. This organization clusters proteins with similar biological activity or physicochemical properties and sequence homology. These proteins do not constitute a superfamily but rather a collection of unrelated proteins commonly involved in the defense system .
Despite numerous studies, the function of some PR representatives remains elusive. In particular, the role of PR-10 members is very poorly understood. Since they are expressed when plants encounter pathogenic or environmental stresses, they are suggested to have a protective role. However, it is important to point out that some PR-10 members are also constitutively expressed, which is indicative of their more general biological role in plant development [15-17]. PR-10 proteins are coded by multigene families. This is probably the basis of the multifunctional aspect of these proteins, which evolved anciently and had time to acquire mutations and different functions in a process called protein promiscuity [18-20]. To date, over 100 PR-10 members have been identified in both monocotyledonous and dicotyledonous flowering plants of more than 70 species [11, 21, 22]. They are small (154–163 amino acids), slightly acidic  and resistant to proteases. Generally, PR proteins either are localized in the vacuoles or are extracellular; however, the PR-10 members are generally intracellular and cytosolic [23-27]. Consequently, they are also called intracellular PR proteins, although this may not always be true [28, 29].
There is no evidence of a particular role in the plant cell system for PR-10 proteins but their conserved sequence motifs and the fact that they are spread throughout the plant kingdom suggest a general and indispensable function . PR-10 proteins are structurally unrelated to any other class of PR proteins, despite their initial description as PR-1 proteins [31-33].
The first published mention of a pr-10 gene referred to the results of treatment of parsley cell suspensions with a fungal elicitor [23, 31]. Shortly thereafter, based on ~ 50% sequence similarity, common allergens present in birch pollen were also included in the PR-10 class  as a subclass (Bet v 1). Subsequently, other subclasses were reported and their members were characterized structurally. All structural information available to date will be analyzed in this review. Detailed structural characterization of a set of related proteins combined with elusive biological function is not typical in structural biology. Usually, the biochemical role is well established and only later are structural models elucidated to confirm or explain the physiological action. With the PR-10 proteins, the situation is exactly opposite. While a large number of NMR and high-resolution crystal structures are known, some in complexes with ligands, the function of the PR-10 proteins is still under debate and requires further studies.
Several review articles have been published, including papers on the PR-10/Bet v 1 subclass analysis [89, 90] and on the expression and phylogenetic clustering of PR-10 proteins with speculations about their possible biological functions . However, a detailed overview of the structural features of these proteins has not been presented. Such a summary is very much warranted, especially since many PR-10 homologs have been characterized structurally, including the very recent structures of members of additional subclasses [92-94]. This new information has prompted us to prepare this structurally oriented review, with emphasis on possible insights into the functional aspects of this mysterious class of proteins.
PR-10 taxonomy: subclasses
The first reference to a pr-10 gene from parsley established the PR-10 class of PR proteins [23, 31]. The parsley protein and members of the same subclass are referred to as ‘classic’ PR-10 proteins.
Shortly afterwards, based on sequence homology to classic PR-10 proteins (~ 50% identity), common allergens found in birch pollen , celery , apple  and other fruits and vegetables were also included in the PR-10 class. Some are induced by pathogens .
Another group of homologs, called major latex proteins (MLPs), found in the latex of some plants including opium poppy [97, 98], bell pepper , melon , raspberry , soybean , strawberry , cucumber , sugar beet , Arabidopsis thaliana , peach  and ginseng , were also classified as PR-10 proteins despite low (~ 25%) sequence identity. Classic PR-10 proteins and MLPs generally do not coexist in the same plant species .
Cytokinin-specific binding proteins (CSBPs) were structurally confirmed as a PR-10 subclass despite marginal (<20%) sequence identity . Moreover, classic PR-10 proteins were found to form complexes with cytokinins [110, 111], flavonoids  and brassinosteroid analogs .
Recently, there has been a surge of interest in the biochemistry of hormonal control in the plant kingdom, resulting in the discovery of receptors for abscisic acid (ABA) [113, 114] and for gibberellins [115-117], as well as in the identification of other phytohormone-binding proteins . In this context, the PR-10 proteins have emerged in the spotlight again, as their folding canon was found in the ABA receptor family known as PYR/PYL/RCAR (pyrabactin resistance/PYR-like/regulatory component of ABA response).
Two additional proteins with reportedly enzymatic function were also classified as PR-10 proteins, namely (S)-norcoclaurine synthase (NCS)  and the phenolic oxidative coupling protein (Hyp-1) from Hypericum perforatum . NCS enzymes are involved in benzylisoquinoline alkaloid biosynthesis, catalyzing a Pictet–Spengler condensation of dopamine and 4-hydroxyphenylacetaldehyde to (S)-norcoclaurine . The proteins share 28%–38% sequence identity with classic PR-10 proteins [119, 122]. Hyp-1 was reported to catalyze the condensation of two emodine molecules to the bioactive naphthodianthrone hypericin  but recently such a reaction has been questioned [94, 123, 124]. Hyp-1 shows ~ 40% sequence identity with classic PR-10 proteins .
The PR-10 members are similar in size, total charge and to some extent amino acid sequence. The deduced sequence ranges from 154 to 163 residues with molecular masses around 17 kDa [22, 125]. Pr-10 genes usually consist of two exons interrupted by a conserved positional intron of 76–359 bp [22, 69, 77]. The open reading frame of pr-10 genes ranges from 465 to 480 bp [22, 125].
Amino acid sequence alignments of PR-10 proteins clearly show the most divergent and most conserved segments (Fig. 1). The most variable area, located at the C terminus, is particularly conspicuous in the N-terminal half of helix α3 of even very close homologs (Fig. 1B). The most conserved glycine-rich loop L4 is preserved even in distant homologs, e.g. CSBP (Fig. 1A). The glycine-rich loop with the sequence EG(D/N)GG(V/P)G(T/S), positions 45–52 in parsley PR-10.1, therefore constitutes a signature motif of PR-10 proteins.
MLPs show less conservation in the glycine-rich loop. In all proteins except the A. thaliana At1g24000.1 protein, where the loop is absent altogether, the conservation is limited to GxxxxxG, where the third residue in this motif is usually a tryptophan. Nevertheless, MLP genes are intervallic with an intron confirming their classification as PR-10 members . Another divergence from the standard PR-10 profile is the existence of two-domain MLP homologs. For example, MLP28 has two domains that share 93% identity and are predicted to have arisen from a tandem gene duplication within A. thaliana . Its paralog MLP34 is another member with two domains . A search for cytokinin binding proteins in moss revealed a two-domain UBP34 protein with the domains displaying 45% sequence identity .
Multiple sequence alignments reveal the uniqueness of the NCS sequences due to the presence of N- and C-terminal extensions  (Fig. 1A). While PR-10 proteins have been associated with cytosolic localization, the N-terminal extension of NCS is a putative signal peptide and suggests a subcellular localization. Benzylisoquinoline alkaloid biosynthetic enzymes are known to localize with the endoplasmic reticulum, and the NCS N-terminal extension also targets this enzyme to this compartment [119, 127].
Overview of PR-10 proteins with known 3D structure
A growing number of PR-10 proteins have their 3D structure determined (Table 1). At present, two subclasses are structurally more thoroughly characterized, namely classic PR-10 proteins from yellow lupine and the Bet v 1 group (allergen subclass) from white birch. Overall, however, all the PR-10 subclasses have at least one representative with known structure. The classic PR-10 and Bet v 1 structural examples, together with the remaining structures of CSBP, MLP, NCS and Hyp-1, form a broad platform for structural comparisons.
Table 1. PR-10 protein structures in the PDB. NDSB, 3-[benzyl(dimethyl)ammonio]propane-1 sulfonate
The PR-10 fold consists of a conserved, highly curved, seven-stranded antiparallel β-sheet embracing in a palm-like grip a long C-terminal α helix (α3). Two additional consecutive short helices (α1, α2) complete the scaffold creating a hydrophobic void where most globular proteins have a hydrophobic core (Fig. 2). The β-sheet edges are formed by strands β1 and β2, which are separated in a right-handed crossover by the two short α helices. The consecutive strands β2–β7 are connected by five hairpin loops bringing β7 back into contact with β1. The presence of hairpin connections leads to an accumulation of loops at both ends of the β-sheet. Of particular importance are the odd-numbered loops L3–L9, bordering the main entrance to the internal cavity. A secondary entrance is located on the other side of helix α3, where it forms a lip with strand β1. Helices α1 and α2 form a V-shaped support for the C-terminal part of helix α3, which crosses the β-sheet scaffold from the point of its entry at L9 (Fig. 2B). MLPs show slight variations from this canonical pattern as described below.
The internal cavity is very large, up to 3900 Å3 (Table 1). Its largely hydrophobic interior is decorated with a number of polar residues. In addition to the two entrances, there are also smaller openings leading to the lumen of the pocket.
The regularity of the β-sheet is distorted by β-bulges which endow it with a baseball-glove shape. There are from three to eight conserved β-bulges in different PR-10 proteins, confirming their importance for the structure of the β-sheet. The overall geometric result is a stable, rigid and highly curved β-sheet (Figs 2B and 3A).
The two MLP structures known so far, MLP28 and At1g24000.1, have different β-sheet organization. MLP28 lacks strand β2 (i.e. it has a six-stranded β-sheet) due to the presence of a long, flexible loop that connects helix α1 with strand β3, thereby eliminating helix α2 from MLP28 topology (Fig. 2A). The differences between At1g24000.1 and other PR-10 homologs are even more prominent, due to a unique sequence profile where 33 residues following α1 are missing, eliminating helix α2 and strands β2 and β3; At1g24000.1 therefore contains a five-stranded β-sheet (Fig. 1A).
The C-terminal α helix α3
In contrast to the structurally invariant β-sheet, the C-terminal α helix shows large conformational variability despite being a key element of the PR-10 fold. When all PR-10 structures in the Protein Data Bank (PDB)  are compared, it is evident that the conformation and coiling of helix α3 differ among the homologs (Fig. 3A).
The most pronounced differences can be summarized as (a) a different ‘angle of entrance’ linked with variable conformation of the connecting loop L9, (b) bending in the middle of the helix (up to 60°) leading to its collapse into the cavity, (c) axial rotation, which can be gauged by a well-conserved C-terminal aromatic residue (Tyr148 in LlPR-10.1A, Fig. 1), and (d) sliding along its axis. It has been postulated that the bending deformations result from conformational changes that occur in order to accommodate/release ligands and an interesting example of this is noted in a comparison of the LlPR-10.2A/LlPR-10.2B pair . The coil sliding effect is possibly generated by the insertion of additional residues in the connecting loop L9, which is visible in pairwise sequence alignments of Bet v 1 with LlPR-10.1A  or with Gly m 4 . Another difference is found at the C-terminal helix of NCS, which is longer and divided into two independent sections connected by an extended five-residue stretch (Fig. 1A) .
Helices α1 and α2
The two short helices α1 and α2 close the hydrophobic cavity defined by the remaining structural elements, i.e. the β-sheet and the C-terminal helix α3. They are present in all homologs of known structure (Fig. 3A) except for the MLPs where the α2 helix is absent. Most PR-10 homologs have a highly conserved proline residue near the beginning of helix α1 and one or two around the end of helix α2.
The internal cavity
The most intriguing feature of the 3D structure of PR-10 proteins is the internal cavity, existing despite a relatively compact fold and short sequence. The cavities are in general hydrophobic but some polar residues are also pointing into their lumen. The cavity can be reached by several openings. Not all proteins have similar entrances, and it is not completely clear if the differences reflect different gating properties or are consequences of crystal packing. The main entrances localize between α3 and the wedge of loops L3, L5 and L7, and between α3 and β1. Some homologs have an additional small entrance between loops L2 and L4. Api g 1 has a unique entrance between β2 and α2. In the LlPR-10.2B structure in complex with trans-zeatin (hereafter zeatin) the larger entrance is located close to and connected with another hole between β5, L7 and β6. This additional opening is clearly enlarged by disruption of the antiparallel β-sheet structure by a water molecule, two hydrogen bonds away from the end of the β-sheet. The larger of the entrances is in most cases framed by α3 and the cluster of loops L3, L5, L7, although some homologs do not use one of the surrounding loops. The entrance can be gated by a hydrogen bond(s), as in LlPR-10.1B (Arg137…Ser64).
Several homologs were structurally described in complexes with various ligands, and in all cases ligand molecules were found inside the cavity. There are a few cases where additional ligand molecules are found outside the cavity, at intermolecular sites, but they are most probably crystallographic artifacts. Where good resolution was achieved, water molecules were found in the interior of the cavity, both in apo structures and in protein–ligand complexes, where they mediate the hydrogen bonding networks. Interestingly, NMR chemical shift mapping indicates that the MLP protein At1g24000.1, despite its shorter sequence and lack of two β strands and one α helix, binds inside its cavity progesterone, a steroid-type molecule .
Due to the large conformational and sequence variability of the C-terminal helix, the cavities vary in shape and have remarkably different volumes (Table 1). The best example is the large difference observed between the two close homologs LlPR-10.2A and LlPR-10.2B. Despite very high 91%/96% sequence identity/similarity, they have cavities (1050 and 3830 Å3, respectively) with a volume difference of ~ 2800 Å3 (Table 1).
It is not clear if the conformational arrangement of the C-terminal helix contributes directly to the alteration of the cavity or if it only reflects changes effected by the presence of ligands. In the case of LlPR-10.2B, which was co-crystallized with two different ligands, it seems that indeed the presence and identity of ligands alter the shape and volume of the cavity .
It was realized early on that PR-10 sequences are quite diverse. However, one particular section is highly conserved in all homologs. This segment, loop L4 linking strands β2 and β3, contains several glycine residues and is termed the glycine-rich loop (Fig. 1). It shows a remarkable sequence similarity to the P-loop, also known as the phosphate-binding loop, found in nucleotide-binding proteins . However, the PR-10 proteins lack any affinity for ATP , and in fact the glycine-rich loop is conformationally different from the P-loop . Strikingly, despite the high glycine content, the glycine-rich loop is the most rigid structural element in the PR-10 fold.
A suggestion that L4 could be a putative cytokinin binding site [129, 132] was later shown to be incorrect [109, 110, 112, 132]. New findings regarding lipid binding may explain the conservation of the glycine-rich loop. Mattila and Renkonen modeled several putative amphiphilic and lipid ligands in the Bet v 1 cavity and showed that lipid binding could position polar heads in the vicinity of the glycine-rich loop .
Considering that glycine endows polypeptide chains with high flexibility, it is remarkable that the L4 loop is structurally well defined and shows extraordinary rigidity, as demonstrated by excellent electron density and low B factors. The rigidity of the loop is maintained by a pattern of three hydrogen bonds between the Oγ1 atom of Thr/Ser52 and the main-chain N–H groups of Asn46, Gly47 and Gly48 (Fig. 4A).
In Cα superpositions of the entire molecules, the glycine-rich loop as a whole appears to be slightly shifted in different structures. However, when only the loop region is aligned, the match is excellent (Fig. 4A). This is best illustrated by a comparison of LlPR-10.1A and molecule B of holo NCS, where the loop is shifted by 5.4 Å when the whole molecules are aligned, while the rmsd value is only 0.4 Å for the loop atoms alone.
Involvement in metal coordination
Loops L3 and L9 have been reported to bind metal ions in some PR-10 structures. Usually only one of the loops is engaged, but in the complex of LlPR-10.2B with the synthetic cytokinin diphenylurea (DPU) both loops coordinate Na+ cations (Fig. 4B). The structures show a clear preference for Na+ which appears to be the cation with the highest affinity for PR-10 proteins. With high-to-atomic resolution diffraction data, identification of the metal type by the bond valence method [134, 135] is usually very reliable. Calcium ion was found in only one structure so far, namely at the L3 loop of the LlPR-10.2B complex with zeatin . Since Ca2+ was not present in any of the purification or crystallization buffers, it had to be acquired at the protein expression stage. The fact that it remained bound to the protein throughout the purification process may be indicative of its high affinity for LlPR-10.2B.
Stabilization of the N terminus
Loop L9 is also utilized to stabilize the N terminus of PR-10 proteins. It was observed that the N-terminal –NH3+ group, in addition to being involved in the interactions that bring strand β7 back into hydrogen-bond contact with β1, is also engaged in hydrogen bonding with residues in loop L9  (Fig. 4C). One exception is observed in the LlPR-10.2B protein in complex with DPU, where the Na+ coordination by residues of loop L9 pushes the first two residues of the β1 strand away from the β-sheet. Nevertheless, the electron density shows no disorder for the N-terminal part of the protein .
Potential gates to the cavity
The cavities enclosed within PR-10 structures can be reached by multiple openings. Several loops are engaged in these openings, perhaps functioning as gates. In particular loops L3, L5 and L7 shape the larger entrance, but also loops L2 and L4 are involved.
Structural comparisons of various PR-10 structures
Structural information regarding PR-10 proteins has been accumulating very quickly in recent years, creating a large base for detailed comparisons. Taking into account that in several cases more than one protein molecule is found in the asymmetric unit, the amount of structural information is by now very large (Table 1).
Despite the same folding canon, superposition of the PR-10 structures reveals very significant structural differences. They are mainly visible at the C-terminal helix α3, which displays different axial shifts as well as a variable degree of deformation at the center and at its N-terminal connection with loop L9 (Fig. 3A). Also, the volume of the internal cavity, formed with the participation of α3, displays a remarkable variability (Table 1).
For the purpose of this analysis, a uniform set of rmsd values characterizing pairwise Cα superposition has been generated using the embl-ebi server . The volume of the cavities was calculated by the surfnet program , using a 1.3 Å probe. The first model from the NMR ensembles and all chains from the crystallographic structures (after removal of double conformations and hydrogen atoms) were used for the calculations.
Differences between subclasses
Structural superposition of representatives of all PR-10 subclasses indicates that, with the exception of MLPs, all other members retain the canonical structure with seven β strands and three α helices. On the other hand, NCS displays a longer C-terminal helix composed of two helical segments joined by a linker. In Cα superpositions, the PR-10 structures revealed lower rmsd values within subclasses (Fig. 3B and Table S1).
Apo versus holo PR-10 proteins: conformational changes upon ligand binding
There are only a few cases allowing a comparison of the free-form and natural-ligand-bound structures. In one of those examples, crystals of the NCS protein were soaked with the dopamine substrate or with the nonreactive substrate analog 4-hydroxybenzaldehyde. Comparison with the apo structure  shows no significant conformational changes, with an rmsd of ~ 0.4 Å (Table S1). Since the residues involved in ligand binding were identified as necessary for the enzymatic reaction, it is possible that the NCS structure is not affected by the presence of the ligands in the cavity. Recently, the structures of Bet v 1a in complex with the natural ligands naringenin (flavonoid) and kinetin (cytokinin) were solved . The authors also report an ‘apo’ structure for Bet v 1a in the same communication, although two MPD ((4S)-2-methyl-2,4-pentanediol) molecules are present inside the protein cavity. For this review and in line with our structure of Hyp-1 , where polyethylene glycol (PEG) molecules were found in the cavity, we shall consider these cases as non-specific complexes which nonetheless accentuate the binding capability of PR-10 proteins. Fortunately, the Bet v 1a complexes with naringenin and kinetin can be compared with a proper apo Bet v 1 structure reported previously . They show rmsd values of 0.70 and 0.65 Å, respectively (Table S1). The most visible changes are in loops L7 and L9, with moderate changes also in loops L4, L5 and L8 (Fig. 3C,D). The protein scaffold of the two complexes is very similar (rmsd 0.2 Å) and the cavity volume is unchanged upon ligand binding (Table 1) .
Bet v 1a was also studied in complex with two molecules of deoxycholate, which is structurally similar to the plant brassinosteroid hormones. The structure has an rmsd of 0.9 Å compared with the apo form (Table S1), mainly due to rearrangements in loops L9 and L7, with virtually no change in the cavity volume (Table 1). Another comparison can be made between apo Bet v 1a and Bet v 1l in complex with two copies of deoxycholate. These homologs show 94%/97% sequence identity/similarity [11, 138]. The structure is only minimally influenced by the presence of the ligands, as illustrated by an rmsd of 0.9 Å. There is also a small expansion of the cavity (2960–3130 Å3) upon ligand binding (Table 1).
The most interesting comparison in this context is between the two close lupine homologs, LlPR-10.2A and LlPR-10.2B (91%/96% sequence identity/similarity). The latter protein was co-crystallized with the plant cytokinin hormone zeatin , while the former one was characterized in its apo form . The two structures have very different cavities. In the apo structure the cavity (1050 Å3) is much smaller than in the complex (3830 Å3). The major structural change is observed in the middle of helix α3, where in the absence of ligands there is a strong inward kink (~ 60°) or collapse of the helix into the cavity (Fig. 3E). This large conformational change, together with other rearrangements, visible at α2, β2, β3, β4, L3, L5, L7, L8 and L9, all contribute to the large difference between the Cα traces (rmsd of 2.03 Å).
The three zeatin molecules found in the cavity of LlPR-10.2B form hydrogen bonds with six residues and 14 additional van der Waals contacts. The residues involved in hydrogen bonds with the ligands are all conserved (Fig. 1C). Among the 14 residues involved in van der Waals contacts, only two are dissimilar, namely at position 10 (Tyr in LlPR-10.2B versus Ser in LlPR-10.2A) and at position 57 (Phe versus Leu). The sequence of the N terminus of helix α3 is well conserved in the LlPR-10.2A/LlPR-10.2B pair. Since none of the substitutions is engaged in ligand binding, and since the sequence around Phe142, where the α3 helix is kinked in the LlPR-10.2A structure, is exactly the same, it is logical to conclude that helix α3 has changed its conformation due to the presence of the ligands in the cavity.
As an extension of the above comparison, it is also possible to include the complex of LlPR-10.2B with DPU (Fig. 3F). The Cα traces of the LlPR-10.2A and LlPR-10.2B/DPU structures have an rmsd of 1.96 Å, similar to the LlPR-10.2A-LlPR-10.2B/zeatin pair (Table S1), although the cavity of the LlPR-10.2B/DPU complex is smaller (3030 Å) than in the LlPR-10.2B/zeatin case (Table 1). Nevertheless, the differences are remarkable and indicate that the presence and identity of ligands in the cavity do modulate the structure.
Holo versus holo PR-10 proteins
Bet v 1a versus Bet v 1l in complex with deoxycholate
Although the two deoxycholate molecules in the Bet v 1a and Bet v 1l complexes locate at similar places within the cavity, the ‘inner’ molecule is rotated ~ 180° along an axis perpendicular to the sterane C ring, resulting in a different binding pattern (Fig. 1C). The two complexes show an rmsd of 0.53 Å, with the largest displacements at loop L7. The two protein isoforms (of which Bet v 1a binds IgE strongly and Bet v 1l weakly) differ at five hallmark residues, of which only the Phe/Val change at position 30 directly influences the topology of the hydrophobic cavity . The authors speculate that physiological and immunologically relevant ligands could distinguish high from low IgE-binding Bet v 1 isoforms.
Cytokinins in complex with LlPR-10.2B
The lupine LlPR-10.2B protein was co-crystallized in complex with zeatin (three ligand molecules inside the cavity) and with DPU (four ligand molecules inside the cavity), which is a synthetic cytokinin analog. Structural superposition of the two LlPR-10.2B chains [110, 111] reveals that, although the fold is the same, their Cα traces show small but significant differences (Fig. 3G), with an rmsd of 0.87 Å. The major folding differences are seen in loops L7 (maximum deviation 13.5 Å) and L9 (maximum deviation 11.1 Å). The conformational change of loop L9 is evidently caused by Na+ coordination in the LlPR-10.2B/DPU structure (Fig. 4B). The same Na+ cation is also responsible for pushing the positively charged N terminus away from its typical β-sheet association with strand β7. The large conformational change of loop L7 cannot be explained by different crystal packing as in both cases the lattice interactions of this loop are weak.
The ligand molecules occupy completely different positions in the protein cavity (Fig. 3G). There is no direct overlap between the ligands but DPU1–3 and Zea1–3 occupy the same general area. DPU4 does not overlap with any zeatin molecule. A comparison of the residues involved in interactions with the ligands in the two complexes shows that they have distinct binding modes (Fig. 1C). Each zeatin molecule is bound to the protein by at least one direct hydrogen bond and they are additionally stabilized by numerous van der Waals interactions. The DPU ligands on the other hand are anchored to the protein exclusively by van der Waals contacts with the exception of a single water-mediated hydrogen bond of DPU3 (Fig. 1C).
A simple Cα alignment is not sufficient to fully appreciate the structural changes induced by different ligands in the same protein cavity. A superposition of all the atoms would be more useful in that it would pinpoint smaller changes as well. Such a comparison, calculated for the LlPR-10.2B/zeatin–LlPR-10.2B/DPU pair using the lsqkab program , is characterized by an rmsd of 1.81 Å. This relatively large value is directly correlated with a significant difference of the cavity volume, which is reduced in LlPR-10.2B/DPU by 800 Å3 (Table 1), and with rearrangements in loop L7.
The different stoichiometry as well as orientation and binding mode of the ligand molecules in the LlPR-10.2B complexes results in a rearrangement of the side chains of the residues pointing into the interior of the cavity. Some side chains are shifted as a consequence of the Cα movements where there are slight backbone rearrangements. There are some, however, which have a clearly different conformation in the two ligand complexes. An interesting aspect of the cavity rearrangement is the number and identity of residues modeled with double conformation. Comparing only the residues with side chains pointing into the cavity, the LlPR-10.2B/zeatin complex has one residue with double conformation (Val66) and the LlPR-10.2B/DPU complex has two such residues: Leu55 and His68. The disorder of the cavity-forming residues is especially remarkable when one considers the significant cargo bound in the cavity of these complexes, including water molecules.
Zeatin complex with LlPR-10.2B versus VrCSBP
LlPR-10.2B and mung bean (Vigna radiata) VrCSBP were both co-crystallized with zeatin (Table 1, Fig. 3H). Superposition of their Cα atoms shows a large rmsd of ~ 1.9 Å, mainly because in VrCSBP the C-terminal helix α3 is less separated from the β-grip and the remaining two helices, α1 and α2, are positioned closer to the center of the protein. Such rearrangements result in a significantly smaller cavity in VrCSBP (1240 versus 3830 Å3 in the LlPR-10.2B/zeatin complex).
The binding mode of the zeatin ligands in the LlPR-10.2B complex is also different from the zeatin binding mode by VrCSBP. In contrast to the three ligand molecules found in the LlPR-10.2B/zeatin complex, in the VrCSBP/zeatin complex a maximum of two zeatin molecules (‘inner’ and ‘outer’) were found in the protein cavity . In addition to the surprisingly higher cytokinin binding ability of the LlPR-10.2B protein, the crystal structures show that the ligands are located at different sites within the cavity (Fig. 3H). When the Cα atoms of the LlPR-10.2B molecule are aligned with the four protein molecules found in the asymmetric unit of the VrCSBP structure, it is evident that the inner zeatin ligand in the VrCSBP complex, which is conserved in all four VrCSBP copies, does not coincide with any ligand molecule in the LlPR-10.2B/zeatin structure. In particular, while the inner zeatin ligand in the VrCSBP complex is aligned along the α3 helix, the innermost zeatin molecule in the LlPR-10.2B complex (Zea1) is aligned with helices α1 and α2. Considering the position of the outer zeatin molecule in the VrCSBP structure, which is present only in three of the four copies of the protein molecule, it is noticeable that it corresponds to the Zea3 ligand in the LlPR-10.2B structure, although only a poor overlap is observed with the outer zeatin ligand in molecule C of the VrCSBP protein. The outer zeatin ligands in molecules A and D of the VrCSBP structure are rotated by 180° with the aliphatic tail pointing in the opposite direction.
Although both proteins employ five residues for hydrogen bonding with the ligands, only one is positionally conserved (Fig. 1C). Nevertheless, this amino acid is not conserved as it is His68 in LlPR-10.2B and Glu69 in VrCSBP. It is of note that an Asp residue at the same position is used by Bet v 1l to form a hydrogen bond with the deoxycholate ligand . His68 in the LlPR-10.2B protein is also responsible for a van der Waals contact with a DPU molecule. The remaining hydrogen bonds in the LlPR-10.2B and VrCSBP complexes are formed by residues at different spatial positions. The main forces responsible for the stabilization of the zeatin molecules in the LlPR-10.2B cavity are weak hydrogen bonds mediated by water molecules and van der Waals interactions. Again, the amino acid residues responsible for the latter interactions are generally different in the two proteins (Fig. 1C).
Cytokinins in complex with LlPR-10.2B, VrCSBP and Bet v 1a
Kinetin and zeatin are both adenine-type cytokinins with distinct N6 substitutions. To investigate if there is a conserved binding mode of cytokinins to PR-10 proteins, we have superposed the Bet v 1a protein in complex with kinetin and LlPR-10.2B or VrCSBP in complex with zeatin. It is clear that the ligand molecules locate at different places within the protein cavity (Fig. 3I,J) and display different binding modes (Fig. 1C).
Differences between multiple molecules within the asymmetric unit
To assess the flexibility of a protein chain, one can compare its multiple copies found in the asymmetric unit of the same crystal structure. Fortunately this situation is frequently encountered with PR-10 proteins (Table 1).
Comparison of the two LlPR-10.1B molecules present in the asymmetric unit reveals that they are in fact very similar, with a Cα rmsd of 0.56 Å. The greatest deviations are observed for a fragment containing the strands β2, β3, β4 and the two connecting loops L4 and L5. Loop L4 is the glycine-rich loop and loop L5 is disordered in the crystal structure . The LlPR-10.2A structure also contains two monomers in the asymmetric unit but in this case the rmsd is higher, 0.84 Å. The most significant differences are observed in loops L5, L7 and L9. Loop L9 is actually disordered in both monomers and loop L5 is disordered in molecule B .
Two major allergen proteins, Api g 1 and Dau c 1, crystallize with more than one molecule per asymmetric unit. The two chains in the Api g 1 structure are virtually identical, with an rmsd of 0.33 Å. Also the four monomers in the asymmetric unit of the Dau c 1 structure are virtually identical, with rmsd values < 0.1 Å (Table S1), but these numbers may be artificially low because of the use of non-crystallographic symmetry restraints during refinement.
The crystal structure of MLP At1g24000.1 has two monomers in the asymmetric unit superposable with an rmsd of 0.24 Å. The most significant difference between the two molecules is located in the loop between the first helix and the subsequent strand (bear in mind the different topology of MLPs) and at loop L8.
The two monomers in the asymmetric unit of NCS have an rmsd of 0.55 Å, with the largest discrepancies at loops L1, L6 and L9. After soaking the crystals with NCS ligands, the rmsd decreased to 0.45 Å, with the largest differences at the same loops as in the apo structure.
The Hyp-1 structure has two monomers in the asymmetric unit which superpose poorly, with an overall Cα rmsd of 1.21 Å, the most variable elements being helix α2 and loops L3 and L5. Loop L5 exists in two conformations in molecule A, which is indicative of its intrinsic flexibility .
As mentioned earlier, the VrCSBP crystal structure has four independent monomers and their arrangement inside the asymmetric unit is particularly interesting. The four protein molecules are organized into two pairs related by pseudo-twofold symmetry; within the pairs the monomers form analogous interactions through the short helical regions α1–α2, although in solution VrCSBP appears to be monomeric. All chains have the same fold with small variances located mainly at loops L3 and L5. Their pairwise Cα superpositions are characterized by rmsd values between 0.34 and 0.63 Å (Table S1).
Different ligand binding modes are observed for the same PR-10 homolog, even among multiple complex molecules in the same asymmetric unit. In the SPE16 structure with two protein molecules in the asymmetric unit, two 8-anilino-1-naphthalene sulfonate (ANS) molecules are present per protein chain. One of them, at a cavity entrance, is found in the same position, while the ligands present inside the cavity have different orientations. For those inner ANS molecules, the sulfonate group is practically at the same position while the anilinonaphthalene moiety is inverted (Fig. 5A). In the NCS structure in complex with ligands, only molecule B has both the substrate dopamine and the nonreactive substrate analog 4-hydroxybenzaldehyde bound. Molecule A binds only 4-hydroxybenzaldehyde in more than one orientation. None of the conformations completely overlaps with the molecule present in complex B, although they are located in the same region of the cavity (Fig. 5B).
The VrCSBP structure reveals three very different zeatin binding modes by the four monomers in the asymmetric unit. In all the monomers, the inner zeatin molecule is present in identical orientation. The outer zeatin, found in three copies of the protein, has two distinct binding modes (Fig. 5C). In two cases, the inner and outer zeatins are bound head-to-head, with their adenine rings stacked. In one complex molecule, the outer zeatin, while still perfectly ordered, is flipped and fills the cavity in a head-to-tail fashion, pointing at the inner zeatin with its isoprenoid tail. This pattern seems to indicate a lower affinity for the outer zeatin site. This conclusion is inconsistent, however, with a backsoaking experiment, in which the single-occupancy inner zeatin was washed out before any of the outer ones .
Structural similarity to other folds
A search against all entries in the PDB  with the Dali server  has revealed several proteins with a PR-10-like fold. The most similar are structures of the START domain, the ABA binding proteins and tetracenomycin aromatase/cyclase (TcmN ARO/CYC).
Steroidogenic acute regulatory protein (StAR), present both in animals and plants, plays a crucial role in steroidogenesis . Cholesterol, in addition to being an essential component of cellular membranes, is the starting point for the biosynthesis of steroids and bile acids. The StAR protein facilitates its translocation from the outer mitochondrial membrane through the aqueous intermembrane space to the inner mitochondrial membrane where the first step in steroid biosynthesis takes place [144, 145].
StAR has an N-terminal mitochondrial import sequence followed by a steroidogenic acute regulatory protein-related lipid transfer (START) domain that plays a role in the binding and transfer of cholesterol . The START domain has ~ 210 residues, binds hydrophobic ligands, and frequently occurs at the C terminus in multidomain proteins . The 3D structure of StAR protein is unknown but X-ray crystal structures of three START-related proteins, MLN64 (metastatic lymph node protein 64), StAR D4 and the phosphatidylcholine transfer protein, are available [148-150]. The overall structure of these START domains consists of a highly curved nine-stranded antiparallel β-sheet and four α helices. The N terminus is formed by a long α helix that rests against the back of the β-sheet. The β-sheet forms the floor of a hydrophobic cavity, although the two initial β strands have a limited contribution while the remaining three α helices act as the ceiling.
The START domains align well with PR-10 proteins, both at the secondary and tertiary structure level. Yet the alignments clearly show that the larger START domains have secondary structural elements preceding the PR-10 fold. Despite very low sequence homology (14% identity), the structures align with an rmsd of 2.66 Å, when MLN64 and LlPR-10.2B are compared, for example.
The search for ABA receptors has been difficult and not without controversy [113, 114, 151, 152]. Recently, several structures have been elucidated. The most conclusive findings reveal not only the structure of an ABA receptor (PYL2) in its apo form and in complex with ABA, but also of a ternary complex between the receptor in its ligand-bound state and a type 2C protein phosphatase (PP2C), previously understood to function downstream in the ABA signaling pathway . The structure of PYL2, also in its ligand and HAB1 (a PP2C family member) complex, reveals a typical PR-10 fold with its three α helices and seven antiparallel β strands, preceded by an additional N-terminal α helix . Indeed, the structures of PYL2 and PYL1 (PYL2 homolog with 51% sequence identity) were solved by molecular replacement with Bet v 1 as a model. The set of presented structures allows us to understand the mechanism, in which conformational changes induced by ABA docking to PYL2 facilitate binding to the PP2C active site.
The structure of the 1 : 1 PYL2 : ABA complex shows the hormone tightly bound in the PYL2 cavity. In addition to several hydrophobic interactions, the hormone is anchored by a few hydrogen bonds (including water mediation) at its polar groups. Mutations of several residues forming the cavity reduce the ability of the protein to bind ABA. The binding mode of ABA explains the basis of stereoselectivity, as flipping (+)-ABA to the (−)-isomer would cause collisions with protein atoms. ABA binds two loops in the PYL2 structure that act in a ‘gate and latch’ mechanism. The ‘gate’ loop between strands β3 and β4 (L5 in PR-10) experiences remarkable conformational changes with a shift of up to 9 Å towards another loop located between strands β5 and β6 (L7) that acts as a ‘latch’ that does not change considerably in the apo and ligand-bound states. This change seals the ABA ligand inside the cavity and closes the cavity entrance. Another remarkable difference between the two states occurs at a glutamate residue that immediately precedes the latch motif. The side chain in the apo structure points into the cavity and prevents the closure of the gate; however, upon ABA binding the glutamate rotates by 150º, pointing out of the cavity. When the ‘gate’ and ‘latch’ loops come together, a new interface is formed, which is the binding interface for PP2C, thereby triggering the ABA signaling cascade.
The cyclization of secondary metabolite polyketides is promoted by aromatase/cyclase (ARO/CYC) enzymes. TcmN ARO/CYC is the best studied single-domain ARO/CYC, but it was unclear how the enzyme, alone or in complex with other proteins, could catalyze cyclization of polyketides. The crystal structure of the enzyme reveals a PR-10-like fold . Despite barely 9% sequence identity, the apo structure of TcmN ARO/CYC can be aligned with VrCSBP with an rmsd of 2.16 Å. Docking, mutagenesis and in vivo assays demonstrate that the cavity of TcmN ARO/CYC is capable of binding a 20-carbon polyketide substrate. Moreover, the study indicates the molecular determinants that lead to the selectivity of ring closure. In particular, Tyr35 and Arg69 appear to be important for the orientation of the substrate and as an acid/base active site that promotes the first-ring cyclization and aromatization .
The aspect of PR-10 oligomerization has not been covered sufficiently well. Generally, PR-10 proteins are shown to exist in monomeric form in solution, although a few exceptions exist with no clear functional importance. For instance, both monomeric and dimeric forms of Bet v 1 were detected and reported to possess RNase activity  although a simultaneous publication described Bet v 1 as a monomeric protein . A similar situation was later observed for birch PR-10c protein . SPE16 homodimers were also detected in solution as displaying RNase activity . Recently, dynamic light scattering was used to show that Bet v 1 is monomeric only under very restricted conditions and generally forms a mixture of monomers, dimers and higher order oligomers . Dimerization was also reported for Mal d 1, AmPR-10 and CSBP [160-162]. NCS was characterized as a dimer and its sigmoidal dopamine binding kinetics are consistent with such an oligomerization state . Ginseng PgPR-10.1 and PgPR-10.2 proteins were also detected in dimeric form; moreover, yeast two-hybrid analysis confirms that the PgPR-10 proteins can interact with each other forming homodimers and heterodimers . In the crystal structure of Hyp-1 two protein molecules are covalently linked by a disulfide bond, but this intermolecular oxidation seems to be an artifact of the crystallization process. Parenthetically, it can be mentioned that sulfur-containing residues are quite rare in PR-10 sequences and are nearly totally absent from the amino acid composition of classic PR-10 members.
Possible biochemical functions
PR-10 proteins have been reported to have several functions but there is no general function common to all members of this class. PR-10 proteins can be involved in enzymatic processes, including RNase activity and secondary metabolite biosynthesis, in antimicrobial processes, storage, membrane binding, as well as in phytohormone and other hydrophobic ligand binding, storage and transport. The majority of the experiments exploring PR-10 functions were done in vitro.
The first PR-10 protein with ribonuclease activity was isolated from callus cell culture of Panax ginseng . The protein has 60–70% sequence identity with two intracellular PR proteins from parsley, but remarkably does not show homology with other ribonucleases. This initial result prompted other groups to investigate the RNase activity of other PR-10 proteins, and such activity was detected in some of them, namely Bet v 1 [155, 165], LaPR-10 , LlPR-10.1B , BpPR-10c , GaPR-10 , SPE16 [88, 158], CaPR-10 , SsPR-10 , AhPR-10 , PsPR-10.1, PsPR-10.4 [169-171], AmPR-10 , Pru p 1.01, Pru p 1.06D , TcPR-10 , ZmPR-10, ZmPR-10.1 [174, 175], CsPR-10 , PgPR-10s  and JcPR-10a .
Since the initial sequence analyses revealed that the most conserved region was a glycine-rich segment with apparent similarity to the P-loop known from nucleotide-binding proteins , it was speculated that this region interacts with the RNA phosphate groups. However, as the subsequent structural studies showed, the PR-10 glycine-rich loop is very rigid and has a conformation that is different from typical P-loops (vide supra) .
Analysis of the conserved residues revealed that Glu96, Glu148 and Tyr150 (ginseng ribonuclease sequence) should be important for the RNase function. The 3D structures of PR-10 proteins showed that Glu96 is localized at the N terminus of strand β6, while Glu148 and Tyr150 are located far from it, at two opposing sides of the C-terminal helix. To investigate the relevance of the highlighted residues for RNase activity, several groups performed site-directed mutagenesis of those residues and also at some positions within the glycine-rich loop . The RNase activity of SPE16 and GaPR-10 is affected to a larger extent when residues of the C-terminal helix are substituted, while for AhPR-10 larger effects are seen with mutagenesis at the glycine-rich loop. The activity of PsPR-10.4 is elevated when Glu148 is mutated to alanine and is decreased with an H69L mutation .
The RNase activity of CaPR-10 is increased on interaction with leucine-rich-repeat protein 1 (LRR1) . Zeatin was shown to have negative influence on the RNase activity of Pru p 1.01 but it had no effect on the closely related Pru p 1.06D isoform . Similarly puzzling results were reported for the yellow lupine LlPR-10.1A and LlPR-10.1B proteins. Despite the high level (76.8%) of identity and sequence conservation at the RNase-relevant positions, RNase activity was observed only for LlPR-10.1B . It must thus be concluded that even if some PR-10 members have a low level of RNase activity this is not a general property of this class.
Phenolic oxidative coupling protein (Hyp-1)
The St John's wort Hyp-1 protein was reported to catalyze the synthesis of hypericin from the emodin precursor in an enzymatic condensation reaction . The purported multistep synthesis involves three oxidative coupling events with unknown order . Our recent crystallographic study of the Hyp-1 protein [94, 177] confirmed the structural classification of Hyp-1 in the PR-10 folding class (Fig. 3A). However, we were unable to reproduce the Hyp-1-catalyzed conversion of emodin to hypericin in vitro . Two other studies also raise doubts about this biological function, since no correlation was found between mRNA transcripts for Hyp-1 and the emodin substrate localizations , or between the presence of the hyp-1 gene in species of the genus Hypericum and the production of hypericin . These results question the enzymatic ability of Hyp-1 and raise the hypothesis that the protein may be used for storage or transport of hypericin, since there is evidence that these molecules can interact in solution . In addition, the identification and isolation of an H. perforatum endophytic fungus  led to the finding that the isolated fungus is able to produce both emodin and hypericin . The fungus has no gene similar to hyp-1 from the host plant, suggesting that Hyp-1 is not involved in the biosynthesis of hypericin or that the hypericin pathway in the endophyte is different .
(S)-Norcoclaurine synthase (NCS)
NCS has been confirmed to have PR-10 fold . The NCS sequences are unique in the PR-10 group as they possess noncatalytic N- and C-terminal extensions (Fig. 1A), with the former acting as a putative signal peptide, suggesting localization in a cellular organelle. NCS activity could not be detected in recombinant PR-10 proteins from other subclasses (e.g. Bet v 1, CSBP, Hyp-1) . However, a new classic PR-10 from Coptis japonica (CjPR-10.1A) does catalyze the NCS reaction, which indicates that this enzymatic activity is not exclusively attributed to the NCS branch of PR-10 proteins .
Papain inhibitory activity
A PR-10 protein from Crotalaria pallida, CpPRI, was shown to inhibit papain, a model cysteine protease . The inhibition follows noncompetitive kinetics and is characterized by a Ki of 0.15 × 10–9m . The papain inhibitory property could be part of the plant defense system as cysteine proteases are often part of the armory used by invading pathogens.
Some PR-10 homologs were shown to possess antifungal activity and, remarkably, AhPR-10 and TcPR-10 were shown to internalize in fungi [24, 173].
Flores et al.  showed that the PR-10 homolog ocatin inhibits fungus growth in a dose-dependent manner. Antifungal activity was also detected for SsPR-10 , CsPR-10 , CaPR-10 , maize PR-10 proteins [174, 175] and JcPR-10a .
In addition to antifungal activity, some PR-10 proteins also show antibacterial activity. Ocatin inhibits the growth of phytopathogenic bacteria, such as Agrobacterium tumefaciens, Agrobacterium radiobacter, Serratia marcescens and Pseudomonas aureofaciens . The maize ZmPR-10 and ZmPR-10.1 proteins have inhibitory activity against Pseudomonas syringae .
CaPR-10 was shown to degrade viral RNA . The production of tobacco mosaic virus in bugang leaves was severely reduced after inoculation with recombinant CaPR-10.
A recent report implicates CpPRI as a nematicide factor . It has long been known that plants from C. pallida species are resistant to nematodes, although the mechanism of resistance remains unknown . One possible explanation could be the inhibition of papain-like enzymes present in the digestive tube and in the body, especially in the cuticles, of the pathogens . In addition to papain inhibition, CpPRI internalizes and diffuses over the entire body of juvenile Meloidogyne incognita nematodes .
Possible role in the sporopollenin pathway
Sporopollenin is a complex polymer that constitutes the outer (exin) layer of spore and pollen wall, acting as a biological protector for male sperm . Its biosynthesis is linked to the tapetum. The discovery of a PR-10 homolog that has a unique organ/tissue-specific expression in the tapetal cells during anther development suggests a potential role in the sporopollenin pathway for these proteins [27, 186]. The presence of this homolog in the middle layer cells provides further evidence for such function since these cells are known to produce sporopollenin .
Possible role as antifreeze
The PR-10 homologs WAP18 (winter accumulating 18 kDa proteins) from mulberry (Morus bombycis Koidz) are detected at maximum levels in mid-winter. This period also corresponds to the maximum freeze tolerance in cortical parenchyma cells of the mulberry tree. Moreover, these proteins exhibit cryoprotective activity in vitro, which suggests that some PR-10 proteins might function in frost-tolerance mechanisms . Vegetative storage protein (VSP) from white clover (Trifolium repens L.), another PR-10 homolog, also accumulates under autumn and winter conditions, and thus may endow the plants with tolerance to chilling . Moreover, PR-10 proteins are overexpressed in Oxytopis (Fabaceae) species adapted to the Arctic as opposed to temperate species .
Storage protein function
The PR-10 homolog ocatin from the Andean crop oca is expressed specifically in the tubers where it accounts for 40–60% of total soluble proteins . The protein level gradually increases with the development of the tubers from 20 to 100 days, and decreases at later stages, under storage and upon sprouting. The high levels, developmental regulation and tissue specificity of ocatin could mean that the protein has a storage function. In white clover, VSP is mobilized after winter in the plant re-growth process, which is strongly dependent on its nitrogen reserves .
Membrane binding function
Immunocytochemical localization of a pine PR-10 homolog, Pin m III , shows that the protein binds to the cell wall of the fungus Cronartium ribicola that causes blister rust disease in five-needle pines . Mogensen et al.  carried out biophysical analyses of these interactions and reported that Bet v 1 interacts with synthetic phospholipid vesicles.
Renknen et al. proposed an interesting hypothesis of caveolae-dependent uptake and transport of Bet v 1 in the epithelium. They reported on the binding of the Bet v 1 pollen allergen to conjunctival and epithelial cells of allergic but not healthy individuals [192, 193]. Moreover, Bet v 1 traveled through the epithelium with caveolae to mask cells. Affinity chromatography with covalently matrix-bound TAP-tagged Bet v 1 and nasal epithelial cell lysates unveiled 16 proteins, present only in allergic patients, that associate with Bet v 1. Six of them are caveolar proteins. The underlying mechanism of caveolar allergen transport remains unclear.
The size and lining of the PR-10 internal cavity was identified in the first structure published, Bet v 1, and described as the ‘most unusual feature’ of the protein structure . The presence of such an internal cavity suggests a possible role as a binding site for hydrophobic ligands. This hypothesis is supported by the structural similarity between PR-10 members and the START domain of the human MLN64 protein (vide supra).
The ligand binding propensity of PR-10 proteins was initially investigated for the major cherry allergen, Pru av 1 . The cholesterol binding properties of the structurally homologous StAR domain of the human MLN64 led the authors to investigate phytosteroids as putative ligands for Pru av 1. Despite the hydrophobicity of virtually all physiologically relevant steroids, the authors, by means of NMR measurements, were able to gather experimental evidence that Pru av 1 interacts with a specific phytosteroid called homocastasterone, which is a plant brassinosteroid. Molecular modeling showed that the Pru av 1 cavity, as well as the cavity of Bet v 1, is so large that it can accommodate two phytosteroid molecules .
After confirmation that ANS binds in the protein cavity, displacement experiments coupled with fluorescence measurements were used to identify a number of physiologically relevant ligands that bind to Bet v 1 . Fatty acids, flavonoids and cytokinins were shown to bind, albeit generally with low micromolar affinity.
Cytokinins were used as probes to detect CSBPs. Nagata et al.  identified the presence of CSBPs in the soluble fraction of etiolated mung bean seedlings and reported affinities for their ligands in the range 10−9–10−10m−1, but these estimates were later re-determined to a more modest range, 10−4–10−6m−1 . Based on low-level sequence homology, the VrCSBP protein was tentatively classified to have PR-10 fold, a prediction confirmed by a subsequent crystallographic analysis . The crystal structure revealed, for the first time, cytokinin (zeatin) binding by a PR-10-like protein. The structure showed that VrCSBP can accommodate up to two zeatin molecules within the protein cavity. The high resolution and excellent electron density for the ligands significantly increased our understanding of the binding mode of these partners, which involves hydrogen bonding and extensive van der Waals contacts. Later, the LlPR-10.2B protein was also reported to bind cytokinins, with comparable affinity, and for the first time the same protein was crystallized with two different ligands, namely both a natural and a synthetic cytokinin [110, 111]. Cytokinin molecules also interact with birch PR-10c  and peach Pru p 1.01 . The interaction of Pru p 1.01 with zeatin was analyzed by isothermal titration calorimetry and a dissociation constant of 9.4 μm and 1 : 1 stoichiometry were reported .
In the crystal structure of Bet v 1l the cavity is occupied by two deoxycholate molecules . This compound is not a plant metabolite but it is structurally similar to brassinosteroids, which are plant steroid hormones [196, 197]. Deoxycholate was used as a solubilizing agent in the purification of the recombinant proteins, and possibly due to its high affinity for Bet v 1l remained bound throughout the purification process. Mass spectrometry was used to show that the protein can bind genuine brassinoloids, brassinolide and 24-epicastasterone .
Koistinen et al. investigated the possible binding of emodin by PR-10c after the reported Hyp-1-catalyzed conversion of emodin to hypericin [120, 132]. Although binding of two emodin molecules to the PR-10c protein was confirmed, no formation of hypericin could be detected. The experimentally visualized cavity of Hyp-1  is large enough to accommodate two emodin molecules or one hypericin molecule, and thus the lack of catalytic activity suggests that Hyp-1 is not an enzyme but could be a hypericin binder. In the crystal structure, the Hyp-1 molecules are covalently linked into dimers by an intermolecular S–S bond and contain multiple PEG molecules in the cavity. PEG ligands were also found in the crystal structure of Dau c 1 .
The crystal structure of norcoclaurine synthase shows the binding mode of dopamine and the nonreactive substrate analog 4-hydroxybenzaldehyde, which adopt a stacked configuration with their aromatic rings lying in nearly parallel planes  (Fig. 5B).
A subclass of PR-10 proteins displays strong allergenic properties. They constitute a large group of common food and pollen allergens [199, 200]. An important source of allergens are pollens, with birch-related pollens studied most extensively . Over 90% of birch pollen allergic patients develop specific IgE towards Bet v 1 . Thirteen different isoforms were identified with the isoform Bet v 1a exhibiting the highest and Bet v 1l the lowest allergenic activity [203, 204]. Additionally, Bet v 1 is known to cross-react with other food allergens . The apo structure of Bet v 1a and its Fab complex [138, 205] identified an important surface epitope of Bet v 1 and revealed that, since its area corresponds to ~ 10% of the total exposed surface of the allergen, it is plausible that more than one IgE molecule could bind simultaneously to the same allergen protein . The complex structure also shows that there is very little structural rearrangement of the allergen upon antibody binding, with a Cα rmsd of 0.65 Å. The sequentially discontinuous epitope is composed of a stretch between residues 42 and 52, which includes the glycine-rich loop flanked on one side by Arg70, Asp72, His76, Ile86 and Lys97. The interaction with the BV16 Fab is a combination of van der Waals interactions and hydrogen bonding. The latter involves half of the epitope residues, all of which are situated within the 42–52 stretch . The side chain of a central residue of the epitope, Glu45, points directly into a groove in the Fab BV16 surface, forming two hydrogen bonds with main-chain N atoms. The importance of Glu45 was confirmed by mutating it to serine, which completely abolished antibody binding . Serine was chosen since it is present at position 45 in several PR-10 proteins, e.g. from asparagus  and parsley . Also none of the Bet v 1 isoforms has serine at this position and it was shown to produce no structural changes in the crystal structure . Moreover, the E45S mutant also showed reduced potency for IgE binding (20%–50%). Similar structural conservation and reduced IgE binding was observed for Pru av 1 when mutated at Glu45 [194, 210], which confirms the importance of this residue. Large-scale mutagenesis experiments with four and nine amino acid substitutions produced Bet v 1 mutants that, while maintaining the overall fold, also displayed reduced IgE binding and induced strong Bet v 1 specific IgG responses in mice .
Two reports investigated the influence of Bet v 1 oligomerization on allergenicity [159, 212]. Using skin tests in Bet v 1 allergic mice, it was shown that the dimeric but not monomeric Bet v 1a elicited positive reactions . Dynamic light scattering was used to compare the hyperallergenic Bet v 1a and the hypoallergenic Bet v 1d. The studies revealed that the latter protein had a high tendency to oligomerize, probably due to a serine to cysteine exchange at residue 113 . A C113S mutation of Bet v 1d resulted in a Bet v 1a like behavior.
To study the structural basis of cross-reactivity, two crystal structures were solved for the food allergens Api g 1 from celery  and Dau c 1 from carrots , both of which lack the Bet v 1 specific Glu45 residue. Api g 1 and Dau c 1 have a sequence variation of only 30 amino acids that are distributed evenly between the protein surface and interior. It is therefore conceivable that both share the same cross-reactive epitopes with Bet v 1 . The epitope at the glycine-rich loop of Api g 1 and Dau c 1 diverges considerably from that of Bet v 1. Despite the fact that the glycine-rich loop is structurally very conserved, the preceding segment of Bet v 1 is quite different from that in the celery and carrot allergens, mainly as a result of the change at position 45 where the negative charge of Glu45 is replaced by a positively charged Lys residue in Api g 1 and Dau c 1 . The differences indicate that the Bet v 1 glycine-rich epitope is probably not important in IgE binding of Api g 1 and Dau c 1 . Moreover, a K45E substitution in the Api g 1 sequence leads to increased IgE binding . The structures of Api g 1 and Dau c 1 facilitate the identification of other epitopes at the protein surface responsible for the cross-reactivity of these allergens with Bet v 1 and also of epitopes which are not common between these allergens [198, 213].
Recently, the structure of an important food allergen, Gly m 4 from soybean, has been described , showing a typical PR-10 fold and similar epitopes as in other allergens. In detailed comparisons, however, Gly m 4 turned out to be more similar to yellow lupine PR-10 structures than to the other allergens . For instance the L5 loop of Gly m 4 is shorter and the terminal helix α3 curves slightly and slides by almost one turn along its axis . This relation is supported by phylogenetic analysis, which places this allergen close to classic PR-10 proteins (Fig. 3B).
A thermodynamic study of PR-10 allergens, in which the conformational stability of Bet v 1, Api g 1 and Dau c 1 was analyzed by determining the difference of the Gibbs free energy between folded and unfolded molecules , indicates a relatively low stability, most probably due to the reduced hydrophobic core of the PR-10 fold.
Some studies investigated the effect of pH and temperature on the stability of several PR-10 proteins [159-161, 172, 191, 214-220]. The results indicate that different homologs display different behavior when analyzed for stability in different conditions. To our knowledge, there has been no systematic thermodynamic study of all isoforms from one PR-10 subclass. Such a study, especially for a subclass (such as Bet v 1) with well characterized structure and allergenic properties, could provide information on possible folding intermediates relevant to allergic response. It was shown, for example, that a single Y120W mutation changes the refolding behavior of Bet v 1 by abolishing the accumulation of one intermediate state . The yellow lupine PR-10 proteins are also of interest in this regard, as small variations in sequence and cavity volume could be correlated with overall protein stability.
In a different approach, the endolysosomal degradation profile of Bet v 1 was investigated, using extracts of dendritic cells obtained from patients with birch pollen allergy. Remarkably, Bet v 1 showed resistance to endolysosomal degradation, with a high proportion of the protein remaining intact after 24 h. Nevertheless, the first fragments to emerge, derived from the N terminus (1–20), the central region (84–97) and the C terminus (146–157), contained frequently recognized T-cell-activating regions . The relative proteolytic resistance of Bet v 1 may provide a continuous supply of intact protein for the generation of proteolytic fragments, which may contribute to its high allergenic potential . Recently, simulation of gastric proteolysis for Api g 1, Mal d 1, Pru p 1 and Cor a 1 via pepsinolysis showed a rapid digestion of these proteins. The proteolysis leads to a mixture of peptides with masses below 2 kDa and results in lower affinity for their cognate IgE . Trypsinolysis was shown earlier to be slower for Api g 1 and Mal d 1, and significantly slower for Cor a 1 .
The first PR-10 protein reported to undergo post-translation modification was the birch PR-10c homolog . PR-10c shares 70–74% sequence identity with the pollen allergen Bet v 1, but is not constitutively expressed and can be glutathionylated. PR-10c displays RNase activity, which is unaffected by the post-translational modification . The modification occurs at the unique PR-10c Cys82, poorly conserved among PR-10 proteins. Nevertheless, cysteine is found at an identical position in a small number of homologs within the Betula family . Apart from this family, a conserved cysteine is found in the major hazelnut allergen Cor a 1.0401 [157, 223], the major Asian hazel allergen Cor h 1, the major European chestnut Cas s 1 and Hyp-1.
Park et al. established that the hot pepper CaPR-10 protein can be phosphorylated and demonstrated that this modification enhances the RNase activity .The phosphorylation is likely to endow the protein with specificity for certain RNAs, thus preventing a potentially dangerous unspecific RNase activity within the plant cell []. A recent report shows that phosphorylation of CaPR-10 is enhanced by LRR1 . Also some PR-10 proteins from Arachis hypogaea were shown to be phosphorylated . In the cocoa TcPR-10 protein, phosphorylation has no influence on the RNase activity or substrate specificity . Phosphorylation of StAR was shown to modulate its steroidogenic activity .
In two MLP proteins from A. thaliana acetylation of the N terminus was detected . It has long been known that the removal of the N-terminal methionine of proteins (by methionyl aminopeptidase) is dependent on the size of the side chain of the amino acid in the penultimate position . In the same way, the efficiency of N-terminal acetylation also depends on the nature of the N-terminal amino acid, with preference for alanine and serine . The two MLPs in the above study both have alanine residues at the N terminus. Interestingly all the 15 MLP-related protein sequences found in the NCBI database have a highly conserved N terminus; thus N-terminal acetylation of these proteins is expected .
AmPR-10 isolated from the roots of Astragalus mongholicus was reported to be glycosylated, with a carbohydrate content of 13.7%. High performance anion exchange chromatography revealed that the carbohydrates consist mostly of arabinose (73%), glucose (15%) and fructose (4.8%) .
The PR-10 class members are relatively small proteins with a characteristic fold consisting of a highly curved β-sheet and three α helices. On the surface of some homologs there are allergenic epitopes that cause IgE-mediated type I allergy. The secondary structure elements are assembled to create a large cavity of variable volume in the core of the protein. The cavity is predominantly hydrophobic but there are also polar side chains pointing into its lumen. The size and character of the cavity is mostly determined by the highly variable C-terminal helix α3. Such a cavity is an unusual feature, particularly in view of the small size of PR-10 proteins. Ligand-binding studies, fold similarity to START and ABA receptors, and first of all the crystal structures of PR-10 homologs in complex with small molecules, all suggest a ligand-binding role for PR-10 proteins. PR-10 proteins seem to be rather non-specific binders, with the same homologs displaying affinity for different ligands and with the same ligand bound in a variety of well-defined ways. Interestingly, the binding of different ligands appears to induce conformational changes in the scaffold of the cavity. Moreover, the complexes characterized structurally so far indicate that the binding partners may be adaptable as well.
PR-10 proteins are widely distributed throughout the plant kingdom and are thus deemed to have crucial, albeit still elusive, functions. PR-10 homologs were shown to be involved in plant development and defense systems. They are typically encoded by multigene families and close homologs most probably perform similar functions, complicating interpretation of single-gene knockout experiments. Also, the lack of coordinated cellular expression makes assignment of a unique function difficult. It has been demonstrated that some PR-10 proteins are involved in various enzymatic reactions or possess antimicrobial activities, but these properties are certainly not shared by all PR-10 members. The most common feature seems to be the ability to bind (and store) small-molecule ligands, especially plant hormones. However, the low affinities characterizing such complexes, together with an amazing variability of the binding modes, make even this tentative assignment of a physiological role intriguing. On the other hand, perhaps the modest affinity coupled with binding versatility guarantee that the ligands can be delivered to their final receptors. In any case, despite the accumulated experimental data, PR-10 proteins are still a mystery and certainly will remain an extremely fascinating object for structural and functional studies also in the near future.
The work was supported in part by a grant (N N301 003739) from the Ministry of Science and Higher Education to KM.