Characterization of wheat puroindoline proteins


L. Day, Food Science Australia, 671 Sneydes Road, Werribee, VIC 3030, Australia
Fax: +61 3 97313250
Tel: +61 3 97313233


Puroindoline proteins were purified from selected UK-grown hexaploid wheats. Their identities were confirmed on the basis of capillary electrophoresis mobilities, relative molecular mass and N-terminal amino acid sequencing. Only one form of puroindoline-a protein was found in those varieties, regardless of endosperm texture. Three allelic forms of puroindoline-b protein were identified. Nucleotide sequencing of cDNA produced by RT-PCR of isolated mRNA indicated that these were the ‘wild-type’, found in soft wheats, puroindoline-b containing a Gly→Ser amino acid substitution (position 46) and puroindoline-b containing a Trp→Arg substitution (position 44). The latter two were found in hard wheats. Microheterogeneity, due to short extensions and/or truncations at the N-terminus and C-terminus, was detected for both puroindoline-a and puroindoline-b. The type of microheterogeneity observed was more consistent for puroindoline-a than for puroindoline-b, and may arise through slightly different post-translational processing pathways. A puroindoline-b allele corresponding to a Leu→Pro substitution (position 60) was identified from the cDNA sequence of the hard variety Chablis, but no mature puroindoline-b protein was found in this or two other European varieties known to possess this puroindoline-b allele. Wheats possessing the puroindoline-b proteins with point mutations appeared to contain lower amounts of puroindoline protein. Such wheats have a hard endosperm texture, as do wheats from which puroindoline-a or puroindoline-b are absent. Our results suggest that point mutations in puroindoline-b genes may confer hard endosperm texture through accumulation of allelic forms of puroindoline-b proteins with altered functional properties and/or through lower amounts of puroindoline proteins.


capillary electrophoresis


carboxymethyl cellulose




Triton X-114

Puroindolines (PINs) are low molecular weight, highly basic, highly surface-active, quantitatively relatively minor, non-gluten proteins of the wheat caryopsis. They occur in two major isoforms, PIN-a and PIN-b, with PIN-a being the predominant isoform [1]. The relative molecular masses of PINs, as calculated from amino acid sequences deduced from nucleotide sequencing of cloned genes, are in the range of 12 000–13 000 [2]. A distinguishing feature of these proteins is that they contain high levels of the amino acid tryptophan (Trp). Some of the Trp residues are present in an amphiphilic domain or loop flanked by two cysteine (Cys) residues forming a disulfide bond [3]. As well as the two Cys residues flanking the Trp-rich domain, PINs contain a further eight Cys residues, which are also involved in four disulfide bonds. The disulfide structure of PINs resembles that of another class of small proteins, plant non-specific lipid transfer proteins [3–5]. According to the deduced amino acid sequences [2], PIN-a and PIN-b have 55% homology in their primary structures. Significant differences occur in the Trp-rich domain, where the PIN-a domain (Trp-Arg-Trp-Trp-Lys-Trp-Trp-Lys) includes five Trp and three basic residues, whereas the PIN-b domain (Trp-Pro-Thr-Lys-Trp-Trp-Lys) contains only three Trp and two basic residues.

Research on PINs over the last decade has focused on their possible role in controlling the endosperm texture (hardness or softness) of wheat grain, an important characteristic that has a profound effect on the technological performance of wheat [6]. A strong link has been shown, largely through analysis of genomic DNA sequences, between allelic differences among PIN genes and variation in wheat grain endosperm texture [7–12]. Soft endosperm texture is considered to be the wild phenotype, and soft wheats contain both PIN-a (genetic allele Pina-D1a) and wild-type PIN-b (allele Pinb-D1a) proteins [7]. Several PIN variants that are associated with endosperm hardness have now been identified at the molecular genetic level. Some hard wheats have a null form of PIN-a (allele designation Pina-D1b) plus wild-type PIN-b [8], whereas others have the normal form of PIN-a plus one or other of several allelic PIN-b genes that have been identified. The deduced amino acid sequences indicate that the more common allelic PIN-b genes encode either a Gly→Ser amino acid substitution in PIN-b at position 46 (Pinb-D1b)[7], a Leu→Pro substitution at position 60 (Pinb-D1c), or a Trp→Arg substitution at position 44 (Pinb-D1d) [9]. Other relatively rare allelic PIN-b genes contain stop codons at either position 39 (originally Trp), position 44 (Trp) or position 56 (Cys), possibly resulting in truncated PIN-b proteins [10]. More recently, a further PIN-b variant associated with endosperm hardness was identified among Chinese hard wheats, again through analysis of genomic DNA sequences [11,12]. That PIN-b variant involved a single nucleotide (adenine) deletion at position 42, leading to a shift in the ORF and disrupting the last part of the Trp-rich domain. Several hard wheats have also recently been shown to have null forms of both PIN-a and PIN-b [12].

An additional indicator of the probable role of PINs in controlling endosperm texture is the demonstration that the locus corresponding to the PIN-a gene is closely linked to the hardness gene locus, Ha, on the short arm of chromosome 5D [13]. The Ha gene locus is known to be responsible for a high proportion (approximately 60%) of the variation in wheat endosperm texture [14,15]. No PIN genes have been found in the ultra-hard durum wheat used for pasta making, Triticum durum, which lacks the D genome [2]. The evidence indicates clearly that PINs, despite being relatively minor components of wheat endosperm, are intimately involved in the phenomenon of variation in endosperm texture in a manner that is still not clearly or fully defined at the genetic, biochemical or physicochemical levels.

The mature PIN-a protein has been isolated, and its amino acid sequence and relative molecular mass have been determined [1], but only the N-terminal amino acid sequence has been determined for the purified wild-type PIN-b protein [16]. None of the ‘hard’ allelic forms of PIN-b protein have been isolated and characterized to date. Such research is needed to facilitate further understanding of the biochemical and physical functionalities of these proteins, and to establish the importance of these parameters in the control of endosperm texture in wheats. The aim of this work, therefore, was to develop procedures for isolating PIN proteins from wheats differing in endosperm texture, and to characterize the purified PIN proteins using capillary electrophoresis (CE) and MS as a basis from which to gain further understanding of the structure–function relationships of PIN proteins.


Identification of PIN proteins using CE

Five UK-grown hexaploid (Triticum aestivum) wheat varieties differing in endosperm texture were chosen for the study. Two of the varieties, Riband and Consort, have a soft endosperm texture, whereas the other three, Hereward, Soissons and Chablis, have a hard endosperm texture. CE analysis was performed using the capillary zone electrophoresis mode of separation in low-pH phosphate buffer. This mode of CE provides a powerful means of discriminating between closely related protein species. Differences in electrophoretic mobility between proteins are due to various factors, among them relative molecular mass, conformation, charge and solvation properties. The CE profiles (electrophoregrams) of the high-mobility, basic components among the Triton X-114 (TX-114) extractable proteins from their flours are shown in Fig. 1. An unresolved envelope of slower-moving, less basic components is not shown; this material probably comprises higher molecular weight proteins, mainly γ-gliadins, which have been characterized in such extracts [16,17]. Normally, four major peaks were detected by CE, and, by comparison with the mobilities of protein standards, these were identified as α-purothionins and β-purothionins (the first two peaks), as well as PIN-a and PIN-b (Fig. 1). It should be noted that, in our earlier publication [18], the CE peaks for PIN-a and PIN-b were misidentified (wrong way round) owing to inadvertent mislabeling of PIN-a and PIN-b standards (supplied by D Marion, INRA-Nantes, France) used as electrophoretic markers. The correct identities of the peaks and standard protein samples were established here by MS and N-terminal amino acid sequencing (results not shown).

Figure 1.

 CE separations of TX-114-extractable proteins (0.5 mg·mL−1) from wheat flours with different endosperm textures: (A,B) The soft varieties Riband and Consort. (C) The hard variety Hereward. (D) The hard variety Soissons. (E) The hard variety Chablis. Peaks: 1, α-purothionin; 2, β-purothionin; 3, PIN-b; 4, PIN-a.

The varieties Riband and Consort (soft endosperm texture) both contained the wild-type PIN-a and PIN-b proteins (Fig. 1A,B). The other three varieties (all hard) contained PIN-a proteins with mobility the same as the wild-type, but they differed in their PIN-b protein patterns. The PIN-b protein from Hereward (Fig. 1C) exhibited a slightly faster mobility in the CE profile than the PIN-b proteins from Riband and Consort, whereas the PIN-b protein peak from Soissons (Fig. 1D) exhibited an even shorter retention time. No peak that could be attributed unequivocally to PIN-b was observed for the variety Chablis (Fig. 1E).

As well as having PIN-b components with CE mobilities different from those of wheat varieties with a soft endosperm texture, it was evident that the three hard varieties had lower amounts of PIN-b protein than the two soft varieties on the basis of CE peak sizes (Fig. 1). CE analysis of PIN proteins from a large number of other varieties has provided similar findings (results not shown). The amounts of PIN-b protein recovered from hard wheats in the purification work described below were also less than from soft wheats. Additionally, it appeared that, as the degree of hardness increased among the hard wheat varieties, the CE PIN-b peak size decreased. Thus, the variety Chablis, which is the hardest of the varieties examined, contained no PIN-b protein. The variety Soissons, which is somewhat less hard than Chablis, had a small PIN-b peak. The variety Hereward, which is less hard even than Soissons, had a PIN-b peak larger than that of Soissons but smaller than those of the soft varieties Riband and Consort. The PIN-a peak sizes were similar in all cases. These observations may indicate an effect of total PIN protein amount on endosperm texture, softer wheats having more total PIN protein than hard wheats, although more rigorous quantification and analysis of a much greater number of samples is required before firm conclusions can be drawn.

Isolation of PIN proteins using carboxymethyl cellulose

PIN proteins were extracted from flour and separated into PIN-a and PIN-b protein-enriched fractions using a combination of methods reported previously [1,19], prior to purification on a Mono-S cation exchange chromatography column. During the development of the purification procedure, CE was used for monitoring the high-mobility protein components at each step (Fig. 2). The preparation of the PIN-a- and PIN-b-enriched fractions involved an initial selective extraction with the detergent TX-114, which was shown by CE to extract both PIN-a and PIN-b, as well as purothionins (Fig. 2A). Very basic proteins, including PIN-b but not PIN-a, were adsorbed from the TX-114 extract onto carboxymethyl cellulose (CMC) at acid pH and were subsequently eluted with 1 m NaCl (Fig. 2C). A PIN-a-enriched fraction was produced by phase separation at 30 °C of the TX-114-extracted proteins that were not adsorbed onto CMC and precipitation of PIN-a and some other proteins from the detergent-rich phase using ethanol/diethyl ether (Fig. 2B).

Figure 2.

 CE separations of: (A) total TX-114-extractable proteins from the soft variety Riband; (B) PIN-a-enriched fraction − protein remaining in the TX-114 solution, i.e. that did not bind to CMC, which was recovered by phase partitioning and solvent precipitation; (C) TX-114-extractable proteins selectively bound to CMC and eluted by 1 m NaCl; (D) purothionin-enriched fraction of CMC-bound proteins; (E) PIN-b-enriched fraction. Peaks: 1, α-purothionin; 2, β-purothionin; 3, PIN-b; 4, PIN-a.

Further separation of PIN-b protein from purothionins, both of which were found to bind to CMC, was necessary, as they were coeluted during cation exchange chromatography using a Mono-S column [20]. According to previous reports [1,16], the separation of PIN-b from purothionins should have been achievable by selectively precipitating PIN-b using 2 m NaCl. CE analysis of the fractions obtained using 2 m NaCl revealed that about one-third of the PIN-b protein had not precipitated and remained in solution with purothionins. An increased NaCl concentration of 4 m was required to achieve complete separation of PIN-b from purothionins. CE analyses of the proteins remaining in the 4 m NaCl supernatant (purothionin-enriched fraction) and in the precipitate (PIN-b-enriched fraction) are shown in Fig. 2D,E. The ability to purify different forms of the polypeptide purothionin using the protocol established here is a side benefit of the present research. Previous methods for purifying purothionin have involved the use of hazardous flammable organic solvents [21], which the present method avoids.

Purification and characterization of PIN-a proteins

It was necessary to separate the PIN-a in the enriched fractions from other proteins, which were mostly less basic (lower CE mobility) and which probably included γ-gliadins [16,17]. A Mono-S cation exchange chromatography elution profile of the PIN-a-enriched fraction from the wheat variety Riband is shown in Fig. 3A. The main protein peak was eluted at an ammonium acetate concentration of approximately 0.5 m. The protein had a migration time of 7.5 min, as determined by CE (Fig. 3B). N-terminal amino acid sequencing of this protein showed complete homology over the first 10 amino acid residues, Asp-Val-Ala-Gly-Gly-Gly-Gly-Ala-Gln-Gln, with PIN-a [1]. Very similar Mono-S elution profiles were obtained for all five varieties. The CE migration times of the five purified PIN-a proteins differed by less than 0.02 min, which is within the generally accepted migration time variability in CE. Minor heterogeneity occurred for PIN-a protein, as shown by both cation exchange chromatography and CE (Fig. 3). The Mono-S cation exchange chromatography PIN-a peak showed an obvious trailing shoulder, whereas in CE one or two minor PIN-a peaks were observed as well as one major peak. ES MS analysis of purified PIN-a proteins from three varieties showed that each contained one major isoform with a relative molecular mass of 12 743 (± 2) and one minor isoform with a relative molecular mass of 12 908 (± 3) (Table 1). These values are similar to (but slightly lower than) the reported mass for PIN-a protein [1] and that calculated from the deduced amino acid sequence [2]. The lower relative molecular mass corresponds to a PIN-a protein containing 115 amino acids and with a Thr residue at the C-terminus (theoretical relative molecular mass 12 749), whereas the higher relative molecular mass corresponds to the same protein with an Ile-Gly dipeptide extension after the C-terminal Thr residue (theoretical relative molecular mass 12 918) [22].

Figure 3.

  (A) Mono-S cation exchange HPLC elution profile of PIN-a-enriched fraction from the soft variety Riband. (B) CE separation of the major protein that was eluted at about 0.55 m ammonium acetate.

Table 1.   Relative molecular masses of purified PIN proteins determined by ES MS and MALDI-TOF MS. (a) Main isoform, (b) minor isoform.
Wheat varietyRelative molecular masses of the major proteins in order of abundance
Purified PIN-aPurified PIN-b
  • a

     MALDI-TOF MS results.

Riband (soft)(a) 12 741, (b) 12 904(a) 13 076 (13 076a)
Consort (soft) (a) (13 076a)
Hereward (hard)(a) 12 743, (b) 12 908(a) 13 103 (13 106a), (b) 12 975
Soissons (hard)(a) 12 744, (b) 12 904(a) 13 045, (b) 13 240

cDNA was prepared by RT-PCR of mRNA extracted from developing endosperms of five wheat varieties with different endosperm textures (the soft variety Riband, and the hard varieties Buster, Shamrock, Soissons and Chablis) using a PIN-a-specific primer. The varieties Buster and Shamrock were used instead of Hereward, owing to the unavailability of growing plants of the latter variety. CE analysis (not shown) confirmed that the PIN-a proteins from all three varieties were identical. Nucleotide sequencing of those cDNA preparations confirmed that the genes encoding PIN-a are the same regardless of whether the endosperm texture of the wheat is soft or hard (results not shown).

The recovered amount of purified PIN-a protein ranged between 15% and 18% (w/w) of the CMC-unbound protein, equivalent to 0.02% (w/w) of flour for these five varieties, which is somewhat higher than the level (0.015%) reported previously [1].

Purification and characterization of different allelic forms of PIN-b proteins

Differences between the PIN-b proteins isolated from wheat varieties differing in endosperm texture were observed in the Mono-S cation exchange chromatography elution profiles of their PIN-b-enriched fractions and the subsequent CE analyses of the purified proteins (Fig. 4). For the soft varieties Riband and Consort, a major protein peak was eluted at an ammonium acetate concentration of approximately 0.55 m (Fig. 4A); it had a CE migration time of 6.61 min (Fig. 4B). The N-terminal amino acid sequence of both of these purified proteins was Glu-Val-Gly-Gly-Gly-Gly-Gly-Ser-Gln-Gln, which is identical to the PIN-b amino acid sequence deduced from the genomic PIN-b nucleotide sequence [2]. In the present work, cDNA was prepared by RT-PCR of mRNA extracted from developing endosperm of Riband, using a PIN-b-specific primer. The nucleotide sequence (results not shown) of this cDNA showed complete homology with that reported for the wild-type PIN-b, Pinb-D1a[2,7].

Figure 4.

 Mono-S cation exchange HPLC elution profiles of PIN-b-enriched fractions, and CE electrophoregrams of corresponding major protein peaks, from: (A,B) the soft variety Riband; (C,D) the hard variety Hereward; and (E,F) the hard variety Soissons.

A relative molecular mass of 13 076 was determined by ES MS and MALDI-TOF MS (Table 1) for PIN-b proteins from the two soft varieties Riband and Consort, which is greater than the calculated relative molecular mass of 12 935 for PIN-b with 114 amino acid residues [2]. The difference of 141 units between the measured and the calculated relative molecular mass indicates that, as in the case of PIN-a, the main isoform of PIN-b is likely to have a dipeptide (Ser-Gly) extension at the C-terminus; that is, it comprises 116 amino acid residues (theoretical relative molecular mass 13 079).

The Mono-S cation exchange chromatography elution profile of the PIN-b-enriched fraction from Hereward, a hard variety, differed from those of soft varieties. The main peak was eluted at an ammonium acetate concentration of approximately 0.58 m (Fig. 4C), and the protein in this peak had a CE migration time of 6.59 min (Fig. 4D).

The relative molecular mass of the main isoform of the purified PIN-b protein from Hereward was 13 103 by ES MS and 13 106 by MALDI-TOF MS (Table 1). This relative molecular mass is greater than that for the PIN-b proteins from the soft varieties by 30 mass units, indicating that the purified PIN-b protein from this hard variety is likely to have an amino acid substitution. This relative molecular mass difference of 30 mass units is consistent with the Gly→Ser substitution (allele Pinb-D1b) found in the genes encoding PIN-b from some hard wheats (theoretical relative molecular mass 13 109) [7]. A second minor PIN-b isoform was found in Hereward, and had a relative molecular mass of 12 975. This mass corresponds to PIN-b containing the Gly→Ser substitution from which the N-terminal Glu residue has been lost, resulting in a PIN-b protein containing 115 amino acid residues (theoretical relative molecular mass 12 980).

Owing to the unavailability of growing plant material, cDNA sequencing could not be carried out with Hereward. Instead, the varieties used were Buster and Shamrock, which have PIN-b proteins with the same mobilities and relative peak intensities as that from Hereward, as determined by CE of total TX-114 extracts [18]. The cDNA sequencing results (not shown) revealed that a single nucleotide base change of guanine to adenine occurred in the PIN-b sequence of both Buster and Shamrock, causing an amino acid change from Gly to Ser at position 46 in the expressed protein. The fact that the Gly→Ser substitution in PIN-b has been observed only at position 46 makes it reasonable to assume that Hereward contains the same Gly→Ser substitution in PIN-b as in Buster and Shamrock.

A peak that was eluted in the region expected for PIN-b was also observed in the chromatogram for Soissons, although it was eluted at a higher ammonium acetate concentration of approximately 0.60 m (Fig. 4E). The protein had a CE migration time of 6.49 min (Fig. 4F). The relative molecular mass of this protein was determined by ES MS as 13 045 (Table 1), which is less than the relative molecular mass of the wild-type PIN-b protein by 31 mass units. This mass is consistent with the PIN-b protein containing 116 amino acid residues and having a Trp→Arg substitution (theoretical relative molecular mass 13 049). That the Soissons PIN-b did contain the Trp→Arg mutation (Pinb-D1d), which is found in some northern Europe hard wheats [9], was confirmed by nucleotide sequence analysis of the cDNA from the same variety. A single nucleotide base change of thymine to adenine was detected, which would result in a Trp→Arg change at position 44 of the PIN-b protein. The Trp→Arg substitution results in a relative molecular mass reduction of 30 mass units, as well as a greater net positive charge on the protein, as arginine is a basic amino acid and has the highest pI value, 10.8, of all the amino acid residues. This was clearly shown by the migration time shift in CE. A difference of 0.12 min was observed between the purified PIN-b proteins from Soissons and Riband, but only a 0.02 min difference between the purified PIN-b proteins from Hereward and Riband (Fig. 4B,D,F). Therefore, the net charge of the PIN-b protein from Soissons (Pinb-D1d) is greater than that of the PIN-b protein from Hereward (Pinb-D1b), and that of the PIN-b protein from Riband (Pinb-D1a).

The second PIN-b isoform present in Soissons had a relative molecular mass of 13 240, i.e. 195 mass units greater than the first isoform. By calculation, this could be accounted for by a Tyr-Tyr dipeptide extension (+ [2 × 163] mass units) at the C-terminus (after the Ser-Gly extension), which is consistent with the nucleotide sequence of the PIN-b gene [2], and a loss of Glu (− 128 mass units) at the N-terminus at the same time, resulting in a protein containing 117 amino acid residues (theoretical relative molecular mass 13 246). A similar loss of N-terminal Glu also appeared to occur in the Hereward minor PIN-b component. This kind of microheterogeneity has been observed for the mature PIN-a protein [1]. Therefore, it seems reasonable to suggest that PIN-b has similar C-terminal and N-terminal modifications.

MS and CE analyses of purified PIN-b proteins have provided important new information about the structural properties of PIN-b proteins. There are differences in their charges as well as their relative molecular masses. CE analyses of the purified PIN-b proteins revealed clear differences in PIN-b protein migration times: 6.61 min for the PIN-b from both the soft varieties Riband and Consort, 6.59 min for that from Hereward (hard), and 6.49 min for that from Soissons (hard). The differences were small, especially between the soft varieties and the Hereward group of hard varieties, in which the PIN-b proteins differ only in terms of the Gly→Ser amino acid substitution, i.e. with no alteration in charge. Nevertheless, when a mixture of three types of the purified PIN-b proteins was analyzed, all three were separated, although not completely; the PIN-b proteins from Hereward and Riband formed a doublet peak (Fig. 5). This clearly demonstrated that there are true mobility differences between the allelic forms of the PIN-b protein.

Figure 5.

 CE electrophoregram of a mixture of the PIN-b proteins purified from the soft variety Riband and the hard varieties Hereward and Soissons. The amount of Soissons protein in the mixture was half that used for Riband and Hereward.

The amount of PIN-b purified from the soft varieties accounted for about 16% (w/w) of the PIN-b-enriched fraction or 0.005% (w/w) on a flour weight basis. As for the hard varieties, the purified PIN-b protein accounted for 12% (w/w) of the PIN-b-enriched fraction, equivalent to 0.003% (w/w) of the flour of the variety containing the Gly→Ser substitution and 0.002% (w/w) of the flour of the variety containing the Trp→Arg substitution.

PIN-b containing the Leu→Pro mutation (Pinb-D1c)

Although a small amount of the TX-114-extracted fraction (nominally ‘PIN-b-enriched’) was produced from the hard variety Chablis, the quantity was less than those obtained from any of the other four varieties studied. Upon examination by CE, no peak was observed within the migration time region of 6.5–7.0 min expected for PIN-b (Fig. 6A). Further purification of the TX-114 extract by Mono-S cation exchange chromatography also did not show any protein peaks where the other allelic forms of PIN-b protein were eluted (Fig. 6B). Two possible explanations for these observations that occurred to us initially were either that Chablis represented a novel (at the time) null PIN-b variant or that its PIN-b gene contained one of the known stop codon mutations [10,11]. Somewhat surprisingly, these possibilities were ruled out by the observation that a cDNA product, the same size as the cDNA of other allelic forms of PIN-b, was produced after RT-PCR of the mRNA isolated from the developing endosperm of Chablis. Nucleotide sequencing of this product revealed that it was a complete PIN-b gene, but with a single base change, from thymine to cytosine, which would result in a Leu→Pro change at position 60 in the deduced amino acid sequence (results not shown). Thus, Chablis contains a PIN-b gene with the Leu→Pro substitution (Pinb-D1c allele).

Figure 6.

  (A) CE electrophoregram and (B) Mono-S cation exchange HPLC elution profile of PIN-b-enriched fraction from the hard variety Chablis.

The fact that, in the present research, an mRNA corresponding to this mutation was shown to be present in the developing endosperm shows that the corresponding PIN-b gene is present and is transcribed. Unlike the other allelic PIN-b proteins with single amino acid substitutions, the PIN-b protein with the Leu→Pro mutation was not found in the expected fractions. This observation was verified by CE analysis and protein purification using flours from two European wheat varieties, Portal and Avle. Both contain PIN-b genes with the Leu→Pro mutation [9]. Patterns of chromatography and CE profiles similar to those of Chablis were obtained; that is, no clearly identifiable PIN-b protein was found in either variety (results not shown).

Control experiments were carried out to determine whether the Leu→Pro substitution might have altered the properties of the protein such that it behaved unexpectedly in the fractionation procedure used for isolating PIN-b. Thus, extensive efforts were made to determine whether PIN-b occurred in any of the fractions obtained in the fractionation procedure other than that in which PIN-b was expected. Despite these strenuous efforts, which were carried out with the cultivars Portal and Avle, as well as Chablis, no PIN-b protein was found. Analysis of total TX-114 extracts without the CMC adsorption, precipitation with NaCl to remove purothionins and Mono-S ion-exchange chromatography also failed to demonstrate a PIN-b protein in the variety Chablis (results not shown). Furthermore, two other studies [12,23], in which extraction procedures were used that were different from those used here, also failed to find this PIN-b protein. We conclude, therefore, that the Leu→Pro PIN-b is not present in the flours of wheat cultivars containing this mutation.


The research reported here has resulted in the development of an effective purification protocol that enables substantial amounts (milligram amounts) of highly purified PIN-a and PIN-b proteins to be prepared from wheats with varying endosperm textures. This has facilitated further characterization of these proteins by CE, MS and N-terminal amino acid sequencing. Although PIN purification protocols have been reported previously [1,16,19], this is the first time that allelic forms of PIN-b protein have been isolated and characterized.

Only one type of PIN-a protein (genotype Pina-D1a) was found in the five varieties examined here, regardless of the endosperm textures of the wheats. So far, no additional allelic forms of protein have been reported with the use of either SDS/PAGE [7–10], polyacrylamide gel electrophoresis at acid pH (A-PAGE) [23,24], or in our laboratory by CE analyses of several hundred different wheat varieties [18]. However, in some hard varieties, a null form of the PIN-a gene occurs [8–10]. The PIN-a null form is relatively common in North American, Chinese and Australian wheats, but is rare in or absent from European wheats [8–11,25]. The PIN-a null form has not been observed so far in any UK bread wheats analyzed by CE [20].

The results obtained here were somewhat different from those described by Kooijman et al. [19] regarding PIN-a protein purification, in that they reported that PIN-a, along with PIN-b, was adsorbed onto CMC. In our research, the PIN-b protein was shown to be adsorbed onto the CMC, but PIN-a was not. The same results were obtained for all five wheat varieties examined here, which covered a range of hardness. The reason for this difference in the behavior of PIN-a is not known, but, as the PIN-a protein itself does not seem to vary from flour to flour, it is likely to be due to differences in experimental conditions from those used by Kooijman et al. [19]. For example, in the present research, the CMC step was carried out directly with the 4% TX-114 extracts of flour, compared with about 12% (v/v) TX-114 in the published procedure [19]. Such physicochemical differences may have affected the strength of the binding interaction between the CMC and PIN-a, although why this should have affected PIN-a but not PIN-b is not known.

PIN-b proteins were purified from four of the five varieties examined in detail here. The CE and MS results showed that there were three allelic forms of PIN-b proteins among these four varieties containing PIN-b. The PIN-b proteins from the soft varieties Riband and Consort had the same relative molecular mass of 13 076, which is consistent with the theoretical relative molecular mass [2] for the wild-type PIN-b protein, genotype Pinb-D1a, of 116 amino acids. A PIN-b protein with a relative molecular mass of 13 106 was found in the hard variety Hereward. The relative molecular mass difference, the CE mobility shift and the nucleotide sequence of the cDNA from the mRNA of varieties having a PIN-b with the same CE mobility were consistent with the conclusion that this PIN-b protein also contained 116 amino acids but possessed a single Gly→Ser amino acid substitution at position 46. The protein is the product of the Pinb-D1b allele. The main PIN-b protein from the hard cultivar Soissons had a relative molecular mass of 13 045, and its CE mobility differed from that of the wild-type PIN-b protein as well as from that of the PIN-b containing the Gly→Ser substitution found in the hard cultivar Hereward. The nucleotide sequence of the cDNA from the mRNA of Soissons showed that the Soissons PIN-b protein, which is the protein product of the Pinb-D1d allele, had a Trp→Arg mutation at position 44. Mobility differences between the three main allelic PIN-b proteins from Riband, Hereward and Soissons were observed by CE (Fig. 5). Therefore, the single amino acid substitutions Gly→Ser and Trp→Arg in the PIN-b proteins alter the relative molecular mass and/or the hydrodynamic conformation and/or the net charge and/or the solvation properties of these proteins to an extent sufficient to produce differences in mobility using the capillary zone electrophoresis mode of separation. Evidence of microheterogeneity was obtained for the PIN-b proteins from both Hereward and Soissons from the relative molecular mass results. A minor form of the PIN-b from Hereward appeared to have lost the N-terminal Glu residue that was present in the major isoform. In the case of Soissons, a minor PIN-b isoform appeared also to have lost the N-terminal Glu residue, but additionally it appeared to contain a C-terminal dipeptide (Tyr-Tyr) extension. The origin of this microheterogeneity is not known.

Although a cDNA sequence corresponding to the Leu→Pro mutation of the PIN-b gene was found in reverse transcripts of mRNA from the variety Chablis, no mature PIN-b protein was found in this hard wheat variety or in two other European varieties, Portal and Avle, known to have the Leu→Pro Pinb-D1c gene [9]. The Leu→Pro mutation in the putative PIN-b protein corresponding to this gene occurs outside the Trp-rich domain of PIN-b, between the second and third α-helical domains [3]. Like the Gly→Ser mutation, the Leu→Pro mutation would have no effect on the charge of the putative PIN-b protein, unlike the Trp→Arg mutation. The effect of the Leu→Pro mutation on the relative molecular mass of the putative PIN-b protein would be similar to those of the Gly→Ser or Trp→Arg mutations. Strenuous attempts to detect the protein in fractions other than those in which it was expected to occur were unsuccessful. As a cDNA encoding the Leu→Pro substitution was produced by RT-PCR of mRNA, it was clear that the corresponding gene was transcribed to mRNA. Analysis of genomic DNA sequences from varieties containing the Leu→Pro substitution has not so far indicated any defect that might prevent mRNA translation. Assuming that the mRNA for the PIN-b containing the Leu→Pro substitution is translated, it appears that, for some reason, as yet unknown, the protein does not accumulate in the endosperm. The ubiquitin/proteasome protein degradation pathway is known to be very active in plants [26], and it seems reasonable to speculate that the mRNA for the putative PIN-b protein containing the Leu→Pro substitution may be translated but that the structure of the newly synthesized protein may be altered sufficiently that the protein is degraded by proteases and thus does not accumulate in the endosperm. Ikeda et al. [12] also reported recently that no PIN-b protein could be detected in any of 10 Japanese and North American cultivars with the Pinb-D1c genotype, i.e. the Leu→Pro substitution.

We have put forward a hypothesis, the ‘friabilin hypothesis’[27,28], which attempts to explain variation in endosperm texture on the basis of a protein named ‘friabilin’, which is now known mainly to comprise puroindolines [29–31], interacting more or less efficiently with the starch granule surface. When the interaction occurs at a low level or is absent, the starch granule surface may adhere strongly to the surrounding matrix of gluten proteins and other components in the endosperm. In this case, the endosperm will be hard. When the PIN–starch granule surface interaction occurs at a high level, as when both PIN-a and wild-type PIN-b are present, the PINs hypothetically act as a ‘nonstick’ layer at the granule surface. This prevents adhesion of the granule surface to other endosperm components, thus creating a plane of weakness and rendering the endosperm soft. The fact that both PIN-a and wild-type PIN-b must be present simultaneously to produce endosperm softness suggests that endosperm softness may be a two-gene effect. The presence of PIN-a together with forms of PIN-b other than the wild-type, or the presence of wild-type PIN-b in the absence of PIN-a, are not sufficient to produce endosperm softness. Exactly how PIN-a and wild-type PIN-b interact to produce endosperm softness is not known.

The PIN–starch granule surface interaction may hypothetically involve the PIN Trp-rich domain binding to the starch granule surface through hydrophobic parallel stacking of indole rings with glucose rings [32]. The substitution of one of the Trp residues in the PIN-b Trp-rich domain with an Arg residue will reduce the hydrophobicity of the Trp-rich domain of PIN-b considerably and may reduce not only the potential for hydrophobic stacking significantly but also the lipid-binding ability of the protein. PINs are considered to be lipid-binding proteins [3], and there is some evidence that the interaction of PINs with the starch granule surface may be complex, involving polar lipids as well as the starch and protein components [33,34]. The Trp→Arg amino acid substitution may therefore be more significant than the Gly→Ser substitution in terms of endosperm texture. On the basis of the friabilin hypothesis, the Trp→Arg substitution in particular might weaken the interaction of PIN-b with the starch granule surface, allowing greater adhesion between the granule surface and the surrounding matrix of endosperm components and resulting in hard wheat grain endosperm texture. The variety Soissons, characterized here and in other work [24] as containing PIN-b with the Trp→Arg substitution, is, in fact, harder than varieties with the Gly→Ser substitution (unpublished results), as would be expected in the light of the above discussion.

Possible quantitative effects of different amounts of PIN proteins on endosperm texture, as well as qualitative effects of different allelic forms of PIN-b, also need to be considered. When both wild-type PIN-a and PIN-b are present, as in soft hexaploid (T. aestivum) wheats, the amount of PIN protein is at its greatest, and endosperm softening occurs to the greatest extent. Wheats containing allelic forms of PIN-b appear to have reduced amounts of PIN-b protein on the basis of peak sizes in CE electrophoregrams (Fig. 1); the amounts of purified PIN proteins recovered in our PIN isolation work were also lower for these wheats (see above). PIN protein levels in PIN-a null wheats are also reduced [18]. Ikeda et al. [12] have recently reached similar conclusions, based on the intensity of spots in two-dimensional electrophoregrams of protein-enriched fractions obtained by phase separation of TX-114 extracts. The hardest wheat variety examined in the present research (Chablis) appeared to have the lowest amount of PIN proteins as judged by the size of the PIN-a peak and the absence of a PIN-b peak in the CE electrophoregrams. Wheats lacking the genes encoding PIN-a also have harder grains than those with allelic forms of PIN-b [35,36]. This again is consistent with variation in PIN protein amount being important in relation to grain hardness, as the amount of PIN-a is substantially greater than that of PIN-b in those varieties in which both PIN proteins are present. Furthermore, the kernels of durum wheats (T. durum), which are harder still than the hardest hexaploid (T. aestivum) wheat kernels, are totally lacking in PIN proteins because of the absence of the D chromosome set, within which the genes encoding PIN proteins are located (i.e. on the short arm of chromosome 5D).

Research reported by Igrejas et al. [37] apparently contradicts this possible effect of PIN protein amount on endosperm texture variation. In that study, no correlation was observed either with PIN-a or PIN-b protein content, measured using an immunoassay (ELISA), and grain hardness. The lack of correlation between PIN-a content and endosperm texture is not surprising, given the results obtained in the present research and those obtained in other unpublished research from our laboratory showing that PIN-a CE peak sizes are relatively constant among different wheat varieties, except for PIN-a null genotypes, none of which were included in the study by Igrejas et al. [37]. Previous work with French wheats had shown a negative correlation of PIN-a content with grain hardness when PIN-a null genotypes were included in the sample set, however [38]. Furthermore, although the polyclonal antibodies used in the research by Igrejas et al. [37] appeared to be specific under the conditions used in the ELISA, it is not clear, first, against which allelic form of PIN-b the antisera used to prepare those anti-(PIN-b) sera were raised, and, second, whether or not the reactivity of the polyclonal antibodies was the same against different allelic forms of PIN-b. Indeed, the PIN-b alleles of the varieties/advanced breeding lines used were not determined/reported. It should be borne in mind also that the proportion of the variation in endosperm texture among bread wheats controlled by the Ha locus on the short arm of chromosome 5D (i.e. where the PIN genes are located) has been estimated at about 60%[14,15]. Therefore, a substantial, but minor, proportion of the variation in endosperm texture is controlled by genes other than those encoding PIN proteins. Whether or not the possible importance of PIN protein amount may have been masked as a result of one or more of these factors in the research by Igrejas et al. [37] is not clear.

There are several unresolved questions concerning the functional role of PIN proteins in controlling variation in endosperm texture in wheat grains. The availability of the protocol described here for purifying substantial quantities of both PIN-a and the different allelic forms of PIN-b may provide a number of new opportunities for investigating the mechanism of the different effects of PIN proteins on endosperm texture. These might include in vitro binding studies to elucidate the potential role of tryptophan/sugar hydrophobic stacking in the binding of PINs to the starch granule surface, as well as studies on the role of polar lipid binding in the control of endosperm texture [27,33,34].


A purification procedure and a CE method were established for the preparation and identification, respectively, of PIN proteins from wheats differing in their endosperm texture. The relative molecular masses of the purified PIN-a and PIN-b proteins of different allelic forms were determined by MS. One form of PIN-a protein (allele designated Pina-D1a) was identified in the five varieties examined, regardless of their endosperm texture. Three allelic forms of PIN-b proteins were identified. They are: the wild-type (Pinb-D1a) PIN-b with a relative molecular mass of 13 076, found in wheats with soft endosperm texture; PIN-b, with a relative molecular mass of 13 106, corresponding to a Gly→Ser substitution (allele Pinb-D1b), found in hard wheats; and PIN-b, with a relative molecular mass of 13 045, corresponding to a Trp→Arg substitution (allele Pinb-D1d), also found in hard wheats. Although mRNA corresponding to a Leu→Pro substitution (allele Pinb-D1c) was found in the developing endosperm of the hard wheat variety Chablis, no PIN-b protein was identified in this variety or in two other North European varieties known to possess this PIN-b gene sequence. The present research has provided opportunities to elucidate further the biochemical mechanism of endosperm texture variation in wheat, and to investigate the physicochemical functional properties of these lipid-binding proteins in flour processing.

Experimental procedures

Chemicals and reagents

TRIZMA base [tris(hydroxymethyl)aminomethane], Triton X-114, purothionin and horse myoglobin protein standards were purchased from Sigma Aldrich (Poole, UK). Diethyl ether, ethanol and acetonitrile were obtained from Rathburn (Walkerburn, UK). CMC (pre-swollen microgranular, CM52) was obtained from Whatman (Maidstone, UK). Methylhydroxypropyl cellulose [viscosity of a 2% (w/v) aqueous solution 40 000–60 000 cP] was obtained from Hercules (Salford, UK). The water used for preparing chemical solutions and buffers was from a Milli-Q Plus purification system (Millipore, Bedford, MA, USA). The Mono-S HR10/10 cation exchange column was obtained from Amersham Pharmacia Biotech (St Albans, UK). The RNeasy Plant Mini Kit and the OneStep RT-PCR Kit were purchased from Qiagen (Crawley, UK). Dye Reagent Concentrate for the Bradford dye-binding protein assay was purchased from Bio-Rad (Hemel Hempstead, UK). Other chemicals were obtained from BDH (Poole, UK).

Wheat flour samples

Hexaploid (T. aestivum) wheat samples from the 1999 UK harvest were of pure varieties grown at various sites under the UK Recommended List Trials conducted by the National Institute of Agricultural Botany, Cambridge, UK. Wheat grain was milled into ‘straight-run’ white flour on a laboratory-scale mill Bühler MLU 202 (Bühler, Uzwil, Switzerland) using the Campden & Chorleywood Food Research Association (CCFRA) Tes-CM-0001 milling protocol (details of which can be obtained from CCFRA). The Bühler mill settings were used to achieve flour yields and starch damage levels as close as possible to current commercial practice in the UK. Flours milled from the German variety Portal and the Swedish variety Avle, both grown in Norway, were kindly provided by K. Tronsmo, E. Merethe Magnus and E. Mosleth Færgestad, Norwegian Food Research Institute MATFORSK, Ås, Norway.

Preparation of crude protein fractions

Wheat flour (240 g) was initially stirred with the PEK extraction buffer (0.05 m sodium phosphate, pH 7.6, containing 0.05 m KCl and 0.005 m EDTA, 1.2 L) at 4 °C for 1 h and centrifuged (1000 g, 10 min, Sorvall RC5B, DuPont, Wilmington, DE, USA). The resulting pellet was extracted with 4% (v/v) TX-114 in the PEK buffer (1.2 L), mixed at 4 °C for 1 h, and centrifuged (13 000 g, 10 min, Sorvall RC5B, DuPont). The pH of the supernatant was then adjusted to 4.5 with glacial acetic acid. CMC (12 g) was added to the TX-114 extract and mixed continuously with an overhead stirrer at 4 °C overnight. The CMC pellet, containing CMC-bound proteins, was then separated from the TX-114 supernatant, containing CMC-unbound proteins, by centrifugation (13 000 g, 10 min, Sorvall RC5B, DuPont).

The CMC pellet was washed twice with 0.05 m acetic acid (0.4 L). CMC-bound proteins were then eluted from the CMC resin with 1 m NaCl in 0.05 m acetic acid (0.08 L). The CMC pellet was removed by centrifugation (13 000 g, 10 min, Sorvall RC5B, DuPont), and solid NaCl (14 g) was dissolved in the supernatant to a final concentration of 4 m, mixed at 4 °C for 1 h, and centrifuged (13 000 g, 10 min, Sorvall RC5B, DuPont). The pellet was redissolved in 0.05 m acetic acid (0.02 L), dialyzed (relative molecular mass cut-off 8000) against 0.05 m acetic acid (2 × 5 L, over 48 h) and freeze-dried (crude PIN-b fraction). The supernatant was also dialyzed against 0.05 m acetic acid (2 × 5 L, over 48 h) and freeze-dried (crude purothionin fraction).

The TX-114 supernatant was heated at 30 °C for 1 h, and centrifuged (13 000 g, 10 min, Sorvall RC5B, DuPont) to achieve phase separation. The upper detergent-poor phase was discarded. The lower TX-114-rich phase was mixed with ethanol/diethyl ether (3 : 1 v/v, 1.2 L), and proteins were precipitated overnight at − 20 °C. The protein pellet was recovered by centrifugation (13 000 g, 10 min, Sorvall RC5B, DuPont), and washed with ethanol/diethyl ether (0.4 L) at − 20 °C to remove traces of the TX-114 detergent. The pellet was then vacuum dried briefly, redissolved in 0.05 m acetic acid (0.08 L), dialyzed against 0.05 m acetic acid (2 × 5 L, over 48 h) and freeze-dried (crude PIN-a fraction).

Protein purification by cation exchange chromatography

The dry crude protein fractions were dissolved in 0.05 m ammonium acetate, apparent pH 5.5, in 25% (v/v) acetonitrile (i.e. 1 : 3 acetonitrile/water by volume), heated at 30 °C for 30 min, and centrifuged (11 600 g, 4 min, Eppendorf 541 5D, Eppendorf AG, Hamburg, Germany). The resulting supernatant was applied to a Mono-S HR10/10 column (cation exchanger). Proteins were eluted with a 0.05 m to 1 m concentration gradient of ammonium acetate (pH 5.5) containing 25% (v/v) acetonitrile. The flow rate was 3 mL·min−1. The absorbance was monitored at 280 nm, and fractions were collected, dialyzed against water (2 × 5 L, over 48 h) and freeze-dried. Fractions containing PINs were identified by CE.


CE analyses were carried out using a Beckman P/ACE MDQ system (Beckman Coulter, Inc., Fullerton, CA, USA), largely as described previously [18]. Freeze-dried or solvent-precipitated crude proteins, or protein fractions collected after the cation exchange chromatography, were dissolved in 0.05 m acetic acid to give an approximate protein concentration of 0.2–0.5 mg·mL−1 by Bradford protein assay. The protein sample solution was injected into the CE system at the anode using pressure (0.01 MPa, 20 s). The capillary zone electrophoresis mode of separation was carried out in the electrophoresis buffer (0.1 m sodium phosphate containing 0.05% w/v methylhydroxypropyl cellulose, pH 2.5) with a constant voltage of + 15 kV at 25 °C and monitored by UV detection at 200 nm. Before each run, the capillary was rinsed with 1 m phosphoric acid (0.14 MPa, 1 min), followed by water (0.14 MPa, 1 min) and then the electrophoresis buffer (0.14 MPa, 3 min).


ES MS analyses were carried out using a VG Quattro tandem mass spectrometer (Waters Ltd, Elstree, UK). Freeze-dried purified proteins (0.5 mg) were dissolved in Milli-Q water (0.5 mL). Acetonitrile (0.5 mL) and formic acid (2 µL) were added to produce a sample suitable for ES MS. The sample (20 µL) was injected into a solvent flow of 50% (v/v) aqueous acetonitrile at a flow rate of 20 µL·min−1, using a standard injection loop. Data were acquired for 4 min after the initial injection, using multichannel analysis mode with a scan time of 5.5 s (interscan 0.10 s·decade−1, where a decade is the mass range covering a factor of 10) and over the mass/charge range 600–2000 m/z. Data were analyzed using masslynx software, version 1.2 (Waters Ltd). The relative molecular mass information for the protein was obtained using maxent software (Micromass, Manchester, UK) (output mass of 10 000–15 000, resolution of 1.00 mass units per channel). The mass spectrometer was calibrated using a solution of native horse heart myoglobin (0.2 mg·mL−1) in 50% (v/v) aqueous acetonitrile containing 0.2% (v/v) formic acid.

The purified PIN-b proteins from Riband, Consort and Hereward were also analyzed using MALDI-TOF (Voyager-DE™ STR Workstation, Applied Biosystems, Warrington, UK) by the Michael Barber Centre for Mass Spectrometry, University of Manchester, UK. The instrument was equipped with a 337 nm nitrogen laser. The samples were mixed with an equal volume of UV-absorbing matrix solution and subsequently dissolved in water/acetonitrile (2 : 1, v/v) containing 0.1% (v/v) trifluoroacetic acid. The prepared sample was then spotted and dried onto the target for analysis. The accelerating voltage was 25 kV in linear mode and 23.6 kV in reflection mode; the signal digitization rates were 250 and 500 MHz, respectively.

N-terminal amino acid sequencing

The phenyl isothiocyanate method used was that described by Bloch et al. [31]. N-terminal amino acid sequences were determined by M. Naldred, John Innes Centre for Plant Science Research, Norwich, UK, using a pulsed-liquid amino acid sequencer (model 477A, Applied Biosystems, Foster City, CA, USA) equipped with an online phenylthiohydantoin amino acid analyzer (model 120A).

cDNA preparation and nucleotide sequencing

Greenhouse-grown plants, kindly supplied by P. Shewry and H. Darlington [Institute of Arable Crops Research (IACR), Long Ashton Research Station, Bristol, UK], were used for the isolation of RNA. The heads of the plants were tagged on the day when anthesis started and collected between 14 and 18 days after anthesis. The developing endosperms were harvested, immediately frozen in liquid nitrogen, and stored at − 80 °C until required. Total RNA from developing endosperms was isolated using the Qiagen RNeasy Plant Mini Kit according to the kit protocol. Reverse transcription of RNA and PCR amplification of cDNA were carried out using the Qiagen OneStep RT-PCR Kit. All components required for reverse transcription and PCR amplification were added into one tube during set-up, according to the kit protocol. Both reactions were carried out sequentially in the same tube. Primers specific for PIN-a or PIN-b previously published by Gautier et al. [2] were synthesized by MWG-Biotech AG (Milton Keynes, UK) and used for the RT-PCR amplification of either PIN-a or PIN-b cDNA. Reactions were performed in a 96-well GeneAmp® PCR System 9600 (Applied Biosystems). The samples were first incubated at 50 °C for 30 min to complete the reverse transcription reaction. The reaction temperature was then raised to 95 °C for 15 min to activate the HotStarTaq DNA polymerase as well as to inactivate the reverse transcriptase and to denature the cDNA template. This was followed by 30 cycles of denaturation at 94 °C for 30 s, annealing at 58 °C for 1 min and extension at 72 °C for 2 min. The reaction was completed with a final extension at 72 °C for 10 min. The sizes of the cDNA were confirmed using an Agilent 2100 Bioanalyzer (Palo Alto, CA, USA) with a DNA LapChip® and a DNA 1000 assay kit. Nucleotide base sequencing was carried out by Lark Technologies Inc. (Saffron Walden, UK), using dye-terminator sequencing methodologies on an automated PE Biosystems (Warrington, UK) sequencer.


This work was supported by Biotechnology and Biological Sciences Research Council (BBSRC) Cooperative Awards in Science and Engineering (CASE) to LD and SAL. We thank Professor P.R. Shewry and Dr H. Darlington (IACR, Long Ashton Research Station, Bristol, UK) for providing plant materials and many useful scientific discussions. We also thank Dr K. Tronsmo, Dr E. M. Magnus and Dr E. M. Færgestad (Norwegian Food Research Institute, MATFORSK, Ås, Norway) for providing Portal and Avle grain and flour samples, Dr M. Naldred (John Innes Centre for Plant Science Research, Norwich, UK) for carrying out N-terminal amino acid sequencing, and J. Connolly (University of Manchester, UK) for performing the MALDI-TOF MS analysis.