SEARCH

SEARCH BY CITATION

Keywords:

  • disulfide bond;
  • Kazal;
  • protease–inhibitor complex;
  • specificity;
  • thermostability

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. References
  9. Supporting Information

Greglin is an 83-residue serine protease inhibitor purified from the ovaries of the locust Schistocerca gregaria. Greglin is a strong inhibitor of subtilisin and human neutrophil elastase, acting at sub-nanomolar and nanomolar concentrations, respectively; it also inhibits neutrophil cathepsin G, α-chymotrypsin and porcine pancreatic elastase, but to a lesser extent. In the present study, we show that greglin resists denaturation at high temperature (95 °C) and after exposure to acetonitrile and acidic or basic pH. Greglin is composed of two domains consisting of residues 1–20 and 21–83. Mass spectrometry indicates that the N-terminal domain (1–20) is post-translationally modified by phosphorylations at three sites and probably contains a glycosylation site. The crystal structure of the region of greglin comprising residues 21–78 in complex with subtilisin was determined at 1.75 Å resolution. Greglin represents a novel member of the non-classical Kazal inhibitors, as it has a unique additional C-terminal region (70–83) connected to the core of the molecule via a supplementary disulfide bond. The stability of greglin was compared with that of an ovomucoid inhibitor. The thermostability and inhibitory specificity of greglin are discussed in light of its structure. In particular, we propose that the C-terminal region is responsible for non-favourable interactions with the autolysis loop (140-loop) of serine proteases of the chymotrypsin family, and thus governs specificity.

Database

The atomic coordinates and structure factors for the greglin–subtilisin complex have been deposited with the RCSB Protein Data Bank under accession number 4GI3.


Abbreviations
AEI

Anemonia elastase inhibitor

CID

collision-induced dissociation

HRMS

high-resolution mass spectrometry

OMTKY3

third domain of turkey ovomucoid

RSL

reactive site loop

Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. References
  9. Supporting Information

Serine proteases are involved in many biological processes. In addition to their role in digestion, they have highly regulated functions in embryonic development, the immune response and blood coagulation [1]. In a healthy organism, the proteolytic activity of these enzymes is regulated by natural endogenous inhibitors. An imbalance between proteases and inhibitors may lead to severe dysfunctions. Serine protease inhibitors are classified as canonical inhibitors, non-canonical inhibitors and serpins based on their mechanism of action [2]. Approximately 20 families of canonical inhibitors have been defined based on sequence similarities, disulfide patterns, location of the active site and topological similarity of the overall structures, but the classification is continually being revised (http://merops.sanger.ac.uk) [3]. All the families share the canonical conformation of the inhibitory binding loop (also called the reactive site loop), but each family has its own specific three-dimensional fold. The best-studied families are Kazal, Kunitz (bovine pancreatic trypsin inhibitor), ecotin, pacifastin, Bowman–Birk and eglin.

The Kazal family is well-characterized, especially at the structural level [4-6]. A typical Kazal domain comprises 40–60 amino acids that include six cysteine residues capable of forming three intra-domain disulfide bridges with the pattern 1–5, 2–4 and 3–6, resulting in a characteristic three-dimensional structure comprising one α-helix surrounded by an adjacent three-stranded anti-parallel β-sheet. The reactive site P1inline image, as defined by Schechter and Berger [7], is located between the second and third cysteine. The latter is always located at the start of the central strand (β1) of the β-sheet. The three disulfide bonds stabilize the reactive site loop (RSL) and anchor the termini to the secondary structural elements that form the core of the molecule. Two disulfide bonds, namely 1–5 and 2–4, participate in the so-called cystine-stabilized α-helix motif. Classifications have been proposed based on the spacing between the cysteines of this motif [8]. Classical and non-classical Kazal domains, comprising sub-groups 1 and 2, have been defined based on the values of m and n in the frameworks Cys#1-(X)m-Cys#2 and Cys#4-(X)n-Cys#5 [8, 9]. We propose that the classification primarily depends on the value n, which indicates the relative position of the two cysteines present in the helix (Cys#4 and Cys#5), and that the value m is less important. Classical Kazal domains are defined by a two-residue spacing between Cys#4 and Cys#5, and are represented by ovomucoid inhibitors, such as the third domain of turkey ovomucoid (OMTKY3). Non-classical Kazal inhibitors contain a spacer region between Cys#4 and Cys#5 of between three and seven residues. Of all the Kazal inhibitors of known structure, only the ascidian trypsin inhibitor displays four disulfide bonds, with three cysteines in the α-helix [10].

The discovery of novel inhibitors of serine proteases may lead to applications ranging from pest management to drug therapy. Many organisms, including plants, fungi and insects, are potential reservoirs of as yet unknown inhibitors. In a previous study, a novel serine protease inhibitor named greglin was purified from the ovaries of the locust Schistocerca gregaria and biochemically characterized [11]. Greglin is 83 residues long, and its sequence revealed no striking homology except for a distant relationship with Kazal inhibitors. Greglin is a fast-acting and tight-binding inhibitor of human neutrophil elastase, with a Ki of 3.6 nm, and of subtilisin, with a Ki of 0.68 nm; it also inhibits neutrophil cathepsin G, α-chymotrypsin and porcine pancreatic elastase with Ki values of 153, 26 and 58 nm, respectively. In the same study, it was shown that greglin resists proteolysis and denaturation at high temperature (90 °C) and after exposure to acetonitrile, acidic and basic pH [11]. To uncover the mechanisms underlying its biophysical properties, we determined the structure of greglin isolated from the ovaries of S. gregaria in complex with subtilisin Carlsberg. The structural data were complemented by detailed MS characterization. In addition, the stability of greglin was compared to that of chicken ovomucoid inhibitor by measuring their inhibitory activities after treatment under various conditions of temperature (up to 95 °C), pH and solvent.

Results

  1. Top of page
  2. Abstract
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. References
  9. Supporting Information

Initially, a synthetic gene encoding greglin was constructed according to the amino acid sequence published previously [11]. Greglin was cloned for over-expression in bacteria and Drosophila S2 cells. Using these two expression systems, a recombinant protein was obtained, but was either not soluble or showed only a residual inhibitory activity despite numerous optimization trials (C. Derache, A. Roussel & C. Kellenberger, unpublished data). Because these results could not be improved, we pursued an alternative strategy. We isolated the inhibitor from its natural source, the ovaries of the locust S. gregaria, as performed in the previous study [11]. Purified greglin was obtained using a three-step isolation procedure involving salt precipitation and reversed-phase HPLC, and its purity was assessed by polyacrylamide gel electrophoresis and mass spectrometry.

As greglin was previously reported to be a strong inhibitor of subtilisin, with an inhibition constant (Ki) of 0.68 nm, a complex between greglin and subtilisin Carlsberg was formed and purified by size-exclusion chromatography. The resulting complex was crystallized and its structure was determined.

Structure of the complex between greglin and subtilisin

The crystals of the complex between greglin and subtilisin belonged to the space group C2 and diffracted to 1.75 Å resolution. The structure was solved by the molecular replacement method using the subtilisin structure (Protein Data Bank ID 1YU6) as the search model. The structure of greglin was built manually in Turbo-Frodo [12] using clear difference Fourier maps, and refined using Refmac [13] and Buster [14]. The model was refined to a final Rfactor of 16.5% (Rfree = 20.3%), with good stereochemical parameters (Table 1). The final refined model consisted of residues 21–78 of greglin and residues 1–275 of subtilisin (Fig. 1A).

Table 1. Data collection and refinement statistics for the subtilisin/greglin complex
  1. a

    Values in parentheses refer to data for the highest-resolution shell (1.84–1.75 Å). b Rmeasured = Rr.i.m. = Σ √(n/n − 1) | Ihl − <Ih> | / Σ |<Ih> |, where Ihl is the observed intensity and <Ih> is the mean intensity from n observations (symmetry-related and duplicate measurements of a unique reflection). c Rp.i.m. = Σ √(1/n − 1) | Ihl − <Ih> | / Σ | <Ih> | d R = Σ ||Fobs| − |Fcalc|| / Σ |Fobs|. Rfree is the R value for a subset of 5% of the reflection data that were not included in the crystallographic refinement. Rr.i.m., multiplicity-weighted R measured; Rp.i.m., p.i.m. for precision-indicating merging R value.

Data collection statistics
Radiation sourceESRF ID23-EH2
Wavelength (Å)0.87260
Space groupC2
Cell dimensions= 131.1 Å, = 39.7Å, = 59.8 Å, β = 98.1
Total measured reflections116 178
Unique reflections31 070
Completeness (%)100 (100)a
Redundancy3.7 (3.7)a
R measured a 0.091 (0.541)a
R p.i.m. a 0.047 (0.279)a
I/s(I)11.1 (2.7)a
Refinement and model statistics
Resolution (Å)32.5–1.75
Number of reflections used31 066
R work a 0.166
R free a 0.202
Average B values (Å2)
All atoms23.2
Subtilisin18.2
Greglin35.2
Water atoms34.6
Root mean square deviation from ideality
Bond lengths (Å)0.010
Bond angles (°)1.09
Torsion angles (°)3.27
Number of atoms
Protein2361
Water368
Ramachandran plot
Favoured region (%)98.5% (322/327)
Allowed region (%)100% (327/327)
MolProbity score [14a]1.19 (99th percentile)
image

Figure 1. Overall structure of the greglin/subtilisin complex. (A) Complex between subtilisin Carlsberg, represented in grey surface, and greglin, in cartoon representation. The Ser221 residue in the active site of subtilisin is shown in red. In greglin, the P3inline image canonical binding loop (residues 25–30), encompassing the P1 residue (Leu27) represented by sticks, is shown in yellow. The C-terminus extension (residues 70–78) is shown in cyan. The S atoms of the cysteines are shown in pink. (B) Greglin alone after a rotation of approximately 70° along the horizontal axis compared with (A). The eight cysteines are numbered. The four disulfide bonds (1–5, 2–4, 3–6 and S1–S2) are indicated, as well as the P1 residue. The letters N and C indicate the N- and C-termini, respectively. The secondary structural elements are labelled. The figures were prepared using PyMOL.

Download figure to PowerPoint

The structure of subtilisin superposes well on other subtilisin structures [15, 16]. The structure consists of a central seven-stranded parallel β-sheet decorated by two α-helices on one side and four α-helices on the other side. The catalytic triad comprises residues Asp32, His64 and Ser221, which are located in the so-called substrate-binding cleft.

The structure of greglin region 21–78 consists of an anti-parallel β-sheet composed of three strands (β1 = 32–36, β2 = 39–44, β3 = 61–63). The β1 strand is inserted between the β2 and β3 strands. The β2 and β3 strands are connected by an α-helix (residues 47–57) (Fig. 1B). Residues Pro65 to Cys69 adopt a 310 turn. The architecture of the fold is stabilized by four disulfide bonds at Cys21–Cys55, Cys25–Cys48, Cys33–Cys69 and Cys53–Cys76. No electron density was observed for residues 1–20 and 79–83. To verify whether these two fragments are either disordered or not present, crystals were analysed by mass spectrometry. Two major peaks at 6954.2 and 6725.1 Da were observed, corresponding to the truncated forms of greglin residues 19–78 and 19–80, respectively. The N-terminal region is therefore not present in the crystals.

Revision of the greglin protein sequence

The resolution of the crystal structure of greglin, in combination with MS analysis, showed discrepancies with the previously published sequence [11] (Fig. 2A). The most notable differences occur at positions 21 and 25, where two cysteines were built in the electron density map instead of Leu21 and Ser25, as previously determined by Edman degradation. These findings were confirmed by mass spectrometry. The presence of eight cysteines was deduced from the 8 Da difference between the non-reduced form of greglin (observed monoisotopic mass 9813.06 Da) and the reduced form (observed monoisotopic mass 9821.08 Da) (Fig. 2), with up to three phosphorylations accounting for mass shifts upon dephosphorylation. De novo sequencing of dephosphorylated tryptic peptides by nanoLC coupled with high-resolution mass spectrometry (HRMS) fragmentation (nanoLC-MS/MS) confirmed the presence of cysteines at positions 21 and 25 (Fig. S1) and identified a glutamine at position 39 instead of a serine (Fig. S2). The quality of the electron density map did not allow confirmation of the identity of Gln39. The determined mass of peptide 1–27 using nanoLC-MS/MS is 472.1464 Da greater than its theoretical mass, indicating either a potential post-translational modification or an N-terminal extension of the sequence. The mass of intact greglin after reduction and dephosphorylation (Fig. 2) was consistent with the corrected sequence and the 472 Da mass difference for peptide 1–27. A singly charged m/z 473.1535 ion, readily visible in the MS/MS spectra, may correspond to a protonated form of the species that gives rise to the experimentally determined mass difference. If so, this species is completely labile during the collision-induced dissociation (CID) fragmentation process (Fig. S1), which precludes its localization to a residue side chain. To assess prompt fragmentation at a proline residue, leading to internal peptide fragments, pseudo-MS3 experiments on this species were performed using a combination of ion source decay with CID in HRMS (Fig. S3). Product ions of m/z 473.1531 suggest the presence of hexose (neutral loss of 162.0529 Da), while neutral losses of 18.0109 Da indicate the presence of alcohol and/or acid groups. The observed fragmentation pattern thus appears to exclude the presence of a few extra N-terminal amino acids in the sequence in favour of a previously undescribed post-translational modification. Based on phosphorylations observed on intact greglin (9733.08, 9813.06 and 9893.04 Da), tryptic phosphopeptides were enriched using immobilized metal-affinity chromatography. MALDI-TOF MS analysis confirmed the presence of three phosphorylations on peptide 1–27 with the 472 Da modification (Fig. S4). A phosphorylation site on Ser8 was identified unambiguously using high-resolution nanoLC-MS/MS (Fig. S5). Overall, the present mass spectrometry analysis is in agreement with the masses obtained in the previous study, which were 9600–9844 Da instead of the theoretical molecular mass of 9229 Da [11]. All post-translational modifications observed thus far are located on the N-terminal region, and therefore could not be confirmed by the crystal structure.

image

Figure 2. Sequence analysis of greglin. (A) Annotated corrected sequence. The corrections of the published sequence are shown in green for the amino acids determined by mass spectrometry, and in red (Cys21 and Cys25) for the amino acids determined by both crystallography and mass spectrometry. The disulfide bonds determined by crystallography are shown in two colours: blue for the three classical bonds of Kazal inhibitors and red for the additional disulfide bond. The secondary structural elements are represented as arrows for the strands and a cylinder for the helix. The phosphorylation determined by mass spectrometry is indicated by an encircled P. (B) High-resolution QTOF MS characterization of intact greglin. Deconvoluted mass spectra are shown for intact greglin (top), reduced greglin (middle) and reduced and dephosphorylated greglin (bottom). The mass shifts upon reduction indicate that greglin contains four disulfide bridges. The mass shifts upon dephosphorylation are consistent with up to three phosphorylations.

Download figure to PowerPoint

A BLAST search (http://blast.ncbi.nlm.nih.gov) using the corrected sequence of greglin (residues 21–78) was performed. Although the program did not detect putative conserved domains, it retrieved sequences producing significant alignments. These sequences are from Kazal-type serine protease inhibitors and proteins with annotated Kazal domains (follistatin and agrin) without inhibitory activity. Greglin possesses a unique Cys-(X)4-Cys-X-Cys motif that is not observed in any of the retrieved protein sequences. Another BLAST search was performed against the Protein Data Bank (PDB). The first protein that was retrieved was the Anemonia elastase inhibitor (AEI) with PDB ID 1Y1B, which belongs to the non-classical Kazal family [9].

Reactive site loop of greglin

The reactive site loop of greglin binds subtilisin from P5 (Ser23) to inline image (Tyr29) in a substrate-like manner by establishing 10 hydrogen bonds (Fig. S6). Typically, Kazal inhibitors form approximately 50% of the hydrogen-bonding interactions with the active-site residues through the P1 residue. For greglin, the P1 residue, Leu27, forms four hydrogen bonds with the active-site residues of subtilisin. The carbonyl O atom forms two hydrogen bonds, one with the amide N atom of Ser221 and one with the ND2 atom of Asn155. The amide N atom of P1 forms two hydrogen bonds, one with the OG atom of Ser221 and one with the carbonyl O atom of Ser125 (Fig. S6). The presence of a Leu residue at P1 provides an explanation for the inability of greglin to inhibit PR3 [11], an elastase-related neutrophil protease, whose S1 pocket is narrower than that of elastase and cathepsin G and is thus unable to accommodate the large leucyl side chain [17]. The position P2 is occupied by Pro26, in which the CB atom forms hydrophobic interactions with the CD, CE and CG atoms in the imidazole group of His64 in subtilisin. Identical interactions occur in a complex between domain 1 of CrSPI (a two-headed Kazal inhibitor) and subtilisin [16]. In OMTKY3, the P2 residue is Thr17, in which the CG2 atom also forms hydrophobic contacts [15]. For the majority of serine protease inhibitors, the P3 position is occupied by a cysteine, which is involved in a disulfide bond. This is the case for greglin, in which Cys25 forms a disulfide bond with Cys48. Tyr29 in the inline image position forms a stacking interaction with Phe189 of subtilisin. An identical interaction also occurs in the CrSPI–subtilisin complex [16]. The RSL of greglin adopts a canonical conformation from P2 to inline image that is entirely superposable on that of inhibitors from various families (Fig. S7). The rigidity of the RSL is maintained by disulfide bonds involving cysteines at the P3 and inline image position and by internal hydrogen bonds. These latter involve four hydrogen bonds between the residues at positions P2 and inline image (Asn46 and Glu49). In particular, the carbonyl O atom of Pro26 (P2) forms a hydrogen bond with the amide N atom of Ile28 (inline image); this interaction is absent in OMTKY3 but present in CrSPI.

Structural similarity searches with greglin

A structural similarity search using the DALI program [18] indicated that the structure of greglin is similar to that of Kazal domains. This search retrieved follistatin (PDB ID 2ARP) [19], a non-inhibitory protein containing Kazal domains, and the following Kazal-type inhibitors: rhodniin (PDB ID 1TBQ) [20], PSTI (PDB ID 1CGJ) [21], dipetalin* (PDB ID 1KMA) [22], OMTKY3 (PDB ID 1YU6) [23], infestin (PDB ID 2F3C) [24], AEI* (PDB ID 1Y1B) [9], SPINK4* (PDB ID 1PCE) [25] and CrSPI (PDB ID 3QTL) [16]. The three structures marked with an asterisk were determined by NMR. Two other Kazal-related structures, not retrieved by the DALI search, were added to the comparison: ascidian trypsin inhibitor (PDB ID 1IW4) [10] and oryctin (PDB ID 2KSW) [26]. These two structures were also determined by NMR.

CrSPI domain 1 represents the minimal Kazal fold, with only two disulfide bonds (2–4 and 3–6) and a small helix (seven amino acids). Thus, a rigid body superposition of all the structures onto that of greglin was performed using Cys#2, Cys#3, Cys#4 and Cys#6 as fitting points and with a cut-off distance of 2.5 Å for the refinement procedure in the program Turbo-Frodo. From this superposition, it appears that the third disulfide bond (1–5), which connects the N-terminus to the α-helix, is not structurally conserved. The observed variability originates from the spacing between Cys#4 and Cys#5, which are both located within the α-helix (Fig. 3). The number of residues between these two cysteines may be two (for ovomucoid), three (for rhodniin), six (for AEI) or seven (for oryctin). With a six-residue spacing between Cys48 (Cys#4) and Cys55 (Cys#5), greglin resembles AEI. Indeed, the three disulfide bonds 1–5, 2–4 and 3–6 are superposable in the two inhibitors, as illustrated by close-up views of the α-helices of greglin and AEI (Fig. 3). The additional disulfide bond of greglin (between Cys53 and Cys76, denoted S1–S2) connects the C-terminal extension (residues 70–78) to the α-helix. To our knowledge, ascidian trypsin inhibitor is the only other single-domain Kazal inhibitor of known structure with four disulfide bonds [10]. Three cysteines, with the motif Cys-(X)2-Cys-Cys, are located in the α-helix. The Cys#4–Cys#5 spacing is identical to that of classical Kazal OMTKY3, and the supplementary disulfide bond, involving cysteines that are located after Cys#1 and Cys#5, reinforces the 1–5 disulfide bond (Fig. 3).

image

Figure 3. Comparison of greglin with classical and non-classical Kazal inhibitors. The four inhibitors are shown in two orientations (top and bottom). In the bottom view, greglin is rotated approximately 90° along the vertical axis compared with Fig. 1B. The top view results from a rotation of 90° along the horizontal axis compared with the view at the bottom. The three other inhibitors are shown in an identical orientation. From left to right: AEI (Anemonia elastase inhibitor); greglin; OMTKY3 (third domain of turkey ovomucoid inhibitor); ATI (ascidian trypsin inhibitor). The S atoms of the cysteines are shown in pink. Numbering of the disulfide bonds 1–5 and 2–4 is shown in boxes, and dashed lines indicate their localization on the molecules. The supplementary disulfide bonds in greglin and ATI are circled in red. Visible N- and C-termini in these orientations are labelled N and C, respectively.

Download figure to PowerPoint

Comparison of the stability of greglin and an ovomucoid inhibitor

The stability of greglin was compared to that of a commercially available ovomucoid inhibitor (from chicken). The two inhibitors were subjected to various conditions (solvent, temperature and pH) as previously described [11]. The structural integrity of greglin and the ovomucoid inhibitor were assessed by measuring their activity towards subtilisin and trypsin, respectively. The activity remained unaltered for either inhibitor after treatment with 70% acetonitrile or after exposure to acidic pH. In contrast, greglin retained complete activity after prolonged exposure to pH 12 and to a temperature of 95 °C for a duration of 60 min (Fig. 4), whereas the chicken ovomucoid inhibitor retained approximately 20% activity.

image

Figure 4. Comparison of the stability of greglin and chicken ovomucoid. Residual inhibitory activities of greglin and ovomucoid on constant amounts of subtilisin and trypsin, respectively, were measured after a 1 h exposure of both inhibitors to (A) temperatures varying from 25 to 95 °C in 50 mm Hepes buffer, pH 7.4, and (B) various pH from 2.5 to 12. The final concentration of inhibitors was adjusted such that the control (untreated) inhibitor inhibited 90% protease activity.

Download figure to PowerPoint

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. References
  9. Supporting Information

Greglin is a non-classical Kazal inhibitor with an extra N-terminal domain bearing an unusual combination of post-translational modifications

Greglin is an inhibitor composed of two distinct domains consisting of residues 1–20 and 21–83. At present, the function of the first domain (1–20), which does not possess inhibitory properties, remains unknown. The absence of this domain in the determined crystal structure correlates with its sensitivity to proteolysis, as previously reported [11]. However, this domain displays a combination of post-translational modifications detected by mass spectrometry, i.e. three phosphorylations and a probable glycosylation. This situation is not commonly found in Kazal inhibitors, and may be linked to a specific biological activity, such as control of the formation of multi-protein complexes, localization or stability of the protein.

The region comprising residues 21–83 of greglin is a non-classical Kazal domain containing four disulfide bonds, of which three superposed on those of AEI, which belongs to the non-classical Kazal sub-group 1 [9]. Compared with other Kazal members, greglin possesses a unique additional C-terminal region (residues 70–83) that folds back and is connected to the α-helix via a supplementary disulfide bond (Cys53–Cys76).

The overall fold of the Kazal domain of greglin is highly resistant to denaturation

Greglin (region 21–78) displays a higher thermostability than ovomucoid inhibitors. Our study shows that greglin retains complete inhibitory activity after 1 h at 95 °C, whereas the chicken ovomucoid inhibitor displays only residual activity (Fig. 4). Similarly, OMTKY3 was shown by circular dichroism to be unfolded at 90 °C, with a Tm of 58 °C [27]. Several structural features may explain this protein thermostability. Factors contributing to protein stability include additional intramolecular interactions. Vogt et al. demonstrated that the thermostability of proteins is correlated with the number of hydrogen bonds [28]. Moreover, disulfide bridges, another type of intermolecular interaction, are believed to stabilize proteins primarily through an entropic effect, by decreasing the entropy of the unfolded state of the protein [29]. On the other hand, based on a comparative analysis of 20 complete genomes of thermophilic and mesophilic bacteria, Thompson and Eisenberg proposed the existence of a natural strategy for enhancing protein thermostability through truncations of exposed loop regions to lower the entropy of unfolding [30]. Interestingly, all of these features are found in the greglin structure: (a) the length of the loop between Cys#1 and Cys#2 is shortened, (b) the size of secondary structural elements is enlarged, with greglin possessing the longest β-sheet, stabilized by 11 hydrogen bonds, and one of the longest α-helices among the Kazal family, and (c) the number of disulfide bonds is increased, which includes a fourth bridge that connects the C-terminus extension to the α-helix.

A possible role for the C-terminal extension (70–78) of greglin in inhibitory selectivity

Another feature of greglin is its preference for subtilisin (S8 family) over proteases of the S1 family, whereas OMTKY3 displays a broad inhibition range, with similar inhibitory constants (10−11–10−12 m) against proteases from the S1 and S8 families [31]. The largest difference between greglin and OMTKY3 occurs for α-chymotrypsin, with Ki values of 2.6 × 10−8 m and 5.5 × 10−12 m, respectively.

Although the P1 position is the predominant determinant for the specificity, its influence may be exaggerated, as illustrated by OMTKY3, which inhibits a panel of proteases to an identical degree. In greglin, the P1 residue is Leu27, identical to the P1 of OMTKY3, and thus does not explain the differences in the inhibitory constants. OMTKY3 is one of the best-studied protein inhibitors, particularly at the structural level. We took advantage of the wealth of crystal structures to explore the binding to α-chymotrypsin. A fine superposition of the four complexes subtilisin–greglin, subtilisin–OMTKY3 (PDB ID 1R0R), α-chymotrypsin–OMTKY3 (PDB ID 1CHO) and HLE–OMTKY3 (PBD ID 1PPF) was performed for the RSLs in order to obtain the smallest root mean square deviation from residues P3 to inline image (Fig. S8). The RSLs of OMTKY3 from the three complexes are perfectly superposed, with maximal deviations of 0.33 Å between the Cα atoms. The extremities of three side chains are reoriented: Leu18 (P1), with a small shift of 1.25 Å at the CG atom, Glu19 (inline image), with a shift of 3.25 Å at the CD atom, and Arg21 (inline image), with a shift of 3.25 Å at the CZ atom. Greglin superposes particularly well to OMTKY3 in complex with subtilisin. Because the side chains at inline image and inline image are shorter in greglin than in OMTKY3, their possible reorientation is expected to occur without any steric clash. Therefore, the structure of greglin from P3 to inline image is entirely compatible for interaction with chymotrypsin. The reason for its preference for subtilisin must reside in other regions of greglin. For example, in the pacifastin family, we demonstrated a role for the P10–P6 region of the inhibitors in modulation of the selectivity of inhibition [32, 33].

Despite the active sites comprising superposable catalytic triads, the proteases from the S1 and S8 families display completely different folds and different loops surrounding the entrance of the active site. In the S1 family, the active-site cleft is shaped by several insertion loops, such as the 30-loop, the 60-loop, the 140-loop (the so-called autolysis loop) and the 220-loop, which have been described as involved in the specificity [34, 35]. The binding of greglin to chymotrypsin was modelled using the crystal structures of three complexes: subtilisin–OMTKY3 [15], chymotrypsin–OMTKY3 [6] and chymotrypsin–PMP-D2v (PDB ID 1GL0) [32]. As shown in Fig. 5 (right), in the modelled complex, the C-terminal extension of greglin points toward the autolysis loop of chymotrypsin. The distance between the Cα atoms of residue 73 in greglin and residue 149 in chymotrypsin is <2.0 Å. For comparison, in the subtilisin complex (Fig. 5, left), the smallest Cα–Cα distance in this region is 6.74 Å, and involves residue 73 of greglin and residue 157 of subtilisin. Therefore, in the modelled greglin–chymotrypsin complex, a steric clash is predicted to occur in the absence of reorganization of the loops. The length and amino acid composition of the autolysis loop is highly variable; for example, it is one residue longer in chymotrypsin than in elastases. The autolysis loop may adopt various conformations [36] to modulate the specificity of the enzyme, as demonstrated in the case of coagulation proteases [37]. Thus, due to the flexibility of the autolysis loop, binding of greglin to proteases of the S1 family is ensured but is energetically less favourable, as illustrated by the moderate inhibition of not only chymotrypsin but also elastase and cathepsin G.

image

Figure 5. Comparison of the interaction of greglin with proteases from the S8 (left) and S1 (right) families. Greglin is represented as in Fig. 1. Left: crystal structure of the complex between greglin and subtilisin (of the S8 family), represented as a grey surface. Right: an identical orientation showing the complex between greglin and chymotrypsin (taken from complex PDB ID 1GL0). Chymotrypsin is a member of the S1 family, and is shown in surface representation, coloured in light brown, with its autolysis loop shown in orange. The figures were prepared using PyMOL.

Download figure to PowerPoint

In conclusion, this structural study of greglin using both X-ray crystallography and mass spectrometry resulted in a correction of the protein sequence, which allows definition of a new sub-family of Kazal inhibitors, and provides insight into the structure–function relationship in terms of thermostability and selectivity. However, further studies are necessary to understand the role of the N-terminal domain.

Experimental procedures

  1. Top of page
  2. Abstract
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. References
  9. Supporting Information

Isolation of greglin from Schistocerca gregaria ovaries

The ovaries from 160 adult females of S. gregaria (Arbiotech, Saint-Brieuc, France) were dissected and immediately immersed in 200 mL of cold (4 °C) aqueous 150 mm NaCl solution. The purification procedure was adapted from the procedure described previously [11]. Briefly, after mechanical tissue disruption and 50–80% ammonium sulfate precipitation, greglin was purified to homogeneity by reversed-phase HPLC on a semi-preparative column (C8 Vydac, Grace Vydac, CA, USA, 1 × 25 cm) eluted at 2.5 mL·min−1 using a 30 min linear acetonitrile gradient (0–60%) in 0.075% trifluoroacetic acid. The collected fractions were analysed for their ability to inhibit neutrophil elastase and/or commercial subtilisin Carlsberg (Sigma-Aldrich, St Louis, MO, USA) using the Förster resonance energy transfer (FRET) substrates Abz-APQQIMDDQ-EDDnp and Abz-TPFSGQ-EDDnp, respectively, where Abz is o-aminobenzoic acid, and EDDnp is N-(2,4-dinitrophenyl)-ethylenediamine. Inhibitory fractions were pooled and then re-chromatographed on an identical reversed-phase HPLC semi-preparative column, and eluted under identical conditions as before. The purity of inhibitory fractions was assessed by SDS/PAGE, reversed-phase HPLC on a C8 cartridge (Waters, Guyancourt, France) and MALDI-TOF mass spectrometry. The concentration of purified greglin was determined by titration with purified titrated human neutrophil elastase [38].

Purification of the greglin–subtilisin complex

Greglin (100 μm) was dialysed against 50 mm Hepes, pH 7.4/150 mm NaCl and then incubated with a fourfold molar excess of subtilisin for 40 min at 37 °C. The complex was eluted into 1 mL fractions at 0.5 mL·min−1 on a Superdex 200 10/300 GL column (GE Healthcare, Vélizy, France) that was equilibrated in an identical buffer to remove excess subtilisin. The purified complex was used for crystallization.

Enzyme assays

Subtilisin, human neutrophil elastase and trypsin activities were measured in 50 mm Hepes, pH 7.4/150 mm NaCl supplemented with 0.05% v/v Igepal CA-630 (Sigma Aldrich, Saint Quentin Fallavier, France) using the substrates Abz-TPFSGQ-EDDnp, Abz-APQQIMDDQ-EDDnp and Abz-TPRSALQ-EDDnp, respectively. The hydrolysis of Abz-peptidyl-EDDnp substrates was followed by measuring the fluorescence at λex = 320 nm and λem = 420 nm in 96-well microplates using a SpectraMax Gemini reader (Molecular Devices, Saint Grégoire, France). Greglin (0.5 μm) and trypsin inhibitor from chicken egg white (ovomucoid) (Sigma-Aldrich) (0.5 μm) were incubated for 20 h at pH 2.5 (0.5 m glycine/HCl), pH 7.4 (50 mm Hepes buffer), pH 12 (Tris base/NaOH) and in 70% acetonitrile, and for 60 min at temperatures of 25, 70, 80 and 95 °C. The inhibitory activities of the two inhibitors were measured using subtilisin (2 nm final) and trypsin (2 nm final) as target proteases for greglin and ovomucoid, respectively. The final inhibitor concentration of the untreated inhibitor was adjusted to reach 90% inhibition after 30 min incubation at 37 °C, and the same concentration was used to measure residual inhibitory activities of treated inhibitors under the same experimental conditions.

Crystallization of the greglin–subtilisin complex

The complex between greglin and subtilisin was concentrated to a level producing an absorbance of 5 at 280 nm. Initial crystallization trials were performed using the Wizard and MDL screens (Qiagen, Courtaboeuf, France) on a cartesian robot. For each condition, three drops (100 nL of screen buffer + 100, 200 or 300 nL protein) were formed. Initial crystals (needles) were obtained under the C12 condition of the Wizard screen (1 m trisodium citrate, 0.1 m imidazole, pH 8) in drop 2 only. Optimization was then performed in two steps: (a) varying the pH and trisodium citrate concentration of the C12 condition, and (b) performing a ‘re-screen’ that consisted of mixing 100 μL of sample obtained using the best conditions from step 1 (0.95 m trisodium citrate, pH 8.95) with 50 μL of the anion or cation screens (Qiagen). Drops consisted of 100 nL reservoir buffer and 200 or 300 nL protein. The final crystallization conditions were 0.68 m sodium citrate, 0.36 m sodium formate and 0.066 m Tris, pH 8.5. The crystals were cryo-protected by soaking in a 1.4 m trisodium citrate solution. X-ray diffraction data to 1.75 Å resolution were collected on beam line ID23-EH2 at the European Synchrotron Radiation Facility (Grenoble, France) using the helical data collection option. The dataset was processed using the XDS [39] and SCALA [40] programs. The crystals contain one complex per asymmetric unit.

Structure resolution and refinement

The structure of the greglin–subtilisin complex was determined with the molecular replacement method using the AMoRe program [41] and the subtilisin Carlsberg coordinates (PDB ID 1YU6). The rotation function yielded one solution, and the translation function yielded a unique solution, with a correlation coefficient of 62.4% and an Rfactor of 35.5% for data between 10 and 4 Å. After rigid body refinement, the correlation coefficient was 62.6% with an Rfactor of 33.8%. A weak electron density was observed for the inhibitor that was sufficient to begin building the inhibitor structure. After performing several cycles of refinement using the Refmac [13] and Buster [14] programs, and manual replacement and building for the inhibitor on the graphic display using the turbo-frodo program [12], the Rfactor decreased to 16.6% (Rfree 20.2%). All representations of the structure in the figures were prepared using the program pymol (http://www.pymol.org/). Coordinates for the structure of the greglin–subtilisin complex have been deposited in the Protein Data Bank under accession number 4GI3.

Determination of the number of disulfide bridges and phosphorylation stoichiometry by mass spectrometry

The cystines in greglin were reduced by incubating the protein at a concentration of 40 μm for 30 min at 50 °C, followed by incubation for 30 min at room temperature in a buffer consisting of 50 mm Tris/HCl pH 7.5, 1 mm Tris(2-carboxyethyl)phosphine hydrochloride, 5 mm dithiothreitol and 0.1 mm EDTA. Reduced greglin (24 μm) was incubated at 30 °C for 1 h with or without 40 units of lambda phosphatase (Sigma) per ml of reaction volume in the manufacturer's recommended buffer (50 mm Tris/HCl, pH 7.5, 5 mm dithiothreitol and 0.1 mm EDTA). The samples were acidified using formic acid at a final concentration of 1% v/v, prior to desalting on a C18 ZipTip (Millipore, Billerica, MA, USA). Desalted samples were analysed by direct infusion into a 4 GHz MaXis high-resolution quadrupole time-of-flight (QTOF) mass spectrometer (Bruker Daltonics, Bremen, Germany) equipped with an electrospray ionization (ESI) source. Acquisitions were performed in positive-ion MS mode at 1 Hz frequency. ESI-HRMS spectra were processed and charge-deconvoluted using dataanalysis 3.1 software (Bruker Daltonics) and the MaxEnt algorithm.

De novo sequencing and phosphorylation site identification by MS

Greglin cysteine reduction/alkylation was performed in 50 mm ammonium bicarbonate buffer by treatment with 5 mm Tris(2-carboxyethyl)phosphine hydrochloride for 30 min at 50 °C, followed by carbamidomethylation with 25 mm iodoacetamide for 30 min at room temperature in the dark. Alkylated greglin was proteolysed for 2 h at 37 °C with bovine trypsin at a 1 : 5 w/w trypsin:protein ratio.

For de novo sequencing, trypsin was inactivated by incubation for 5 min at 95 °C followed by 5 min on ice before lambda phosphatase treatment. The primary sequence of greglin was investigated using high-resolution nanoLC-nanoESI QTOF on dephosphorylated tryptic peptides. De novo sequence information was derived by manual interpretation of spectra combined with software interpretation using peaks and biotools (Bruker, Bremen, Germany).

High-resolution nanoLC-nanoESI high-resolution QTOF MS/MS analysis

Experiments were performed using an UltiMate 3000 NanoRSLC System (Dionex, Sunnyvale, CA, USA) connected to a Bruker MaXis HR high-resolution QTOF mass spectrometer equipped with an online nano-ESI ion source. The LC-MS setup was controlled by bruker hystar software version 3.2. Phosphopeptides were pre-concentrated online on a Dionex Acclaim PepMap100 C18 reverse-phase pre-column (inner diameter 100 μm × length 2 cm, particle size 5 μm, pore size 100 Å), and separated on a nanoscale Acclaim Pepmap100 C18 column (inner diameter 75 μm × length 25 cm, particle size 2 μm, pore size 100 Å) at a flow rate of 450 nL·min−1 using a 2–35% gradient of acetonitrile in 0.1% formic acid.

Mass spectra were acquired in positive-ion mode from m/z 50–3000 for MS and MS/MS. Data-dependent CID MS/MS was performed on the two most abundant doubly or triply charged MS ions. NanoLC-nanoESI high-resolution QTOF MS/MS spectra were processed using dataanalysis 3.1 software (Bruker Daltonics).

For phosphorylation site identification, the tryptic phosphopeptides were enriched from the cleavage products by immobilized metal-affinity chromatography using iron-coated PHOS-Select metal chelate beads (Sigma) in accordance with the manufacturer's instructions. Bound peptides were eluted using 1% phosphoric acid, and analysed in parallel by MALDI-TOF MS on a Bruker UltraFlex I and by nanoLC-nanoESI ultra-high-resolution UHR QTOF on a MaXis mass spectrometer.

MALDI-TOF MS analysis

2,5-dihydroxybenzoic acid (20 mg·mL−1), prepared in 50% acetonitrile and 1% phosphoric acid, was used as a matrix. A 0.5 μL aliquot of enriched phosphopeptides was mixed with 0.5 μL matrix solution, spotted on a stainless steel plate, and allowed to air dry. MS experiments were performed using an UltraFlex I mass spectrometer (Bruker Daltonics) equipped with a 337 nm nitrogen laser and a grid-less delayed extraction ion source. Spectra were acquired in reflectron positive-ion mode by accumulation of 1000 laser shots. The instrument was controlled using bruker flexcontrol software. Calibration was performed externally using Pepmix I from Bruker Daltonics. MALDI-TOF MS spectra were processed using flexanalysis 2.0 software (Bruker Daltonics).

Acknowledgements

  1. Top of page
  2. Abstract
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. References
  9. Supporting Information

The Region Centre and the Fonds Européen de Développement Régional supported this work (Projet INFINHI) and the funding of high-resolution mass spectrometry (#2699-33931, SyMBioMS project). We thank the European Synchrotron Radiation Facility at Grenoble, France, in particular the beamline ID23 staff for their assistance.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. References
  9. Supporting Information

Supporting Information

  1. Top of page
  2. Abstract
  3. Introduction
  4. Results
  5. Discussion
  6. Experimental procedures
  7. Acknowledgements
  8. References
  9. Supporting Information
FilenameFormatSizeDescription
febs12033-sup-0001-FigureS1-S8.zipZip archive1412K

Fig. S1. Ultra-high-resolution QTOF CID fragmentation of dephosphorylated greglin (1–27).

Fig. S2. Ultra-high-resolution QTOF CID fragmentation of greglin (28–51).

Fig. S3. Pseudo-MS3 ultra-high-resolution QTOF fragmentation of dephosphorylated greglin (1–27).

Fig. S4. MALDI-TOF MS of immobilized metal-affinity chromatography-enriched tryptic greglin phosphopeptides.

Fig. S5. UHR-QTOF CID fragmentation of mono-phosphorylated [1-27]* greglin peptide.

Fig. S6. Hydrogen bonds between the RSL of greglin and subtilisin.

Fig. S7. Superposition of greglin with canonical inhibitors from various families.

Fig. S8. Superposition of the RSLs of greglin with those of OMTKY3 in complex with subtilisin, chymotrypsin and human neutrophil elastase.

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.