E. Gherardi, MRC Centre, Hills Road, Cambridge CB2 2QH, UK Fax: +44 1223215308 Tel: +44 1223215308 E-mail: firstname.lastname@example.org
Hepatocyte growth factor like/macrophage stimulating protein (HGFl/MSP) and hepatocyte growth factor/scatter factor (HGF/SF) define a distinct family of vertebrate-specific growth factors structurally related to the blood proteinase precursor plasminogen and with important roles in development and cancer. Although the two proteins share a similar domain structure and mechanism of activation, there are differences between HGFl/MSP and HGF/SF in terms of the contribution of individual domains to receptor binding. Here we present a crystal structure of the 30 kDa β-chain of human HGFl/MSP, a serine proteinase homology domain containing the high-affinity binding site for the RON receptor. The structure describes at 1.85 Å resolution the region of the domain corresponding to the receptor binding site recently defined in the HGF/SF β-chain, namely the central cleft harboring the three residues corresponding to the catalytic ones of active proteinases (numbers in brackets define the sequence position according to the standard chymotrypsinogen numbering system) [Gln522 (c57), Gln568 (c102) and Tyr661 (c195)] and an adjacent loop flanking the S1 specificity pocket and containing residues Asn682 (c217) and Arg683 (c218) previously shown to be essential for binding of HGFl/MSP to the RON receptor. The study confirms the concept that the serine proteinase homology domains of HGFl/MSP and HGF/SF bind their receptors in an ‘enzyme-substrate’ mode, reflecting the common evolutionary origin of the plasminogen-related growth factors and the proteinases of the clotting and fibrinolytic pathways. However, analysis of the intermolecular interactions in the crystal lattice of β-chain HGFl/MSP fails to show the same contacts seen in the HGF/SF structures and does not support a conserved mode of dimerization of the serine proteinase homology domains of HGFl/MSP and HGF/SF responsible for receptor activation.
hepatocyte growth factor like/macrophage stimulating protein
receptor tyrosine kinase(s)
Hepatocyte growth factor–like/macrophage stimulating protein (HGFl/MSP) and hepatocyte growth factor/scatter factor (HGF/SF) control the growth, movement and morphogenesis of a variety of cell types in vertebrate organisms (reviewed in [1,2]). Unlike most growth factors, which are small proteins with a relatively simple domain structure, HGFl/MSP and HGF/SF are high molecular weight glycoproteins with a complex multidomain architecture related to that of the proenzyme plasminogen . Both HGFl/MSP and HGF/SF consist of six domains: (a) an N-terminal domain homologous to plasminogen preactivation peptide; (b) four copies of the kringle domain; and (c) a C-terminal serine proteinase homology domain that lacks enzymatic activity due to mutations of critical residues at the catalytic site and S1 specificity pocket . Both proteins are synthesized as single-chain precursors and subsequently processed through cleavage of a long linker peptide connecting the fourth kringle domain and the serine proteinase homology domain. This yields two-chain (α/β) proteins held together by a single disulfide bond: the larger α-chain contains the N-terminal and the four kringle domains, the smaller β-chain corresponds to the serine proteinase homology domain.
The receptors for HGFl/MSP and HGF/SF are the tyrosine kinases (RTK) RON and MET, respectively [4–6]. RON and MET have a complex multidomain architecture  and, upon activation, they transduce cell signals crucial for embryo development [8–11], wound healing [12,13] and cancer growth and spreading (reviewed in [2,14]).
In contrast to their well studied biological activities, the mechanism of receptor binding and activation by plasminogen-related growth factors is less well understood and is currently under intense investigation. There is strong evidence that the presence of both the α- and β-chains is required for receptor activation by both HGF/SF and HGFl/MSP [15–17]. However, in HGF/SF the primary (high affinity) receptor binding site is located in the α-chain with the β-chain contributing a lower affinity site [15,16] whereas in the case of HGFl/MSP the high affinity site is located in the β-chain [17–19].
Insights into the mechanism of binding have recently been provided by a crystal structure at 3.3 Å resolution of the complex of the β-chain of HGF/SF and the MET receptor . This structure defines in detail the regions of the β-chain of HGF/SF and the β-propeller domain of MET responsible for the formation of a 1 : 1 complex and the conclusions derived from the crystallographic analysis have also been corroborated by extensive mutagenesis studies .
Here we report a crystal structure of a fragment of HGFl/MSP consisting of the serine proteinase homology domain and the last 19 residues of the α-chain. Further, in order to understand better the structural basis of proteolytic activation in receptor binding, we also present a model of the serine proteinase homology domain of HGFl/MSP in its precursor, inactive form. Analysis of the experimental structure and the model confirms that HGFl/MSP, like HGF/SF, has retained an enzyme-like mode of receptor binding involving the area corresponding to the active site of bona fide proteinases. It also shows that the process of proteolytic activation leads to a major rearrangement of the ‘active site’ region that may influence receptor binding.
Results and Discussion
Description of the structure
The structure of the β-chain of HGFl/MSP contains 4 residues of the α-chain (residues Cys468 to Arg471), 225 residues of the β-chain (residues Val484 to Met708) (c16-c242) and 154 water molecules (Table 1 and Fig. 1). The 15 remaining residues of the α-chain and the hexahistidine sequence fused at the C-terminus of the protein are not visible in the structure, presumably due to disorder.
Table 1. Crystallographic statistics for the structure of the β-chain of HGFl/MSP.
aRsym = Σh|I h - < I > |/Σ hI h, where Ih is the intensity of reflection h, and < I > is the mean intensity of all symmetry-related reflections. bRcryst = Σ||Fobs|-|Fcalc||/Σ|Fobs|, Fobs and Fcalc are observed and calculated structure factor amplitudes. cRfree as for Rcryst using a random subset of the data (around 5%) excluded from the refinement. d Estimated coordinate error based on the R-value as calculated by refmac. e Calculated with procheck.
The overall fold closely resembles that of serine proteinases of the chimotrypsinogen family consisting of two antiparallel six stranded β barrels forming two lobes at the junction of which lies the region corresponding to the active site cleft of the enzymes (Fig. 1A). The disulfide bond connecting Cys468 of the α-chain and Cys588 (c122) of β-chain is clearly defined as are the five remaining intradomain disulfide bonds of the β subunit, two in the N-terminal lobe: Cys507-Cys523 (c42-c58) and Cys527-Cys562 (c62-c96) and three in the C-terminal one: Cys602-Cys667 (c135-c201), Cys632-Cys646 (c168-c182) and Cys657-Cys685 (c191-c220) (Fig. 1A). A Cys672 (c206) to Ser mutation was introduced in order to prevent the unpaired Cys672 from forming an aberrant disulfide bond with Cys588 (c122) which would disrupt formation of the correct disulfide bond between the α- and β-chains. The structure of the region corresponding to the active site of bona fide proteinases is conserved with the three residues replacing the catalytic Asp, His and Ser [Gln522 (c57), Gln568 (c102) and Tyr661 (c195)] aligned alongside the cleft (Fig. 1A). In contrast, significant differences are apparent in the structure of the region corresponding to the S1 specificity pocket. While the upper side of the pocket is substantially preserved, with the exception of the Ser to Tyr661 (c195) mutation, in the lower part Pro681 replaces a Trp residue found in most catalytically active serine proteinases and generates a turn in loop L13 (680–691) (c214-c225) which allows greater accessibility to Tyr661 (c195) (Fig. 1A). However, Tyr661 (c195) sterically reduces access to the pocket, suggesting that receptor binding of the β-chain of HGFl/MSP may only involve the entrance of the S1 specificity pocket (Fig. 1B, and below).
Comparison with the HGF/SF and plasmin β-chains and structural consequences of proteolytic activation
The β-chain of HGFl/MSP displays a high level of sequence identity (41% and 39%, respectively) and structural similarity with HGF/SF and plasminogen. Superposition of the structure of the β-chain of HGFl/MSP with those of HGF/SF  and plasmin  yields rmsd of 2.13 and 2.48 Å over 224 and 209 amino acids, respectively. A structure-based, sequence alignment for the three proteins is shown in Fig. 2A and outlines the strand and loop nomenclature used in this report. Figure 2B shows a ribbon representation of the superposition (HGFl/MSP in blue, HGF/SF in red, plasmin in grey) and demonstrates that the structures of the three proteins are closely conserved in the central region while deviating considerably in certain surface loops, for example L5, L8 and L11 but not others, such as L13.
The availability of high-resolution crystal structures of both plasmin [22,23] and plasminogen [24,25], i.e. a homologous protein in its precursor and active forms, allowed modeling of the single-chain form of the serine proteinase homology domain of HGFl/MSP and analysis of the structural changes that may result from proteolytic activation. A superposition of the model of the single-chain form and the crystal structure of the two-chain form is shown in Fig. 3A. The comparison illustrates that, as observed with plasminogen, the newly formed N-terminus (Val484) (c16) folds into a hydrophobic pocket forming a ionic interaction with Asp660, causing a movement in loops L11 and L13, which is disulfide bonded to L11 (Fig. 3A). Loop L13 contains a cluster of three positively charged arginine residues (Arg683, Arg687, Arg689) (c230, c234 and c236) one of which (Arg683) (c230) is known to play a major role in RON binding  while a further triple arginine cluster is found in loop L10 (Arg637, Arg639, Arg641) (c184, c186 and c188). Together these two loops generate an extended positively charged patch on the surface of β-chain MSP that is repositioned as a result of proteolytic activation and is notably absent in the homologues plasmin and HGF/SF (data not shown). Proteolytic activation of HGFl/MSP also appears to affect the position of loops L4 and L5 but to a lesser extent than L8, L11 and L13 (Fig. 3A).
The binding site for the RON receptor
The recent crystal structure of a complex between the β-chain of HGF/SF and a fragment of the MET receptor (PDB accession: 1SHY)  showed that the binding of the β-chain of HGF/SF to MET involves an extended area centered around the ‘active site’ of the homologous enzymes and involves the residues corresponding to the catalytic ones [Gln534 (c57), Asp578 (c102) and Tyr673 (c195)] on both sides of the central cleft as well as several residues in L13: Val692 (c215), Pro693 (c216), Gly694 (c217), Arg695 (c218), and Gly696 (c219) that form a continuous binding surface [20,21]. The position of the three residues of HGFl/MSP that correspond to Gln534, Asp578 and Tyr673 in HGF/SF, namely: Gln522 (c57), Gln568 (c102) and Tyr661 (c195) is shown in Fig. 3B. Figure 3B also shows the position of two residues in loop L13: Asn682 (c217) and Arg683 (c218), that were shown previously to be important or essential for binding of HGFl/MSP to the RON receptor . Therefore, although the mutagenesis data are limited compared to HGF/SF and the individual contributions of Gln522 (c57), Gln568 (c102), Tyr661 (c195), I680 (c215), P681 (c216) and V684 (c219) remain to be confirmed, the structural data presented here indicate that the receptor binding site of the β-chain of HGFl/MSP and HGF/SF are highly conserved and that the binding specificity of the two growth factors for their cognate receptors depends on local sequence variation and not on the utilization of different areas of the domain surface.
Mapping the location of amino acids Gln522 (c57), Gln568 (c102), Tyr661 (c195), Asn682 (c217) and Arg683 (c218), of HGFl/MSP onto the surface of the model of the single chain form of the domain (Fig. 3C) illustrates dramatically the putative effect of proteolytic activation of the domain and may provide a basis for the different binding affinities reported for the single chain and two chain forms of the proteinase homology domains of HGFl/MSP and HGF/SF (see for example ).
Implication for biological activity
The high-resolution, crystal structure of the two-chain form of the serine proteinase homology domain of HGFl/MSP reported here and the model of the corresponding single-chain (precursor) form discussed above have highlighted a role for the opening of the S1 pocket and a rearrangement of loops L8, L11, L13 in domain activation (Fig. 3A). Given the conservation of HGF/SF and HGFl/MSP, these results imply that domain activation may involve similar changes in HGF/SF.
However, it is well known that the binding affinity of the β-chain of HGFl/MSP for the RON receptor (Kd ≈ 10−9m)  is approximately hundred fold higher than the affinity of the β-chain of HGF/SF for MET (Kd = ≈ 10−7m) . Does this imply that the β-chain domain of HGFl/MSP uses a larger binding interface than the one defined for the HGF/SF domain? A conclusive answer to this question awaits cocrystal structures of ligand-receptor complexes and more extensive mutagenesis data but the available evidence argues against a more extensive interface. The binding affinity of the β-chain of HGF/SF for MET is weak (Kd = ≈ 10−7m) and yet the binding surface is very extensive [20,21] and could readily allow the extra hydrogen bonding required to bring the affinity into the nanomolar range in the case of HGFl/MSP and RON.
There are indications from the cocrystal structure of the HGF/SF β-chain – MET complex that the β-chain of HGF/SF may mediate domain dimerization  and, as a result, dimerization of two 1 : 1 HGF/SF-MET complexes to form an active signaling unit. This hypothesis is based on the presence of conserved intermolecular interactions involving the N-terminal sequence and residues from loops 8 (c140) and 11 (c180) in the crystal structures of the β-chain of HGF/SF alone  or in complex with the β-propeller domain of MET . Given the structural conservation of the β-chains of HGF/SF and HGFl/MSP and their receptor-binding surfaces ([20,21] and Fig. 3B), we analyzed the intermolecular interactions due to crystal packing in the structure of the HGFl/MSP β-chain (Fig. 4). Each molecule buries 2192 Å2 of surface area in contacts with six adjacent molecules through loops L4, L5, L11, L13 and helix 1 but the diverse contacts seen in the crystal structure of the β-chain of HGFl/MSP (Fig. 4) do not include the one seen in the HGF/SF structures [20,21]. These results therefore do not support a functional role for the contact seen in the HGF/SF structures, or of any other set of contacts for that matter, although final elucidation of this important point will clearly require cocrystal structures of full length HGFl/MSP or HGF/SF in complex with the RON or MET receptor and/or mutagenesesis experiments.
All the structural and biochemical data currently available provide strong evidence for the concept that the serine proteinase homology domains of HGF/SF and HGFl/MSP have retained the proteolytic mechanism of activation and the enzyme-substrate mode of binding seen in the catalytically active serine proteinases. Although it is well known that the large α-chain of the complex proteinases of the clotting and fibrinolytic cascades play a role in substrate binding, it is equally clear that the serine proteinase domains themselves can bind substrate and cofactors with regions outside the S1 specificity pocket (reviewed in ). This versatility of the serine proteinase domain in mediating further protein–protein interactions may also be at work with the plasminogen-related growth factors and the structure of the β-chain of HGFl/MSP reported here should facilitate not only further analysis of the RON binding site but of other areas of the domain possibly involved in receptor oligomerization or interaction with coreceptor molecules.
A Cys672Ser mutant of HGFl/MSP used for these studies was obtained by site-directed mutagenesis using the Pfu Turbo polymerase (Stratagene, La Jolla, CA, USA) and the following mutagenic oligonucleotides: 5′-TGCTTTACCCACAACTCATGGGTCCTGGAAGGA-3′ and 5′-TCCTTCCAGGACCCATGAGTTGTGGGTAAAGCA-3′. Plasmid pBSKSII-containing the mutated cDNA was used as a template for PCR amplification of the β-chain MSP fragment by using primers 5′-CGGGATCCCAGTTTGAGAAGTGTGGCAAGAGGG-3′ and 5′-AGCTCTCTAGAATCTACTAGTGGTGATGATGGTGATGACCCAGTCTCATGACCT-3′. The primers introduce a BamHI restriction site at the AUG and a His6 tag and an XbaI restriction site 3′ of the new stop codon. The PCR product was digested with BamHI and XbaI, N-terminally fused to the leader sequence of a human immunoglobulin variable chain (VL-HuLys11)  and subcloned into plasmid pA71d at a unique SmaI site. NS0 cells were grown in Dulbecco's modified Eagle's medium supplemented with 10% (v/v) fetal bovine serum. For stable expression of β-chain MSP, 1.5 × 107 cells were transfected with 10 µg of linearized β-chain MSP cDNA by electroporation and placed in complete medium with 0.8% (v/v) hygromycin B (Stratagene, La Jolla, CA, USA). After selection and screening for expression by slot blot, single wells were cloned, and lines with the highest expression levels were selected. Production of the large quantities of protein required for crystallization experiments was carried out in roller bottles in 20 L batches in Dulbecco's modified Eagle's medium supplemented with 1.25% (v/v) fetal bovine serum.
Purification and crystallization
Cell cultures were harvested by centrifugation and the supernatant dialyzed against Phosphate Buffered Saline prior to loading onto an IMAC Ni-NTA Superflow column (Qiagen, Hilden, Germany) for partial purification. Subsequent cationic exchange chromatography on a MonoS column (Amersham Biosciences UK Ltd., Chalfont St Giles, Bucks, UK) with a NaCl gradient yielded over 95% pure HGFl/MSP β-chain. Deglycosylation of the purified protein was performed by overnight incubation at 30 °C with the glycoamidase PNGaseF in a 1 : 25 (w/w) enzyme to substrate ratio. After final gel filtration purification of the reaction mixture, the HGFl/MSP β-chain was concentrated to 10 mg·mL−1 in a buffer containing 20 mm Mes pH 6.0, 100 mm NaCl and used for crystallization using a sitting drop vapour diffusion method. The crystallization drops contained 1 µL of protein mixed with 1 µL of precipitant solution and were equilibrated against 0.75 mL of the latter in 24-well plates (Molecular Dimensions Ltd, Soham, Cambridge, UK) at 19 °C. The initial crystallization condition corresponding to condition #27, from Wizard Screen I™ (Emerald Biosystems, Bainbridge Island, WA, USA) was then optimized by varying the concentrations of crystallizing agents. The best crystals had the appearance of thick rods with a square cross-section and grew in 1.4 m NaH2PO4/0.93 m K2HPO4, CAPS pH 10.5, Li2SO4 0.1 m, final pH 6.1 and reached a maximum size of 350 × 40 × 40 µm in 24 h.
X-ray data collection
Fully grown crystals of HGFl/MSP β-chain were soaked for 5–10 s in a cryoprotectant solution containing 20% glycerol in the precipitant solution listed above. After soaking, the crystals were mounted into rayon cryo-loops (Molecular Dimensions Ltd), flash-cooled in liquid nitrogen and stored in liquid nitrogen until used for X-ray diffraction experiments at the ESRF synchrotron in Grenoble, France (beam station ID 13). The diffraction data were recorded using Quantum4 CCD (Area Detector Systems Corp., Poway, CA, USA) detector and were indexed, integrated, scaled and reduced using HKL diffraction data processing suite . All subsequent calculations were carried using the CCP4 crystallographic suite . The crystals of the HGFl/MSP β-chain diffracted to a maximum resolution of 1.85 Å and crystallographic data collection statistics are given in Table 1.
Structure solution and refinement
Calculation of the Matthew's coefficient  suggested the presence of only one molecule of the HGFl/MSP β-chain in the asymmetric unit resulting in a solvent content of about 50%. The structure was solved by molecular replacement using the crystal structure of microplasmin (PDB accession code: 1BUI, 39% sequence identity)  as a search probe. Molecular replacement calculations were performed with amore. The rotation function produced a clear peak with signal-to-noise ratio of 0.7 σ (the resolution range 8–3 Å was used for all calculation). This in turn produced a clear peak in the translation function with a correlation coefficient and Rcryst between observed and calculated structure factor amplitudes of 25.4% and 54.9%, respectively. The rigid body refinement performed in amore improved both the correlation coefficient and the R-factor to 37% and 52.7%, respectively.
The initial model obtained was then subjected to several rounds of crystallographic refinement using the cns refinement package  and manual rebuilding. Simulated annealing protocols as implemented in cns were utilized in the first rounds of refinement, which was replaced with Powell minimization protocol in the last rounds. The temperature factor refinement included the restrained individual B-factor refinement. Manual rebuilding was performed in xtalview suite  using sigmaA weighted 2Fo-Fc, Fo-Fc and annealed omit maps. An automated model-rebuilding program arp/warp was also employed in the early stages of refinement to aid the manual rebuilding procedure. Most water molecules were picked using the xtalview internal subroutine and additional ones were placed manually using the following criteria: a peak of at least 2.5 σ for a Fo-Fc map, a peak of at least 1 σ for a 2Fo-Fc map, and reasonable intermolecular interactions. Final refinement statistics are shown in Table 1.
Coordinates and structure factors have been deposited in the Protein Data Bank, accession code 2ASU.
Work in EG's laboratory is supported by MRC Programme Grant G9704528. TLB thanks the Wellcome Trust Programme Grant (046073) and the BBSRC Structural Biology Initiative for support. Lauris Kemp and David Pratt are gratefully acknowledged for critical reading of the manuscript.