The acquisition of function is often associated with destabilizing mutations, giving rise to the stability–function tradeoff hypothesis. To test whether function is also accommodated at the expense of foldability, fibroblast growth factor-1 (FGF-1) was subjected to a comprehensive φ-value analysis at each of the 11 turn regions. FGF-1, a β-trefoil fold, represents an excellent model system with which to evaluate the influence of function on foldability: because of its threefold symmetric structure, analysis of FGF-1 allows for direct comparisons between symmetry-related regions of the protein that are associated with function to those that are not; thus, a structural basis for regions of foldability can potentially be identified. The resulting φ-value distribution of FGF-1 is highly polarized, with the majority of positions described as either folded-like or denatured-like in the folding transition state. Regions important for folding are shown to be asymmetrically distributed within the protein architecture; furthermore, regions associated with function (i.e., heparin-binding affinity and receptor-binding affinity) are localized to regions of the protein that fold after barrier crossing (late in the folding pathway). These results provide experimental support for the foldability–function tradeoff hypothesis in the evolution of FGF-1. Notably, the results identify the potential for folding redundancy in symmetric protein architecture with important implications for protein evolution and design.
Protein function typically relies on the precise alignment of specific main-chain or side-chain groups within the folded structure. Active conformations often require that functional residues be constrained by the global fold to adopt energetically suboptimal arrangements, such as solvent-exposed hydrophobic patches,1 regions of high-charge density, strained conformations,2 and buried polar/charged groups.3, 4 Numerous researchers have reported successful enzyme stabilization by active-site redesign at the expense of function.5–8 Careful analysis of active site mutagenesis data has led to the hypothesis that optimization of functional activity is often accommodated at the expense of thermostability; that is, there exists a fundamental “stability–function tradeoff.”6, 7 In this viewpoint, enzymes preorganize a small subset of residues for efficient function by enforcing structurally strained conformations within the active site, and this is offset by favorable interactions distributed over the remaining majority of the protein structure.3, 4, 9–12
Computational studies have provided strong support for an additional but distinct foldability–function tradeoff hypothesis, which suggests that native state strain is only one manifestation of the burden imposed by the acquisition of function. In this viewpoint, the requirements of function enforce structural features that are unlikely to be optimal to nucleate protein folding (i.e., they will fold after barrier crossing and be denatured-like in the folding transition state) or are more likely to form intermediates before folding properly (requiring folding “back-tracking”). For example, Go-type simulations paired with molecular dynamics simulations revealed that the functional β-bulge of interleukin-1β (IL-1β) undergoes significant backtracking (proper folding preceded by a misfolding/unfolding event) before folding correctly—a computational result that was subsequently confirmed using real-time refolding NMR experiments.13–15 Thus, the foldability–function tradeoff hypothesis posits that regions contributing to specific function in a protein are likely to be segregated from regions contributing to efficient folding. Folding and stability are entangled properties, in that stability defines thermodynamics (but not kinetics of folding), while folding defines kinetics (but from which thermodynamic stability can be derived).
Experimental support for the stability–function tradeoff has been well-established,1–9 in part owing to the relative simplicity of characterizing protein stability and functional activity in response to active site mutation. Similar experimental support for a foldability–function tradeoff, however, remains scarce, reflecting a more recent postulate as well as the more significant demands of characterizing a protein's folding pathway. φ-Value analysis,16–18 although labor-intensive, remains one of the few direct experimental techniques with which to identify key positions contributing to the formation of the folding transition state; thus, experimental evaluation of the foldability–function tradeoff hypothesis can be directly tested using φ-value analysis in combination with functional data. Furthermore, by selecting a protein fold having internal (i.e., rotational) symmetry, regions of function can be compared directly to symmetry-related positions of the protein that are not associated with function but adopt the same general structure to ask whether they make equivalent contributions to the folding transition state. In essence, the protein provides an internal standard that explicitly demonstrates the folding potential (or lack thereof) of a given structural element. If symmetry-related positions do not fold concurrently, with one folding early and another folding late, it suggests that sequence divergence between symmetry-related subdomains (either by evolution of function or neutral drift) has diminished the foldability at specific sites. Furthermore, probing the foldability of symmetry-related regions in symmetric protein architecture can elucidate the potential for folding redundancy. Thus, although φ-value studies have been performed on a variety of proteins, analysis of symmetric protein folds provides unique and valuable information about protein folding, symmetric redundancy thereof, and with important implications for protein evolution and design.
Fibroblast growth factor-1 (FGF-1) is a heparin-binding protein that adopts the β-trefoil fold, a common protein fold that exhibits a threefold rotational symmetry at the tertiary structure level.19, 20 This architecture is composed of three repeating “trefoil-fold” subdomains (40–50 amino acids in length) each composed of a pair of antiparallel β-hairpin structures. Thus, within the structure are a total of 12 β-strands (numbered #1–12) and 11 reverse turns (numbered #1–11; Fig. 1). The primary structure of FGF-1, however, does not directly reflect the higher-order (i.e., tertiary structure) symmetry of the β-trefoil architecture: only a single amino acid position is conserved across the three repeating trefoil-fold subdomains. Such sequence asymmetry may reflect not only genetic drift but also functional and folding regions distributed asymmetrically in the primary structure.
Turns are the principle secondary structure element that permit 180° changes in the polypeptide direction and are thus a requirement for globular protein architecture (i.e., α-helix and β-strand secondary structure are linear structural elements). For efficient folding, approximate turn structure is likely formed early in the folding pathway to allow for structural collapse, and several studies have characterized the importance of turns to both folding (as folding nuclei)21–23 and stability.24 We report a comprehensive φ-value analysis for each of the 11 turns of FGF-1, involving a total of 44 residue positions within the 140 amino acid FGF-1 protein. The folding transition state of FGF-1 is shown to be highly polarized, with the majority of turns adopting either native-like or denatured-like structure in the folding transition state. However, symmetry-related turns do not fold concurrently, indicating that structures important for foldability are asymmetrically distributed over the protein structure. Turns associated with heparin-binding functionality and receptor-binding sites appear largely unstructured in the folding transition state. Functionally important residues are therefore highly segregated from the regions of the protein optimized for foldability, in support of the foldability–function tradeoff hypothesis.
The isothermal equilibrium denaturation data for all turn mutations used in the φ-value analysis of FGF-1 are given in Table I. FGF-1 is a weak mesophile as regards thermostability (ΔGunfolding = 21.1 kJ/mol),25, 28 and certain turn mutations can destabilize the protein by a magnitude approaching the overall ΔGunfolding; thus, stabilizing background mutations were necessary for a subset of mutations. Of the 44 positions evaluated, 28 yielded |ΔΔG| ≥ 2.0 kJ/mol (i.e., >3σ for ΔΔG measurements) and were therefore of sufficient magnitude to permit accurate φ-value analysis.16 Complete folding and unfolding kinetic data were collected for this set of mutations (Table II). A comparison of ΔΔG values determined from isothermal equilibrium denaturation (ΔΔGiso) and folding kinetic data (ΔΔGkin) is in good agreement (Fig. 2, upper panel) supporting the two-state denaturation assumption.25 A plot of the derived φ-values (Fig. 2, lower panel) indicates that FGF-1 has a highly polarized transition state, with the vast majority of evaluated positions having φ-values clustered around 1.0 or 0.0 (and are therefore either fully native-like, or denatured-like, in the folding transition state, respectively). Thus, a specific subset of turn positions in FGF-1 makes a critical contribution to the folding pathway, while others make little contribution. Despite the threefold internal symmetry characteristic of the β-trefoil architecture (involving three repeated trefoil-fold subdomains of 42–45 amino acids having a r.m.s.d. for main chain atoms of ≤1.3 Å29), the key regions identified for folding are asymmetrically distributed in the FGF-1 structure (Fig. 3, upper panel). For example, turns #1, #5, and #9 are related by the threefold internal symmetry; however, turn #1 is unstructured in the folding transition state, while turn #5 is natively structured, and turn #9 is partially structured. Notably, there is no set of symmetry-related positions that fold concurrently. The general regions that contribute to the folding transition state appear essentially contained within the second half of the first trefoil-fold subdomain and most of the second trefoil-fold subdomain (i.e., comprising ∼50% of the protein structure). In contrast, the other regions (including essentially all the third trefoil-fold subdomain) contribute little to the folding transition state (and therefore fold late in the folding pathway).
Table I. Isothermal Equilibrium Denaturation Data for FGF-1 Turn Mutants
Heparin and membrane-bound heparan (heparan sulfate proteoglycan; HSPG), with a large negative charge density, are known to bind FGF-1 with high affinity.34, 35 The role of HSPG binding appears crucial for the proper biological function of FGF-1. Increased tissue levels of HSPG restrict the distribution of signaling molecules, such as FGF-1, can regulate concentration gradients of such signaling molecules and may play a role in pattern formation in embryogenesis; conversely, decreased levels of HSPG in vasculature promote long range transport of such signaling molecules.36 Thus, HSPG binding is postulated to be a key determinant of the pharmacokinetic properties of FGF-1.37–39 Furthermore, the competent signal transduction complex of FGF-1 involves a ternary interaction between FGF-1, FGF receptor, and HSPG.30–33 The addition of soluble heparin to FGF-1 confers resistance to thermal denaturation, chemical denaturation, and proteolysis,28, 40, 41 and inclusion of heparin in the formulation of FGF-1 greatly improves its potency, stability, storage, and reconstitution properties.28 Thus, heparin binding represents a key functionality that regulates the tissue distribution, pharmacokinetics, and receptor signaling of FGF-1.
A large body of published work, involving numerous investigators and wide-ranging methodologies, has unambiguously identified amino acid positions associated with heparin-binding and receptor-binding functionality in FGF-1. For example, there are nine different molecular structures (seven X-ray and two NMR) of FGF-1 in complex with receptor and/or heparin analogues (PDB accession 1EVT, 1DJS, 1E0O, 1RY7, 2ERM, 1HKN, 1RML, 1AFC, and 1AXM) that serve to identify structural details of the regions of FGF-1 associated with both receptor and heparin binding. A large number of functional studies (involving chemical modification, point mutations, deletion mutations, homologous substitution mutations, and peptide-binding competition studies) in combination with analytical ultracentrifugation, surface plasmon resonance, and affinity chromatography validate the above structural data.42–46 Heparin binds a specific cluster of basic residues with positive charge density and comprised mostly from the first β-hairpin and the last two-thirds of the third trefoil-fold subdomain (Fig. 3, lower panel). Furthermore, within such regions, heparin-binding and receptor-binding functionalities are associated with local structural deviations from ideal threefold symmetry. For example, positions 120–122 in turn #11 (which contribute significantly to heparin-binding functionality46) represent a structural insertion in comparison with the symmetry-related turns #3 and #7 (which are not involved in heparin-binding function). Additionally, residue positions 104–106 in turn #9 (which contribute to the low-affinity receptor-binding site) are another apparent structural insertion in comparison with the symmetry-related turns #1 and #5.31, 43 Previous studies on forms of FGF-1 mutated to have increased primary sequence symmetry have observed a significant stability–function tradeoff in this region: deletion of the insertions at positions 104–106 and 120–122 (part of turns #9 and #11, respectively) increases protein stability by a substantial 16 kJ/mol (i.e., increasing by 50% the ΔGunfolding of the protein) but diminish heparin-binding affinity (i.e., KD for sucrose octasulfate) by an order of magnitude.46 Notably, deletion of residues in these heparin-binding regions increases the folding rate constant by a factor of 20, while the unfolding rate constant is largely unaffected.46 Thus, heparin-binding functionality is accommodated at the expense of thermostability, and the kinetic basis of such thermostability is principally upon foldability (i.e., folding kinetics).
Strikingly, the regions associated with heparin affinity are, without exception, observed to be denatured-like in the folding transition state (Fig. 3). Sites known to support receptor binding and positions identified as being denatured in the transition state are largely coincident, with limited exceptions. Notably, turns related by symmetry to the heparin-binding site, but do not participate in heparin-binding or receptor-binding functionality, are observed to be folded in the transition state (e.g., turn 5 compared to turns #1 or #9; turn #2 compared to turn #10 and turn #7 compared to turn #11), critically demonstrating that the inherent architecture of these positions is compatible with foldability. On the basis of the above-mentioned data, we conclude that formation of the heparin-binding site, and the majority of the receptor-binding site, necessitates specific physicochemical properties (i.e., positive charge density repulsion or strain associated with structural insertions that provide for molecular recognition), which preclude such regions from also efficiently participating in protein folding nucleation. The folding nucleus of FGF-1 appears largely localized to the second trefoil-fold subdomain; a region of the protein with a role that appears principally structural rather than functional. Taken together, our results show a general asymmetric segregation (despite the threefold symmetric tertiary structure) of critical folding and functional elements in FGF-1, thus supporting the foldability–function tradeoff hypothesis.
Seven of 11 total turn regions in FGF-1 are observed by φ-value analysis to possess native structure in the folding transition state, a result that underpins the importance of turn formation early in the folding of FGF-1. As highlighted by Meiering and coworkers,47 the folding of FGF-1 was thought to differ from the folding pathways of both hisactophilin and IL-1β. The basis for this conclusion was a report that probed the folding of FGF-1 using NMR H/D exchange studies and identified the association of the N- and C-termini as the first step in the folding pathway.48 The φ-value analysis presented here is inconsistent with this proposed folding pathway: both turns #1 and #11 are observed to fold after barrier crossing, and neither turn is part of the folding nucleus (which approximately spans turns #2–#7). In the absence of significant residual structure, termini closure as the first step in folding seems unlikely from an entropic standpoint, as it would require association of the two most distal structural elements of the protein. Indeed, the success of relative contact order,49 a measure of topological complexity, to predict folding rates is based on the preference of nearest-neighbor interactions to form first, which then support the formation of more distal interactions (so called, “zipping and assembly”50). Finally, refinement of anisotropic displacement parameters from a 1.10 Å X-ray diffraction dataset suggests that the N- and C-terminal regions of FGF-1 do not form a rigid body; instead, the N- and C-termini appear to be sliding past one another, suggestive of a tenuous interaction (even when stabilized by the folded structure).51 Thus, our analysis of the folding of FGF-1 is inconsistent with the published report of Yu and coworkers,48 but is in general agreement with the folding studies of other β-trefoil proteins in which the central strands of the protein are observed to fold faster than those on the periphery.47, 52, 53
Only a key subset of structural elements (turns #2–#5 and turn #7, spanning ∼50% of the overall protein) appears necessary to confer efficient foldability to FGF-1. The entire protein is not (and apparently does not need to be) optimized for foldability; the regions not contributing to formation of the folding transition state instead segregate to regions of HSPG and receptor-binding functionalities. In the case of FGF-1, the structural regions that are observed to fold late (i.e., have decreased foldability) do so not because of a structurally intrinsic inability to efficiently fold: in every case, at least one symmetry-related position is observed to be part of the folding nucleus (and folds early). Therefore, the inability of regions to efficiently fold appears due to differences in the primary structure or the presence of a symmetry-breaking structural insertion. Thus, if the requirement for function is relaxed, regions associated with functionality could instead be optimized for foldability (e.g., by substitution of the primary structure of the symmetry-related regions that comprise the folding nucleus). The resulting protein would exhibit exact primary sequence symmetry and contain a foldable structural element at each of the symmetry-related positions. It has been postulated that such a protein might exhibit inefficient folding (due to folding frustration involving regions of identical primary structure),54, 55 or conversely, might possess highly redundant, overlapping folding nuclei, thereby promoting folding co-operativity.56 We note that a protein with folding pathway redundancy might, in principle, retain efficient folding despite a localized deleterious mutation due to compensation by the folding-competent symmetry-related positions. Recent reports of de novo designed purely symmetric β-trefoil proteins that efficiently fold and are hyperthermophile in stability57–59 demonstrate that pure primary structure symmetry does not necessarily result in folding frustration and therefore supports the hypothesis of redundant foldability. If folding redundancy is a property of purely symmetric proteins (i.e., as would result from gene duplication and fusion replication errors60–62), it highlights a critical advantage of symmetric protein architecture over asymmetric architecture in protein evolution and de novo design; namely, an intrinsic ability to tolerate diverse functional mutation and retain efficient foldability.
Materials and Methods
Mutant construction and protein purification
Ala point mutations in FGF-1 for φ-value analysis were constructed using the Quikchange® site-directed mutagenesis method and wild-type FGF-1 as a template. Kinetic parameters and φ-values of several Ala mutations were previously reported using wild-type FGF-1, H93G, or K12V/C117V/P134V as a template depending on the requirement of a stabilizing background. Mutant protein expression and purification procedures were reported previously.46 Purified mutant protein was exchanged into 20 mMN-(2-acetamido)iminodiacetic acid (ADA), 0.1M NaCl, and pH 6.6 (“ADA buffer”) with the addition of 2 mM DTT. An extinction coefficient of E280 nm (0.1%, 1 cm) = 1.2663 was used for FGF-1 and mutants thereof.
Isothermal equilibrium denaturation
Isothermal equilibrium denaturation by guanidine HCl (GuHCl) for Ala point mutations was performed as previously described64 using fluorescence as the spectroscopic probe. Briefly, fluorescence data were collected on a Cary Eclipse fluorescence spectrophotometer (Agilent Technology, Santa Clara, CA) equipped with a Pelletier controlled-temperature regulator at 298 K and using a 1.0-cm path length cuvette. About 5.0 μM protein samples were equilibrated overnight in ADA buffer at 298 K in 0.1M increments of GuHCl. Samples were excited at 295 nm, and emission was measured from 304 to 500 nm. Scans were collected in triplicate, averaged, and buffer-subtracted. Isothermal equilibrium denaturation of the G71A mutation was measured by circular dichroism (CD) due to its abnormal fluorescence profile. CD data of 25 μM samples were collected on a Jasco model 810 CD spectrophotometer (Jasco, Easton, MD) equipped with a Pelletier controlled-temperature regulator at 298 K and using a 1-mm path length cuvette. The unfolding process was monitored by quantifying the change in CD signal at 227 nm with increasing GuHCl. Data were analyzed using the general purpose nonlinear least-squares fitting program DataFit (Oakdale Engineering, Oakdale, PA) implementing a six-parameter, two-state model.65 The effect of a given mutation upon the stability of the protein (ΔΔG) was calculated by taking the difference between the midpoint of denaturation (Cm value) for reference and mutant proteins and multiplying by the average of the m values, as described by Pace and Scholtz66 and where a negative value indicates that the mutation is stabilizing in relationship to the reference protein.
Folding/unfolding kinetic analysis
Folding and unfolding kinetic measurements of FGF-1 Ala point mutations followed previously described methods.26 Briefly, denatured protein samples for folding kinetics measurements were prepared by adding GuHCl to 2.0M followed by overnight incubation to permit equilibration. All folding kinetic data were collected using an Applied Photophysics SX20 stopped-flow system (Applied Photophysics, Surrey, United Kingdom) at 298 K with excitation wavelength at 295 nm and emission at 350 nm. Folding was initiated by a 1:10 dilution of 40 μM denatured protein into ADA buffer with denaturant concentrations increasing in increments of 0.05M up to the midpoint of denaturation as determined by isothermal equilibrium denaturation measurements. The data collection strategy was designed to span approximately five half-lives or >97% of the expected fluorescence signal change between the fully denatured and native states. Because of the comparatively slower kinetics, unfolding kinetics measurements for Ala point mutations were performed using manual mixing. Protein samples (∼30 μM) were dialyzed against ADA buffer overnight at 298 K. Unfolding was initiated by a 1:10 dilution into ADA buffer with a final GuHCl concentration of 1.5–5.5M in 0.5M increments. All unfolding data were collected using a Cary Eclipse fluorescence spectrophotometer (Agilent Technology) equipped with a Pelletier-controlled temperature unit at 298 K. Data collection times for each protein were designed so as to quantify the fluorescence signal over three to four half-lives or >93% of the total expected amplitude.
The kinetic rates and amplitudes versus denaturant concentration were calculated from the time-dependent change in fluorescence using a single exponential model (or double exponential model under low-GuHCl concentrations). Folding and unfolding rate constant data were fit to a global function describing the contribution of both rate constants to the observed kinetics as a function of denaturant (chevron plot) as described by Fersht67:
where kf0 and ku0 are the folding and unfolding rate constants, respectively, extrapolated to 0M denaturant and mkf and mku are the slopes of the linear refolding and unfolding arms, respectively, of the chevron plot. Changes in the free energy barrier to folding, ΔΔGf, and unfolding, ΔΔGu, were calculated from the global fit of the kinetic data:
where kf and ku are calculated at the average midpoint of denaturation for the reference and mutant proteins.
Alanine scanning was performed at each position of the 11 turns that comprise the β-trefoil architecture of FGF-1. Only mutants with a |ΔΔG| > 2.0 kJ/mol were considered suitable for φ-value analysis (i.e., >3σ in the error of the ΔΔG measurement). φ-Values were calculated following the procedure established by Fersht et al.68:
where ΔΔG is derived from equilibrium experiments as described earlier. Note that all reported φ-values are at the average midpoint of denaturation for the mutant and reference proteins.24
The authors declare no competing interests that could undermine the objectivity or integrity of this work.