In most thermodynamic experiments only ΔG and ΔH are measured independently, ΔS being obtained by subtraction. If |ΔG| < |ΔH|, which is very often the case, then the high correlation between errors in ΔH and ΔS can produce linear ΔS-ΔH plots with a high correlation coefficient. This possibility was originally discussed by Lumry and Rajender (1970) in a detailed review of compensation. A concise discussion of this was presented by Krug et al. (1976), who showed that a simple statistical test can be used to determine the significance of such ΔS versus ΔH plots. The confidence interval for Tc is determined from a linear regression analysis of this plot and we ask whether the experimental temperature T (or harmonic mean experimental temperature if data are obtained at different temperatures) lies outside this confidence interval. Thus, the correlation would not be significant at the 95% confidence level if
where σ is the estimated standard error in Tc from the fit. Krug et al. (1976) used this test to examine 38 data sets purportedly showing high ΔS-ΔH compensation. For only three of these, according to Krug et al. (1976), “was the hypothesis rejected that the observed compensation pattern can be explained as an artifact.” Unfortunately, the simple test proposed in this paper does not appear to be widely known or applied, perhaps because of the predominantly negative results it produced. Below I reexamine three recent sets of protein thermodynamic data using this test.
Unfolding of cytochrome c
Milne et al. (1999) measured the hydrogen-exchange protection factors for amide proteins in oxidized and reduced cytochrome c at various temperatures. From the T dependence of these protection factors, they determined ΔS and ΔH for opening at each amide and found a high correlation. They found no reason for this correlation, involving, as it does, a mixture of local protein fluctuations and larger unfolding events at different parts of the protein.
ΔS versus ΔH plots for these three experiments are shown by the solid symbols in Figure 1A–C, respectively. In each case, an impressive linear plot resulted, with a high correlation coefficient (Table 1, Col. 2). This table also shows, however, that for the first two experiments T falls within the 95% confidence limits of Tc (Col. 4), whereas for the third set, T falls just outside the confidence limits. Thus, in spite of the linearity, the S-H correlations range from clearly not significant to barely significant. To show these conclusions in a more graphic and forceful way, I regenerated each ΔS versus ΔH plot using the actual ΔG data combined with random ΔH values spanning about the same range. The randomly selected ΔHs give equally good or slightly better fits with a statistically indistinguishable slopes (Fig. 1A–C, open symbols; Table 1).
Lest the impression arise that these two tests would yield a negative result for any set of ΔS-ΔH data, data for entropy and enthalpy of solvation of the linear alkane series pentane through hexadecane (Ben-Naim and Marcus 1984) are presented in Figure 1D and Table 1. Here, T falls many standard deviations outside the Tc confidence limits, and the random ΔHs produce a clearly inferior fit with a significantly different slope (Tc = 330K). However, this is simply an example of additivity producing a linear ΔS-ΔH plot (category 3a above).
In each of the three protein data sets, reasonable arguments can be made that prior experimental or biological factors constrain the range of ΔGs that can be observed. For calcium-binding proteins, the range of affinities must lie within some biologically functional window determined by in vivo calcium levels, a requirement to modulate activity by removing calcium and such. For small protein unfolding, ΔG must have some minimum value (<6 kcal/mole) to have any stability, whereas the requirement for equilibrium reversible calorimetry in aqueous solution puts an upper bound on the stability of proteins for which one can easily obtain measurements. Thus, the contribution to stability per residue for easily measurable proteins is constrained to be a few tenths of a kcal/mole. For hydrogen exchange data on cytochrome c, the maximum protection factor has an upper bound given by the protein's global stability (≈13 kcal/mole for the oxidized form), whereas the difficulty of measuring rapidly exchanging hydrogens puts a lower bound on the measurable free energy values (≈6 kcal/mole). Thus for all three sets of experimental data the a priori range of observable ΔGs is small compared to the average values (Table 1, Col. 6) or the average enthalpy. Other examples of this no doubt can be found in the literature. For example, another recent study of hydrogen exchange in lysozyme by Dixon et al. (2000) also produced an impressive ΔS versus ΔH plot, which was attributed to an even more narrow observation window for ΔG (a range of only ≈2.3 kcal/mole). In contrast, Gallicchio et al. (1998) provide examples of non-compensation, where S and H change linearly but with opposite sign.
In each of the three protein data sets, it is statistically likely that the compensation is produced by the high correlation between the errors in estimating H and S. One cannot, however, rule out the possibility that compensation with Tc = T would be seen in these systems if the H and S were measured more precisely, but on balance, nothing in the data examined so far provides convincing evidence of extra-thermodynamic compensation.
It has been suggested that compensation is an intrinsic property of complex systems that have many soft modes of fluctuation, which would include aqueous solutions and soluble proteins (Lumry and Rajender 1970; Weber 1995; Qian and Hopfield 1996; Qian 1998). Below I examine this using a simple statistical mechanical model of a complex system to explore whether and under what conditions extra-thermodynamic S-H compensation can occur, and what it can tell us about the system. For this purpose, I adopt the following working description of a complex system: It is composed of many atoms, and hence many degrees of freedom, governed by some potential energy function (Hamiltonian). This Hamiltonian includes many kinds of interactions (van der Waals, electrostatic, torsions, etc.). These interactions have different functional forms and distance dependencies, which result in a complex multidimensional energy surface with a very large number of closely spaced minima (energy states). Let the number of states with an energy within some small range U to U + δU be ω(U)δU, where ω(U) is the density of states. The configurational part of the partition function Q is obtained by integrating overall energy levels
where β = 1/kT, T is the temperature and k is the Boltzmann constant. To model the effect of a perturbation, it is assumed that the energy levels in the range U′ to U′ + δU are significantly perturbed by raising them by an amount ΔU. The partition function of the perturbed system Q′ can be written in terms of the unperturbed system as
where the second equality here defines a significant perturbation to mean ΔU > 3kT, so the first exponential term is small compared to the second. The change in free energy produced by the perturbation is then given by
where P(U′) = ω(U′)e−βU′ is the unperturbed probability distribution, that is, the probability of finding the system in a state with energy U′ in the unperturbed system. In a complex system with many closely spaced energy levels, the probability of being in any small range of energy levels is small, that is, P(U′)dU << 1, and a linear expansion of the logarthmic term gives
The mean energy in the unperturbed system is
(The difference between mean energy E and enthalpy H in most biochemical experiments is small and no distinction is made between them here.) With the above definition of a significant perturbation, the mean energy in the perturbed system can be written as
Using equation 3 to substitute for Q′, the ratio of the partition functions in equation 5 may be written
The last equality again uses the fact that P(U′)dU << 1, allowing a linear expansion of the reciprocal. Substituting equation 7 into equation 6 and subtracting the energy of the unperturbed system gives the change in mean energy
In the last equality the quadratic term U′(P(U′)dU)2 has been dropped because it is negligible compared to the linear terms. Finally, an expression for the entropy change may be obtained using TS = E − A:
It should be noted that equations 5, 9, and 10 are general. They do not depend on the particular distribution of states, providing the following two assumptions are satified: (1) A small number of the states are perturbed and (2) the size of the perturbation is significant. Moreover, the magnitude of the perturbation in A, E, and S depends on the initial energy of the perturbed state(s) and how many are perturbed but not on how much the states are perturbed. These equations also apply to the situation in which the perturbation results in a lowering of the energy levels simply by exchanging the role of perturbed and unperturbed states. This results in free energy, entropy, and enthalpy changes of equal magnitude and opposite sign. In this case U′ refers to the final energy of the perturbed state(s), rather than the initial energy.
One may define a compensation temperature in this model by
Clearly this is not a constant, but it depends on where the energy of the perturbed states lies with respect to the mean energy.
To provide a concrete example of the entropy–enthalpy behavior in this model, it is useful to use some specific distribution of energy levels, but it should be stressed that the general conclusions do not depend on the specific form. Given the high dimensionality and complexity of the energy surface of a protein it seems reasonable that as one changes any given conformational degree of freedom, there are relatively few minima that can be accessed that are either very low or very high in energy. Most will cluster around some mean value. In particular I will assume a Gaussian density of such minima ω(U) of mean Uo, width σ permitting an analytical treatment. Because the statistical mechanical behavior is independent of the zero point of the energy scale, the presentation can be further simplified, without loss of generality, by setting Uo = 0.
It also seems reasonable that because many competing interactions from many different atoms contribute to the energy, there is, on average, little correlation in the change produced by simultaneously perturbing any two degrees of freedom. If the first produces an increase in energy, the second is, on average, as likely to decrease the energy as increase it. In other words, each degree of freedom can be treated independently and the total configurational partition function Q can be approximated as the product of the partition functions for each degree of freedom Q = ⇁qj, and equations 5, 9, and 10 will apply to each qj. I will consider below whether correlations between different degrees of freedom affect the conclusion to be drawn from this model.
For a Gaussian density of states, the configurational part of the partition function for a given degree of freedom qj is
where β = 1/kT, and nj is the total number of states accessible to the jth degree of freedom The probability distribution in the unperturbed system is (see Fig. 2A)
Using this probability distribution in equations 9 and 10, a plot of ΔS versus ΔE can be generated for a systematic series of perturbations, corresponding to a series of experimental manipulations that affect different energy levels, by varying U′. The result is shown in Figure 3. The resulting plot is ellipsoidal with the major axis aligned along the Tc = T direction. To check the validity of the mathematical approximations made in the derivation of equations 9 and 10, the exact partition function for the perturbed system (equation 3) was evaluted numerically for the Gaussian model. The same values of dE and dS, to within numerical precision, were obtained as from the approximate analytical results, equations 9 and 10.
The ellipsoidal profile can be explained with reference to equations 9 and 10 and Figure 2A. Consider the effect on E of a perturbation of a given set of states at U′. This is the product of the following two terms: (1) an occupancy term, P(U′)δU, which is always positive, and (2) an energy difference term, E-U′. If we perturb an energy level that is lower than the mean energy of the unperturbed system U′ < E, then ΔE is positive because on average the system spends more time in higher energy states. The reverse is true for U′ > E. For E = U′ the first term is zero, and there is no effect on the mean energy. The occupancy term is small at very low and very high energies because the probability density of these states is small, hence ΔE → 0. Thus, as increasingly higher energy states are perturbed, ΔE first increases as the occupancy term increases, drops to zero, and becomes negative as the second term changes sign, and finally returns to zero as the probability density decreases (Fig. 2B). The behavior of ΔS is governed by the same equation, except E is offset by −kT. It follows the same profile: It is increasingly positive at low U′, decreasing through zero to negative values and finally returning to zero at high U′ but with a phase shift (Fig. 2B). The resulting plot of ΔS versus ΔE is ellipsoidal.
What can be extracted from this ΔS-ΔE plot? The key parameter in the Gaussian density of states model is σ, the width of the density distribution, that is, the spacing of the energy levels. A more elongated ellipse indicates a greater spacing of energy levels. The direction of the major axis remains unchanged and provides no specific information. The discussion in the previous paragraph indicates that the criteria for an ellipsoidal type ΔE-ΔS plots are rather broad, requiring only that P(U′) tends to zero at high and low U′; that is, any peaked density of states distribution will produce very similar behavior.
Is such behavior seen in experimental systems? Data presented by Eftink and Biltonen (1983) for nucleotide binding to RNase show such behavior Hooked plots of ΔH versus ΔS for linked binding-conformational change equilibria are observed, which resemble portions of an ellipse. The parameter changed in these plots is the equilibrium constant for the conformational change, that is, the energy gap between the two protein conformations. Thus, their experimental situation is effectively a two-energy-level version of the model presented here. However, in any given system, it is unlikely that a sufficient range of perturbations of U′ is experimentally realizable, so complete elliptical plots are not likely to be seen. A restricted range of perturbations would effectively manifest some portion of the ellipse, as seen by Eftink and Biltonen (1983). Depending upon which portion and how much is accessed, quasi linear ΔH-ΔS plots of widely varying slope (Tc) could result. In these situations, the particular value of Tc would reveal nothing specific about energy-level distribution. Examples of non-compensation discussed by Gallicchio et al. (1998) may represent portions of the ellipse with negative slope.
The Gaussian density of state model describes the contribution from a single degree of freedom (DOF) qj. Thus, the net perturbation in ΔE or TΔS is rather small. (In fact, from equations 9 and 10, it must be <kT.) An actual experiment represents the net effect of perturbations to many DOFs. If the perturbations are not correlated, then one would expect many of the contributions to ΔE or TΔS to cancel, resulting in small net values. Thus, larger experimental ΔE or TΔS values presumably result from correlated perturbations from many degrees of freedom. In effect this would produce ΔE and ΔS values corresponding to a summation of similar elliptical curves all aligned along Tc = T. Combined with finite experimental precision, this would most likely result in a fat line with slope Tc ≈ T. This would be difficult to distinguish from the self-evident compensation of case 3. This model suggests that extra-thermodynamic compensation, if it exists, is unlikely to be observed in real experimental systems and difficult to interpret if it does.
In summary, if the range of ΔG's measured in a series of experiments is much smaller than the range of ΔH's, then with respect to ΔH, ΔG ≈ Constant. Linear dH-dS compensation follows immmediately from the relationship ΔG = ΔH − TΔS. The question then is whether this arises from (1) larger errors in determining ΔH than ΔG, (2) Some extra-experimental constraint that a priori restricts the range of observable ΔGs, or (3) some extra-thermodynamic mechanism of ΔH-ΔS compensation. For the three data sets examined here, the statistical tests strongly suggest, although they cannot prove, the first explanation. I argue that this is because extra-experimental constraints a priori restrict the range of observable ΔGs to less than the precision in dH measurements, even though the latter may be carefully measured. Nevertheless, without knowing the molecular origin of the entropy and enthalpy components and from statistical tests alone, one cannot rule out some type of extra-thermodynamic compensation of the type seen in the model presented here.