Protein Science

Cover image for Vol. 25 Issue 9

Edited By: Brian W. Matthews

Impact Factor: 3.039

ISI Journal Citation Reports © Ranking: 2015: 118/289 (Biochemistry & Molecular Biology)

Online ISSN: 1469-896X

Protein Folding: Introduction to the virtual issue

Protein Folding: Short Question-Long Answer

Introduction to the virtual issue

Brian W. Matthews


This Virtual Issue includes selected articles from Protein Science that relate to the folding of proteins. Because this is, arguably, the most common subject discussed in the journal, the selection was not easy.

As a compromise, it was decided to include articles in two general categories. The first group were chosen to provide coverage of different areas, but all can be considered as “citation classics”. Each of these reports has had a major influence in the protein folding field.

The second group of selected articles were all published recently, none earlier than 2008. These have already generated considerable interest and can be considered as pointers to future developments.

In the Beginning

How do proteins fold? The question is simple but finding the full answer has occupied protein scientists for decades with no sign that the end is near.

Kauzmann in his 1959 review1 is usually given credit for the widely-accepted principle that it is the hydrophobic effect that drives folding. Twenty years earlier, however, Bernal2 had the same understanding.

Likewise, Anfinsen’s 1961 paper3 is widely cited for demonstrating that the amino acid sequence determines the three-dimensional structure of a protein, although as early as 1932, Northrop4 had shown that crystalline trypsin lost its activity on heating, and regained it on cooling, suggesting that purified enzymes could be reversibly unfolded and refolded. The first thoroughly convincing experiment showing that a fully unfolded protein could spontaneously refold was carried out by White, in Anfinsen’s laboratory. In two papers in JBC in 1960 and 19615,6 he showed that ribonuclease A, reduced to break its four disulfide bridges, and fully unfolded, could spontaneously refold in the presence of atmospheric oxygen to form the active structure. In a third paper, the only one of the three for which Anfinsen was a coauthor, the kinetics of formation of the active ribonuclease was described.3 This third report was perhaps especially convincing because the evidence suggested that the eight cysteines formed disulfide bridges at random, and then reshuffled through disulfide exchange to yield the native configuration. It strongly suggested that protein folding is driven by thermodynamics.

Principles of Protein Folding

There is no place better to start than the now classic review from Ken Dill’s group7 “Principles of protein folding – A perspective from simple exact models”. One of the striking conclusions was that unique, compact chain conformations do not necessarily require complicated sequences of amino acids; simplified sequences in which at least one member of the sequence is solvent-averse may be sufficient.

As a complement to this bold theoretical prediction we include an example of Michael Hecht’s experimental studies on de novo proteins.8 In this report the Hecht group shows that a polypeptide sequence based on simple combinations of polar and non-polar amino acids forms a four-helix bundle whose structure could be determined by NMR.

Although simple models can often yield remarkable insights, increased computer-power has made atom-based approaches more and more prevalent. We include an example from Ma and Nussinov9 in which molecular dynamics simulations of alanine-rich β-sheet oligomers were used to investigate the basis of seeding and seed growth in amyloid formation.

Folding Energetics

As is well known, folded proteins are only marginally stable. Perhaps this may be a requirement for function, but having large, opposing energy terms in play does complicate the theoretical analysis. When a protein is unfolded with a denaturant such as urea it is common to determine the m value, i.e. the dependence of the free energy of unfolding on denaturant concentration. In the article included, Myers, Pace and Scholtz10 showed that m values correlate very strongly with the area of protein surface exposed to the solvent on unfolding. Their work also confirmed that the change in accessible surface area of a protein on unfolding is the main structural factor in determining the heat capacity change.

Robert (Buzz) Baldwin and his group have used the helix-forming ability of relatively short alanine-based peptides to obtain key insights into helix stability. In the classical report included here11 they conclude that the formation of peptide H-bonds can largely offset the unfavorable entropy change caused by fixing the peptide backbone. A more recent approach using amide-to-ester backbone mutagenesis to measure backbone-backbone H-bonding in helices is described in the recent article by Gao and Kelly.12

Even when high-resolution structures are available, the accurate calculation of protein stability remains extremely challenging. Difficulties include the complexity of protein structures, the presence of solvent and imperfect potential functions. Recently Zhou and Zhou13 have introduced new structure-derived potentials which were of superior accuracy in predicting the stabilities of 895 mutant proteins. There has been considerable interest in this approach.

The use of single-molecule approaches to study protein folding and function are rapidly becoming popular. Here we include just one intriguing example in which Guoliang Yang and coworkers show that macromolecular crowding enhances the mechanical stability of protein molecules.14

Kinetics of Folding

Although the details of protein folding are complicated, Plaxco, Simons and Baker showed that for single domain proteins there is a striking correlation between the rate of folding and a simple topological parameter called the relative contact order. In an extension of that work the included article15 uses topological principles to predict folding rates for both two-state and multistate folding proteins as well as short peptides. Also Makarov and Plaxco16 introduce the topomer search model which posits that the rate-limiting step in folding of small single-domain proteins is the search for unfolded conformations with grossly correct topology.

By way of contrast the report of Courtemanche and Barrick17 describes a protein that folds by a simple two-state mechanism but significantly more slowly than expected from contact order. The structure consists of seven tandem leucine-rich repeats with each repeat contributing to an overall seven-strand β-sheet. Notwithstanding its modular structure it folds in a two-state manner without apparent intermediates.

The report from Peter Wright’s group18 describes a very different situation where partially-folded intermediates can be identified and characterized structurally.

The Folded Structure

There are now over 66,000 structures in the Protein Data Bank. A cynic might say that many of these are “boring”, but now and again a structure determination illustrates a novel aspect of protein folding which heretofore had been neither anticipated nor appreciated. Domain swapping is such an example. In 1995 Bennett, Schlunegger and Eisenberg19 reviewed domain swapping in the context of oligomer assembly. This ability of a single polypeptide chain to participate in two distinct structures is now seen in a multitude of contexts including amyloid-like fibril formation.

The ever-increasing number of well-determined high-resolution protein structures has also been indispensable in developing better methods for structure prediction and validation. Building on the appreciation that protein cores are well-packed, Sheffler and Baker20 have developed “RosettaHoles”, a rapid algorithm to discriminate between real and computationally-generated structures, to predict incorrect regions in models and to identify problematic structures in the Protein Data Bank.


Although Anfinsen and his group had shown that the sequence of a protein defines its structure, it became apparent that folding inside the cell, especially for multi-domain proteins, was susceptible to misfolding and aggregation. The review of Fenton and Horwich21 was especially influential in explaining the role of the double-ring GroEL structure as a chaperone in preventing protein aggregation.

Other examples of proteins acting as chaperones have become increasingly common.22,23 In the case of SecA, a chaperone involved in protein export, Linda Randall and coworkers23 recently showed that there is a distinct region on the surface of the protein which serves as a site of interaction for a variety of different unfolded polypeptide ligands.

We also include an example of chaperone action of a different type, namely the use of the maltose binding protein to facilitate protein expression and purification. Kapust and Waugh24 show the benefit of fusing an aggregation-prone polypeptide to a partner which is soluble and well behaved. They found that the E. coli maltose binding protein was especially effective, acting as a general molecular chaperone to promote the solubility and folding of its fusion partner. It goes without saying that this finding has been of enormous practical benefit.


The eye lens predominantly contains a high concentration of the so-called crystallins. These proteins form hetero-oligomers that assemble into ordered structures which promote transparency. Among their other attributes the crystallins are chaperones and as such are thought to minimize protein misfolding and aggregation in the eye lens.

In one of the recent reports included here, Kirsten Lampi and coworkers25 show how deamidation destabilizes and triggers aggregation of βA3-crystallin. With high likelihood this is one of the contributors to cataract formation.

In the second report, David Eisenberg’s group26 uses crystal structures of truncated crystallins to infer how polymorphic interfaces enforce polydispersity important for the function of the lens. It might be noted that this article includes examples of Interactive Images, a new feature being introduced by Protein Science. The cover image for this Virtual Issue is also taken from the same report.26

1*. Kauzmann W (1959) Some factors in the interpretation of protein denaturation. In: Anfinsen CB, Anson ML, Bailey K, Edsall JT, Eds. Advances in protein chemistry, Vol. 14, New York: Academic Press, pp. 163.

2*. Bernal JD (1939) Structure of proteins. Nature 143:663-667.

3*. Anfinsen CB, Haber E, Sela M, White FH Jr (1961) The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. Proc Natl Acad Sci USA 47:1309-1314.

4*. Northrop JH (1932) Crystalline trypsin. IV. Reversibility of the inactivation and denaturation of trypsin by heat. J Gen Physiol 16:323-337.

5*. White FH Jr (1960) Regeneration of enzymatic activity by air-oxidation of reduced ribonuclease with observations on thiolation during reduction with thioglycolate. J Biol Chem 235:383-389.

6*. White FH Jr (1961) Regeneration of native secondary and tertiary structures by air oxidation of reduced ribonuclease. J Biol Chem 236:1353-1360.

7. Dill KA, Bromberg S, Yue K, Fiebig KM, Yee DP, Thomas PD, Chan HS (1995) Principles of protein folding-A perspective from simple exact models. Protein Sci 4:561-602.

8. Go A, Kim S, Baum J, Hecht MH (2008) Structure and dynamics of de novo proteins from a designed superfamily of 4-helix bundles. Protein Sci 17:821-832.

9. Ma B, Nussinov R (2002) Molecular dynamics simulations of alanine rich β-sheet oligomers: Insight into amyloid formation. Protein Sci 11:2335-2350.

10. Myers JK, Pace CN, Scholtz JM (1995) Denaturant m values and heat capacity changes: Relation to changes in accessible surface areas of protein unfolding. Protein Sci 4:2138-2148.

11. Chakrabartty A, Kortemme T, Baldwin RL (1994) Helix propensities of the amino acids measured in alanine-based peptides without helix-stabilizing side-chain interactions. Protein Sci 3:843-852.

12. Gao J, Kelly JW (2008) Toward quantification of protein backbone-backbone hydrogen bonding energies: An energetic analysis of an amide-to-ester mutation in an α-helix within a protein. Protein Sci 17:1096-1101.

13. Zhou H, Zhou Y (2002) Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci 11:2714-2726.

14. Yuan J-M, Chyan C-L, Zhou H-X, Chung T-Y, Peng H, Ping G, Yang G (2008) The effects of macromolecular crowding on the mechanical stability of protein molecules. Protein Sci 17:2156-2166.

15. Ivankov DN, Garbuzynskiy SO, Alm E, Plaxco KW, Baker D, Finkelstein AV (2003) Contact order revisited: Influence of protein size on the folding rate. Protein Sci 12:2057-2062.

16. Makarov DI, Plaxco KW (2003) The topomer search model: A simple, quantitative theory of two-state protein folding kinetics. Protein Sci 12:17-26.

17. Courtemanche N, Barrick D (2008) Folding thermodynamics and kinetics of the leucine-rich repeat domain of the virulence factor Internalin B. Protein Sci 17:43-53.

18. Schwarzinger S, Mohana-Borges R, Kroon GJA, Dyson HJ, Wright PE (2008) Structural characterization of partially folded intermediates of apomyoglobin H64F. Protein Sci 17:313-321.

19. Bennett MJ, Schlunegger MP, Eisenberg D (1995) 3D domain swapping: A mechanism for oligomer assembly. Protein Sci 4:2455-2468.

20. Sheffler W, Baker D (2009) RosettaHoles: Rapid assessment of protein core packing for structure prediction, refinement, design, and validation. Protein Sci 18:229-239.

21. Fenton WA, Horwich AL (1997) GroEL-mediated protein folding. Protein Sci 6:743-760.

22. Johansson H, Eriksson M, Nordling K, Presto J, Johansson J (2009) The Brichos domain of prosurfactant protein C can hold and fold a transmembrane segment. Protein Sci 18:1175-1182.

23. Lilly AA, Crane JM, Randall LL (2009) Export chaperone SecB uses one surface of interaction for diverse unfolded polypeptide ligands. Protein Sci 18:1860-1868.

24. Kapust RB, Waugh DS (1999) Escherichia coli maltose-binding protein is uncommonly effective at promoting the solubility of polypeptides to which it is fused. Protein Sci 8:1668-1674.

25. Takata T, Oxford JT, Demeler B, Lampi KJ (2008) Deamidation destabilizes and triggers aggregation of a lens protein, βA3-crystallin. Protein Sci 17:1565-1575.

26. Laganowsky A, Benesch JLP, Landau M, Ding L, Sawaya MR, Cascio D, Huang Q, Robinson CV, Horwitz J, Eisenberg D (2010) Crystal structures of truncated alphaA and alphaB crystallins reveal structural mechanisms of polydispersity important for eye lens function. Protein Sci 19:1031-1043.

*References 1-6 are not included in the Virtual Issue.