Mechanisms of amyloid fibril formation – focus on domain-swapping

Authors


E. Žerovnik, Department of Biochemistry and Molecular and Structural Biology, Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia
Fax: + 386 1 477 3984
Tel: + 386 1 477 3753
E-mail: eva.zerovnik@ijs.si

Abstract

Conformational diseases constitute a group of heterologous disorders in which a constituent host protein undergoes changes in conformation, leading to aggregation and deposition. To understand the molecular mechanisms of the process of amyloid fibril formation, numerous in vitro and in vivo studies, including model and pathologically relevant proteins, have been performed. Understanding the molecular details of these processes is of major importance to understand neurodegenerative diseases and could contribute to more effective therapies. Many models have been proposed to describe the mechanism by which proteins undergo ordered aggregation into amyloid fibrils. We classify these as: (a) templating and nucleation; (b) linear, colloid-like assembly of spherical oligomers; and (c) domain-swapping. In this review, we stress the role of domain-swapping and discuss the role of proline switches.

Abbreviations
1D

1 dimensional

AFM

atomic force microscopy

CO

critical oligomers

DA

dipole assembly

DCF

double-concerted fibrillation

IDPs

intrinsically disordered proteins

MDC

monomer-directed conversion

NCC

nucleated conformational conversion

NDP

nucleation-dependent polymerization

NP

nucleated polymerization

OFF

off-pathway folding

TA

templated assembly

TEM

transmission electron microscopy

TFE

2,2,2-trifluoroethanol

Introduction

The ordered aggregation of proteins to amyloid fibrils is at the core of systemic diseases such as diabetes type II and immunoglobulin light-chain amyloidosis, and also prevalent in localized diseases, particularly in neurodegenerative disorders such as Alzheimer’s, Parkinson’s, Huntington’s disease, several other dementias, motor neuron disease, different ataxias and prion-related diseases [1–4]. Increasing evidence suggests that aberrant folding of the mutated protein and its aggregation might be the initial trigger of such diseases, followed by other consequences, such as Ca2+ and metal ion imbalance, oxidative stress, and the overload of chaperone and ubiquitin proteasome systems [1,5,6]. The primary trigger in sporadic cases is still a matter of debate. Proteins and lipids become damaged by oxidative stress and by excessive metal interactions, which, in turn, could both promote protein aggregation [7,8]. Aging by itself influences the performance of the ubiquitin–proteasome system [9] and autophagy [10], with a concomitant decline in protein degradation capability. In addition, mitochondrial energy production becomes less efficient with age [11]. All these factors could contribute to the accumulation of protein aggregates.

Understanding the rules governing protein folding should lead to a better understanding of protein ‘misfolding’ (i.e. folding to an alternative, often multimeric state). The conversion to the cross-β structure observed in mature amyloid fibrils takes place starting from an intermediate conformation which, in the case of globular proteins, forms after partial unfolding and, in natively unfolded proteins, after partial folding [12,13].

Dobson [14] proposed that any protein can be transformed into amyloid fibrils. Many disease-related and nonpathological proteins have been studied in an attempt to reveal the molecular mechanism of their aggregation into ordered, β-sheet rich amyloid fibrils. In this review, we focus on the possible mechanisms of amyloid-fibril formation and search for common grounds. We also discuss the interface between folding and aggregation.

The field of protein aggregation into amyloid fibrils combines physicochemical and structural studies, cellular and animal models, and clinical studies. In addition to providing a basic understanding of the processes of protein folding and aggregation, such data help towards translational approaches in medicine.

Structural and morphological data

Pre-amyloid, oligomeric intermediates, at the cross-roads between protein folding and aggregation, possess some common structure, regardless of their amino acid sequence, because polyclonal antibodies raised against one can bind to most such oligomers of different amyloid proteins [15]. It remains to be clarified whether the structure of the prefibrillar oligomers is indeed all β-sheet or whether the α-helical parts are the ones that cross the membranes. As revealed by atomic force microscopy (AFM), the structure of such annular oligomers embedded in lipid bilayers resembles that of the well ordered bacterial toxins [15–17]. It still remains for us to capture and image the annular oligomers in their cellular environment where they are inserted in cellular membranes. We envisage that two-photon fluorescence correlation spectroscopy [18] may soon make this possible. However, the common structural details of the oligomers and their mode of toxic action remain unknown [4] and would profit from innovative research approaches.

Mature amyloid fibrils are long and straight, usually comprising four to six filaments. They specifically bind certain dyes such as Congo red and thioflavin T, and they demonstrate a characteristic cross-β pattern on X-ray diffraction, reflecting distances between β-strands (4.7 Å) and distances between β-sheets (9–11 Å) [19,20].

High-resolution structural methods such as NMR and X-ray diffraction are of limited use for characterizing prefibrillar aggregates and amyloid fibrils, primarily as a result of their limitations in providing insight into the structure of heterogeneous species. However, they can be used to determine the structure of the precursor conformation, whereas, for the fibrils and oligomers, cryo-electron microscopy, transmission electron microscopy, small angle X-ray scattering and AFM are more suitable [21]. AFM and electron microscopy have revealed multiple morphological variants of amyloid fibrils differing in the number of filaments and the helicity of their intertwining [22–24].

The structure of the mature fibrils has been determined in a limited number of cases by either solid state NMR [25] or by H/D exchange quenched flow followed by heteronuclear NMR [26]. However, the structure of the prefibrillar oligomers, which is more relevant to biomedically oriented research, remains rather elusive. Both Yu et al. [27] and Glabe [28] proposed that two kinds of β-structure are possible: the β-sheet that is observed in the mature fibrils and the α-pleated sheet [29], which could be the structure in the prefibrillar species, termed either globular oligomers (or ‘globulomers’), ‘granules’, ‘critical oligomers’ or ‘spheres’. The α-pleated sheet structure would give the globular oligomers higher dipole moments, which would lead to a linear, colloid-like growth of amyloid protofibrils. Glabe [28] suggested that, instead of selecting oligomers by size, they could be selected by the structural epitopes that become exposed. Trials with conformationally selective antibodies have shown that most of the prefibrillar species are bound by the selective A11 antibody, and only a few by OC antibody, which also binds fibrils [28].

Comparison of amyloid aggregation and protein folding

Under physiological conditions, protein folding takes place in the crowded milieu of the cell with a whole range of helper proteins [30]. These helpers include a series of molecular chaperones whose functions, amongst others, are to prevent aggregation of incompletely folded polypeptide chains [31] and to disaggregate formed aggregates [32–34].

Protein folding involves a complex molecular recognition phenomenon that depends on the cooperative action of a large number of relatively weak, noncovalent interactions involving thousands of atoms. Hydrophobic [35,36], electrostatic [37–39] and van der Waals interactions [40,41]; peptide hydrogen bonds [42,43]; and peptide solvation [44,45] are major forces driving protein folding. The electrostatic interaction between polar C=O and NH groups in the peptide backbone depends strongly on the peptide backbone conformation [37,38]. In the extended β-strand conformation, C=O and NH dipoles of adjacent peptide units are aligned antiparallel, whereas, in the α-helical conformation, they are parallel. The stability of both types of structure can be explained by the electrostatic screening model [37,46]. This readily explains the distinct preferences of residues in native and denatured proteins [46] and in peptides [47,48]. In this model, it is assumed that the total free energy of an amino acid residue is determined predominantly by the local electrostatic energy of the backbone dipole moments (N-H, C=O) as a result of interaction with neighboring peptide groups, and by the solvation free energy of the backbone dipole moments [37,49,50]. The ϕ and ψ values of the ‘coil library’ of high-resolution protein structures, which represent residues outside the secondary structure, adopt β, αR, αL and polyproline II backbone conformations [51]. With regard to the electrostatic screening model [46,51], the β conformer is energetically more favorable than either of the two α conformers of a residue in the gas phase. The antiparallel orientation of the backbone dipole moments stabilizes the β conformer, whereas the parallel orientation of dipole moments destabilizes the αR conformer. However, the parallel arrangement of dipole moments has advantages in polar solvents as a result of favorable interactions with the solvent. Therefore, the solvation of backbone atoms is much larger for α conformers than for β conformers. Interaction with solvent thus compensates for the destabilization of the α conformation as a result of peptide dipole moments. Alternation of the screening of backbone electrostatic interactions by side chains causes different conformational preferences of residues in aqueous solution. Moreover, the additional modulation of screening by changing the local environment and inter- and/or intramolecular interactions may have a significant influence on the preferential conformations of a single amino acid residue. Therefore, even small variations in pH, temperature and ionic strength may have sufficient potential to induce changes in the conformational propensities of amino acid residues to form secondary structure, as well as their ability to aggregate.

Computer simulations of protein aggregation indicate that the hydrophobic effect plays an important role in promoting the aggregation process [52]. Molecular dynamics simulations of small peptides show that β-sheet aggregates are stabilized by backbone hydrogen bonds, as well as by specific side-chain interactions, such as hydrophobic stacking of polar side chains and formation of salt bridges [53,54]. Coulombic interactions also play an important role in protein aggregation [54–57]. Synthetic amyloidogenic peptides polymerize into fibrils only when the net charge is ± 1 [54], whereas a neutral or higher effective charge prevents fibril formation. These results were explained on the assumption that nonspecific, amorphous aggregation and fibril formation represent competing events.

When the structure of the side chains permits, polypeptides in the β-pleated sheet conformation can self-assemble into 1D, crystal-like structures involving a very large number of β-sheets. The capacity of unlimited interchain hydrogen bonding in the absence of structural restraints is considered to drive the assembly of susceptible proteins into amyloid fibrils [19]. The structure of amyloid fibrils reflects the aggregation of strands of β-pleated sheet polypeptides into a long cross-β assembly, with the strands oriented perpendicular to the fibril axis. The dominant forces driving the association of β-sheet formations are dipole–dipole interactions and the dehydration propensity of preformed intrasheet hydrogen bonds [58].

Factors influencing the propensity to aggregate

The degree of conformational stability of the protein native state plays an important but not always decisive [59,60] role in the process of aggregation. A partially-unfolded conformation favors specific intermolecular interactions, including electrostatic attraction, hydrogen bonding and hydrophobic contacts, which result in oligomerization and fibrillation [14,61–64]. In general, amyloid formation in vitro can be achieved by destabilizing the native state of the protein under conditions in which noncovalent interactions still remain favorable [65–67]. However, a local conformational change before aggregation is not a necessary step in the fibril formation of every protein. For some proteins, it was shown that the native structure is preserved in the fibrils [68,69]. Even all-α [70] or mixed α/β proteins can transform into amyloid fibrils. It has also been observed that the ability of a protein to undergo an α to β conformational change is facilitated by amino acid regions that adopt an α-helical conformation within the native structure, at the same time as having a higher statistical propensity for the β-structure [71].

Mutations and changes in environmental conditions both affect the aggregation reaction [72–76]. A protein may assemble into amyloid fibrils with multiple distinct morphologies in response to a change in amino acid sequence [74] or upon a change in aggregation conditions [23,24,76], as well as under the same growth condition [22,77,78]. A study of β-lactoglobulin has shown that charge repulsion makes amyloid fibrils more regular, whereas a lower charge, caused by a pH change in the direction of the pI and/or screening electrostatic interactions by salt, results in shorter fibrillar rods that pack into spheres [56].

Analysis of naturally occurring β-sheet proteins and mixed α/β proteins has identified a number of structural motifs that interrupt self-assembly of the edge strands into the intermolecular β-pleated sheet. For example, charged side chains within the hydrophobic region of the edge strand and proline residues both limit interactions with other β-pleated sheet edge strands [79]. It has been suggested that the edge strands have evolved as guards against uncontrolled propagation of the β-pleated sheet conformation that would otherwise interfere with productive protein folding [79].

Partial proteolysis often results in amyloidogenic fragments. Algorithms have been developed to predict the location of amyloidogenic fragments in the polypeptide sequence [80–82]. In globular proteins, such amyloidogenic parts are usually surrounded by residues that have a low aggregation propensity, the so-called ‘amyloid-breakers’ [82], and inhibit amyloid propagation.

The software used to calculate the propensity of a protein to aggregate is based on either sequence or structural data, thus taking into consideration the known data, including intrinsic and external factors [83–85]. The universe of proteins capable of forming amyloid-like fibrils has been named the ‘amylome’ [86]. The major determinants qualifying a protein to belong to the amylome can be summarized as: (a) the formation of a ‘steric zipper’ consisting of two self-complementary β-sheets that form the spine of an amyloid fibril and (b) sufficient ‘conformational freedom’ of the self-complementary segment to interact with other molecules. Although self-complementary segments are found in almost all proteins, the size of the amylome is limited, suggesting that chaperoning effects have evolved to prevent self-complementary segments from interacting with each other [86].

Mechanisms of amyloid fibril formation

The models reported before the year 2000 have been described in older reviews [63,64,87] and some excellent reviews have been written subsequently [2,4,88–90]. On the basis of the main features of the models, we have classified them into three groups (Table 1): (a) templating and nucleation; (b) linear, colloid-like assembly of spherical oligomers; and (c) domain-swapping.

Table 1.   Models for the mechanism of amyloid fibril formation.
Templating (A) and nucleation (B)Examplesa
  1. a All human proteins, with a representative case example.

ATA model [91] (Fig. 1A)Prion
AMDC model [92] (Fig. 1B)Prion, stefin B at pH 7 (from monomer)
BNP model [97]Amyloid-β peptide
BNDP model [99] (Fig. 1C)Amyloid-β peptide
BNCC model [98]Yeast prion protein Sup35
C‘Polar zipper’ model [93–96]Huntingtin, ataxin-3
Linear colloid-like assembly of spherical oligomers examples
AModel of CO [104]Yeast phosphoglycerate kinase
BDA model [107] (Fig. 1D)Tau 40 protein
CDCF mechanism [88,108] (Fig. 1E)α-Synuclein
DIsodesmic (linear) polymerization [104,185]β2-Microglobulin stefin B at pH 3 (from globular oligomer)
Domain swapping [150,160]Examples
APropagated domain-swapping [120,186]Cystatin C
BOff-pathway model [137] with domain-swapped oligomers [123] and propagated domain-swapping (Fig. 1G)Stefin B at pH 5 (from dimers)
BOff-pathway model [137] with domain-swapped oligomers [121,122,163] and likely propagated domain-swappingStefin A

For some of the case proteins relevant to the focus of this review on domain-swapping, descriptions of the mechanisms are provided, whereas, for most of the other cases, the original publications are cited. On the basis of our research on cystatins, which are capable of domain-swapping, and on a literature survey of a number of other amyloidogenic proteins that initially form dimers, we emphasize domain-swapping as a possible mechanism underlying amyloid fibril formation (see below). We also describe several factors that are decisive for folding, misfolding, domain-swapping and amyloid fibril formation.

Templating and nucleation models

Templating models comprise the templated assembly (TA) and the monomer-directed conversion (MDC) models. These models were originally proposed for the prion protein transformations [91,92]. The TA and DSC models are presented in Fig. 1A,B.

Figure 1.

 Schematic representations of the chosen mechanisms. (A) The TA model [98]. In the TA model, in a rapid pre-equilibrium step, the soluble state (S) molecules that are initially in a random coil conformation bind to a pre-assembled (A) state nucleus. This binding induces the rate-determining structural change from the random coil to the β-pleated sheet structure as the molecule is added to the growing end of the fibril [91]. (B) The MDC model [98]. In the MDC model, a pre-existing monomer in the A-state conformation, analogous to the conformation adopted in the fibrils, binds to the soluble S-state monomer and converts it to an A-state dimer [92] in a rate-determining step. The dimer then dissociates, and the constituent A-state monomers add to the growing end of the fibril. (C) The NDP model [88]. We consider that the final structure labeled as ‘amyloid’ represents protofibrils rather than fibrils. The NDP model also predicts a lag phase that arises from the fact that the dissociation rate is initially greater than the association rate. (D) The DA model [107]. In the first step, nucleation units (globular oligomers) form in a process driven by the surface chemical potential. In the second step, the nucleation units aggregate linearly as a result of their intrinsic dipole moment [107]. (E) The DCF model [88]. We consider that the final structure labeled as ‘amyloid’ represents protofibrils rather than fibrils. In this step, the interactive surfaces of the monomers shift from intra-oligomeric to interoligomeric. With the application of shear stress or organic solvents, oligomeric granules become distorted [108,187] and fibril growth takes place almost instantaneously. (F) The general OFF model [167]. In this model, denatured monomers Mu are refolded into either stable monomer M or dimer D (the latter could be domain-swapped) or a less stable dimeric intermediate I (which again could be a partially-unfolded domain-swapped dimer). The initial steps are practically irreversible, and are followed by cooperative assembly of the fibril prone dimeric intermediates, I, into a nucleus, N, from which thin filaments, f, originate. Filaments grow linearly by repeated addition of I, and fibrils, F, form by lateral association of the filaments. F also elongate by end-to-end association [167]. (G) Off-pathway oligomers model, branching at domain-swapped dimer, as derived for stefin B [137]. Andrej Vilfan (Jožef Stefan Institute, Ljubljana) prepared the artwork. The growth phase shows an anomalous dependence on protein concentration, which is explained by off-pathway oligomer formation with a rate-limiting escape rate [137].

The ‘Polar zipper’ model proposed by Perutz et al. [93] can also be classified as a templating model. This model applies to amyloid forming proteins whose β-sheets are stabilized by hydrogen bonds between polar side chains, such as those between glutamine and asparagine [94,95]. Molecular modeling has shown that such polar residues link β-strands together into β-sheets by a network of hydrogen bonds between the main-chain amides and the polar side chains. The glutamine- and asparagine-rich regions are commonly found in the N-termini of both mammalian and yeast prion proteins [96] and several other proteins with polyglutamine expansions such as huntingtin and ataxin-3.

The nucleation-based models [97–99] comprise the nucleated polymerization (NP) model [97], the nucleated conformational conversion (NCC) model [98] and the nucleation-dependent polymerization (NDP) model [99]; for a review, see Kelly [87].

An example of the NP model is that used by Lomakin et al. [100] to describe fibril formation by the amyloid-β peptide. The model predicts that the lag phase, which disappears upon seeding, decreases exponentially as the protein concentration increases; however in a recent, very reproducible study of the kinetics of Aβ assembly, this was found not to be the case [101]. The NP model predicts micelle formation above a critical protein concentration, where fibrils nucleate on heterologous seeds. In this model, fibrils grow by irreversible binding of monomers to the fibril ends.

The NDP model (Fig. 1C) predicts that the lag phase arises from the fact that the dissociation rate is initially greater than the association rate. This is reversed after a critical nucleus size is reached. In this model, the lag phase is also predicted to show a high concentration dependence and to disappear on seeding [102].

The NCC model of Serio et al. [98] is applicable when little or no concentration dependence is observed for both the nucleation and assembly rates. In this model, a steady rate is ensured by an almost constant concentration of the assembly competent oligomers [98,103]. In the NCC model, the rate-determining step is a conformational change that occurs in the nucleus of preformed oligomers, rather than oligomer growth itself. The concentration of soluble oligomers does not increase with higher soluble protein concentration as a result of the formation of assembly-ineligible complexes. An example of NCC mechanism of amyloid assembly is provided by the yeast prion protein Sup35 [103].

Linear colloid-like assembly of spherical oligomers

Model of ‘critical oligomers’ (CO)

In the kinetics of yeast phosphoglycerate kinase fibrillation studied by Modler et al. [104], two steps were observed during the formation of amyloid. ‘CO’ were formed in the first step, whereas, in the second step, a linear growth of oligomers into protofibrils was observed. The kinetics of both steps were found to be irreversible. Phosphoglycerate kinase was converted into protofibrils, starting with a partially-unfolded intermediate [105,106]. According to this model [104], the acquisition of a β-sheet structure and fibril growth are coupled events subsequent to a generalized diffusion-collision process.

Dipole assembly (DA) model

Xu et al. [107] proposed a similar two-step model, which they termed the ‘DA’ model. In the first step, nucleation units (i.e. globular oligomers resembling ‘spheres’ or ‘granules’) form in a process driven by the surface chemical potential. The oligomeric and spherical nucleation units reach a uniform size as a result of the electrostatic repulsion between these species and the monomers. Xu et al. [107] proposed that nucleation units aggregate linearly as a result of their intrinsic dipole moment. Their growth is governed by charge-dipole and dipole–dipole interactions (Fig. 1D).

Double-concerted fibrillation (DCF) model

Bhak et al. [88,108] proposed the ‘DCF’ model as an alternative to the prevailing nucleation-dependent fibrillation models [97–99]. In this model (Fig. 1E), amyloid fibril formation also occurs in two steps: (a) association of the monomers into oligomeric units (globular oligomers; also termed ‘granules’ or ‘spheroids’) and (b) linear growth of the oligomeric units into protofibrils in the absence of a template [108]. According to this model, the major driving force for fibril formation is a structural rearrangement within the oligomeric granules achieved by shear stress.

Domain-swapping as a mechanism of amyloid-fibril formation

Here, we feel we need to explain more of our main model proteins: cystatins and stefins. Given their example, we illustrate the principle of domain-swapping and how this can underlie the process of amyloid-fibril formation.

Cystatins and stefins: an example of domain-swapping proteins forming amyloids

Cystatins and stefins are a large family of cysteine proteinase inhibitors, examples of which have been linked to amyloid diseases and degenerative conditions. These small globular proteins (11–13 kDa), albeit evolutionary distinct [109], are structurally and functionally analogous and those studied so far show evidence of 3D domain-swapping both in vitro and in vivo.

Human cystatin C is a member of the cystatin II family of cysteine cathepsin inhibitors [110] but may have additional functions. It is a well known amyloidogenic protein whose mutations cause hereditary cystatin C amyloid angiopathy [111]. Recently, it was reported that cystatin C induces autophagy [112] in a cathepsin-independent manner and, in this way, contributes to neuroprotection. It is also known that the cystatin C A/A allele, which leads to impaired secretion of the protein and intracellular accumulation, influences negatively the outcome of late-onset Alzheimer’s disease and frontotemporal lobar degeneration [113,114].

Human stefins are representative of the cystatin I family of the cysteine protease inhibitors [110]. Human stefins A and B (sometimes referred to as cystatins A and B), together with some cathepsins, were identified in the core of amyloid plaques of various origins [115]. Human stefin B (i.e. cystatin B gene) mutations cause progressive myoclonus epilepsy of type 1-EPM1 [116,117], with signs of cerebellar neurodegeneration [118] and oxidative stress [119].

The structures of cystatin domain-swapped dimers have been solved, both by X-ray crystallography (human cystatin C) [120] and by heteronuclear NMR (human stefin A and chicken cystatin) [121,122]. The domain-swapped dimer of stefin A (Fig. 2A) is made of strand 1, the α-helix and strand 2 from one monomer, and strands 3–5 from the other monomer [120,122]. Similar to other cystatins, stefin B is prone to form domain-swapped dimers (Fig. 2B). The 3D structure of its tetramer [123] is composed of two domain-swapped dimer units. The two domain-swapped dimers interact through loop-swapping, also termed ‘hand-shaking’ [123].

Figure 2.

 Involvement of domain-swapping in amyloid fibril formation of cystatins. (A) Stefin A monomer (Protein Data Bank code: 1dvc) and domain-swapped dimer as found in the structure of the tetramer (Protein Data Bank code: 1N9J); (B) stefin B monomer (Protein Data Bank code: 1stf) and domain-swapped dimer (Protein Data Bank code: 2oct); and (C) proposed mechanism of the building up of amyloid fibrils obtained on the basis of stefin B H/D exchange and heteronuclear NMR. Adapted from Morgan et al. [163].

Folding mechanisms and oligomer formation by domain-swapping

Folding studies are usually focused on unraveling the conformational changes occurring within the monomeric protein under conditions often referred to as ‘physiological’, generally comprising pH 7.0 and room temperature. It is clear that different folding conditions must be examined when the focus switches to what is occurring in the early steps of amyloid-fibril formation. For many systems, including the stefins [124–126], amyloid-fibrils form at nonphysiological pHs and in the presence of further additives, such as metal ions or organic solvents that are proposed to mimic the effects of biological surfaces. The most extensively studied example of a cystatin amyloid is that of stefin B, which is triggered by mildly acidic conditions and a low concentration of TFE [127]. It is notable that stefin B forms long unbranched amyloid fibrils from a native-like intermediate [124,125]. These conditions often correspond to conditions that favor oligomeric states [123,124,126].

Proteins in which folding intermediates are populated, such as cystatin C and stefin B [128,129], are more likely to form oligomers of the domain-swapped type than those folding in a two-state (N-U) manner. A number of conformational changes to the cystatin molecule (as a representative of globular proteins) undergoing oligomerization and, by extension, amyloid formation will be considered below, including the role of 3D domain-swapping and proline isomerization.

The energetics of domain-swapping

Intramolecular and intermolecular forces do not differ. The only parameters favoring the monomeric state are thus entropic. However, the edge strands usually protect a monomer from direct interaction with another monomer [79], whereas the internal strands do not possess such built-in protection. Under denaturing conditions, the internal strands become exposed and they can shift from intra- to intermolecular arrangements. There also is considerable backbone strain in the loop between strands 2 and 3 in the monomer structure of stefin A [122] because this is required for its proteinase inhibitory activity. The driving force for dimerization may thus be the alleviation of this strain as loop 1 extends on formation of the dimer [122]. Whether kinetic or thermodynamic factors govern the oligomer formation remains to be clarified [130].

In certain proteins, metastable states can exist site by site because the kinetic barriers are too high to allow the energetic minimum to be reached in a reasonable time [131]. However, when barriers are crossed (e.g. by raising the temperature or pressure, by lowering the pH or adding denaturant), the thermodynamically most stable state [i.e. the lower oligomer (dimer), then higher oligomers and, finally, fibrils] can be attained.

Because the temperature dependences of fibrillation and domain-swapping are the same (i.e. activation energy of approximately 100 kcal·mol−1), it was concluded that domain-swapping may be the rate-determining step [132]. Domain-swapping demands almost complete unfolding before the two chains can rearrange and swap strands [132]. Domain-swapped dimers have been observed for both the mammalian prion protein [133] and the cystatins [120–122], and, for a number of amyloidogenic proteins, it is observed that the process of fibrillogenesis starts with dimerization [134]. The height of the first barrier to fibrillation observed for the stefins is distinct from that measured in the case of α synuclein [13] and also HET prion [135], where a smaller activation energy of 22 kcal·mol−1 was observed. The value of 100 kcal·mol−1 is close to the energies needed for unfolding, whereas the value of 25 kcal·mol−1 is characteristic for Pro cistrans isomerization. Because native α-synuclein is not folded, whereas stefin B is a globular protein, different intermediates may be rate-determining for fibrillation. Theoretical studies [136] point to a role for hydrophobicity in the nucleation barriers.

Thus, we have shown that domain-swapping of stefins demands almost complete unfolding, with a high activation energy of approximately 100 kcal·mol−1 preceding stefin A domain-swapped dimerization [121]. It has been shown for RNAse A that dimerization is not always energy demanding, as indicated by the presence of a variety of different domain-swapped and nonswapped dimers [130]. However, for stefins, a high activation energy (as observed for domain-swapped dimerization) is also a prerequisite for the initiation of amyloid fibril growth [137] which, together with a prominent role of the dimers accumulating in the lag phase [126,127], supports the hypothesis that the domain-swapped dimers are directly or indirectly involved in the amyloid fibril formation of stefins. This is consistent with the case of the homologous cystatin C, where the prevention of domain-swapped dimerization also prevents amyloid fibril formation [138].

Role of proline cis–trans isomerization as a gate-keeper against oligomerization

Studies on stefin B and β2-microglobulin have shown a link between oligomerization and cis to trans proline isomerization. The critical prolines are usually positioned in the loops that have to extend in the domain-swapping process, as also was the case with αA crystallin [139].

RNAse A forms a C-terminal domain-swapped dimer in which the β-strand consisting of residues 114–124 (among them Pro114) is exchanged. Dimerization of RNAse A occurs under extreme conditions of acid, organic solvents or temperatures [140]. This is reminiscent of stefin A domain-swapping [121] and implies a high-energy barrier. The crystal structures of the RNAse A monomer and C-terminal dimer reveal that Pro114 is trans in the dimer and cis in the monomer [130].

Another example is provided by domain-swapping in p13suc1, which occurs in the unfolded state and is controlled by conserved proline residues [141]. The monomer–dimer equilibrium is controlled by two conserved prolines in the hinge loop that connects the exchanging domains. They exploit the backbone strain to specifically direct dimer formation, at the same time as preventing higher-order oligomerization. Furthermore, an excellent correlation between domain-swapping and aggregation has been observed, which again suggests a common mechanism.

In the structure of the monomeric stefin B in complex with papain [142], the Pro103I is found to be trans, whereas, in the tetrameric structure, the homologous residue Pro74 is cis. Hence, in the stefin B tetramer, the proline residue in the loops undergoing the exchange [123] has to isomerize from trans to cis.

Accordingly, in amyloid fibril formation of the wild-type stefin B, the Pro74 cis isomeric state was found to be critically important. Its mutation to Ser prolonged the lag phase by up to ten-fold at room temperature and almost stopped fibril growth [143]. Furthermore, it was shown that the prolyl peptidyl cis–trans isomerase, cyclophilin A, profoundly delayed the fibrillation rate of the wild-type protein [143]. The potentially important role of proline isomerization in stefin B oligomerization and fibril formation is also reflected in the activation energy of approximately 27 kcal·mol−1 for the fibril elongation phase [137], which is in the range of proline isomerization reactions.

Pro32 is cis in the native structure of β2-microglobulin. For this protein, cis to trans isomerism acts as the ‘gate-keeper’ for the transition to an intermediate conformation serving as a direct precursor of fibril formation [144–146]. The Pro32 trans to cis isomerization is facilitated by complexation with Cu2+, which is an important metal influencing amyloid formation in the brain [145,147,148]. Interestingly, stefin B also binds Cu2+ in an oligomer-dependent manner [149], indicating similar underlying processes.

Domain versus loop-swapping

In the process of 3D domain-swapping, as originally proposed by Bennett et al. [150] and Liu et al. [151], two protein chains of partially open monomers exchange the whole parts of their chains from the hinge loop to the termini, and fold back to two monomeric domains. The extended surface of the ‘hinge loop’ is the only region of the protein that adopts a different conformation in the domain-swapped dimer from that in the monomer [120,122]. By contrast, in the process of loop-swapping, as seen in the tetramer of stefin B, which is a dimer of domain-swapped dimers [123], swapping of additional internal parts of the chain occurs from residues 72–80. It is therefore possible that an analogous mechanism of domain exchange is also present in the higher-order oligomers. In the ‘hand-shake’ of the loops observed by stefin B tetramer [123], the loop position from residues Ser72 to Leu80 is enabled by Pro74 and Pro79. The adopted loop position differs in the tetramer from that in the monomer and domain-swapped dimer. The monomer and domain-swapped dimers of stefins A and B are illustrated in Fig. 2.

Pro74 is widely conserved in stefins and cystatins, and is found in trans isomeric state in all of the reported structures [120,122,142,152,153]. Only in the high- resolution structure of the stefin B tetramer is it in the cis isomeric state [123]. The dimer to tetramer transition is associated with a rotation of domains, which appears mandatory for the 90° repositioning of the exchanged loops. From the superposition of stefin B monomers and stefin A and cystatin C domain-swapped dimers onto the tetramer structure, it is evident that the Ser72-Leu80 loops and the N-terminal trunks have to adopt different conformation in the tetramer to prevent clashes [154]. The adopted conformation of the Ser72-Leu80 loop and the N-terminal trunk is made possible only by the proline in the cis conformation.

Indirectly, we have confirmed that proline isomerization is at the root of the slow conformational change coupled to tetramerization by measuring the temperature dependence of the kinetics [123]. The value for the activation energy of 28 ± 3 kcal·mol−1 observed for the P79S mutant tetramer formation is consistent with the contribution of one proline isomerization event, most likely the conversion of Pro74 from trans to cis. In the case of recombinant stefin B, in which both P74 and P79 are present, the activation energy is higher (i.e. 36 kcal·mol−1), suggesting that Pro79 also contributes to the loop rigidity, and its conformation would be strictly trans.

These findings are consistent with those of Sanders et al. [155]. On the basis of thermodynamic and kinetic data, they concluded that oligomerization of the chicken cystatin occurred in the pre-exponential phase of the fibril growth. They describe that cystatin first undergoes a bimolecular transition to a domain-swapped dimer via a predominantly unfolded transition state, followed by a unimolecular transition to a tetramer via a predominantly folded transition state [155].

Models for amyloid fibril formation based on domain-swapping

‘Run-away’ and ‘propagated domain-swapping’ models

The domain-swapped oligomer can act either as a seed for fibril elongation (propagated domain-swapping) or as an end product (off-pathway domain-swapped dimers, tetramers) [156]. The process of domain-swapping is rate-limiting for the initiation of amyloid fibril formation, as reflected by a high energetic barrier [121,150]. In principle, any protein is capable of oligomerization by 3D domain-swapping [157]. Ogihara et al. [158] designed a sequence of RNAse A that underwent a reciprocated swap and another that ended in a propagated swap (Table 1).

Under partially denaturing conditions, the protein molecule partially opens and, when stabilizing conditions are restored, the partially-unfolded monomers can swap domains. When the exchange of secondary structure elements is not reciprocated but propagated along multiple polypeptide chains, this can result in higher-order assemblies [159]. Guo and Eisenberg [160] proposed the term ‘run-away domain-swapping’ mechanism for such a process of continuous domain-swapping.

In their study of T7 endonuclease, Guo and Eisenberg [160] define ‘run-away domain-swapping’ as a mechanism in which each protein molecule swaps a domain into the neighboring molecule along the growing fibril. By designing disulfide bonds that form only at the domain-swapped dimer interface, they were able to show that the resulting covalently-linked fibrils contained domain-swapped dimers. If these were locked in a close-ended dimeric form by making internal disulfide bonds, they were unable to form fibrils. A study by Liu et al. [161] indicates that the β-sheet spine in amyloid fibrils of β2-microglobulin could be made from amyloidogenic peptide sequences of the hinge regions of domain-swapped dimers, which also build the prefibrillar, curvelinear oligomers. For the example of αA crystallin, Laganowsky and Eisenberg [139] have shown even more plasticity in the way that the N- or C- terminal parts can swap from one molecule to another.

Wahlbom et al. [162] used the term ‘propagated domain-swapping’ to describe a similar process of continuous domain-swapping in the formation of cystatin C prefibrillar oligomers and fibrils. They showed annular oligomers with an outer diameter of 13 nm at the beginning of fibril formation, which transformed to mature fibrils of 10 nm in width. From their study, it is not shown clear at which state the disulfide bond stabilizes the domain-swap.

On the basis of the H/D exchange study of Morgan et al. [163], we suggest that, in the case of stefins, and in addition to initial domain-swapping to produce the domain-swapped dimer, there could be further exchange of loops. We propose that such additional loop-swapping could occur between the loop extending from the only α-helix to strand 2 of one domain-swapped dimer with another acting as one ‘click’, and between loops from strands 4–5 as another ‘click’, in a similar process to that taking place in the tetramer. Alternatively, whole α-helices and N-terminals could swap. Clearly, a 3D structure of a higher oligomer in the range of 12–16 mers is mandatory to provide insight into such exchange events.

An example of the propagated domain-swapping is provided by human cystatin C. Amyloid fibril formation by human cystatin C has been studied [162,164] and connected to the domain-swapping of this molecule [120,165]. In an experiment where dimer formation was prevented by engineered disulfide bridges, fibril formation was also prevented [165]. From these studies, it is apparent that dimers play an important role in fibril formation, although this does not imply they should build the fibrils directly.

A further example of a domain-swapping mechanism underlying amyloid fibril formation is provided by the human stefins. In vitro studies have demonstrated that stefin A, albeit under rather harsh preconditioning, is able to form amyloid fibrils [64,125], as would be expected if this process is generic [66]. In stefin A monomer and dimer, there are more salt bridges interconnecting the α-helix and strand 2 to the rest of the structure than in its stefin B homolog [59,125]. Consequently, stefin A can only form amyloid fibrils in vitro under very stringent conditions compared to the almost physiological conditions needed for stefin B [64,166]. Fibrillation of stefin A can be initiated by heating the protein to predenaturing temperatures of approximately 90 °C, which promotes domain-swapped dimer formation, and by reducing the pH below 2.5, which partially unwinds the dimer [121]. Staniforth et al. [122] originally proposed the propagation of the domain-swapped dimer of stefin A into fibrils through the open ends on the N- and C-termini.

Similar to other cystatins, stefin B is prone to form domain-swapped dimers (Fig. 2B). H/D exchange and heteronuclear NMR studies on stefin B mature fibrils do not appear to confirm the initial prediction [122] based on the structure of stefin A dimers. Rather, they suggest that the fibrils are themselves highly structured, being made from a protected core of strands 2–5, whereas the α-helix and strand 1 are unprotected. This makes sense if the α-helix and strand 1 were flanking from the spine of the fibril [163] (Fig. 2C). These results imply that lower domain-swapped oligomers of cystatins (dimers and tetramers) do not directly build mature fibrils. It appears that the oligomers can become fibril building blocks only after partial unfolding of the α-helix from the body of the β-sheet (Fig. 2C).

Possible role of domain-swapping in the ‘off-pathway folding’ (OFF) model

The OFF model for amyloid formation was first described by Pallitto and Murphy [167]. The general model is described in more detail in Fig. 1F and is included in Table 1. The off-pathway oligomers are the dead end of an alternative folding pathway, which is incapable of converting directly to fibrils and substantially slowing their formation. As the protein concentration increases, the off-pathway oligomers become even longer lived, which is doubly detrimental in that their lifetime increases when they become more abundant [137,168].

An example of the OFF model, involving domain-swapping, is provided by human stefin B (Fig. 1G). The fibrillation of stefin B at room temperature and at approximately pH 5 is characterized by an extensive lag phase, in which granular aggregates have been observed by TEM and AFM, appearing as micelle-like arrangements of oligomers [124–127,169]. After the lag phase, various morphologies have been detected during the fibril growth phase, from annular to spherical, rod-like and amorphous species [126,169]. Unlike at room temperature, at temperatures above 35 °C, thioflavin T fluorescence shows no visible lag phase [137]. The subsequent growth phase shows an anomalous dependence on protein concentration; at low concentrations, the final value is reached faster than at higher concentrations. This observation is explained in terms of an off-pathway state with a rate-limiting escape rate [137]. However, there may be two (or more) pathways by which this protein aggregates, depending on pH and ionic strength [124,126,127].

Discussion

Although it is essential to study different conformational states populated by the amyloid precursor proteins, it is a difficult task to draw links between states occurring during folding and the ‘misfolded’ states populated during amyloid formation. In some cases, extensive study allows us to determine the pathways to which different conformations belong. However, our understanding of the importance of different parts of the molecule and their flexibility in the process of amyloid fibril formation often results more from a structural analysis of the amyloid endpoint. As with many other proteins [80], there is no clear link between the fold, stability, unfolding or folding rates of stefins and their propensity to form amyloid [59]. Studying the effect of different factors (i.e. from pH, temperature to concentration of TFE) on amyloid fibril formation, and looking at how these combine to cause the specific changes favoring amyloid over native folds, or even alternative (off-pathway) oligomers, where the additivity of different effects may not be straightforward, is a formidable task. Nevertheless, such studies do provide us with a useful insight into the mechanism of amyloid formation. It often becomes apparent that changes in specific parts of the protein molecule (e.g. either protonation of a number of side chains, binding of metals to an unfolded state or mutations at specific sites) are key to amyloid formation. One of the major players appear to be prolines at critical loop positions, which may act as gate-keepers to amyloid-preceding conformation; this remains to be discussed. As the number of studies grow, insight will be provided as to how specific (or indeed how random, directed or evolved) the amyloid structure may be. The answer to the equation is therefore not simply to determine the conditions that are sufficiciently destabilizing to favor large conformational changes but, instead, those that are sufficiciently stabilizing to produce a new structure. It is tempting to imagine that these changes are controlled by nature, just as they can be by the scientist.

In the case of globular proteins, the formation of partially-unfolded intermediates populated from the native state or accessible during refolding is the first critical step of the pathway of amyloid fibril formation [61,170,171]. Taking the example of stefin B, partial unfolding is a prerequisite for both protofibril and fibril formation. We have observed that protofibrils tend to form from the structured molten globule obtained at pH 3.3 and the mature fibrils from the partially-unfolded monomer (native-like intermediate) populated at pH 4.8, which transforms into domain-swapped dimer [124,126,127]. We also have shown that the aggregates formed from different partially-folded intermediates differ in toxicity [172].

The intrinsically disordered proteins (IDPs) [173], constitute a large fraction of naturally occurring amyloidogenic proteins [174]. In the case of IDPs (i.e. natively unfolded proteins), the formation of partially structured conformers occurs by partial folding, and fibril formation is promoted by factors that induce partial folding [12,13,175]. For example, in the case of α-synuclein, either a decrease in pH or an increase in temperature appears to induce partial folding, as well as enhance the propensity of the protein to fibrillate [13]. Similar to intermediates formed from the partial folding of globular proteins, the aggregate-prone intermediates of IDPs can polymerize to form fibrillar or amorphous aggregates, or soluble oligomers.

The generalized picture that we describe below may hold for the folded globular proteins more than for IDPs. A common trait is the fact that the monomers have to undergo a conformational change (i.e. partial opening), which demands partially denaturing conditions. The conformational change can happen more easily on a template of another, already distorted, monomer (MDC) or a nucleus acting as a template (TA). In the case of stefin B, under mild solvent conditions (from pH 5.5 to neutral), the partially-folded monomers tend to form closed domain-swapped dimers, tetramers, etc., up to octamers and even dodecamers, which persist temporarily as off-pathway oligomers. On the main pathway to fibrils, slightly below pH 5, some of the oligomers (ring-like, annular oligomers of 4–20 nm in diameter) [169] unwind and rearrange to make short protofibrils. As discussed by Wahlbom et al. [162], this transition can take place in two ways: either the small oligomer rings gather one above the other (meaning the protofibril would be hollow) or they open and rearrange to become a growing filament, a number of which wind around each other. Under more stringent denaturing solvent conditions (of pH 3.3 or at a higher temperature at pH 7) that destabilize the lower oligomers, oligomers higher than dimers [126,127], and possibly tetramers [155], cannot form. Therefore, partially-unfolded monomers and dimers accumulate until they form the so-called ‘critical oligomers’, which comprise part of the insoluble granular aggregate. When such a critical mass is reached and the oligomeric spheres gain sufficiently large dipole moments, they form linear chains in the form of colloid particles to give protofibrils. These can interact laterally, building up fibrils. Under suitable solvent conditions, the protofibrils smooth out into filaments, which wind around each other and form mature fibrils, whereas, under some other solvent conditions, they remain protofibrillar. It is also possible that several fibril morphologies could exist side by side.

In the NP, NDP and NCC models (Table 1), the partially-unfolded intermediates, when present at a critical concentration, slowly assemble into a nucleus, within which the first conformational change takes place. These oligomeric nuclei then rapidly grow into globular oligomers, also termed ‘granules’ or ‘spheroids’ and, after reaching a ‘critical’ size, go on to form chain-like protofibrils (CO and DA models), which eventually form fibrils [97,176] or remain protofibrillar. Some other models, such as the DCF model, predict that a second concerted conformational change has to take place within the globular oligomers, after which they can chain up (i.e. as colloid particles) into protofibrils [102,104].

We have noted that the amyloid fibril formation of proteins that form domain-swapped oligomers (e.g. cystatins and stefins) initially follow nucleation kinetics with a prominent lag phase. However, when the high barrier to partial unfolding is crossed (under different solvent conditions or increased temperature), they follow a linear growth of globular oligomers similar to that of the DA or CO models.

Only a small change in reaction conditions, in which different structural intermediates form, can turn the mechanism from one path to another. Polyglutamine repeat-containing proteins are a good example, for which a domain-swapped model [177], polar zipper [94], cross-linking [178] model and a nucleation-dependent pathway [179] have all been proposed. Accordingly, our model protein stefin B follows a different pathway towards fibrils or protofibrils at different pHs and temperatures (Table 1). Interestingly, several parallel mechanisms can apply even to a single protein, leading to final fibril heterogeneity [102].

It is a great challenge to derive a common, generic model of amyloid fibril formation that would define a single preferred pathway for a given protein sequence, covering most of the possible conditions. Is it likely that there is only one generic mechanism of amyloid fibril formation? Is there a common mechanism of protein folding? Or are there extreme cases of two-state folding on one end and noncooperative transitions via multiple intermediates on the other, with all the rest inbetween? [180]. In folding, the energy landscape representation [181] is used to show different scenarios, with steep funnels or ragged surfaces, slowly descending into a final funnel. In certain cases of metastable states, the funnels can end in two or three minima.

We propose that such metastable states preceding fibril formation could well be domain-swapped dimers and higher oligomers, preceding or gate-keeping the amyloid fibril formation. High-energy barriers of the order of 100 and 30 kcal·mol−1 occur in the domain-swapped dimerization of stefin A [121] and in the tetramerization of stefin B [123], respectively. These barriers are equivalent to those corresponding to almost complete unfolding and proline isomerization. Both barriers occur in amyloid-fibril formation by this protein [137]. We further propose that the term ‘propagated domain-swapping’ would encompass both domain-swapping and loop-swapping. The domain-swapping demands almost complete unfolding (> 90 kcal·mol−1) and loop-swapping, usually an extensive conformational change involving proline cis–trans isomerization. One such reaction would cost 28 kcal·mol−1, and two, occurring within the nucleus, would cost approximately 56 kcal·mol−1 [137].

The cis to trans isomerization of prolines is rate-limiting for many processes from protein folding to switching on/off of neurotransmitter ion channels [182]. In many cases, proline isomerism also serves as a gate-keeper to amyloid-fibril formation [130,137,141,145,146].

In the context of neurodegenerative disorders, it was shown that a prolyl isomerase facilitates the formation of α-synuclein inclusions (i.e. the Lewy bodies of Parkinson’s disease) [183], and that the prolyl isomerase Pin1 regulates amyloid precursor protein processing and amyloid-β production in Alzheimer’s disease [184]. Taken together, there is a growing body of evidence indicating that proline residues have crucial roles in the processes of oligomerization and amyloid fibril formation, suggesting mechanisms that could destabilize the structure of toxic intermediates and thus prevent their undesired activity [123].

Acknowledgements

The work was supported by programs P1-0140 (proteolysis and its regulation, led by B. Turk) and P1-0048 (Structural Biology, led by D. Turk) and by the project J3-2258 (V. Stoka), all financed by the Slovenian Research Agency (ARRS). Work in Sheffield is funded by grants to RAS from the BBSRC (UK) (BB/C504035/1) and the Royal Society (516002.K5631). We thank Professor R. H. Pain for editing the English and making useful suggestions.

Ancillary