James A. Huntington, Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Hills Road, Cambridge CB2 0XY, UK. Tel.: +44 1223 763230; fax: +44 1223 336827. E-mail: firstname.lastname@example.org
Summary. Serpins have been studied as a distinct protein superfamily since the early 80s. In spite of the poor sequence homology between family members, serpins share a highly conserved core structure that is critical for their functioning as serine protease inhibitors. Therefore, discoveries made about one serpin can be related to the others. In this short review, I introduce the serpin structure and general mechanism of protease inhibition, and illustrate, using recent crystallographic and biochemical data on antithrombin (AT), how serpin activity can be modulated by cofactors. The ability of the serpins to undergo conformational change is critical for their function, but it also renders them uniquely susceptible to mutations that perturb their folding, leading to deficiency and disease. A recent crystal structure of an AT dimer revealed that serpins can participate in large-scale domain-swaps to form stable polymers, and that such a mechanism may explain the accumulation of misfolded serpins within secretory cells. Serpins play important roles in haemostasis and fibrinolysis, and although each will have some elements specifically tailored for its individual function, the mechanisms described here provide a general conceptual framework.
In a paper published in 1980, Hunt and Dayhoff  noted the sequence similarity between two apparently unrelated proteins, chicken ovalbumin (OVA) and human antithrombin III (AT), and suggested a provisional name of the ‘ovalbumin-antithrombin superfamily’. Previous work had found sequence homology between AT and another human protease inhibitor, α1-antitrypsin (α1AT, also known as α1-proteinase inhibitor) . In a seminal review of this new family and its members in 1985, Carrell  coined the now-widely used descriptive acronym SERPIN, since most members of this family were serine protease inhibitors. There are now well over 1500 serpin sequences identified in the genomes of organisms representing all kingdoms of life, and 36 confirmed human serpins . Most family members are indeed serine protease inhibitors, but several have additional cross-class inhibition functions and inhibit cysteine protease family members such as the caspases and cathepsins, and others, such as OVA, are incapable of protease inhibition and serve other functions. In plants and other basic organisms small ‘canonical’ inhibitors (such as Kunitz and Kazal type inhibitors) predominate . In humans, serpins have been selected to control processes that require tight regulation, such as blood coagulation, inflammation and fibrinolysis. Much has been learnt since 1980 about the function of individual serpins in human health and disease, but a single question continues to define the field: Why has evolution settled on a serpin to fulfil a particular function? The answer lies in the unique serpin structure, the conserved inhibitory mechanism, and the regulatory mechanisms that govern serpin activity.
Serpin structure and metastability
Since serpins were known to be inhibitors of serine proteases, it was clear that their structure would present a loop to interact with the protease active site cleft. In other families of serine protease inhibitors, these ‘reactive centre loops’ (RCL) are short and are maintained in a rigid conformation ideal for protease docking. In this case, inhibition depends on the inability of the ends of the scissile bond (P1–P1′, as per the nomenclature of Schechter and Berger ) to separate after proteolytic attack. The inhibitor thus binds tightly, but reversibly to a protease, and neither protein undergoes conformational change. In contrast, serpin RCLs are long (typically 20–24 residues) and flexible, resembling a substrate loop. The first crystal structure of a native serpin was that of OVA , and its RCL was found to form a stable helix. Subsequent structures of inhibitory serpins showed a range of RCL conformations and a high degree of conformational flexibility (often reflected by high B-factors or lack of electron density). The native serpin is composed of approximately 400 amino acids that fold into an N-terminal helical domain and a C-terminal β-barrel domain (Fig. 1A). The two main features related to function are the RCL and the five-stranded central β-sheet (β-sheet A). In the classic orientation, the RCL is on top and sheet A is facing (Fig. 1B).
The unusual, if not unique, aspect of the serpins is that their native fold is not their most stable. Serpins are able to incorporate the RCL into β-sheet A as the central fourth strand (s4A), and in so doing, become hyperstable [8,9]. This conformational/topological change can occur upon proteolytic cleavage within the RCL (Fig. 1C), or spontaneously to form the intact ‘latent’ conformation (Fig. 1D). It was clear from early studies into the serpin mechanism of protease inhibition that the ability to rapidly and stably incorporate the RCL into β-sheet A distinguished the inhibitory (such as α1AT) from the non-inhibitory serpins (such as OVA) [10,11]. However, the nature of the inhibitory complex and the purpose of the conformational change to a hyperstable state were not known until a crystal structure of the final serpin-protease complex was solved.
The serpin mechanism
The long, flexible RCL of serpins and their ability to stably incorporate the RCL into β-sheet A suggested a mechanism other than the well characterised canonical lock-and-key mechanism used by other families of protease inhibitors, but just what that new mechanism might be was unclear. One proposal was that partial incorporation of an intact RCL would lock it into a canonical (key-like) conformation that would inhibit proteases via a non-covalent, reversible mechanism . However, the fact that serpin-protease complexes could be observed on SDS-PAGE suggested a covalent complex, somehow stabilised by the full incorporation of the cleaved RCL into sheet A . Fluorescence energy transfer studies appeared to support of both mechanisms [14,15]. In 2000, we solved and published the first crystallographic structure of a final serpin-protease complex  (Fig. 2A). Consistent with the fluorescence studies performed by the Gettins group , we found a fully incorporated RCL with the bond between the P1 and P1′ residue severed, exactly as in the first structure of an RCL cleaved serpin  (Fig. 1C). The protease was found at the bottom of the serpin, covalently linked via an ester bond between the catalytic Ser195 of the protease and main chain carbonyl carbon of the P1 residue (Fig. 2B). This linkage represents the acyl-enzyme intermediate of the serine protease catalytic cycle, after the expulsion of the P′ residues from the active site, but before the ingress of water for deacylation. The catalytic loop of the protease was distended by the pulling force exerted on it by the limited length of the RCL and the clash between the body of the protease and the body of the hyperstable serpin. This resulted in the destruction of the oxyanion hole, required for stabilisation of the tetrahedral transition state, thus preventing deacylation (Fig. 2B). In addition to the conformational change in the serpin (full-loop insertion) and changes in the catalytic site of the protease, about 40% of the protease also appeared to have become disordered, based on lack of electron density (Fig. 2A). The disordered region included residues 16–41, 62-84, 110–120, 139–156, 186–190 and 223–224. Of these six linear stretches, four (underlined) correspond to the serine protease zymogen activation domain . This domain transitions from a disordered to ordered state upon conversion from a zymogen to a protease due to the insertion of the N-terminus (Ile16 forms a salt-bridge with Asp194) into the activation pocket (formed by the three C-terminal stretches underlined above). The pulling force exerted on Ser195 by the serpin extends the loop 3.5Å from its normal catalytic position, thus destroying the oxyanion hole (main chain amides of Gly193 and Ser195, Fig. 2B), and also breaking the salt-bridge between Ile16 and Asp194, effectively converting the protease back into a zymogen-like state. Intriguingly, in our structure, Asp194 makes an alternate salt-bridge with Lys328 of the serpin at the bottom strand 5A (Fig. 2B).
The extent of structural disorder engendered in the protease by its complexation with serpins is a matter of some debate. Early studies revealed that discrete regions of the protease became accessible to proteolytic attack when in complex with serpins, and these regions were later mapped to the disordered region in the serpin-protease complex structure. Similarly, trypsin was found to lose affinity for Ca2+ when in complex with α1AT, suggesting disordering of the binding site (the 70s loop) as seen in the crystal structure, and high Ca2+ concentrations were shown to promote the dissociation of the complex, presumably by populating the protease conformation . Thrombin was shown to lose the ability to bind to exosite I ligands (the 30s and 70s-loops) [20,21], and we recently showed that both exosites I and II become dysfunctional due to the disordering induced by serpin complexation . These results suggest a global disordering of the protease, and are consistent with NMR data implying a molten-globule-like conformation for proteases when complexed by serpins . In striking contrast, no disorder was observed in a recent crystal structure of porcine pancreatic elastase in complex with α1AT . However, on the weight of the currently available data, we can conclude that some protease unfolding is induced by serpins, but that the extent of unfolding may depend on the serpin-protease pair. Factors that could affect the extent of unfolding include the length of the RCL, the size and composition of loops on the ‘bottom’ of the serpin and of loops flanking the active site of the protease, and the presence of ligands that preferentially bind to the protease conformation.
Heparin modulation of serpin activity
The activity of several serpins is modulated by glycosaminoglycans such as heparin . In general, heparin serves to ‘bridge’ the serpin to the protease, thus enhancing the rate of complex formation. This is true for protein C inhibitor with thrombin and activated protein C , for PAI-1 and thrombin , for protease nexin 1 and thrombin , for ZPI and fXa and fXIa , and importantly for AT and thrombin . This bridging effect also contributes to the heparin-acceleration of thrombin inhibition by HCII , and fIXa and fXa inhibition by AT , however, the majority of the rate enhancement in these cases is provided by conformational change in the serpin (allostery). In order for allostery to contribute to heparin acceleration of serpin function, the native state of the serpin must be in a low activity conformation, and the new heparin-bound conformation must somehow enable the formation of productive recognition (Michaelis) complexes with target proteases. Over the years we and others have studied the activation of AT by heparin, and have successfully determined the molecular basis of the relative inactivity of the native form, how heparin binds, how its binding results in conformational change, and how the main targets of AT, thrombin, fXa and fIXa, are recognised in the presence of heparin. AT is thus a thoroughly described system that provides a paradigm for the heparin binding serpins, as well as other serpins where binding to cofactors and proteases is regulated by conformational change (e.g. HCII , and even non-inhibitory serpins such as thyroxine and corticosteroid binding globulins ).
AT circulates in a native state that is a poor inhibitor of its target proteases, namely thrombin, fXa and fIXa. The work of Steve Olson and Ingemar Björk (and others) elegantly demonstrated that AT undergoes a conformational change upon heparin binding that results in a 500-fold acceleration of fXa and fIXa inhibition, but that thrombin is insensitive to the AT conformational change (many papers, but most completely described in Ref. ). The conformational basis for the low activity of the circulating form of AT was unknown until the structure of native AT was solved in 1994 [35,36]. It was found to have the normal serpin fold, but with two important differences – the RCL was partly incorporated into β-sheet A (P15-P14 residues Gly379 and Ser380) and the P1 residue was oriented toward the main body of AT and was thus unavailable for interaction with a protease (Fig. 3A, centre panel). Subsequent biochemical and structural work has confirmed that the native conformation does have the RCL partially incorporated, and that expulsion of the hinge is sufficient for full allosteric activation [37–40]. However, the orientation of the P1 residue and its role in maintaining circulating AT in a low activity state was unclear, due in part to crystal contacts involving the RCL. A structure of AT in another crystal form  again showed the P1 oriented toward the main body of AT, however, in this case it was making a different, more intimate contact (Fig. 3A, left panel). The RCL of native AT was thus capable of at least two conformations that sequestered the RCL away from an attacking protease, however, the appreciable rate of thrombin inhibition in the absence of heparin (7000 M−1 s−1) suggested that the P1 residue must be accessible to some degree (as illustrated in Fig. 3A, right panel). Native AT can thus be thought of as an ensemble of conformations, such as those shown in Fig. 3A, in rapid equilibrium.
The effect of heparin binding on the conformation of AT can be summarised as the expulsion of the RCL from β-sheet A, and its subsequent closure so that the activated state of AT resembles the native conformation of most other serpins (such as α1AT, Fig. 1B, right panel). This conformational change improves affinity for the specific heparin pentasaccharide by 1000-fold through an induced-fit mechanism . Recent structural and biochemical data have revealed that conformational changes in the lower helical part of the AT precede expulsion of the RCL and contribute about 40% of the binding energy provided by the conformational change [37,43]. Thus, the binding of AT to the specific heparin pentasaccharide occurs through three steps (Fig. 3B): the first is the non-specific association of the negatively charged heparin with the basic heparin binding site on helix D (cyan in figure); the second step is the conformational rearrangement of the helical domain of AT to accommodate the pentasaccharide and maximise ionic and hydrogen bonding interactions; in the third step, the hinge is released from β-sheet A, and the top portions of strands 2 and 3 slide over to make a continuous five stranded β-sheet. Careful structural analysis of these structures reveal a large constant domain, primarily composed of the C-terminal β-barrel domain (blue in Fig. 3C), three regions that move as rigid bodies (green, yellow and red), and loops that move independently (grey) . In a reversal of the heparin binding mechanism, full incorporation of the RCL that accompanies formation of the final complex between AT and a protease results in a 1000-fold decrease in heparin affinity [43,44]. This analysis also provides a detailed paradigm for the general mechanism of modulation of serpin activity through conformational change.
Heparin activation of AT
Determining the mechanism of heparin activation of AT is crucial for understanding the mode of action of the numerous heparin fractionations and synthetic heparins currently on the market as anticoagulants. In all cases, specificity and high affinity binding to AT is conferred by the presence of the pentasaccharide sequence, present in about a third of unfractionated heparin chains. All such chains will allosterically activate AT towards fXa and fIXa inhibition. However, chains must be 18 monosaccharide units in length (5.4 kDa) to accelerate thrombin inhibition by a bridging mechanism , and about twice as long to bridge fXa and fIXa to AT . Most low molecular weight heparin preparations contain a distribution of sizes close to that capable of bridging thrombin, but only unfractionated heparin is likely to bridge factors IXa and Xa. It has long been understood that thrombin inhibition is insensitive to the conformation of AT , and thus the recognition complex should not require the expulsion of the RCL from β-sheet A. The crystal structure of the ternary complex (solved with a synthetic hexadecasaccharide) showed why, with thrombin positioned forward and towards the heparin binding site of AT (Fig. 4A) . The hinge region could not be fully resolved in electron density due to flexibility, however, clear evidence for the pre-insertion of P15 was seen, and the position of thrombin could easily be maintained with the RCL inserted to the level seen in native AT. When the hinge region was prevented from being expelled from β-sheet A by an engineered disulphide bridge, the rate of thrombin inhibition in the presence of heparin was actually slightly increased . Thus, the position of thrombin found in this structure is equally valid for heparin-activated and native AT. One important feature of thrombin is the depth of the active site cleft due to the 60 and 147-insertion loops, both of which make significant contacts with the body of AT (Fig. 4B). AT has a unique three-residue insert on the P′ side of the RCL (shown in green in Fig. 4C) that normally forms a tightly H-bonded turn, and that is seen to unwind in the Michaelis complex to accommodate the bulk of the 60 and 147-insertion loops and to allow the forward orientation of thrombin (Fig. 4C). Consistent with these findings, deletion of the three-residue insert reduces the rate of thrombin inhibition . Thus, recognition of thrombin by AT in the absence and presence of heparin is independent of the extension of the N-terminal portion of the RCL (the hinge region), but does require the extension of the C-terminal portion (the P′ region).
In contrast, recognition of fXa and fIXa by AT is exquisitely sensitive to the conformation of the hinge region. Disulphide engineering studies have shown that fXa can form two separate recognition complexes depending on the state of the hinge region [37,50]. Presumably when AT is in the native state the recognition complex would resemble that of thrombin. However, fIXa can only react with thrombin when the hinge region is expelled and free to extend away from β-sheet A. It was also clear from studies by Alireza Rezaie and Steve Olson’s groups that the 147-loop (and Arg150 in particular) made a crucial exosite contact with the body of AT [51–54]. The structures of the pentasaccharide-activated recognition complexes between AT and fXa  (Fig. 5A) and fIXa  (Fig. 5B) revealed a rotation in the protease (relative to the position observed for thrombin) that resulted in an orientation requiring hinge region extension (Fig. 5C). The rotation allowed the 147-loop of the protease to engage in extensive contacts with AT, and of particular importance, the burying of Arg150 in an acidic (and also hydrophobic) pocket formed by Arg235, Glu237, Met251, Tyr253, and His319. Intriguingly, this pocket is occupied in the native monomeric state of AT by the P1 residue in a similar fashion (Fig. 5D), supporting the idea that heparin binding ‘exposes’ the exosite utilised by factors IXa and Xa . The unique AT P′ insert is partially unwound to allow fXa to rotate into the observed position, but the tight H-bonded turn is preserved in the complex with fIXa and makes several important interactions to improve the stability of the complex . Thus, the RCL of AT has unique features that are exploited to allow for heparin regulation of protease binding and inhibition.
The advantages of the serpin mechanism over the standard lock-and-key mechanism are manifold, and include: the ability to regulate activity by altering the conformation or accessibility of the RCL (AT activation by heparin is a good example); the covalent complex is irreversible; the change in serpin and protease structure releases complexes from cofactors and substrates (e.g. thrombin loses the ability to interact with exosite I and exosite II ligands , and AT loses the ability to bind with high affinity to heparin ); and change in serpin and/or protease conformation can result in receptor binding and signalling . However, the inherent dependence of the serpin mechanism on conformational mobility and a metastable native fold render serpins highly susceptible to mutations that perturb function [60,61]. There are many things that can go wrong with serpins that lead to loss or gain of function (for reviews see Refs [60–63] and others). Classic examples are mutations in the RCL that either change specificity (such as the Pittsburgh variant of α1AT that turns it into an inhibitor of thrombin ) or knock out inhibitory function by slowing RCL incorporation into sheet A (such as the Cambridge II mutation in AT ). Another common cause of serpin deficiency is the failure to secrete active protein due to mutations that affect the ability of serpins to fold into the metastable native state. These mutations can occur almost anywhere in the serpin, and are normally associated with the formation and accumulation of stable polymers within the endoplasmic reticulum of secretory cells (for reviews see Refs [66,67] and others). Polymerisation always leads to lower levels of secreted serpin, and in cases where the serpin concentration is critical, heterozygous deficiency leads to disease (such as thrombosis for AT). In other cases, only the homozygous carriers of mutations develop a loss-of-function disease (such as emphysema for α1AT). In rare cases, the accumulation of serpin polymers is toxic to cells and leads to tissue damage through an unknown gain-of-function mechanism . Several homozygous mutations in α1AT have been associated with liver disease, including cirrhosis, and several heterozygous mutations in neuroserpin have been described that lead to dementia and death. Although the incidence of disease caused by serpin polymerisation (through both loss- and gain-of-function mechanisms) is quite rare, there has been a great deal of interest in the phenomenon, due in part to its classification as a ‘conformational disease’, along with Alzheimer’s, Huntington’s and Parkinson’s diseases, and the prion encephalopathies .
The mechanism of serpin polymerisation
Consistent with the polymerisation of a folding intermediate in cells, serpins are capable of polymerising in vitro when incubated under mildly denaturing conditions (heat, low pH or chaotropic agents, or in combination) . It has been demonstrated for α1AT and for neuroserpin that unfolding and folding proceed via an intermediate that is prone to polymerisation [71–75]. For wild-type protein this intermediate is short-lived and will thus form only a small amount of polymers upon renaturation, but mutations that slow the conversion from intermediate to native state result predominantly in polymeric serpin. Polymers obtained from cells or by partial denaturation have similar properties and are thought to share a common intermolecular linkage. Indeed, it has long been believed that the molecular basis of polymerisation is the same for all serpins, whether induced by mutations or by mild denaturation, and that the mechanism involved the incorporation of the RCL of one monomer into β-sheet A of another (the ‘loop-sheet’ mechanism). However, there is no direct evidence for the formation of loop-sheet polymers by partial denaturation or in vivo, and several experiments contradicting the loop-sheet hypothesis have been reported [76–78]. An alternative mechanism of polymerisation was suggested by a recent crystal structure of a stable AT dimer formed by incubating native AT at pH 5.7, 37 °C . AT was found to have undergone a large domain swap that included both the RCL (strand 4A) and strand 5A, along with other regions (Fig. 6A). The structure could readily be extended to form an open dimer with a flexible linker region (Fig. 6B), and suggested a polymerigenic folding intermediate (or unfolding intermediate) that had strand 5A as well as the RCL exposed (Fig. 6C). Several experiments have been conducted to determine if this mechanism is general for the serpins, and preliminary data suggest that it may account for AT, α1-AT and neuroserpin polymerisation under certain conditions [72,79,80]. More work is required to determine if this or other domain swaps are responsible for polymerisation of serpins within cells, and if the regions that participate in domain swapping depend on the serpin and the mutations involved.
Serpins are found in all branches of life and appear time-and-again controlling proteolytic pathways related to human health and disease. It was perhaps serendipity that the most important anticoagulant serpin AT was a founding member of the superfamily, but nevertheless, AT provides a useful and well-characterised model for understanding the possible range of serpin structure and how it relates to serpin function and dysfunction.
JAH is a Senior Medical Research Council (MRC) Non-clinical Fellow.
Disclosure of Conflict of Interests
The author states that he has no conflict of interest.