Humoral Pattern Recognition and the Complement System


Correspondence to: S. E. Degn, Department of Biomedicine, Aarhus University, The Bartholin Building, Wilhelm Meyers Alle 4, 8000 Aarhus C, Denmark. E-mail:


In the context of immunity, pattern recognition is the art of discriminating friend from foe and innocuous from noxious. The basis of discrimination is the existence of evolutionarily conserved patterns on microorganisms, which are intrinsic to these microorganisms and necessary for their function and existence. Such immutable or slowly evolving patterns are ideal handles for recognition and have been targeted by early cellular immune defence mechanisms such as Toll-like receptors, NOD-like receptors, RIG-I-like receptors, C-type lectin receptors and by humoral defence mechanisms such as the complement system. Complement is a proteolytic cascade system comprising around 35 different soluble and membrane-bound proteins. It constitutes a central part of the innate immune system, mediating several major innate effector functions and modulating adaptive immune responses. The complement cascade proceeds via controlled, limited proteolysis and conformational changes of constituent proteins through three activation pathways: the classical pathway, the alternative pathway and the lectin pathway, which converge in common effector functions. Here, we review the nature of the pattern recognition molecules involved in complement activation, as well as their close relatives with no or unknown capacity for activating complement. We proceed to examine the composition of the pattern recognition complexes involved in complement activation, focusing on those of the lectin pathway, and arrive at a new model for their mechanism of operation, supported by recently emerging evidence.

Fundamentals of innate immunity and pattern recognition

The immune system is a double-edged sword: one edge holds the power to efficiently eliminate any unwelcome intruder; the other edge poses the risk of wreaking havoc on the bearer. The balance between the two rests on (1) the ability to precisely discriminate self from non-self; (2) the ability to determine when and where to mount a response; and (3) the ability to control the magnitude of the response and to dampen it once the threat is eliminated.

The fundamental immunological basis for discriminating self from non-self is the presence of recognizable differences between foreign entities and the host. The innate immune system relies on recognition of evolutionarily conserved pathogen-associated molecular patterns (PAMPs) by innate pattern recognition molecules and receptors (PRMs and PRRs) [1], whereas the adaptive immune system is principally trained to recognize ‘everything, except what is intrinsic to the host’. The decision to mount a response rests on whether not mounting a response is detrimental. According to the ‘danger theory’, the read-out for such detriment is danger- or damage-associated molecular patterns (DAMPs), signifying host cell and tissue damage [2]. Responses are localized by PAMPs and DAMPs (from endogenous sources, i.e. tissue-resident cells) and subsequently by localization of inflammatory mediators resulting from the response itself (from tissue-exogenous sources, i.e. immigrant immune cells). The magnitude of the response is similarly encoded in the amount of DAMPs and PAMPs present. The mechanisms governing dampening of the response after elimination of the initiator may be attributed to clearance of PAMPs, DAMPs and inflammatory mediators, depletion of effector mechanisms and desensitization of immune cells.

Thus, immune responses are initiated through the concomitant recognition of PAMPs and DAMPs. The simultaneous recognition of ‘danger patterns’ contextualizes the recognition of a pathogen, acting as a safeguard against untoward responses. A further dimension of pattern recognition in immunity is the recognition of so-called host-associated molecular patterns (HAMPs), serving again as a safeguard against untoward responses, by providing an ‘inverted’ means of recognition of non-self or altered self, through the recognition of the absence of self-patterns. As we shall describe below, the two different modes of operation are exemplified in complement, where the classical and lectin pathways are initiated upon recognition of DAMPs and PAMPs, while the basis of alternative pathway function is the recognition, or absence of recognition, of self, by potent regulatory molecules.

While classical biochemistry has tended to operate with simple one-to-one interactions, describing how one molecule binds to one ligand, the unifying feature of detection of pathogens, danger and self is that of pattern recognition. Pattern recognition is made possible through multivalent soluble protein complexes, multivalent receptors or monovalent receptors requiring clustering for signalling. Each recognition subunit possesses low affinity for the individual molecular subunits of the pattern, but high-affinity binding is achieved upon concomitant binding of all recognition subunits to their cognate target subunits. The ‘pattern’ is either defined based on a certain spacing and conformational arrangement or a certain density of the target subunits, affording a much higher degree of specificity than would otherwise be possible through high-affinity one-to-one interactions. The phenomenon of pattern recognition is not limited to the immune system, but is gaining particular interest in this area. In this context, it is important to remember the duality of the immune system as both a guardian of homoeostasis and as an anti-microbial defence mechanism, a duality, which is also demonstrated by the complement system.

To limit infection, the mammalian host uses a wide armamentarium of pattern recognition molecules [3]. Importantly, through redundant mechanisms, multiple PRMs are engaged simultaneously in the course of microbial infection and the final results of the stimuli are highly complex [4]. While much attention has been focused on the cellular side of pattern recognition, following the identification of Toll-like receptors, NOD-like receptors, RIG-I-like receptors and C-type lectin receptors, this is only part of the story. An intricate network of humoral PRMs is found in the blood, and the complement system is a central part of this. This review will focus on soluble PRMs that can collectively be designated complement-activating pattern recognition molecules. These can be seen as a subset of a larger and more diverse family of PRMs, sometimes referred to as collagen defence molecules, their unifying features being multimeric macromolecular complexes composed of monomers containing collagenous stalks and some sort of recognition domain. In this family, one may include a number of collectins (collagenous stalks with C-type lectin recognition domains), the ficolins (collagenous stalks with fibrinogen-like recognition domains) and C1q (collagenous stalk with a globular C1q (gC1q) recognition domain).

Complement – a humoral pattern recognition and proteolytic cascade system

The complement system is a humoral proteolytic cascade–based defence mechanism, comprising around 35 different soluble and membrane-bound proteins. It is part of the innate immune system and proceeds via controlled, limited proteolysis of constituent proteins through three activation pathways: the ‘classical pathway’, the ‘alternative pathway’ and the ‘lectin pathway’ (Fig. 1).

Figure 1.

Overview of the three complement activation pathways. The figure summarizes the events leading to the formation of their respective C3 convertases, while for simplicity, the formation of the terminal membrane attack complex and the regulatory mechanisms are omitted. Components and enzymatic activities of the classical, the lectin and the alternative pathways are represented in red, blue and green, respectively. Common components are shown in dark blue, and associations/dissociations of protein components are represented with grey arrows. Active proteases are underlined. The acronyms of participating proteins are used (see text for full names). The classical pathway is activated by C1q in complex with C1s and C1r upon binding of antibody–antigen complexes. The lectin pathway is activated when MBL or ficolins, in complex with MASPs, recognize foreign patterns of carbohydrate or acetyl groups, respectively. The alternative pathway is constitutively activated by the formation of C3(H2O)B complexes and inhibited on self-surfaces, but is allowed to propagate on foreign surfaces. The three pathways converge on C3 and C5 and the common terminal pathway leading to formation of the membrane attack complex.

The classical pathway – recognizing patterns of immunoglobulins and more

Activation of the classical pathway occurs through binding of immune complexes or aggregates containing IgG or IgM, as well as direct recognition of some microorganisms and C-reactive protein (CRP), by C1q [5].

C1q, the recognition subunit of C1, is the archetypal multimeric humoral pattern recognition molecule; yet, it is somewhat of an odd one out among the collagen defence molecules. C1q is a 460-kDa protein complex with the overall shape of a bouquet of tulips. It is composed of three different polypeptide chains (A, B and C) that form six heterotrimeric (each containing an A-, a B- and a C-chain) collagen-like triple helices that associate in their N-terminal halves to form a ‘stalk’, then diverge to form individual ‘stems’, each terminating in a C-terminal heterotrimeric globular domain [6] (Fig. 1). The insertion of a (Gly-X-Y-Z) element into the (Gly-X-Y) triplet pattern in the A-chain of C1q combined with the substitution of one glycine for an alanine in the C-chain determines a structure that forces the triple helix to form a kink at this position with a fixed angle of 60 ° [7]. The C-terminal globular region (gC1q domain) fold is also found in a variety of non-complement proteins [8]. Furthermore, there is a structural and evolutionary link between tumour necrosis factor (TNF) and gC1q-containing proteins, defining a C1q and TNF superfamily (Fig. 2).

Figure 2.

Evolutionary relationships and overview of the complement-related innate humoral pattern recognition molecules. The set of complement-related humoral pattern recognition molecules present in humans is indicated. They hail from three different families: the C1q/TNF superfamily (C1q), the fibrinogen-related proteins (H-, L- and M-ficolin) and the C-type lectin-domain superfamily (MBL and potentially CL-K1 are able to activate complement, whereas the others are not or have unknown capacity, see text). Some molecules not present in humans are also shown in grey, namely the tachylectins (found in horse-shoe crab and related organisms), MBL-A and finally conglutinin, CL-43 and CL-46, all three only found in the bovidae. Although far from exhaustive, some related branches have been included for reference (TNF family, NK-cell receptor family).

The ligands bound by the globular domains of C1q can be immune complexes containing IgG or IgM, but not those formed by IgA, IgD or IgE [9]. C1q has low affinity for the Fc region of monomeric IgG, but due to avidity effects, the binding strength is increased dramatically for multiple Fc sites. The binding of arrays of IgG to the surface of microorganisms can be said to generate ‘an acquired pattern’. With regard to IgM, which is already multivalent, it is the exposure of cryptic-binding sites in the Fc regions upon interaction with large antigens, which allows tight C1q binding to take place [10]. C1q is traditionally known for its ability to bind antibodies, but an amazing variety of other ligands have been suggested. These include certain bacteria, viruses, parasites and mycoplasma, indicating a role as an antibody-independent defence protein. C1q has also been reported to bind to CRP when this is complexed with exposed phosphocholine residues on bacteria, providing a further means of host defence [11].

C1q lacks enzymatic activity, but is associated with two serine protease proenzymes, C1r and C1s, in the heteropentameric C1 complex, C1qr2s2. Binding of C1q is believed to induce a conformational change leading to auto-activation of the associated C1r, which then cleaves and activates C1s (Fig. 1) [12]. The conformational change also facilitates interaction with C4, which is cleaved by C1s into C4a and C4b. C4b has the ability to bind covalently to hydroxyl or amino groups through a reactive thioester and may thus bind to the surface recognized by C1q [13]. Cleavage of C4 also exposes a cryptic C2-binding site on C4b. Upon binding, C2 is cleaved into C2a and C2b by C1s. C2a remains bound to C4b, and the complex thus generated acts as a C3 convertase, recruiting C3 and cleaving it into C3a and C3b. Release of the small C3a anaphylatoxin induces a conformational change in C3b, exposing a previously buried thioester [14]. Like for C4b, this allows C3b to become covalently bound to the target that elicited complement activation. Association of the C4b2a complex with C3b forms a C5 convertase, which initiates the common terminal pathway.

The lectin pathway is highly parallel to the classical pathway (see below), and quite probably, at some stage in the evolution of the classical pathway, C1q recognized carbohydrate structures on immunoglobulins [15]. This puts into perspective both a reported recognition of glycosylations on immunoglobulins by mannan-binding lectin (MBL), one of the collagen defence molecules of the lectin pathway, and the reported activation of the classical pathway by direct binding of C1q to PAMPs. Thus, the primordial ‘C1q pathway’ may have been very lectin pathway like. This is further reinforced by the observation that lamprey C1q acts as a lectin [16] that binds serine proteases and activates the C3/C4-like thioester-containing protein of the lamprey complement system [15], in a manner similar to that illustrated in Fig. 1.

The alternative pathway – recognizing patterns of sialic acids

The course of activation of the alternative pathway, on the other hand, is distinct from that of the two other activation pathways until the generation of a C5 convertase and initiation of the common effector mechanisms. The principle behind the pathway also differs from that of the other two, in that the classical and lectin pathways are activated upon recognition of non-self, whereas the alternative pathway is constantly auto-activated and then inhibited on self-surfaces, but allowed to propagate on non-self-surfaces. Thus, the alternative pathway primarily makes use of what has been termed ‘reverse recognition’, based not on the recognition of PAMPs, but rather the recognition of host-associated molecular patterns (HAMPs). The key host-pattern recognition molecule appears to be factor H, predominantly recognizing HAMPs composed of polyanionic surface structures, for example sialic acids [17].

C3 is continuously activated at a low rate in the fluid phase through either cleavage by serum proteases to C3a and C3b, reaction of the thioester with small nucleophiles or water, or non-specific perturbations leading to conformational changes and the exposure and hydrolysis of the thioester. This is known as the ‘tick-over mechanism’. Intact C3 that has a hydrolysed thioester without being cleaved is referred to as C3i or C3(H2O) (when the nucleophile is water). It has a molecular conformation like C3b and is able to associate with factor B (see Fig. 1). The C3(H2O)B complex is cleaved by factor D to form the alternative pathway initiator, the C3-convertase C3(H2O)Bb. These processes operate at low levels and are inhibited in various ways by self. On host-surfaces, deposited C3b forms a complex with the soluble molecule factor H, which is drawn to the surface by a pattern of high density of sialic acids and serves as cofactor for the regulatory serine protease factor I in the breakdown of C3b to iC3b (‘inactivated C3b’). Also the membrane proteins complement receptor 1 (CR1), membrane cofactor protein (MCP) and decay-accelerating factor (DAF) interact with the C3 convertase and promote its degradation or dissociation. Factor H and factor I also act on the fluid-phase C3-convertase, C3(H2O)Bb. However, if C3b deposition occurs on an activating, that is, non-inhibitory, surface, it can serve as a seed for an explosive positive amplification loop (Fig. 1). Importantly, the alternative pathway serves not only as an independent pathway, but also as an amplifier of the two other pathways once these have been initiated and have generated the C3-convertases specific for these pathways (see above and below) [18, 19].

The lectin pathway – recognizing patterns of carbohydrates and acetyl groups

The lectin pathway parallels the classical pathway, the difference being at the initial step of target recognition and subsequent activation. Evolutionarily, it is arguably the most ancient of the activation pathways [20, 21]. Activation of the lectin pathway occurs through direct recognition of carbohydrate or acetylated PAMPs by MBL and ficolins, respectively, in association with MBL-associated serine proteases (MASPs). Upon binding of MBL or ficolins to their target, the associated MASPs are activated, presumably in a manner analogous to that of the C1 complex. MASP-2 is able to cleave both C4 and C2 in principle rendering it sufficient for lectin pathway activation, but under physiological conditions, MASP-1 plays an important role by transactivating MASP-2 (see Fig. 1) [22, 23]. Furthermore, convertase formation appears to be accelerated by MASP-1 through auxiliary C2 cleavage [24, 25]. The importance of MASP-1 may partly lie in its relative abundance (10 μg/ml), as compared to MASP-2 (0.5 μg/ml) [26]. Recently, MASP-1 has also been implicated in the alternative pathway, with the demonstration of its requirement for activation of pro-fD to active fD in the MASP1 knock-out mouse [27]. However, in a human MASP1-deficient patient, the alternative pathway was found to function normally [22].

Collectins – recognizing patterns of carbohydrates

Lectins are carbohydrate-binding proteins. The animal lectins have been subdivided into a number of families. One of these is a family of proteins containing a C-type lectin-like domain (CTLD). The mammalian members of the CTLD family can be further divided into 17 subgroups based on phylogenetic relationships and domain structures [28]. Despite having a CTLD, not all members actually bind carbohydrate structures, but many are Ca2+-dependent carbohydrate-binding proteins and thus true C-type lectins. Within the CTLD, the highly conserved Glu-Pro-Asn (EPN) and Gln-Pro-Asp (QPD) motifs determine specificity for mannose- or galactose-containing ligands, respectively [29].

One of the subgroups of the CTLD family is the collectins. Similarly to C1q, these contain collagen-like regions and usually assemble into large oligomeric complexes, allowing high-avidity binding based on multiple low-affinity interactions of their CTLDs [30]. This enables collectins to discriminate not only specific carbohydrate moieties but also specific patterns of these, often characteristic to pathogens. MBL is the first discovered and most well-known collectin. It was originally named mannan-binding protein, but has later become known as mannose-binding lectin or mannan-binding lectin. Lung surfactant was also shown to contain two collectins, surfactant protein A (SP-A) (of which two similar forms are found in humans, SP-A1 and SP-A2) and SP-D. Furthermore, genome analyses revealed two related human collectins, collectin liver 1 (CL-L1) and CL placenta 1 (CL-P1), and a third, CL kidney 1 (CL-K1), was recently cloned [31].

Mannan-binding lectin is a plasma protein of hepatic origin. The polypeptide chain of MBL has the characteristics of the collectins: a short N-terminal region containing cysteines that mediate disulphide bridging between structural subunits or monomers of the same subunit [32]; a collagen-like region with variable repeats of Gly-X-Y triplets (where X and Y may be any amino acid) stabilized by the presence of hydroxyprolines and glycosylated hydroxylysines [33]; a small neck region that forms an α-helical coiled-coil; and a C-terminal globular CTLD, dictating the ligand specificity. Three such polypeptide chains assemble to form homotrimeric structural subunits (Fig. 3), which then associate into higher-oligomeric forms (Fig. 4), ranging from dimers to hexamers and even higher oligomers [34]. In human MBL, an interruption in the collagenous sequence between the first 7 and the ensuing 12 collagen repeats is assumed to give rise to a flexible joint in the collagenous region of MBL, also known as the ‘kink’ (a similar kink is found in SP-A, and these are analogous to the kink in C1q described above). The binding site for MASPs has been localized around lysine 55 [35] and suggested to be the motif, Hyp-Gly-Lys-X-Gly-Pro/Tyr (where Hyp is hydroxyproline), which is conserved between MBL and ficolins of various species, as well as C1q [36, 37] (Fig. 5). In MBL and ficolins (see below), this motif has been associated with binding of MASPs and hence complement activation via C4 and C2. Whereas the overall structure of MBL resembles that of C1q, MBL has a more open or extended conformation (Fig. 4). While the arms of C1q are joined in a sizeable stalk, restricting flexibility somewhat, MBL seems to be joined N-terminally in a so-called ‘hub’ [38, 39].

Figure 3.

Subunit structures of the humoral collectins and ficolins. Lengths of collagen-like regions are calculated according to a translation of 0.286 nm per residue in the triple helix [95], corresponding to a triplet pitch of a type I collagen helix of 0.85 nm [96]. The α-helical coiled-coil length of ~3 nm and the domain size of the C-type lectin-like domains (CTLDs) of approximately 3 × 4 nm were derived from X-ray diffraction [97], and the domain sizes for FBGs are about the same size, also based on X-ray diffraction [98, 99]. The FBGs are marginally larger than the CTLDs based on spherical volume calculations as described in the Supporting Information, and they are depicted as such in the figure. The figure is drawn to scale.

Figure 4.

Schematic drawings of the ultrastructures of the multimeric humoral pattern recognition molecules. Based on [75, 76] (C1q), [77] (SP-A), [100] (SP-D), [39] (MBL), [100, 101] (IgM), [78] (L-/M-ficolin), [51] (H-ficolin). The drawings are to scale. Some structure dimensions were corrected according to the calculations in the Supporting Information.

Figure 5.

Alignment of the collagen-like sequence around the putative MASP-binding motif of MBL, the ficolins, the chains of C1q, collectin-L1, collectin-K1 and SP-D. SP-A1 and SP-A2 lack this motif entirely. X = any amino acid. ‘:’ entirely conserved. ‘.’ highly conserved. Prolines in third position in the Gly-X-Y triplet are likely hydroxyprolines. The putative MASP-binding motif in the collagen sequence is indicated in blue. Residues involved in binding are given in bold, and the essential lysine (55 in MBL) is indicated with an asterisk.

The PAMPs recognized by MBL comprise various simple and complex carbohydrate motifs. MBL selectively binds not only mannose or its multimers, as the often used name mannose-binding lectin implies, but rather in general recognizes sugars with 3- and 4-OH groups placed in the equatorial plane of the sugar ring structure, for example, mannose, N-acetylmannosamine, glucose, N-acetylglucosamine and L-fucose [40]. High-avidity binding of MBL requires appropriate spacing of ligand sugars, allowing concomitant binding of multiple CTLDs.

The level of MBL in plasma varies interindividually by more than three orders of magnitude (2 ng–10,000 ng) and is to a large degree genetically determined by polymorphisms in the promotor region and in exon 1 [41]. Three naturally occurring mutations in the N-terminal part of the collagenous region cause decreased MBL levels, low levels of oligomerization of the subunits and decreased binding of MASPs [34]. MBL deficiency, caused either by low MBL levels or defective oligomerization, is widely recognized as one of the most common human immunodeficiencies.

CL-L1 was cloned from liver and has been reported to be present mainly in liver as a cytosolic protein and at low levels in placenta [42]. CL-L1 has 24 Gly-X-Y repeats, not interrupted by a kink, and lacks the first hydroxyproline of the traditional binding motif for MASPs and C1r/C1s (Figs. 3 and 5). Evolutionarily, CL-L1 appears to be restricted to mammals and birds. Studies on CL-L1 are lacking, and no physiological role has been established.

CL-P1 was isolated from placenta [43]. It is a type II membrane protein, which has a coiled-coil region, a collagen-like domain and a CTLD. It appears to be a scavenger receptor found on vascular endothelial cells and able to bind both to bacteria and yeast, as well as oxidized low-density lipoprotein [43, 44].

CL-K1 was discovered in kidney [31], but is expressed also in the adrenal glands and the liver [45] and other tissues. The mean level in plasma was found to be 2.1 μg/ml in 10 individuals. CL-K1 possesses an uninterrupted collagen-like region (24 repeats) (Fig. 3), which lacks the first hydroxyproline in the proposed protease-binding motif: Hyp-Gly-Lys-X-Gly-Pro/Tyr (Fig. 5) found in MBL, ficolins and C1q [46]. However, MASP-1 was found to copurify with CL-K1 indicating that the motif is not absolutely required and that variants are allowed [45]. The other MASPs were not identified in the study. The protein seems to have direct antimicrobial activities, and another study suggests an ability to activate the lectin pathway [47].

Phylogenetic analyses indicate that the collectins can be divided into five distinct subgroups. MBLs group together, SP-A1 and SP-A2 group together, SP-D forms a group with bovine conglutinin and collectin-43 and collectin-46, CL-P1 groups alone and finally CL-L1 and CL-K1 group together [31] (Fig. 2).

Ficolins – recognizing patterns of acetyl groups

The lectin pathway can also be activated by the ficolins. They are structurally and functionally related to the collectins, the defining difference being that they have fibrinogen-like (FBG) domains in place of CTLDs [48] (Fig. 3). Similar to the collectins, the ficolins form structural subunits through a collagen helix (Fig. 3) and these associate into larger oligomeric structures (Fig. 4). Three ficolins are found in humans: H-ficolin, L-ficolin and M-ficolin, while two are found in rodents (Ficolin A and Ficolin B, homologues of L- and M-ficolin, respectively) [49]. The ligands for the ficolins were initially suggested to be acetylated monosaccharides–like GlcNAc and GalNAc [50, 51], in line with their evolutionary relationship to the tachylectins. However, it is now known that the motifs recognized are more generally acetylated compounds, including non-sugars such as N-acetyl-glycine, N-acetyl-cysteine and acetylcholine [52]. Based on this selectivity, the ficolins are not easily defined as lectins, rendering the term ‘lectin pathway’ somewhat of a misnomer, but we shall use this name indiscriminately in this review, as is also the case in the literature in general.

H-ficolin, also referred to as Ficolin-3, was initially noted as an auto-antigen in patients with systemic lupus erythematosus [53] and was subsequently characterized and recognized as a ficolin based on structural and functional characterization [51, 54]. Meanwhile, human L-ficolin (Ficolin-2), initially termed P35, had been discovered, isolated from plasma and characterized [55-57]. M-ficolin was identified by cloning of cDNA encoding a molecule resembling L-ficolin [58, 59] and was also referred to as P35-related protein or Ficolin-1.

H-ficolin is produced by bile duct epithelial cells and hepatocytes, as well as in the lung by ciliated bronchial epithelial cells and type II alveolar epithelial cells [60]. From these sites of production, H-ficolin is secreted into serum, bile and bronchoalveolar fluid. The median serum H-ficolin level has been determined at 18.4 μg/ml, ranging from 11.2 to 33.8 μg/ml [61], by far making it the most abundant of the activators of the lectin pathway. Upon gel permeation chromatography, serum H-ficolin runs at 650 kDa [54]. The ligand specificity of H-ficolin is reportedly N-acetyl-galactosamine (GalNAc) and N-acetyl-glucosamine (GlcNAc), and H-ficolin was found to agglutinate human erythrocytes coated with LPS from Salmonella and Escherichia [51]. H-ficolin was also found to bind a polysaccharide from Aerococcus viridans [62]. Perhaps more biologically relevant, this finding was supported by binding to whole bacteria [61] and higher serum concentrations of H-ficolin were reportedly linked with inhibition of A. viridans growth [63]. Moreover, H-ficolin was also reported to activate the lectin pathway of complement [61, 64].

L-ficolin is hepatically produced and the median concentration in serum has been established at 3.3 μg/ml, ranging from 1.8 to 9.0 μg/ml [61]. It is found in serum in two forms, at 650 and at 150 kDa, and it binds acetylated compounds, including GlcNAc, GalNAc, N-acetyl-mannosamine, N-acetyl-glycine, N-acetyl-cysteine and acetylcholine [52]. L-ficolin has also been found to bind some encapsulated bacteria [57, 61], to associate with MASPs and MAps in a calcium-dependent manner [65] and to activate complement [66].

M-ficolin is found on the surface of monocytes and granulocytes [67, 68], although it lacks a transmembrane domain and also in secretory granules in the cytoplasm of neutrophils and monocytes [69]. The expression of M-ficolin was reported to cease when the monocytes differentiated into macrophages [68, 70]. M-ficolin was found to bind acetylated patterns, associate with MASPs, and activate complement [67, 69]. Although M-ficolin was initially not thought to be found in serum, a mean concentration of 1.07 μg/ml, ranging from 0.28 to 4.05 μg/ml, has been reported [71]. On GPC of serum, M-ficolin is found in a complex of ~900 kDa [71], much larger than what has previously been seen for H-ficolin [54] and L-ficolin [52]. The cell-surface association of M-ficolin in the absence of a transmembrane motif was recently reported to be caused by tethering through recognition of sialic acid by the fibrinogen-like domain [72]. The preference of M-ficolin for sialic acid was supported by studies based on glycan array screening [73]. Such recognition would appear to be a risky mechanism, potentially triggering complement activation on self-surfaces. It is curious that such binding to patterns of sialic acids may be counteracted by the affinity of factor H of the alternative pathway for sialic acids in combination with C3-convertases (see above for the alternative pathway).

Ultrastructures of the multimeric humoral PRMs

Although the defining feature of innate PRMs and PRRs is that their specificities are germ-line encoded and thus fixed, in contrast to the somatically recombined B- and T cell receptors, taken together, even within single species, the total pool of innate PRMs and PRRs exhibit an extremely broad repertoire of recognition capabilities. This repertoire is enhanced by natural antibodies of the IgM isotype, produced at tightly regulated levels and in the complete absence of external antigenic stimulation [74]. IgM and the bona fide innate PRMs also resemble each other somewhat in terms of their ultrastructures.

The humoral recognition molecules commonly attain high-avidity recognition capabilities by being polymeric, for example, IgM is pentameric or hexameric, and C1q is hexameric, while ficolins and collectins are found in a number of multimeric forms (Fig. 4). It is important to note, however, that the interactions may be even higher order than the oligomeric nature suggests. In the case of MBL, each subunit has three CTLD heads, meaning that a hexamer of subunits in reality has 18 recognition sites. The large size of the multimeric PRMs also gives them an advantage in the agglutination of large and often mutually repulsive particles. Electron microscopy of rat SP-D preparations demonstrated a highly homogeneous population of molecules with four identical rod-like arms (46 nm in length) that emanated from a central ‘hub’ in two pairs that closely paralleled each other for their first 10 nm. Higher-order oligomers were also observed, in which four such dimers were arranged so that each rod-like arm constituted the spoke of a wheel-like structure (Fig. 4). The diametrally opposed rod-like arms and their recognition domains together span an impressive 100 nm, that is, the diameter of small viruses! SP-A was found to have a structure much like that reported for C1q [75, 76]. Collagen arms with a length of around 19.6 nm and a diameter of 1.5 nm were observed, terminated by globular heads with a diameter of around 5.3 nm [77] (Fig. 4).

The structure of MBL has recently been characterized by atomic force microscopy [39], and the observations were well in agreement with previous studies using electron microscopy and with the estimated dimensions of the collagen-like region. While the ‘kink’ in C1q is generated by two different interruptions in two of three chains in the collagen-like helix, the kinks in MBL and SP-A are based on the same interruption in three identical chains. This was interpreted to mean that the kink in C1q is probably more rigid, whereas the kink in MBL and SP-A are more flexible [77]. There is also a difference in the kink between MBL and SP-A, as in the former, the interruption is the lack of a Y residue in a triplet, whereas in the latter, a 4 aa motif is inserted into the collagen-like sequence. L- and M-ficolin, but not H-ficolin, also possess a kink, at the very beginning of their collagen-like regions (Fig. 3), and in both cases, this is due to the insertion of a 6 aa sequence. The ultrastructures of H-ficolin [51] and L-/M-ficolin (equivalent to the porcine ficolin α and/or –β in the figure) [78] have also been studied by electron microscopy, revealing overall structures resembling that of MBL (Fig. 4).

Nature and composition of the complement-activating complexes

The nature of the complexes of the serine proteases and the collectins (MASPs) and C1q (C1r and C1s) must be tied intimately to the mechanism of activation. In the case of C1, the complex consists of one C1q with a C1s-C1r-C1r-C1s tetramer, and the model of activation is that the C1r's trans-autoactivate, followed by their transactivation of the C1s's. The scenario for the complexes of the lectin pathway is inordinately more complex, with at least four PRMs (MBL, H-, L- and M-ficolin) interacting with five MASP/MAps (MASP-1, -2, -3, MAp19 and MAp44) [79]. A fundamental prerequisite to the goal of understanding complement activation through the lectin pathway is to define the nature and composition of the complexes.

The three MASPs are multidomain modular proteases with the same overall architecture as C1r and C1s of the classical pathway, that is, they are composed of well-described domains in the order: CUB1-EGF-CUB2-CCP1-CCP2-SP, the latter being the serine protease domain. The CUB1 and EGF domains are responsible for their calcium-dependent binding to C1q (for C1r, C1s) and MBL and the ficolins (for MASPs) [80, 81]. Studies using truncated proteins have shown that MASP-1 and MASP-2 bind to MBL in a Ca2+-dependent manner through interactions involving the CUB and EGF-like domains [82, 83]. Fragments encompassing these regions are Ca2+-independent homodimers that are stabilized by interactions involving the two N-terminal domains, with the CUB1 domain being essential [84]. We have recently found that while MASPs and MAp44 are able to form heterodimers, only homodimers (e.g. MASP-3/MASP-3) are found in serum [85]. The three N-terminal domains of each MASP are necessary and sufficient to reproduce the MBL-binding properties of the full-length proteins [84]. The contiguous CCPs of C1r, C1s, and the MASPs, especially CCP2, have been implicated in the binding of macromolecular substrates [86, 87]. Finally, the SP domain is the catalytically active unit and defines them as S1A family, chymotrypsin-like, proteases. CCP2 and the SP domain are connected by a somewhat flexible linker or activation peptide, which lies just upstream of the R-I cleavage site and contains the recognition motif for activating proteases.

A study employing site-directed mutagenesis in rat MBL and assessing sequence conservation between mammalian MBLs suggested that the site of complex formation with the MASPs and MAps is likely located C-terminally to the break in the collagenous sequence [37] and lysine55 in this position of human MBL was found to be essential for binding to MASPs [35]. The same binding motif is found in the ficolins [36] (Fig. 5). Surprisingly, CL-K1 was recently reported to associate with MASP-1, even though CL-K1 lacks the first hydroxyproline in the otherwise conserved binding motif, perhaps indicating that the requirements for this motif are not as stringent as initially believed [45].

In humans, MBL is found in oligomeric forms ranging from dimers to octamers [34]. The most prominent oligomer contains four subunits, that of three subunits is the second most abundant, closely followed by that of five subunits and decreasing presence of larger oligomers [39]. Analyses of CUB-EGF-CUB fragments indicated that each MASP dimer contains binding sites for two MBL subunits and both sites must be occupied by subunits from a single MBL oligomer to form a stable complex. Thus, the smallest functional unit for complement activation has been proposed to consist of MBL dimers bound to MASP-1 or MASP-2 homodimers [84]. This raises the possibility that two or more homodimers could associate in a single complex with a higher-oligomeric PRM. Conversely, others have suggested that MASP dimers have four separate binding sites for MBL stalks, one for each stalk of a MBL tetramer [88] and that trimeric and tetrameric MBL may bind only a single MASP homodimer [89]. Nonetheless, we recently demonstrated that single complexes with MBL could harbour more than one dimer of MASP [22], but naturally, this might be due to association with higher-order oligomers of MBL. However, we have observed that such heterocomplexes can also be formed through association with purified H-ficolin or L-ficolin, and while H-ficolin is present in a range of oligomeric forms, L-ficolin from serum is predominantly tetrameric, again suggesting that tetramers may bind two MASP dimers [85].

Mechanism of activation

The nature of the PRM complexes is tied intimately to the mechanism of activation, and the main question remains how binding of PRM to a ligand surface triggers MASP activation. MBL binds microorganisms through the C-terminal CTLDs, which are far from the reported binding site of MASPs and MAps C-terminally to the kink in the collagenous sequence [37]. It has been suggested that binding to the surface of a microorganism could induce conformational changes within individual MBL subunits, leading to MASP activation, but crystallographic data show that no structural changes occur in the individual carbohydrate recognition domains upon sugar binding [40].

The presently favoured model for complement activation maintains that binding to the surface of a microorganism causes a global change in the structure of MBL, resulting in the displacement of the relative positions of subunits, translating from the CTLDs into the stalks. Indeed, recent studies have demonstrated a great flexibility of MBL and large conformational changes upon binding to artificial ligand surfaces [38, 39]. Furthermore, the nature of the recent structure of the CUB-EGF-CUB domain of MASP-1/-3/MAp44 with the part of the collagen stalk of MBL harbouring the proposed MASP-binding motif was interpreted as evidence for such a model [90].

Yet, another, simpler, possibility remains, namely that MBL and ficolins serve merely as ‘carriers of MASPs’ and that their clustering upon binding to appropriately spaced ligand patterns on microorganisms causes the juxtaposition of MASPs, allowing intercomplex transactivation. We favour this mechanism for a number of reasons:

Firstly, it is supported by the observation of concentration-dependent auto-activation of MASP-1 and MASP-2 [91, 92]. This implies that there is no (absolute) requirement for a sterical strain for catalytic activity of the MASPs, as has otherwise been suggested for the initiating protease of the classical pathway, C1r [12].

Secondly, until recently, it was believed that while two activation steps (C1r and C1s) are required for C1 activation, strictly only one step (auto-activation of MASP-2) is required for MBL-MASP activation. However, as mentioned earlier, recent data suggest that under physiological conditions, MASP-1 is important in transactivating MASP-2 [22, 23]. Indeed, MASP-1 can be viewed as ‘the trigger’; in that zymogen MASP-1 has a significant activity for either zymogen MASP-1 or MASP-2, allowing it to initiate a ‘ping-pong transactivation’ (Fig. 6). Despite our recent demonstration that MASP-1 and MASP-2 may be found together in the same PRM complex [[22], 85], this may only account for a minor proportion of PRM/MASP complexes. The function of lower-order oligomers of MBL or ficolins with single MASP homodimers would be unaccounted for, unless such oligomers are able to co-operate upon binding to the same target surface.

Figure 6.

Potential scenarios for concentration- or juxtaposition-dependent activation and transactivation of MASPs upon PRM binding. See text for discussion of this figure. The modelled scenarios are not mutually exclusive and may even co-operate on the same target surface.

Thirdly, recent structures of the CCP-CCP-SP fragments of MASPs (e.g. MASP-1 [93]) and MAp44 (representing CUB-EGF-CUB-CCP of MASP-1) [94], allow the in silico assembly of intact MASPs. This gives rise to an estimated length of the dimer of more than 30 nm, which is similar to (or longer than) the longest dimension of MBL! This would allow the MASPs to protrude significantly from the PRM, a prerequisite for such co-operation.

Finally, it is hard to imagine enough structural flexibility in MASPs to allow for the two antiparallel MASPs in a MASP homodimer to ‘curl up’ on themselves, in order for the two serine protease domains to sequentially activate each other. Furthermore, this mechanism would only allow transactivation of MASP-1 on MASP-1 or MASP-2 on MASP-2, not MASP-1 on MASP-2, because as mentioned earlier, heterodimers are not found [85]. While we have demonstrated that two MASP dimers may be found on the same PRM, potentially addressing this problem, the function of lower-order oligomers of MBL or ficolins with single MASP homodimers would again be unaccounted for.

Importantly, the intercomplex transactivation model is not mutually exclusive with that of intracomplex transactivation in complexes harbouring two MASP dimers (or potentially more). Indeed, such higher-order oligomers with multiple MASPs may still be functionally important, both by virtue of the intrinsic colocalization of, for example MASP-1 and MASP-2, and the higher-oligomeric nature of the MBL, allowing high-avidity binding. The intercomplex transactivation model rather extends the scope, allowing for intricate pattern recognition-encoding directing diverse physiological responses to discriminate self versus non-self, as we have previously alluded to [22]. Of note, this would allow not only for co-operation of distinct MBL/MASP complexes, but also distinct PRM/MASP complexes, such as MBL/MASP-2 with H-ficolin MASP-1. We have presented an overview of the proposed modes of operation in Fig. 6. Future experiments should address the questions raised here, such as the possibility of PRM/MASP co-operation on ligand surfaces, as well as the structural aspects in terms of placement and orientation of the MASPs in the activating complexes when bound to their targets.