Role of Ca2+ in folding the tandem β-sandwich extender domains of a bacterial ice-binding adhesin


  • Shuaiqi Guo,

    1. The Protein Function Discovery Group, Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Ontario, Canada
    Search for more papers by this author
  • Christopher P. Garnham,

    1. The Protein Function Discovery Group, Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Ontario, Canada
    Current affiliation:
    1. National Institute of Neurological Disorders and Stroke, Bethesda, MD, USA
    Search for more papers by this author
  • Sarathy Karunan Partha,

    1. The Protein Function Discovery Group, Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Ontario, Canada
    Search for more papers by this author
  • Robert L. Campbell,

    1. The Protein Function Discovery Group, Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Ontario, Canada
    Search for more papers by this author
  • John S. Allingham,

    1. The Protein Function Discovery Group, Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Ontario, Canada
    Search for more papers by this author
  • Peter L. Davies

    Corresponding author
    1. The Protein Function Discovery Group, Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Ontario, Canada
    • Correspondence

      P. L. Davies, Department of Biomedical and Molecular Sciences, Queen's University, Kingston, Ontario, Canada K7L 3N6

      Fax: +1 613 533 2497

      Tel: +1 613 533 2983


    Search for more papers by this author


A Ca2+-dependent 1.5-MDa antifreeze protein present in an Antarctic Gram-negative bacterium, Marinomonas primoryensis (MpAFP), has recently been reassessed as an ice-binding adhesin. The non-ice-binding region II (RII), one of five distinct domains in MpAFP, constitutes ~ 90% of the protein. RII consists of ~ 120 tandem copies of an identical 104-residue sequence. We used the Protein Homology/analogy Recognition Engine server to define the boundaries of a single 104-residue RII construct (RII monomer). CD demonstrated that Ca2+ is required for RII monomer folding, and that the monomer is fully structured at a Ca2+/protein molar ratio of 10 : 1. The crystal structure of the RII monomer was solved to a resolution of 1.35 Å by single-wavelength anomalous dispersion and molecular replacement methods with Ca2+ as the heavy atom to obtain phase information. The RII monomer folds as a Ca2+-bound immunoglobulin-like β-sandwich. Ca2+ ions are coordinated at the interfaces between each RII monomer and its symmetry-related molecules, suggesting that these ions may be involved in the stabilization of the tandemly repeated RII. We hypothesize that > 600 Ca2+ ions help to rigidify the chain of 104-residue repeats in order to project the ice-binding domain of MpAFP away from the bacterial cell surface. The proposed role of RII is to help the strictly aerobic bacterium bind surface ice in an Antarctic lake for better access to oxygen and nutrients. This work may give insights into other bacterial proteins that resemble MpAFP, especially those of the large repeats-in-toxin family that have been characterized as adhesins exported via the type I secretion pathway.


Structural data are available in the Protein Data Bank under the accession numbers 4KDW (P1 structure) and 4KDV (P21 structure).


antifreeze protein


bacterial immunoglobulin




Marinomonas primoryensis antifreeze protein


Protein Data Bank


Protein Homology/analogy Recognition Engine Server


region II


region IV




single-wavelength anomalous dispersion


type I secretion system

V e

elution volume

V t

total column volume


Many organisms that inhabit ice-laden environments use antifreeze proteins (AFPs) to prevent freezing damage caused by internal ice crystal growth [1, 2]. For example, some freeze-resistant species of fish and insects produce AFPs that bind to the surface of ice crystals and stop them from growing [3, 4]. Other organisms, including many species of plants, are unable to avoid freezing. They produce AFPs not to prevent the growth of ice, but to minimize the damage done by freezing [5-7]. Here, AFPs inhibit the recrystallization of ice, where lethal large crystals form at the expense of small ones at high subzero temperatures or during freeze–thaw cycles [8, 9].

To date, AFPs have been characterized as small (3–30 kDa), single-domain proteins. However, an exceptionally large AFP of ~ 1.5 MDa was isolated from the Antarctic Gram-negative bacterium Marinomonas primoryensis (MpAFP) [10]. MpAFP is divided into five distinct regions by the highly repetitive region II (RII) and the moderately repetitive region IV (RIV) [11, 12]. The 34-kDa RIV completely accounts for the antifreeze activity of MpAFP, and its crystal structure reveals a β-solenoid fold that consists of 13 repeats-in-toxin (RTX) repeats [13]. It was originally speculated that MpAFP might be localized in the periplasmic space of M. primoryensis, where it could bind and inhibit the growth of embryonic ice crystals from the extracellular environment before they could damage the cell [10]. Subsequently, MpAFP was localized to the outside surface of the bacteria [12]. Moreover, as the 322-residue RIV constitutes only ~ 2% of MpAFP, freeze resistance, which requires high millimolar concentrations of AFP, is unlikely to be the main role of this giant protein. In contrast, the non-ice-binding RII consists of ~ 120 tandem copies of identical 104-residue repeats that account for ~ 90% of the full-length protein.

Many extremely large (> 2000 residues) RTX-repeat-containing proteins from Gram-negative bacteria function as loosely attached adhesins [14]. In addition to their remarkable size, these adhesion proteins have strikingly similar domain architectures to MpAFP. They typically consist of many (> 25) tandem repeats that are 80–100 residues in length arranged in a similar manner as in MpAFP_RII near the N-terminus, and several RTX repeats near the C-terminus. On the basis of the evidence that MpAFP is attached to the cell surface of M. primoryensis [12], we have reassessed MpAFP as an ice-binding adhesin that functions to bind the strictly aerobic bacterium to ice in the upper reaches of the Antarctic lake, where oxygen and nutrients are most abundant [15, 16].

As a key function of MpAFP probably resides in the multiple RII domains, we set out to determine the three-dimensional structure of one of these domains to gain insights into how RII helps MpAFP bind its host to ice. Previous bioinformatic analyses showed that RII belongs to a family of bacterial proteins that has been annotated as the putative flagellar system-associated repeats repeats (SWM_repeats; Pfam PF13753) [12]. This family of proteins contains > 4900 sequences from > 500 species, but no structure has been reported to date [17]. The SWM_repeats form part of the E-set clan of immunoglobulin (Ig)-like fold superfamily, and are closely related to the bacterial Ig-like (BIg) domains, which are divided into eight subfamilies. Although eukaryotic proteins that contain Ig-like modules are commonly found in the immune system and are involved in other important processes, such as cell recognition and adhesion [18, 19], less is known about the BIg domains, owing to a lack of detailed structural information [20, 21].

Here, we report the 1.35-Å resolution crystal structure of the RII monomer, which folds as a novel Ca2+-bound Ig-like domain. The Ca2+-dependent folding of the 104-residue repeat was demonstrated by CD, and suggested a noncanonical folding pathway for the Ig-like β-sandwich domain. The potential architecture of the incredibly large RII was revealed by Ca2+-mediated contact between the monomers in the crystals. To our knowledge, the RII monomer is the first solved structure of this type of domain that requires Ca2+ for folding. This work is relevant to other bacterial proteins that contain Ig-like domains, especially those of the large RTX adhesins, which are involved in biological processes such as biofilm formation and epithelial cell infection.


Structural homology analyses delineated a single 104-residue RII domain

We originally designed the 104-residue RII monomer to begin and end with sequences TTGSSTHTVD and SSDAAGNTVD, respectively. However, when we used size-exclusion chromatography to determine the elution volume (Ve)/total column volume (Vt) ratio for the RII monomer in the presence of 5 mm Ca2+, the RII monomer was much larger than expected. It eluted from a calibrated Superdex-75 column with a Ve/Vt value of 0.54 (Table 1), which suggested a slightly higher molecular mass than for conalbumin (75 kDa) with a Ve/Vt value of 0.56. As the actual molecular mass (mact) of the RII monomer is 12.4 kDa, the protein appeared to have aggregated, which might explain why crystallization efforts with this construct proved to be futile.

Table 1. The mact, mapp and Ve/Vt values calculated for the protein standards and the two different constructs of the RII monomer. Note: the Ve for blue dextran indicates void volume (Vo), whereas the Ve for NaCl indicates the total volume of the Superdex-75 size-exclusion column. NA, not applicable.
Proteins/saltmact (kDa)mapp (kDa)Ve/Vt
Blue dextran2000NA0.44
RII monomer (original)12.486.60.54
RII monomer (new)12.421.30.68

As the 104-residue sequence is tandemly repeated ~ 120 times, it was difficult to define the exact residues where the repeating sequence of the module begins and ends. A Protein Homology/analogy Recognition Engine (Phyre2) structure prediction on two tandem 104-residue repeats suggested better start/end sites for the domain (Fig. 1A) [22]. According to this prediction, the 10 N-terminal residues (TTGSSTHTVD) are taken from the C-terminus of the preceding domain, which leaves an exposed hydrophobic groove (Fig. 1B). This N-terminal extension could fold as a β-strand to invade and replace the missing C-terminal β-strand of another domain. This strand invasion to form multimer chains could explain the protein's propensity to form larger structures in solution. On the basis of the Phyre2 prediction, we reassessed the sequence of RII by shifting each 104-residue repeat 10 residues towards the C-terminus of MpAFP, to start at TEATAGTVTV and end in TTGSSTHTVD (Fig. 1A). A second Phyre2 analysis of the newly designed RII dimer sequence illustrated that the junction between the two domains now resided in the linker region between the two intact β-sandwiches (Fig. 1C). In the presence of 5 mm Ca2+, the size-exclusion chromatography profile for the newly defined 104-residue domain showed a single symmetric peak (data not shown) with a Ve/Vt value of 0.68, which indicates a size slightly larger than the 17.7-kDa-myoglobin standard (0.69; Table 1).

Figure 1.

Phyre2 models of the RII dimer sequence. (A) Amino acid sequence and the secondary structure representations of MpAFP_RII. Residues 1–104 (not counting residues from the His-tag) define the new RII monomer construct, which is flanked by partial sequences from the preceding and subsequent 104-residue RII repeats (purple). The black arrows mark the boundary residues of the RII monomer, and the purple arrows point to those of the original construct. Secondary structures for individual residues correspond to the P1 structure (gray; Fig. 3A). Side-chain oxygen atoms from the boxed residues help to coordinate intramolecular Ca2+ ions, and each of them is identified by a+. Xs identify the two N-terminal residues that were not observed in the electron density maps. (B) The Phyre2 model of the originally designed RII dimer sequence shows two Ig-like β-sandwich folds arranged in tandem. The N-terminal 104 amino acids of the protein are in gray, and the C-terminal 104 amino acids are in purple. The N-terminal Ig-like fold contains seven antiparallel β-strands, one of which is contributed by the C-terminal domain. (C) The Phyre2 model of the RII dimer with the newly defined boundary residues. The color scheme is the same as in (B).

CD indicates that Ca2+ is indispensable for the folding of MpAFP_RII

We used CD to monitor the protein's secondary structure in the presence of excess EDTA or Ca2+, and in the absence of both (Fig. 2). In the presence of 0.1 mm EDTA, the RII monomer appeared to be unstructured, as its far-UV CD spectrum contained a single negative peak at 198 nm. Deconvolution of the CD spectrum (Table 3) indicated that 88% of the RII monomer was in a random coil form, whereas only 3% was α-helical and 9% was β-stranded. Similar spectra were obtained in the absence of both EDTA and Ca2+. To determine the effect of Ca2+ on the folding of the RII monomer, CaCl2 was added to individual aliquots to achieve 1 : 1, 2 : 1, 3 : 1, 4 : 1, 5 : 1, 10 : 1 and 20 : 1 molar ratios of CaCl2/RII monomer, and far-UV CD spectra were collected for each sample. As increasing amounts of CaCl2 were added, the CD spectra showed a decrease in negative ellipticity that was appreciable at two molar equivalents of CaCl2/RII monomer (cyan trace in Fig. 2A). At a 3 : 1 molar ratio of CaCl2/RII monomer, the spectrum showed a strong positive peak at 194 nm and a broad negative peak at ~ 218 nm, which resembled those observed with proteins containing primarily β-sheets. An isodichroic point appeared at ~ 210 nm, the presence of which is suggestive of a change in the protein's conformation. The positive ellipticity increased and similar CD profiles were obtained when the protein was measured in up to 10 molar equivalents of CaCl2. The spectra recorded for the RII monomer in 10 and 20 molar equivalents of CaCl2 were nearly identical, indicating that the RII monomer was fully folded as a structure enriched with β-sheet in a 10-fold ratio of Ca2+ (12% α-helix, 38% β-strand, and 50% coil and turn). The fractional change in CD intensity at 194 nm was plotted as a function of total Ca2+ concentration (Fig. 2B) The resulting plot showed a sigmoidal profile, illustrating cooperative induction of a β-sheet-rich conformation of the RII monomer upon binding to Ca2+. The data were fitted with the nonlinear fitting function of gnuplot [23] to the equation relating the concentration of bound sites to total binding sites.

display math

where ν is fractional saturation, h is Hill's constant (2.9), and K50 is the [Ca2+] required for half-maximal folding of the RII monomer. K50 was calculated to be 62 μm, which is approximately two times the protein concentration (30 μm).

The effect of increasing temperature (from 1 to 53 °C) on the folding of the RII monomer was also determined in the presence of 10 molar equivalents of CaCl2 (Fig. 2C). Although a gradual decrease in the CD intensity was recorded, the RII monomer remained largely in its native conformation up to a temperature of 38 °C. When the temperature reached 40 °C, an isodichroic point appeared at ~ 210 nm, suggesting a change in the protein's conformation. The CD intensity continued to decline, and began to show a broad negative peak at 198 nm at 48 °C, which indicated that the RII monomer was predominantly unstructured. The CD spectrum measured at 50 °C was nearly identical to that at 53 °C, which led us to conclude that the RII monomer was fully denatured at a 50 °C.

The RII monomer crystallized at high protein and Ca2+ concentrations

The RII monomer is exceptionally soluble in the presence of Ca2+, and was concentrated to ~ 80 mg·mL−1 for crystallization. Crystals of the RII monomer grew at room temperature in two different precipitant solutions that included either 0.2 m calcium acetate or 0.2 m calcium chloride. No crystal growth was observed in the precipitant solution that contained low or no Ca2+. The RII monomer crystals obtained from two different growth conditions have space groups of P21 and P1 (Table 2), whose structures were solved to resolutions of 2.4 Å and 1.35 Å, respectively. The P21 structure was determined by the single-wavelength anomalous dispersion (SAD) method with Ca2+ as the heavy atom to obtain phase information, and was then used as the search model to solve the high-resolution structure (P1) by molecular replacement. The electron density maps were well defined, and > 90% of the structure was automatically built by phenix autobuild and arp/warp from ccp4 [24, 25]. Apart from the N-terminal His-tag, only the first two residues (TE) of the RII monomer lacked sufficient electron density for unambiguous modeling.

Table 2. Data collection and refinement statistics for MpAFP_RII.
Data collectionSAD datasetNative dataset
Space groupP21P1
Cell dimensions
(a, b, c) (Å)28.69, 43.02, 32.2625.64, 28.62, 32.25
(α, β, γ) (°)90, 96.91, 9097.02, 112.93, 96.88
Resolution (Å)32.03–2.12 (2.26–2.12)29.19–1.35 (1.42–1.35)
No. of molecules/asymmetric unit11
I/σI17.0 (3.2)19.6 (10.3)
R merge 0.0790.049
Completeness89.5 (69.6)95.1 (91.6)
Resolution (Å)25.7–2.42 (2.48–2.42)29.19–1.35 (1.42–1.35)
No. of reflections2605 (150)16 337 (1107)
Rwork/Rfree (%)14.4/23.413.0/16.5
No. of atoms: protein/ligand/water691/5/66700/13/140
B-factors (Å2): protein/ligand/water21.6/30.1/23.18.7/11.3/21.4
rmsd values
Bond lengths (Å)0.0120.025
Bond angles (°)1.542.27
Table 3. Deconvolution of the CD spectra for the RII monomer measured in the presence of EDTA and increasing amounts of CaCl2.
Secondary structures0.1 mm EDTA1 MEq CaCl22 MEq CaCl23 MEq CaCl210 MEq CaCl2
α-Helix (%)3571112
Turn + coil8886776550
Figure 2.

CD spectra of the RII monomer measured in Ca2+ titration and thermal denaturation experiments. (A) The far-UV CD spectra of the RII monomer were plotted as molar ellipticity versus wavelength. The CD spectrum in the presence of 100 μm EDTA is indicated by a black hatched line, and the CD spectrum in the absence of both EDTA and Ca2+ is indicated by a blue continuous line. The CD spectra in the presence of 1, 2, 3, 4, 5, 10 and 20 MEq of CaCl2 with repect to the RI monomer are indicated by brown, cyan, green, pink, orange and purple continuous lines and a red hatched line, respectively. Arrows point to the blue trace and black-hatched trace at the bottom, the cyan trace in the middle, and the purple trace and red-hatched trace at the top. (B) Fractional change in CD intensity at 194 nm plotted as a function of total Ca2+ concentration. (C) The far-UV CD spectra recorded for the RII monomer in the presence of 10 MEq of Ca2+ measured at various temperatures. From 1 to 53 °C, the spectra are represented by a black continuous line (1 °C), a blue hatched line (4 °C), brown (25 °C), cyan (35 °C), green (38 °C), pink (40 °C), orange (42 °C), purple (45 °C), red (48 °C) and yellow (50 °C) continuous lines, and a black hatched line (53 °C). Arrows point to the spectra recorded at 1 °C at the top, 40 °C in the middle, and 50 and 53 °C at the bottom.

The RII monomer is a Ca2+-bound Ig-like β-sandwich

The RII monomer folds as a Ca2+-bound Ig-like β-sandwich with a length of 50 Å, a width of 23 Å, and a height of 28 Å (Fig. 3A). The fold contains seven antiparallel β-strands arranged in a Greek key topology, and two short α-helical elements located close to the C-terminal end of the structure. Therefore, on the basis of crystallography, the secondary structure content of the RII monomer (11% α-helix, 50% β-strand, and 39% coil + turn) deviates only slightly from the CD prediction (Table 3). Three β-strands (β1, β2, and β5) form one β-sheet that packs against a β-hairpin (β3 and β4) through hydrophobic interactions. These five β-strands (β1–β5) and the two α-helices (α1 and α2) form the compact core region of the β-sandwich. A distinct feature of the RII monomer that deviates from many Ig-like β-sandwiches resides in the C-terminal β-hairpin (β6 and β7), which protrudes from the core region and is therefore more solvent-accessible. There are seven Ca2+ ions bound to the P1 structure (Fig. 3A); these include the ones coordinated by residues from entirely within the monomer (intramolecular Ca2+) and those that reside in the interface between one 104-residue repeat and the symmetry-related molecules (intermolecular Ca2+). In contrast, only five Ca2+ ions were bound to the P21 structure. An alignment of the P1 and P21 structures with pymol produced an rmsd of 0.19 Å, indicating only minor conformational differences between the two structures (not shown). Close inspection of the structural alignment revealed that β6* and β7* in the P21 structure are three residues longer than β6 and β7 in the P1 structure, where these residues were represented as portions of the flexible coils on the sides of the β-strands.

Figure 3.

Structures of the RII monomer. (A) The structures of the P1 (gray) and P21 (orange) crystal forms are illustrated in cartoon representation, and the Ca2+ atoms are shown as spheres. The N-terminus (left) and the C-terminus (right) of the P1 structure are indicated. Secondary structures and Ca2+ ions are indicated by regular numbers in the P1 structure, and by numbers with asterisks (*) in the P21 structure. Intramolecular Ca2+ ions are colored light and dark green, and intermolecular Ca2+ ions are colored blue. (B) Structural alignment of the Phyre2-predicted RII monomer fold (purple) and the P1 structure of the RII monomer (gray). The N-terminus and C-terminus are indicated as above.

The RII monomer structure is stabilized by the intramolecular Ca2+ ions

The 104-residue repeat sequence of MpAFP_RII is characterized by a high percentage of short-chain amino acids, such as Thr (20.2%), Ala (16.3%), Val (12.5%), Gly (10.6%), and Ser (8.7%). It is also rich in acidic residues (Asp, 11.5%; Glu, 5.8%) and has no Lys or Arg residues, giving the protein a low pI of 3.17 [26]. In the P1 structure, four acidic residues (Asp16, Asp17, Asp76, and Glu23; Fig. 1A) contribute their side-chain oxygen atoms to help heptacoordinate Ca2+ 1, 2 and 3 (Figs 3A and 4A), respectively. Ca2+ 1 appears to be the most intimately bound Ca2+, as it is locked into place by five oxygen atoms from Glu23, Thr14, Asp16 and Val18, as well as two water molecules (Fig. 4B). The only main-chain coordinating atom is the carbonyl oxygen of Val18. The binding of Ca2+ 1 mediates the interaction between α1 and the coil on the C-terminal flank of β1 (Fig. 4A), which helps to keep the α and β elements in close proximity to each other.

Figure 4.

Stabilization of the RII monomer by intramolecular Ca2+ 1, 2, and 3. (A) An overview of Ca2+ 1, 2 and 3 bound to the P1 structure of the RII monomer. The arrow points to the C-terminal loop that follows β7. Secondary structures near the Ca2+-binding sites are identified. The color scheme is the same as in Fig. 3A. (B) Enlarged view of the binding site of Ca2+ 1. The side chains and main chains of the Ca2+ 1-coordinating residues are shown in stick representation. Hatched lines indicate hydrogen bonds. Oxygens are in red, nitrogens are in blue, and water molecules are shown as small aqua spheres. (C, D) The binding sites of Ca2+ 2 and 3 are shown in the same manner as for Ca2+ 1 in (B).

Ca2+ 2 and 3 are more solvent-accessible than Ca2+ 1, as each is coordinated by four water molecules and only three oxygen atoms from the protein (Fig. 4C,D; Fig. S2). Ca2+ 2 binds to Asp17 and Thr100, where the two distantly linked residues are found near to the N-terminus and C-terminus of the sequence, respectively. Thus, Ca2+ 2 mediates the long-range interaction by acting as a bridge that links the two faces of the β-sandwich fold. In contrast, Ca2+ 3 binds to Asp76 and Ala78, which are separated by only one residue (Thr77). The Ca2+ binding helps to make an ~ 90° turn of the coil linking α2 and β6, which guides the β-strand to maintain contact with the core region of the fold.

Two other intramolecular Ca2+ ions (6 and 7) are also each coordinated by four waters (Fig. S1). However, Ca2+ 6 is coordinated by only one oxygen atom from Gly51, whereas Ca2+ 7 is coordinated by two oxygen atoms from the side chain and main chain of Ser86. Thus, these weakly bound intramolecular Ca2+ ions do not appear to have a significant impact on the fold of the RII monomer. No such Ca2+ ions (6 and 7) were observed at the corresponding sites in the P21 structure.

Crystal packing of the RII monomer was facilitated by intermolecular Ca2+ ions

In addition to the intramolecularly bound Ca2+ ions that appear to stabilize the fold of the RII monomer, Ca2+ ions are also shared between each RII monomer and the symmetry-related molecules (Fig. 5). Two such Ca2+ ions (Ca2+ 4 and 5) were found in the P1 structure, whereas three (Ca2+ 3*, 4*, and 5*) were found in the P21 structure.

Figure 5.

Intermolecular Ca2+ coordination between the RII monomer and its symmetry-related molecules. (A) The coordination of Ca2+ 4 and 5 by the monomer (P1, gray) and two symmetry-related molecules (Sym A and Sym B, purple). The color scheme for the Ca2+ ions is the same as described in Fig. 3A. The arrows point to Ca2+ 4 and 5. (B) Enlarged view of the binding sites of Ca2+ 4 and 5. The side chains and main chains of the Ca2+-coordinating residues are shown in stick representation. The color scheme is the same as in (A) and Fig. 4B. (C) The coordination of Ca2+ 3* (indicated by an arrow) by the monomer (P21, orange) and a symmetry-related molecule (Sym A*, purple). (D) Enlarged view of the binding sites of Ca2+ 3*. The side chains and main chains of the Ca2+-coordinating residues are shown in stick representation. The color scheme is the same as in (B) and (C).

Ca2+ 4 and 5 appeared in the same sites as those of Ca2+ 4* and 5*, which bind to residues from the symmetry-related molecules (Sym A and Sym B) as well as the monomer (Fig. 5A). Sym B is located towards the C-terminal flank of the monomer, where the two molecules interact in a head-to-tail fashion via Ca2+ 5; Sym A is located above the interdomain region between the monomer and Sym B, and binds to both Ca2+ 4 and Ca2+ 5. As shown in Fig. 5B, Ca2+ 4 is more solvent-accessible than Ca2+ 5, as it binds to four water molecules in addition to Asp104 (monomer) and Glu52 (Sym A). Furthermore, Ca2+ 4 is not coordinated by Sym B. In contrast, only two water molecules bind to Ca2+ 5, and its other ligands are six oxygen atoms from Asp104 (P1 monomer), Glu52 (Sym A), and Asp38 (Sym B).

Ca2+ 3* (P21) is bound at the same position as Ca2+ 3 (P1) with respect to the monomer, in which it binds to Asp76* and Ala78* to form a Ca2+-binding turn that appears to be important for the integrity of the β-sandwich fold. However, Ca2+ 3* is also shared between the P21 monomer and one of its symmetry-related molecules (Sym A*; Fig. 5C). In addition to the monomer and two water molecules, the heptacoordination of Ca2+ 3* is completed by two oxygen atoms from residues (Glu63 and Asp59) of Sym A* (Fig. 5D). This intermolecular Ca2+ helped to pack Sym A* below the P21 monomer in an antiparallel manner within the crystal.

Bioinformatic analyses revealed no other solved structures with homology to MpAFP_RII

A blastp search of the 104-residue repeat sequence against the nonredundant protein database (expect threshold = 10−4) at NCBI identified a minimum of 538 homologous proteins that contain at least one similar ~ 100-residue repeat. Obtaining a precise estimate of the total copy number of similar ~ 100-residue repeats in the database is difficult, owing to incomplete sequencing (at either the N-terminus or C-terminus, or sometimes both) of many of the proteins encountered. The repeat is found predominantly in extremely large (typically thousands of amino acids) putative virulence factors, adhesins, and hypothetical proteins produced by Gram-negative γ-proteobacteria such as Vibrio, Shewanella, and Acinetobacter. A second blastp search of a single 104-residue repeat sequence against the Protein Data Bank (PDB) database identified no sequences homologous (expect threshold = 1) with proteins of known structures.

We therefore used the Dali server to search for folds with a similar topology to the RII monomer structure, which identified many Ig-like β-sandwich domains from both eukaryotic and prokaryotic proteins [27]. These folds (90–100 amino acids) typically appear as tandem segments within larger proteins (> 450 amino acids) that have a broad range of functions, including the bacterial collagenases, putative bacterial membrane-anchored proteins, and the human complement C3-5 proteins. All of these proteins are localized in the extracellular space or on the cell surface, and the Ig-like folds usually function as the spacer or the adhesion domains. For instance, the top alignment was to an Ig-like structure of a β-1,4-mannanase from the soil bacterium Cellulomonas fimi, which produced a Z-score of 8.9 and an rmsd of 2.3 Å over 79 aligned residues. The exact function of this Ig-like domain is unknown; however, it has been proposed to act as a spacer between the catalytic domain and a carbohydrate-binding module [28]. Among all of the high-scoring structures retrieved from the Dali server, none aligned to the RII monomer with a Z-score of > 9 or an rmsd of < 1.9 Å. The highest sequence identity was 23%, which is close to the cutoff value for homology, and may reflect the high frequency of certain amino acids (Thr, Val, Asp, and Gly) in the Ig-like folds rather than a recognizable relationship by descent.

Although predictive structural homology studies gave key insights into better boundary residues for the 104-residue repeat sequence, some notable differences were identified when a Phyre2 model of the RII monomer was compared with the crystal structures. The Phyre2 server predicted neither α-helical elements nor bound Ca2+ ions, and the model folds as a relatively compact Ig-like β-sandwich without the outward projecting C-terminal β-hairpin observed in the RII monomer structures. A structural alignment (residues 3–104) of the model and the P1 structure produced an rmsd of 4.5 Å, which supported the observation of the differences discerned between the folds (Fig. 3B). With hindsight, these conformational disparities might explain why it was not possible to solve the RII monomer structure by molecular replacement with the Phyre2 model or its template structures (e.g. PDB: 3PE9, 3PDG and 2A9D).


The exceptionally repetitive nature of the tandem 104-.residue sequence made attempts to define the exact boundary residues difficult, especially as part of the RII sequence extends into the adjoining portions of RI and RIII [12]. The original construct of the RII monomer had a propensity to aggregate, and subsequent crystallization trials were unsuccessful. After redesign of the 104-residue sequence boundaries with the Phyre2 server, the new construct of the RII monomer behaved as a monomeric species in solution with little or no tendency to aggregate. We were then able to crystallize the construct, and solved the structure of the RII monomer with X-ray methods, which revealed that the start and end residues of the originally designed 104-residue repeat reside in β7 and β6, respectively (Fig. 1A). As a result, the C-terminal hydrophobic groove of one monomer could associate with the N-terminal β-strand extension from another monomer, possibly in a similar way as the donor-strand complementation mechanism employed by bacterial pili operates to assemble their Ig-like subunits [29]. This association between the individual monomers led to the formation of aggregates, which eluted from a size-exclusion column with a mapp that is approximately seven-fold larger than its mact.

The Ca2+ dependency of MpAFP_RII was first demonstrated by CD, which showed that the RII monomer requires a 10-fold molar ratio of Ca2+ to be fully folded. The results corresponded well with the observation from the crystallization of the RII monomer, which only occurred in the presence of 0.2 m Ca2+. Bioinformatics analyses demonstrated that no solved structures have sequence homology to the 104-residue repeat, indicating that the RII monomer might have a novel fold. Indeed, no search models for molecular replacement were identified in the PDB. Similarly, we were not able to obtain phase information from the Phyre2 models by molecular replacement, even though the structural homology tool was insightful in defining the boundary residues for the RII repeats. The high Ca2+ requirement of the RII monomer turned out to be helpful for the structure determination. As our in-house rotating anode X-ray generator is equipped with a Cr target, it is ideal for Ca2+-phasing. We were then able to solve the structure of the RII monomer by the SAD method with Ca2+ as the heavy atom to obtain phase information.

The importance of Ca2+ for proper folding of the RII monomer became more apparent after the crystal structure was elucidated. The 104-residue repeat folds as an Ig-like β-sandwich that binds to five or seven Ca2+ ions, three of which appear to help stabilize the Ig-like module. This Ca2+ requirement for folding is a distinct feature of MpAFP_RII as compared with all other reported BIg domain structures. Despite recent reports demonstrating that some BIg domains bind to Ca2+ [30-32], these modules fold in the absence of Ca2+. A number of Ig-like modules are known to fold via the nucleation–condensation mechanism, in which the proteins initiate folding by packing the nonconserved hydrophobic residues in the B-strand, C-strand, E-strand and F-strand of the β-sandwich [33]. Although hydrophobic residues are present in the corresponding β-strands of the 104-residue repeat (Val28 and Val29 in β2; Val45, Leu47 and Ile49 in β3; Trp64, Val66 and Val68 in β5; Val83 in β6), the RII monomer is predominantly unstructured in the absence of Ca2+ (Fig. 2; Table 3), suggesting that it may employ a novel folding pathway that is induced by Ca2+ binding. It was observed during CD that, upon addition of one molar ratio of Ca2+ to the unstructured RII monomer, there was an increase in the signal for α-helices, but no change for β-strands (Table 3). As Glu23 from α1 contributes two side-chain oxygen atoms to coordinate Ca2+ 1 (Fig. 4B), it is possible that this ionic interaction helps to fold α1 first, which then serves as a nucleus to initiate folding of the RII β-sandwich.

Two other intramolecular Ca2+ ions (Ca2+ 2 and 3) also appeared to be required to help maintain the proper fold of the RII monomer (Fig. 4C,D); however, the coordination of the two Ca2+ ions is atypical, as each of them binds to only three protein oxygen atoms in addition to four water molecules. However, this is not unprecedented, as it has been shown in the crystal structures of some proteins that one Ca2+ can be coordinated by up to six water molecules [34]. Furthermore, CD showed that the Ca2+-stabilized RII monomer can maintain its β-rich fold up to a temperature of 40 °C, which is unusual for a protein isolated from a psychrophilic organism that inhabits water at temperatures ranging from −1 to 1 °C [10].

Interestingly, Ca2+ also assists with the packing of the 104-residue repeats in both crystal forms, as they are coordinated at the interfaces between one monomer and its symmetry-related counterparts. This is especially important, as the intermolecular Ca2+ 5 (5*) helps one monomer to interact with the symmetry-mate on its C-terminal flank in a head-to-tail fashion. With the symmetric nature of the crystals, this Ca2+-mediated interaction is repeated throughout the crystal, which aligns the monomers into an extended, rod-like structure. This is reminiscent of some other Ig module-containing proteins found in both eukaryotes and bacteria, which use Ca2+ to stabilize their quaternary structures. For example, cadherins are a family of eukaryotic proteins that function to mediate cell–cell adhesion [35]. Although the individual Ig modules of the cadherin ectodomain fold into their native structures in the absence of Ca2+, connections between the successive ectodomains are rigidified by Ca2+. This Ca2+-mediated interaction is absolutely required for maintaining the proper distance between cells for adhesion. Studies on the bacterial S-layer proteins also showed that interdomain Ca2+ ions help to arrange individual BIg domains into their correct quaternary structures, which are necessary for the assembly of S-layers [36]. Again, these BIg domains do not require Ca2+ for folding. Therefore, it is likely that the tandem 104-residue repeats of RII mimic these Ig-like folds by using Ca2+ to stabilize their junction regions. The two N-terminal residues (Thr1 and Glu2) were not observed in the electron density maps, as they closely follow the disordered His-tag. However, these two residues and the C-terminal Asp104 comprise the intervening loop regions that link the 104-residue repeats, where they are likely to coordinate an intermolecular Ca2+ [Ca2+ 5 (5*)] (Figs 1A and 6A).

Figure 6.

Hypothetical architecture of the RII domain via Ca2+ coordination. (A) The monomer (gray) in the asymmetric unit and its flanking symmetry-related molecules (purple) interact via intermolecular Ca2+ ions in the crystals. (B) Ice that covers the surface of Ace Lake is shown as a thick black line with irregular pits underneath. The lake water is colored cyan. The oxygen content decreases with depth in the upper reaches of the lake (0–12 m), as shown by the gray triangle on the left. The corresponding depths for the oxygenated and anoxic zones are marked on the right. The oblong bacterium M. primoryensis has flagella, which are represented as two squiggles on the left side of the bacterium. MpAFP (RI–RV) and other molecules (short black lines) are located on the surface of the bacterium. RI, RIII and RV are represented by black spheres, and RII and RIV are represented by purple and red oval spheres, respectively. Intermolecular Ca2+ ions that rigidify the junction regions linking each 104-residue repeat are indicated by small blue spheres. Cross-hatched lines are used to represent the RII repeats not shown in the diagram. The objects in the diagram are not drawn to scale. RI, region I; RIII, region III; RV, region V.

If the quaternary structure of MpAFP_RII were stabilized by Ca2+ in a similar manner to that observed in the crystals, RII alone could be ~ 0.6 μm long, which is one-third of the length of the M. primoryensis cell. The Ca2+-rigidified chain of RII repeats would help to keep RIV a sufficient distance away from the cell surface layer laden with other proteins and polysaccharides that might otherwise hinder the interaction of RIV with the ice surface. Preliminary structural homology studies on RI, region III and region V of MpAFP also identified potential Ca2+-binding sites in these domains. Taken together with the Ca2+ ions observed in the crystal structures of the RII monomer and RIV, this gives an estimate of > 600 Ca2+ ions that help to form the rigid rod structure of the full-length MpAFP. The remarkably high Ca2+ requirement of MpAFP is satisfied by high levels of divalent metal ions [Ca2+ (3–3.5 mm) and Mg2+ (35–40 mm)] from the shallow (0–12 m) areas of the brackish Ace Lake, which is located in the Vestfold Hills of Eastern Antarctica [15]. As the ice-covered lake surface allows only limited sunlight penetration, photosynthetic microorganisms swarm just underneath the ice, priming the shallower regions with a high concentration of oxygen and other nutrients, while the deeper areas of lake (12–25 m) are anoxic. Therefore the upper reaches of Ace Lake provide the strictly aerobic microorganisms such as M. primoryensis with an ideal habitat.

As previously described, MpAFP resembles many other giant RTX adhesins from Gram-negative bacteria [12, 14]. Although low sequence identity (< 20%) is shared between the repetitive domains of these adhesins and MpAFP_RII, bioinformatics analyses have indicated that they are also tandem BIg modules. Indeed, the recently solved structure of such segments from the epithelial adhesin SiiE (Salmonella enterica) showed three tandem BIg domains with their intervening loop regions rigidified by Ca2+ coordination [37]. The authors proposed that the Ca2+-induced region of BIg repeats can help SiiE reach out beyond the Salmonella lipopolysaccharide layer, which is key to promoting the adhesion to host cells. This study provides supporting evidence for the hypothesized role of RII in helping M. primoryensis to bind to ice [12], which might be a general mechanism for large RTX adhesins.

All large RTX proteins, including MpAFP, need to be exported out of the cells through the type I secretion system (TISS). The TISS allows these extremely large proteins to be channeled across the inner and outer membranes as unfolded chains without forming any periplasmic intermediates. Indeed, these proteins are only likely to fold upon binding Ca2+ in extracellular environments (i.e. seawater and blood). Therefore, we hypothesize that BIg modules from other large adhesins may be dependent on Ca2+ for folding in the same manner as the RII monomer.

In conclusion, we have reported the crystal structure of a novel Ca2+-dependent BIg domain found in the Antarctic Gram-negative bacterium M. primoryenesis, and its potential role in helping the organism bind to ice. This work may serve to give insights into many other large RTX proteins, which have been characterized as large adhesins secreted via the TISS.

Experimental procedures

Construct design and cloning of 312-bp repeat RII

A protein sequence of two tandem repeats of MpAFP_RII (208 amino acids), starting at TTGSSTHTVD (Fig. 1A) and ending at SSDAAGNTVD, was submitted to the Phyre2 server for homology modeling [22]. The resulting model helped to define the domain boundaries of the 104-residue RII repeat. The reassessed 104-residue domain begins at the N-terminal sequence TEATAG and ends in STHTVD. On the basis of this finding, a PCR product encoding a single 104-residue repeat was amplified from M. primoryensis genomic DNA with the forward primer 5′-GATCCATATGACAGAAGCGACGGCTGGTAC-3′ and the reverse primer 5′-CTAGCTCGAGCTAATCAACCGTATGAGTTGAAGAAC-3′ (NdeI and XhoI cut sites are underlined; stop codons are italicized). Following restriction digestion, the PCR product was ligated into the corresponding sites of the pET-28a vector. Positive clones were identified by analytical restriction digestion followed by DNA sequencing (Robarts Research Institute, London, Ontario, Canada).

Expression and purification of the RII monomer

Positive clones were electroporated into the Escherichia coli BL21DE3 (star) expression cell line. A 1-L culture was grown in the presence of 100 μg·mL−1 kanamycin at 37 °C with shaking up to D600 nm = 0.5. The culture was then switched to 23 °C until D600 nm reached 1, whereupon protein production was induced by the addition of 1 mm isopropyl thio-β-d-galactoside, and growth was continued overnight at 23 °C with shaking. The cell pellet was recovered by centrifugation (4540 g for 30 min), and lysed by sonication in buffer containing 50 mm Tris/HCl (pH 9), 500 mm NaCl, and 2 mm CaCl2. Cellular debris and insoluble matter were removed by centrifugation for 0.5 h at 30 938 g in a JA25.5 rotor. The N-terminally His6-tagged protein was purified by Ni2+–nitrilotriacetic acid affinity chromatography. The lysate supernatant containing the RII monomer was mixed with 5 mL of Ni2+–nitrilotriacetic acid resin and brought to 200 mL in buffer N (50 mm Tris/HCl, pH 9, 500 mm NaCl, 5 mm imidazole, 2 mm CaCl2). Following stirring for 30 min, the resin was loaded into a column and washed with three column volumes of buffer N. The RII monomer was eluted with buffer N + 400 mm imidazole. The RII monomer was then buffer-exchanged into a solution of 50 mm Tris/HCl (pH 9), 200 mm NaCl and 5 mm CaCl2 with a centrifugal filter (Millipore, Billerica, MA, USA), and this was loaded onto a HiLoad 16/60 Superdex-200 size-exclusion column (GE Healthcare, Little Chalfont, UK) for further purification. Fractions containing the RII monomer were pooled, and an aliquot (500 μL) was loaded onto a 10/300 GL Superdex-75 size-exclusion column to evaluate the protein's folding. The RII monomer's Ve/Vt ratio was calculated and compared with those of the protein standards. The rest of the pooled fraction was buffer-exchanged into a solution of 20 mm Tris/HCl (pH 9) and 5 mm CaCl2, and concentrated to approximately 80 mg·mL−1 with a centrifugal filter. The protein concentration was measured with a Nanodrop spectrophotometer (Thermal Fisher Scientific, Waltham, MA, USA), and the purity was assessed by 10% SDS/PAGE.

CD of the RII monomer

The RII monomer was dialyzed against 10 mm Tris/HCl (pH 9) and 0.1 mm EDTA. Aliquots (200 μL) of the protein were diluted to a concentration of 30 μm in dialysis buffer. To determine the effect of Ca2+ on the folding of the RII monomer, CaCl2 was added to individual aliquots in order to achieve 0 : 1, 1 : 1, 2 : 1, 3 : 1, 4 : 1, 5 : 1, 10 : 1 and 20 : 1 molar ratios of CaCl2/RII monomer. Seven scans were taken for each sample at 23 °C with a Chirascan CD Spectrometer (Applied Photophysics, Leatherhead, Surrey, UK). All scans for each sample were averaged and buffer reference-subtracted, and three-point smoothing was applied to the data with proviewer software. Deconvolution of the spectra was performed with OLIS SpectralWorks (On-Line Instruments, Bogart, GA, USA). An aliquot of the 10 : 1 ratio of CaCl2/RII monomer was used for the thermal denaturation experiment, during which a range of temperatures from 1 to 65 °C was used to test the effect of increasing temperature on the folding of the RII monomer.

Crystallization, data collection, and structure determination

Two crystal forms were obtained with microbatch methods by mixing equal volumes (1 μL) of 80 mg·mL−1 RII monomer with precipitant solutions. The first crystal form was obtained in a precipitant solution of 0.2 m calcium acetate, 0.1 m sodium acetate (pH 4.5), and 30% (v/v) poly(ethylene glycol) 400. Single rod-shaped crystals formed at room temperature over a period of ~ 5 weeks. Prior to data collection, the crystal was flash-frozen in a cryo solution of 20% (v/v) glycerol and 80% (v/v) of the precipitant solution. Diffraction data from these crystals were collected in-house with a Rigaku MicroMax-007 HF rotating anode X-ray generator equipped with a Cr target and an R-AXIS IV++ image-plate detector equipped with a helium-flushed beam path cone to reduce air absorption. A total of 360 images were collected from a single crystal with an oscillation range of 1° and an exposure time of 5 min per image, with the crystal-to-detector distance set to 105 mm. Diffraction images were indexed and integrated with imosflm [38], and were scaled with scala [39]. The structure was solved with SAD methods in phenix autosol with Ca2+ as the heavy atom to obtain phase information [40, 41]. The initial model of the RII monomer was built with phenix autobuild [42]. The model was corrected and rebuilt with ccpbuccaneer [24, 43], arp/warp 7.3.0, and coot 0.7 [25, 44]. The structure was refined with ccprefmac5 [45].

Owing to difficulties in reproducing crystals in the above condition, a second precipitant solution containing 0.2 m CaCl2, 0.1 m Tris/HCl (pH 8.5) and 20% (w/v) poly(ethylene glycol) 4000 was used to grow crystals for high-resolution data collection. Crystallization occurred at room temperature, with long plate-like crystal clusters appearing after 4 weeks. Single, long plate-like crystals were released from the clusters with the use of a microneedle (Hampton Research, Aliso Viejo, CA, USA). Prior to data collection, the crystal was flash-frozen in a cryo solution of 20% (v/v) glycerol and 80% (v/v) of the precipitant solution. Data were collected at the 23-ID-B beamline of the Advanced Photon Source (Argonne National Laboratory, Lemont, IL, USA) via remote access. Diffraction images were indexed, integrated and scaled with the same methods as described above. The high-resolution structure was solved by molecular replacement with phenix phaser-mr [40, 46], with the low-resolution RII monomer structure as the search model. Manual building of the structure was performed in coot 0.7, and was refined with ccprefmac5 [44, 45]. Molecular graphics images were prepared with pymol (The PyMOL Molecular Graphics System, Version; Schrödinger, LLC, Portland, OR, USA).


We thank K. Munro from the Protein Function Discovery at Queen's University for his help with acquiring and interpreting CD data, Z. Jia for sharing remote access to the synchrotron facilities in the Advanced Photon Source (Argonne National Laboratory), and S. Gauthier for other technical assistance. This work was funded by a grant from the CIHR to P. L. Davies. C. P. Garnham was the recipient of an NSERC-PGSD3 scholarship and an R. Samuel McLaughlin fellowship. P. L. Davies holds the Canada Research Chair in Protein Engineering.