The fimbrial adhesin F17-G of enterotoxigenic Escherichia coli has an immunoglobulin-like lectin domain that binds N-acetylglucosamine


  • Lieven Buts,

    1. Department of Ultrastructure, Institute for Molecular Biology, Vrije Universiteit Brussel, Vlaams Interuniversitair Instituut voor Biotechnologie (VIB), Brussels, Belgium.
    Search for more papers by this author
    • LB and JB are joint first authors.

  • Julie Bouckaert,

    1. Department of Ultrastructure, Institute for Molecular Biology, Vrije Universiteit Brussel, Vlaams Interuniversitair Instituut voor Biotechnologie (VIB), Brussels, Belgium.
    Search for more papers by this author
    • LB and JB are joint first authors.

  • Erwin De Genst,

    1. Department of Ultrastructure, Institute for Molecular Biology, Vrije Universiteit Brussel, Vlaams Interuniversitair Instituut voor Biotechnologie (VIB), Brussels, Belgium.
    Search for more papers by this author
  • Remy Loris,

    1. Department of Ultrastructure, Institute for Molecular Biology, Vrije Universiteit Brussel, Vlaams Interuniversitair Instituut voor Biotechnologie (VIB), Brussels, Belgium.
    Search for more papers by this author
  • Stefan Oscarson,

    1. Department of Organic Chemistry, Arrhenius Laboratory, Stockholm University, Stockholm, Sweden.
    Search for more papers by this author
  • Martina Lahmann,

    1. Department of Organic Chemistry, Arrhenius Laboratory, Stockholm University, Stockholm, Sweden.
    Search for more papers by this author
  • Joris Messens,

    1. Department of Ultrastructure, Institute for Molecular Biology, Vrije Universiteit Brussel, Vlaams Interuniversitair Instituut voor Biotechnologie (VIB), Brussels, Belgium.
    Search for more papers by this author
  • Elke Brosens,

    1. Department of Ultrastructure, Institute for Molecular Biology, Vrije Universiteit Brussel, Vlaams Interuniversitair Instituut voor Biotechnologie (VIB), Brussels, Belgium.
    Search for more papers by this author
  • Lode Wyns,

    1. Department of Ultrastructure, Institute for Molecular Biology, Vrije Universiteit Brussel, Vlaams Interuniversitair Instituut voor Biotechnologie (VIB), Brussels, Belgium.
    Search for more papers by this author
  • Henri De Greve

    Corresponding author
    1. Department of Ultrastructure, Institute for Molecular Biology, Vrije Universiteit Brussel, Vlaams Interuniversitair Instituut voor Biotechnologie (VIB), Brussels, Belgium.
    Search for more papers by this author

E-mail; Tel. (+32) 2 629 19 11; Fax (+32) 2 629 19 63.


The F17-G adhesin at the tip of flexible F17 fimbriae of enterotoxigenic Escherichia coli mediates binding to N-acetyl-β-d-glucosamine-presenting receptors on the microvilli of the intestinal epithelium of ruminants. We report the 1.7 Å resolution crystal structure of the lectin domain of F17-G, both free and in complex with N-acetylglucosamine. The monosaccharide is bound on the side of the ellipsoid-shaped protein in a conserved site around which all natural variations of F17-G are clustered. A model is proposed for the interaction between F17-fimbriated E. coli and microvilli with enhanced affinity compared with the binding constant we determined for F17-G binding to N-acetylglucosamine (0.85 mM−1). Unexpectedly, the F17-G structure reveals that the lectin domains of the F17-G, PapGII and FimH fimbrial adhesins all share the immunoglobulin-like fold of the structural components (pilins) of their fimbriae, despite lack of any sequence identity. Fold comparisons with pilin and chaperone structures of the chaperone/usher pathway highlight the central role of the C-terminal β-strand G of the immunoglobulin-like fold and provides new insights into pilus assembly, function and adhesion.


Bacteria produce diverse surface organelles known as fimbriae or pili, which are involved in adhesion to eukaryotic cells (Low et al., 1996). In host–pathogen interactions, adhesion is the key event in the early stages of infection (Hultgren et al., 1996) and precedes colonization and/or invasion of host cells (Martinez et al., 2000). The specific recognition of host receptors determines host range and tissue tropism and is mediated by adhesins found at the tips or all along the length of the fimbriae.

F17 pili are 3-nm-wide, flexible, wire-like organelles of enterotoxigenic Escherichia coli built up of the major pilin F17-A and exposing the F17-G adhesin at their tips (Lintermans et al., 1988). The F17-G adhesin is the essential fimbrial adhesion factor that mediates attachment to intestinal microvilli, leading to diarrhoea or septicaemia in ruminants. The binding of F17-enterotoxigenic bacteria to microvilli can be inhibited by incubation with N-acetyl-d-glucosamine (GlcNAc; Girardeau, 1980) or GlcNAc oligomers (Bertin et al., 1996). The only fimbrial adhesins for which the structural basis of carbohydrate specificity has been characterized are the PapG adhesin (Dodson et al., 2001) of P pili, found on pyelonephritic E. coli (Svanborg Edén and Hansson, 1978; Normark et al., 1986), and the FimH adhesin (Hung et al., 2002) of type 1 pili, found on uropathogenic E. coli (Brinton, 1965; Martinez et al., 2000). The PapG- and FimH-dependent adhesion of E. coli uropathogens is inhibited by α-d-galactopyranosyl-(1-4)-β-d-galactopyranose (Lund et al., 1987) and α-d-mannose (Gaastra and de Graaf, 1982) respectively.

The appearance of the F17 fimbriae is similar to that of the tip fibrillum of type 1 and P pili. Type 1 and P pili are composed of a flexible tip fibrillum with an open helical structure connected to the end of a tightly wound helical rod made up of their major pilin subunit (Kuehn et al., 1992). F17 pili, like the better characterized type 1 and P pili, are assembled on the cell surface via the chaperone/usher pathway (Thanassi et al., 1998). Pilus subunits all have an incomplete immunoglobulin-like fold, lacking the canonical C-terminal β-strand G (Choudhury et al., 1999; Sauer et al., 1999; Barnhart et al., 2000). The absence of this strand creates a long groove on the pilin surface, exposing the hydrophobic core, and leads to pilin aggregation as well as activation of periplasmic proteases such as DegP (Krojer et al., 2002). However, when the unfolded pilins are exported to the periplasm, a specific chaperone complements for the missing strand by providing a donor strand, the C-terminal G1 strand of its N-terminal immunoglobulin-like domain, in a parallel β-strand pairing. The chaperone primes the pilin subunit for accepting the N-terminal extension of the next pilin subunit to be assembled into the pilus, as a donor strand in an antiparallel β-strand pairing. This mechanism takes place at the outer membrane usher and is called donor strand exchange. The donor strand exchange is associated with a topological transition in the pilin involving the closing of two loops to lock the extension strand in (Sauer et al., 2002).

Analogous to the FimH and PapG fimbrial adhesins, F17-G is a two-domain protein linking an C-terminal pilin domain with an N-terminal carbohydrate-specific lectin domain (Hultgren et al., 1989; Choudhury et al., 1999), that has no sequence identity to any structure in the Protein Data Bank (PDB) (Bernstein et al., 1977). The pilin domain connects the lectin domain of the adhesin to the pilus body and has the highly conserved and incomplete immunoglobulin-like fold of the other pilus subunits (Choudhury et al., 1999; Sauer et al., 1999). Consequently, the adhesin is not a stable entity. To circumvent this difficulty, crystal structures were determined by truncating the PapG protein to contain only the lectin domain (Dodson et al., 2001) and by purifying the chaperone/adhesin complex FimC/FimH (Choudhury et al., 1999) respectively.

To gain a detailed insight into the interaction between carbohydrate receptor and adhesin, we determined the crystal structure of the lectin domain of the F17-G adhesin in the presence and absence of GlcNAc and measured the sugar specificities of two natural variants of F17-G. The structure of the F17-G/GlcNAc complex shows that the carbohydrate-binding site is distinct from that observed in PapG and FimH. Furthermore, we observed that the F17-G lectin domain fits into the immunoglobulin-like fold family of the pilins and chaperones involved in the chaperone/usher pathway. We also for the first time identify this fold in the lectin domains of the PapG and FimH adhesins, which leads to interesting new insights into pilus assembly, function and adhesion.


Structure of the F17a-G lectin domain

The lectin domains of five natural F17-G adhesin variants (Fig. 1) were overexpressed and purified by affinity chromatography. Well-diffracting crystals were obtained for the F17a-G variant. The F17a-G lectin domain structure was solved by multiple wavelength anomalous diffraction (MAD) after soaking a crystal with methyl 2-acetamido-2-deoxy-1-seleno-β-d-glucopyranoside (β-Me-SeGlcNAc) and refined to 1.75 Å resolution. Using this model, the structures of the F17a-G/GlcNAc complex and of the ligand-free protein have been refined at 1.65 and 1.75 Å resolution respectively (Table 1).The lectin domain has a compact, elongated shape based on a β-sandwich with two major sheets (Fig. 2A): a back sheet, consisting of five long strands (A2, G, F, C and D2) in mixed orientations, and a front sheet with four antiparallel strands (A1, B, E and D1). There are additional minor β-strands that extend other strands (A′, B′, F′ and G′) or form insertions (C′ and C′′). Strand nomenclature is based on the conventions established for antibody domains (Bork et al., 1994) and for PapD (Sauer et al., 1999). The surface loop comprising residues Thr22–Asp26, which connects strands A′ and B′ (Fig. 2A), is not visible in the electron density. This region corresponds to the 3–4 loop in FimH, which has been implicated in a shear stress detection mechanism (Thomas et al., 2002). The amino acid differences between the five natural F17-G variants (Fig. 1) were mapped onto the surface of the protein and found to be on the same face of the lectin domain as the binding site (Fig. 2B and C).

Figure 1.

Sequence alignment of the lectin domains of F17 adhesin variants F17a-G (Accession No. AF022140), F17b-G (Accession No. L14319), F17d-G (Accession No. L77091), F17e-G (210F17G, Accession No. AF055311) and F17f-G (377F17G, Accession No. AF055312). Amino acid differences are highlighted in light grey. Residues involved in interactions with the carbohydrate in F17a-G are on a black background. β-Strands are indicated with lines.

Table 1.  Crystal parameters and data collection statistics.
 Unit cell dimensionsBeamline,wavelength (Å)Resolutionrange (Å)Total/uniquereflectionsinline imageCompletenessRmerge (%)
a = b (Å)c (Å)
  1. Values between parentheses are for the highest resolution shell. The space group for all crystals is P6122. Crystal mosaicity was approximately 0.6 degrees in all cases.

β-Me-SeGlcNAc42.757273.758BW7A, 0.961150.0 (1.81)−1.75959107/1634610.3 97.1 (93.1)6.6 (32.1)
GlcNAc42.329268.714X13, 0.801925.0 (1.72)−1.65944586/18689 8.5>99.9 (99.9)7.2 (55.2)
Ligand-free42.862285.702BW7A, 0.961150.0 (1.81)−1.75684511/16982 6.9 91.2 (86.1)8.1 (54.6)
Figure 2.

A. Overall structure of the F17-G lectin domain. The strands of the back sheet are shown in yellow. The front sheet is in green and additional strands are in light blue. The binding site is indicated by the GlcNAc molecule in dark blue. The disulphide bridge connecting strands C and D2 is indicated in black.
B. Front face of the domain, with the carbohydrate-binding site indicated in yellow.
C. Back face, opposite the binding site. The natural amino acid differences between the five variants are indicated in red and the variation is found to be clustered on one side of the domain. The N- and C-termini of the domain are indicated. The three conserved lysine residues on the back face are in dark blue. Lysine amino groups are shown in light blue.

The N-acetyl-d-glucosamine-binding site

The F17a-G site binds GlcNAc and β-Me-SeGlcNAc in the same orientation (Fig. 3A). The site is formed by the carbonyl group of Ala43, the side chains of residues Asp88, Thr89, Trp109, Ser117, Thr118, Gln119 and the nitrogen of Gly120. These residues are conserved in the five cloned variants (Fig. 1), as well as in two additional alleles (CL114 and CL394; Cid et al., 1999). Interactions between the carbohydrate and the protein include 11 possible hydrogen bonds, of which four are mediated by water molecules, and the hydrophobic stacking of the Trp109 side-chain against the C5 and C6 atoms of the sugar. The N-acetyl group of GlcNAc contributes significantly to affinity compared with β-methyl-d-glucose (Table 2) due to a good complementarity of van der Waals surfaces between this group and the side-chains of Thr118 and Asn44, as well as the carbonyl group of Ala43 (Fig. 3B). GlcNAc binding shields approximately 140 Å2 of the protein surface from the solvent.

Figure 3.

A. Stereo view of GlcNAc in the binding site of F17a-G. Hydrogen bonds are indicated in green with heavy-atom distances in Å. Water molecules are represented as small spheres.
B. Illustration of the surface complementarity between GlcNAc and binding site. Trp109, which stacks against C5 and C6 of the sugar, is shown in dark green. Thr118(O) and Ser117(Oγ), which would clash with a substituent in the α-anomeric configuration, are in red. Ala43 and Asn44, which interact with the N-acetyl group, are shown in yellow.

Table 2.  Sugar specificities for two F17-G alleles determined by surface plasmon resonance.
CarbohydrateaKa(1/mm) F17a-GKa(1/mm) F17d-G
  • a

    . Trisaccharide, GlcNAc-(β1,2)-Man-(α1,6)-Man; Pentasaccharide, the biantennary N-linked core pentasaccharide GlcNAc(β1,2)Man(α1,3)[GlcNAc(β1,2)Man(α1,6)]Man.

  • b

    . Anomer mixture.

  • c. For these sugars, concentration-dependent binding was detected, but the association constant was below the threshold for accurate fitting (approximately 0.1 mM−1).

N-acetyl-d-glucosamineb0.85 ± 0.091.00 ± 0.21
Chitobiose1.09 ± 0.040.97 ± 0.08
Chitotriose1.02 ± 0.081.33 ± 0.11
Chitotetraose1.62 ± 0.081.92 ± 0.11
β-Me-SeGlcNAc4.58 ± 0.174.59 ± 0.56
N-acetyl-d-glucosamine-(β1,2)-mannoside2.78 ± 0.102.99 ± 0.14
Trisaccharidea0.64 ± 0.050.55 ± 0.08
Pentasaccharidea1.19 ± 0.041.24 ± 0.05
d-glucosebNo bindingNo binding
α-Methyl-d-glucoseNo bindingNo binding
d-mannosebNo bindingNo binding
d-galactosebNo bindingNo binding
N-acetyl-d-galactosaminebNo bindingNo binding

In the ligand-free protein structure, water molecules are found to mimic the O3 and O4 hydroxyl groups. Several other water molecules form a network that stacks against the indole ring of Trp109. Stacking of a water network with aromatic residues has been observed previously, usually in cases where a hydrogen bond interaction is also possible with the stacking residue (Lemieux, 1996). Indeed, in the structure of ligand-free F17a-G, one of the five water molecules makes a weak hydrogen bond interaction with the nitrogen atom of the Trp109 side-chain. The root mean square deviation (r.m.s.d) for the Cα atoms between the ligand-free protein and the GlcNAc complex is 0.36 Å and there are no major rearrangements of side-chains involved in interactions with the sugar upon binding. The disulphide bridge between cysteines 53 and 110 connects one of the loops forming the sugar-binding site to an adjacent loop (Fig. 2A). However, disulphide bridge formation is not required for GlcNAc binding, as the variant F17b-G can still bind this sugar even though Cys110 is replaced by a serine. In fact, the GlcNAc concentrations required to elute the F17b-G from the affinity column are measurably higher than those needed for F17a-G and F17d-G. The F17e-G and F17f-G variants also require higher GlcNAc concentrations for elution.

Carbohydrate specificity of the F17-G adhesin

Surface plasmon resonance was successfully used to determine the equilibrium affinity constants of the F17a-G and F17d-G variant proteins for carbohydrates, even monosaccharides (Table 2). Figure 4 exemplifies this by comparing the binding response for two monosaccharides. The monosaccharide α-d-methyl-glucopyranoside shows no binding above the background, while the β-anomer does bind in a concentration-dependent mode. The ratios of the affinities of the sugars (Table 2) are similar to the corresponding relative concentrations needed for inhibition of haemagglutination or inhibition of GlcNAc-agarose binding on the closely related GafD lectin (Tanskanen et al., 2001). Prior to measurements, the F17-G variants were covalently coupled via the free amino group of their lysines. Differences in the number (Fig. 1) and surface distribution (Fig. 2B and C) of the lysines however, affected the immobilization of the different variants. The amino groups of the conserved lysines 59, 131 and 142 appeared masked (Fig. 2C) and unsuitable for coupling, as the variants b, e and f of F17-G could not be immobilized in an active form. The additional Lys50 amino group of variant d appeared accessible and F17d-G could be captured with 100% activity. F17a-G has two more lysines, Lys37 and Lys90, the latter of which is very close to the binding site (Fig. 2B). Fortunately, Lys90 is not involved in the immobilization as the stoichiometry of sugar binding is 1:1 for all binding curves. To enable measurement of the carbohydrate specificity of the other variants, a polylysine tail was added C-terminally to the lectin domain. F17b-G with a polylysine tail could be immobilized and preliminary results indicate that it binds β-Me-SeGlcNAc with an affinity that is about three times higher than that of the variants F17a-G and F17d-G, consistent with the affinity chromatography observations.

Figure 4.

Surface plasmon resonance sensorgrams for α- and β-d-MeGlc. The former anomer shows no binding above the background noise. The latter exhibits weak concentration-dependent binding, expressed in resonance units (RU).

The F17-G lectin domain has an immunoglobulin-like fold

The Dali server (Holm and Sander, 1998) was used to perform an objective structural alignment of the F17a-G lectin domain against a non-redundant subset of the PDB database, yielding approximately 45 structurally similar proteins. The most significant similarities (Fig. 5) are the N-terminal domain of the PapD chaperone, with a r.m.s.d. for all Cα atoms in the matched strands of 2.8 Å, the lectin domain of FimH (r.m.s.d. = 3.1 Å), the variable domain of the murine antibody Fab88 (r.m.s.d. = 3.2 Å), the adaptor pilin PapK (r.m.s.d. = 3.3 Å), the minor pilin PapE (r.m.s.d. = 3.6 Å) and the lectin domain of PapGII (r.m.s.d. = 3.9 Å). The N-terminal domain of the murine sialoadhesin cell surface receptor was tested as a representative of the siglec family (Crocker, 2002) and could be superimposed with an r.m.s.d. of 2.9 Å.

Figure 5.

A–E. Structure-based superposition of the N-terminal domain of the periplasmic chaperone PapD (A), the F17a-G lectin domain (B), the PapE pilin (C), an antibody light-chain variable domain (D) and the FimH lectin domain (E). Secondary structure elements are highlighted: the back sheet is in yellow, the front sheet is in green and additional strands are in shades of blue. Structurally similar parts are in dark green and orange. In (C) a peptide corresponding to the N-terminal extension of PapK (K, shown in blue) takes the place of the missing C-terminal strand. (F) Surface representation of the adaptor pilin PapK with the F17a-G lectin domain superimposed. Hydrophobic residues of PapK are shown in yellow, highlighting the groove caused by the absence of the canonical C-terminal strand in this pilin domain. Also shown are the complementing strands of the PapD chaperone in the crystal structure of the PapD/PapK complex (green) and the C-terminal strand of the F17-G lectin domain after optimal superposition of the complete domains (red).

Halaby et al. (1999) identified 52 protein domains as belonging to the immunoglobulin fold family based on similarities in the connectivity pattern of the β-strands and the existence of four core strands (B, C, E and F) which can be superimposed with a r.m.s.d. of less than 3.9 Å. According to these criteria, the F17a-G lectin domain clearly joins this family (Fig. 5). Moreover, its residues Val123 (strand E), Ile145 and Ile148 (strand F) correspond to highly conserved residues in the hydrophobic core of the immunoglobulin fold (Halaby et al., 1999).

Fimbrial lectin domains and pilins share a common fold

The lectin domains of the F17a-G, FimH and PapG adhesins contain the same basic immunoglobulin-like fold as the structural building blocks of the pilus, the pilins (Figs 5 and 6). The three lectin domains are, however, highly variable both in length and nature of the secondary structure elements connecting the four basic core strands B, C, E and F of the immunoglobulin fold. This large degree of variation is an established feature of immunoglobulin-like proteins (Halaby et al., 1999). For example, the FimH lectin domain has additional strands A′ and D′ (Fig. 5) compared with the PapD fold, inserted between A1/A2 and D1/D2, respectively, and strand B is split by a short helical segment. The PapGII lectin domain (Dodson et al., 2001) has much larger insertions between the core strands of the basic fold and is 20% longer than the FimH and F17-G domains. It has been described to consist of two regions (Dodson et al., 2001), the first region with a FimH-like topology and the second region containing the receptor-binding site (Fig. 6C). The Dali analysis confirms that the first region has the immunoglobulin-like fold.

Figure 6.

Localization of the sugar-binding sites of F17a-G (A), FimH (B) and PapGII (C). The proteins (represented in grey) were superimposed based on the structural core of the immunoglobulin fold which was identified in the three domains. Representative carbohydrate ligands are shown in black. The C-termini of the lectin domains, which precede the linker to the pilin domain coincide approximately. Structurally equivalent β-strands are labelled with their names as defined for F17-G. The two parts of the PapGII domain (Dodson et al., 2001) are indicated: part one has the immunoglobulin-like core, whereas part two holds the sugar-binding site.

To confirm our findings, the lectin and pilin domains of FimH were superimposed using Dali, resulting in an r.m.s.d. for all matched Cα-atoms of 2.9 Å. We chose FimH for this purpose because it is the only adhesin of which both domains have been characterized structurally. Interestingly, comparisons show that the C-terminal G strand of the lectin domains (Fig, 5B and E; red ribbon in Fig. 5F is the F17-G G strand), corresponds to the donor strand G1 of the N-terminal domain of the chaperone (Fig. 5A and F; green ribbon), and to the N-terminal extension strand (K in Fig. 5C) of the pilin subunit that complements for the missing G strand of the pilin after donor strand exchange with the chaperone at the usher. Superposition of F17a-G with the adaptor pilin PapK (Fig. 5F) shows that the C-terminal strand of the F17a-G lectin domain fits into the groove caused by the absence of the C-terminal strand of the pilin domain.


The crystal structure of the F17a-G lectin domain in complex with the ligand N-acetyl-d-glucosamine and the carbohydrate binding studies demonstrate the specificity of the F17a-G adhesin for GlcNAc. Despite the low affinity measured for the F17a-G/GlcNAc interaction (0.85 mM−1; Table 2), GlcNAc inhibits binding and colonization by F17-positive enterotoxigenic E. coli of intestinal microvilli (Girardeau, 1980). Inhibition is slightly better with the GlcNAc oligomers chitobiose, chitotriose and chitotetraose (Bertin et al., 1996) which can be explained by the slightly higher affinities of F17a-G for these sugars (Table 2). GlcNH2 and Glc binding are at least one order of magnitude weaker, indicating the importance of the N-acetyl group.

The bound GlcNAc in the structure of the F17a-G/GlcNAc complex adopts the β-anomeric configuration despite the presence of both anomers in solution. The structure reveals that an α-glycosidic linkage would be hindered by the main chain carbonyl of Thr118 and the side-chain of Ser117. The enhanced binding of β-Me-SeGlcNAc by its β-methyl-seleno group and the difference in binding between α- and β-MeGlc confirm the site's selection for the physiologically relevant β-glycosidic linkages of GlcNAc.

The interactions between F17a-G and GlcNAc include a cooperative hydrogen bond between the carboxyl group of Asp88 and the O4/O6 pair of the GlcNAc ligand (Fig. 3B), a pattern also observed in mannose/glucose-specific legume lectins (Bouckaert et al., 1999). In the galactose-specific legume lectins, an equivalent aspartate interacts with the O3 and O4 hydroxyl groups of galactose, which results in a different orientation of the saccharide ring (Loris et al., 1998). Modelling of the galactose in the F17a-G binding site based on these interactions reveals clashes with the side-chain of Trp109 and the main chain of Thr118, consistent with the lack of affinity for Gal and GalNAc (Table 2).

The carbohydrate-binding site of F17-G is a rather shallow and extended depression that has an interaction surface with GlcNAc of 140 Å2 and an association constant in the order of 1 mM−1 (Table 2). This is strikingly different from the high affinity of the FimH fimbrial adhesin for mannose of about 100 nM (Nagahori et al., 2002) and its significantly larger interaction surface (368 Å). A first possible explanation for the low affinities of F17-G may be the shape complementarity of the binding site with its ligand (Fig. 3B). Whereas the FimH mannose-binding site is a deep pocket almost completely enveloping the mannose ligand (Hung et al., 2002), the GlcNAc ligand does not fill the whole F17-G-binding site. It appears that larger ligands, such as oligosaccharides like for PapG (Dodson et al., 2001), or modified sugars could be accommodated, conferring greater specificity and/or affinity. A second possible reason is related to the way the carbohydrate-binding studies were performed. The purified F17-G lectin domain was immobilized and bound to simple monosaccharides in solution. By contrast, the FimH affinities were determined by measuring the binding of type 1-fimbriated bacteria in suspension to highly mannosylated ligands. The intricate network of interactions that can be spun between the multiple pili of a single bacterium and its multivalent ligands has been shown to enhance the affinity for several 10-folds (Nagahori et al., 2002).

A structure-based similarity search reveals that the F17-G lectin domain fits into the immunoglobulin fold family (Halaby et al., 1999). The structural variations of the F17-G lectin domain fall in the range observed for other immunoglobulin-like folds and the domain can in particular be compared with domains of the structural pilus subunits and periplasmic chaperones of bacterial pilus systems (Fig. 5). The disulphide bridge in the F17-G lectin domain is exposed at the surface of the protein (Fig. 2A) and is not essential for GlcNAc binding, as is shown for the F17b-G variant. This is congruent with the finding that disulphide bridges occur frequently in immunoglobulin-like folds, although their location is variable and they are not essential for the stability of the fold (Halaby et al., 1999).

Pilin domains have an incomplete immunoglobulin-like fold lacking the canonical seventh and C-terminal G strand (Choudhury et al., 1999; Sauer et al., 1999). Therefore the folded domain has a long groove on its surface, exposing its hydrophobic core (Fig. 5C and F). The missing strand is donated by the chaperone or the next pilin (Sauer et al., 2002). Unlike the pilin domains, the lectin domains of adhesins have a complete immunoglobulin fold. The G strand of the lectin domains of the adhesins links them to their pilin domains, which allows the adhesin to be attached to the tip of the pilus. We now also observe that the C-terminal β-strand G of the superimposed F17-G lectin domain fits in the hydrophobic groove of the structurally known FimH and PapK (Fig. 5F) pilin subunits and that it corresponds closely to the N-terminal extension of PapK bound to PapE (Fig. 5C). The concurrence of the β-strand G of the lectin domain and the missing strand of the pilin domain can also be exemplified by the FimH structure (Choudhury et al., 1999) and agrees with the fold similarity between the lectin and pilin domains of the fimbrial adhesins. It is suggestive that the two domains may have evolved by gene duplication and fusion, after which they diverged beyond recognizable sequence similarity.

Interestingly, the β-strand G of several proteins with an immunoglobulin-like fold, among which fibronectin III (Craig et al., 2001) and titin, has been shown to be the first strand to undergo unfolding under the application of stretch forces (Isralewitz et al., 2001). Recently, also the G strand of the FimH lectin domain has been proposed to extend upon attachment to the adhesin receptor under conditions of shear stress and enhance adhesion (Thomas et al., 2002), possibly by causing a firmer grip on the receptor (Isberg and Barnes, 2002). The structure of the F17-G lectin domain helped us to identify the immunoglobulin-like fold also in the FimH lectin domain. The β-strand G is thereby the structural equivalent of the donor strand of the pilins. Consequently, the donor strand of the pilins perhaps not only provides a mechanism to build up large polymeric fibres by making non-covalent interactions, but could also explain their spring-like behaviour that occurs upon attachment to the host cell (Hahn et al., 2002).

The variable position and nature of the binding sites of the fimbrial adhesins F17-G, PapG and FimH (Fig. 6) reflect the diverse hosts and tissues that are targeted by these pathogenic bacteria. FimH and PapG are adhesins expressed by uropathogenic E. coli, but display very different tropisms. The mannose-binding site of FimH is located at the tip of the lectin domain so that it can reach the FimH receptor buried deep within the ring-like uroplakin plaques of the bladder (Zhou et al., 2001). In contrast, the binding site of PapGII is located laterally on the lectin domain and can interact sideways with its globotetraoside receptor at the lipid bilayer of kidney epithelial cells, congruent with the conformation of the oligosaccharide (Dodson et al., 2001).

F17-G, the first fimbrial adhesin for enterotoxigenic E. coli to be structurally characterized, recognizes receptors exposing N-acetylglucosamine residues on the microvilli of intestinal epithelium. Its binding site is located laterally like PapGII (Fig. 6). We envision the following model for F17 fimbrial attachment: the long and flexible F17 fimbriae could intrude between the microvilli of the epithelium, with the binding site of the lectin domain interacting laterally with GlcNAc-containing receptors. This model is not only consistent with the orientation of the binding site compared with the pilus direction, but also with the natural variation in amino acids concentrated around the conserved N-acetylglucosamine binding site. The variable amino acids may be involved in supplementary and fine-tuning interactions with the microvilli. Geometrically, the length of F17 fimbriae (1–2 µM) is similar to the length of the microvilli of intestinal epithelial cells (Hirokawa and Heuser, 1981) and the spacing between microvilli is ample for the 3-nm-thick F17 wires. The concentration of F17 pili between the microvilli upon adhesion of the enterotoxigenic bacteria and the extra interactions around the binding site could be enhancing factors for affinity in vivo.

In conclusion, the lectin domains of the F17-G, FimH and PapG fimbrial adhesins share the immunoglobulin-like fold of their periplasmic chaperones and their pilin partners. Topological comparisons highlight the β-strand G of the immunoglobulin-like fold as the donor strand in the chaperone-pilin complexes in the periplasm and in pilin–pilin interactions in the mature pilus and as the C-terminal linker strand of the lectin domains. This same strand is involved in stretching forces and shear-stress enhanced adhesion. Taken together, our work points to common and diverse themes in carbohydrate-adhesin recognition and expands our understanding of these initial and critical host–pathogen interactions in Gram-negative bacteria.

Experimental procedures


GlcNAc was from CalBioChem, GlcNH2, Glc, α-Me-Glc, β-Me-Glc, GlcNAc2, GlcNAc3, GlcNAc4, Gal, GalNAc from Sigma, and GlcNAc(β1-2)Man and GlcNAc(β1,2)Man(α1,3)[GlcNAc(β1,2)Man(α1,6)]Man from Dextra Laboratories. β-Me-SeGlcNAc was synthesized according to Ogra et al. (2002). GlcNAc(β1,2)Man(α1,6)Man was a gift from Dr N. Amiot and Dr G.-J. Boons (University of Georgia, Athens, GA, USA).

Protein expression and purification

The lectin domains of the full-length F17-G genes were isolated by truncating after Thr177 (see PredictProtein, Using the primers F17-35 (5′-ACAAATTTTTATAAG GTCTTTCTGGCTGTATTC-3′) and F17-36 (5′-CCCGAAT TCTATGTGTCATTCAGCGTAAATGGATT-3′), the lectin domain was amplified and cloned under the control of the T7 promoter. The resulting constructs were transformed into the E. coli strain BL21-AI (Invitrogen). Figure 1 shows the amino acid differences for the lectin domain of the variants F17a-G (Lintermans et al., 1991), F17b-G (El Mazouari et al., 1994), F17d-G, F17e-G (Cid et al., 1999) and F17f-G (Cid et al., 1999).The BL21-AI cells were grown in Luria–Bertani (LB) medium until OD600 = 0.5 and induced with 0.2% l-arabinose. Cells were subjected to osmotic shock and the periplasmic extract was loaded onto a GlcNAc-agarose affinity column (Sigma) in buffer A (1 M NaCl, 20 mM Tris buffer, pH 7). Bound protein was eluted with buffer A containing 50–200 mM GlcNAc. The protein was concentrated, further purified by gel filtration (Superdex 75 HR, Amersham Biosciences), and dissolved in 20 mM Tris, pH 8 with 150 mM NaCl for crystallization or in 10 mM sodium acetate buffer, pH 5 with 150 mM NaCl for surface plasmon resonance measurements.


The F17a-G lectin domain was crystallized in 30% PEG4000, 0.1 M sodium acetate (pH 4.6) and 0.2 M ammonium acetate. The crystals were soaked with a cryoprotectant solution (31% PEG8000, 10% isopropanol, 0.1 M sodium/HEPES at pH 7.5) containing 20 mM β-Me-SeGlcNAc, 100 mM GlcNAc or no additional sugar and flash-frozen. All data were processed using Denzo, XDisplayF and Scalepack from the HKL package (Otwinowski and Minor, 1997) and Truncate from the CCP4 suite (Collaborative Computational Project, 1994). Crystal parameters and data processing statistics for the three crystals are summarized in Table 1.

Structure solution and refinement

In the β-Me-SeGlcNAc, the anomeric oxygen (O1) of GlcNAc is replaced by a selenomethyl group. The structure of the β-Me-SeGlcNAc complex was solved by a three-wavelength MAD approach using the anomalous diffraction signal of the selenium atom (Buts et al., 2003; Table 1). Rigid body fitting using the protein from the β-Me-SeGlcNAc complex was sufficient to build and refine the GlcNAc complex and the ligand-free protein models.

Crystallographic refinement was done with the CNS version 1.0 (Brünger et al., 1998) and CCP4 programs. Cross-validation (Brünger, 1992), bulk solvent correction and anisotropic B-factor scaling were used throughout with the maximum likelihood target for structure factors. Slow cooling simulated annealing and conjugate gradient minimization including all available data were alternated with model building. Quality checks and topology analysis were performed using ProCheck (Laskowski et al., 1993), DSSP (Kabsch and Sander, 1983) and ProMotif (Hutchinson and Thornton, 1996). The models have good geometry and crystallographic R and Rfree values (R = 0.205, Rfree = 0.235 for the β-Me-SeGlcNAc complex; R = 0.211, Rfree = 0.242 for the GlcNAc complex; R = 0.211, Rfree = 0.245 for the ligand-free protein). The atomic coordinates and structure factor amplitudes for F17a-G have been deposited in the PDB under Accession Numbers 1O9V (β-Me-SeGlcNAc complex), 1O9W (GlcNAc complex) and 1O9Z (ligand-free).

Model analysis

Structural alignments and analysis were done using the Dali server (Holm and Sander, 1998). The following entries from the PDB were used for structural comparisons: 1PDK (chain A) for PapD (Sauer et al., 1999); 8FAB (chain A) for the Fab88 antibody (Strong et al., 1991); 1KLF (chain B with mannose ligand) for FimH (Hung et al., 2002); 1PDK (chain B) for PapK (Sauer et al., 1999); 1N12 for PapE with the N-terminal peptide from PapK (Sauer et al., 2002); 1J8R (chain A with globotetraose ligand) for PapGII (Dodson et al., 2001); and 1QFP for the N-terminal domain of sialoadhesin (May et al., 1998). The figures were designed with MolScript 2.1.1 (Kraulis, 1991) and PyMol 0.84 (DeLano Scientific).

Affinity measurements

The affinity of the F17a-G and F17d-G variants for a variety of sugars was determined by surface plasmon resonance. The proteins were covalently immobilized on flow cells (Fc2: F17a-G, Fc3: F17d-G) of a CM5 sensor chip (Biacore AB) via lysines until a density of 1.5 µg/cm2. The reference flow cell Fc1 was coated with a camel single domain antibody at a density of 1.6 µg cm−2.

Binding of carbohydrates to the immobilized proteins was measured with a Biacore 3000 instrument. Carbohydrate concentrations from 10 mM to 10 µM in running buffer (50 mM Tris-HCl, 150 mM NaCl, 3 mM EDTA and 0.005% surfactant P20) were incubated for one minute on the flowcells simultaneously, at a flow rate of 30 µl min−1 and at 25°C. Complete dissociation of the carbohydrates was done with running buffer before starting a new binding cycle. All binding cycles were performed in duplicate, including a zero concentration cycle (injection of running buffer).

All analysis was performed with the BIAeval software. The subtracted (Fc2 or Fc3 minus Fc1) equilibrium signals were plotted versus carbohydrate concentrations, and a Langmuir binding isotherm with a 1:1 stoichiometry was fitted to the data, from which the association constants Ka were obtained.


We thank Maria Vanderveken, Maia Dekerpel and Christianne Bouton for excellent technical assistance and Savvas Savvides for critical reading of the manuscript. The authors acknowledge the use of EMBL beamlines BW7A and X13 at the DESY synchrotron, Hamburg, Germany (project PX-02–370). L.B. is a research assistant and J.B. and R.L. are postdoctoral fellows of the Fonds voor Wetenschappelijk Onderzoek – Vlaanderen, which also supported the BiaCore instrument (grant FWONL35) and DNA sequencing equipment (grant FWOAL215).