Structural basis for adaptation of lactobacilli to gastrointestinal mucus

The mucus layer covering the gastrointestinal (GI) epithelium is critical in selecting and maintaining homeostatic interactions with our gut bacteria. However, the underpinning mechanisms of these interactions are not understood. Here, we provide structural and functional insights into the canonical mucus-binding protein (MUB), a multi-repeat cell-surface adhesin found in Lactobacillus inhabitants of the GI tract. X-ray crystallography together with small-angle X-ray scattering demonstrated a ‘beads on a string’ arrangement of repeats, generating 174 nm long protein ﬁbrils, as shown by atomic force microscopy. Each repeat consists of tandemly arranged Ig- and mucin-binding protein (MucBP) modules. The binding of full-length MUB was con-ﬁned to mucus via multiple interactions involving terminal sialylated mucin glycans. While individual MUB domains showed structural similarity to ﬁmbrial proteins from Gram-positive pathogens, the particular organization of MUB provides a structural explana-tion for the mechanisms in which lactobacilli have adapted to their host niche by maximizing interactions with the mucus receptors, potentiating the retention of bacteria within the mucus layer. Together, this study reveals functional and structural features which may affect tropism of microbes across mucus and along the GI tract, providing unique insights into the mechanisms adopted by commensals and probiotics to adapt to the mucosal environment.


Introduction
The human gastrointestinal (GI) tract is colonized by a dense and diverse microbial community which is recognized as an important player in normal human physiology, metabolism, nutrition and immune function. Disruption of the gut microbiota (dysbiosis) has been linked with GI conditions such as inflammatory bowel disease (IBD) and obesity (Sommer and Backhed, 2013). In vivo, the intestinal epithelial cells are covered by a mucus layer, which is a biochemically complex medium, rich in glycoproteins (mucins), antimicrobial peptides, immunoglobulins and many other intestinal proteins as well as lipids and electrolytes (Johansson et al., 2011). In the colon, the mucus is organized in an outer loose layer which is home to gut bacteria and an inner layer which is firmly attached to the epithelium and protects the underlying mucosa from the bacterial load (Atuma et al., 2001;Johansson et al., 2008). Mucins are divided into gel-forming mucins (MUC2, MUC5AC, MUC5B and MUC6) and transmembrane mucins (MUC1, MUC3, MUC4, MUC12, MUC13, MUC16 and MUC17) (Johansson et al., 2011). The predominant mucin and major component of the mucus layer in the small intestine and colon is MUC2 (Johansson et al., 2011). In Muc2 -/deficient mice, bacteria come in direct contact with the epithelial cells, penetrate down into the normally sterile crypts and into the epithelial cells, resulting in severe inflammation and eventually colon cancer (Van der Sluis et al., 2006;Johansson et al., 2008), a condition that is comparable to the disease ulcerative colitis in humans. Mucins are rich in Ser/Thr residues that are targets for O-glycosylation primarily with N-acetylgalactosamine (GalNAc), creating the foundation upon which long and more complex oligosaccharide chains are built. The O-glycan chains are extended by the sequential addition of the monosaccharides, Nacetylglucosamine (GlcNAc), galactose, GalNAc, fucose (Fuc) and sialic acid; the neutral sugars can be further modified by sulfation (Bergstrom and Xia, 2013). The diverse glycan structures along the GI tract are believed to provide binding sites for the gut bacteria which have adapted to the mucosal environment by expressing the right complement of adhesins (Johansson et al., 2008;Juge, 2012;Ouwerkerk et al., 2013). Alterations in mucus structure, mucin expression and/or mucin glycosylation have been reported in intestinal inflammation and infection in humans and mouse models, and may be a cause or a consequence of an associated change in microbiota composition during disease development (Sheng et al., 2012). This could suggest that the mucin glycans are part of a mechanism for selecting our microbiota. Despite the importance of mucus in the maintenance of a homeostatic relationship with our gut microbiota, very little is known about the nature of the host components and bacterial effectors mediating this interaction.
Gram-positive lactobacilli are normal components of the intestinal microbiota and one of the first groups of bacteria to inhabit the human GI tract (Fan et al., 2013). Some strains (autochthonous) colonize the intestine stably throughout the lifetime of the host, whereas allochthonous strains persist only briefly in the intestine, many of these being probiotics (Reuter, 2001). Adhesion to the intestinal mucosa may prolong their persistence in the GI tract and their beneficial effects to the host, and is thus believed to be a requirement for the compliance of certain probiotic effects, such as immunomodulation and pathogen exclusion (Sanchez et al., 2010;van Baarlen et al., 2013). A growing number of reports indicate that Lactobacillus adherence to the intestinal mucosal layer is mediated by surface proteins with mucus-binding capacity (Juge, 2012;Turpin et al., 2012;Sengupta et al., 2013). Moreover, homology-driven genome mining in several Lactobacillus spp. identified the presence of various-sized putative mucus adhesins consisting of one or more copies of mucus-binding repeats, which is considered a unique functional feature for promoting host-microbe interplay in the GI tract (Boekhorst et al., 2006;Kleerebezem et al., 2010). The best characterized example is the canonical mucus-binding protein, MUB, produced by the Grampositive bacterium Lactobacillus reuteri (Roos and Jonsson, 2002;MacKenzie et al., 2010). L. reuteri inhabits the GI tract of mammals as diverse as humans, pigs, mice and rats as well as different species of birds and has been used as a model organism to study the evolutionary strategy of a vertebrate gut symbiont (Oh et al., 2010;Frese et al., 2011). MUBs are multi-domain proteins for which the overall architecture is unknown. MUBs exhibit characteristics typical of Gram-positive cell surface proteins; a C-terminal sortase recognition motif (LPXTG) for covalently anchoring the protein to peptidoglycan, repeated sequence elements of 183-206 amino acids in length (Mub repeats), an N-terminal Gram-positive consensus secretion signal upstream of a domain of yet unknown function (Boekhorst et al., 2006). Mub repeats are individually classed as either type 1 or 2 depending on sequence conservation (Roos and Jonsson, 2002;Boekhorst et al., 2006). We recently reported the first  (Graille et al., 2001), and the B2 domain corresponds to MucBP (mucinbinding protein) in the Pfam database (PF06458). Mub repeats and MucBP domains have been shown to bind to mucins and glycans (Roos and Jonsson, 2002;Pretzer et al., 2005;Bumbaca et al., 2007;von Ossowski et al., 2011;Coic et al., 2012;Watanabe et al., 2012). Numerous MUB homologues and MucBP domain-containing proteins have been found almost exclusively in lactic acid bacteria and predominantly in lactobacilli naturally located in intestinal niches (Boekhorst et al., 2006;Kleerebezem et al., 2010). This suggests that MucBP domaincontaining proteins play an important role in establishing host-microbial interactions in the gut and promoting the evolution of the species as primarily GI organisms.
Here, we describe the structure of a type 1 Mub repeat solved by X-ray crystallography and the structural organization of Mub type 1 and type 2 repeats constituting the MUB molecule by small-angle X-ray scattering (SAXS). Full-length MUB was also purified to homogeneity and analysed by atomic force microscopy (AFM) and in cell and tissue-adhesion studies. The results reveal structural and functional commonality between MUB and fimbrial proteins from Gram-positive pathogens with specificity to sialylated mucin glycans. Furthermore, bioinformatics analyses indicated that the MUB family of adhesins is characterized by the type and number of discrete modules, with large numbers of Mub repeats occurring in proteins from probiotic, commensal or otherwise nonpathogenic species, enabling multiple interactions with mucins, consistent with the natural habitat of commensals within the outer mucus layer.

Crystal structure of a Mub type 1 repeat
Lactobacillus reuteri MUB has 14 Mub repeats -six copies (RI through RVI) of a type 1 repeat (Mub1) and eight copies (R1 through R8) of a type 2 repeat (Mub2) (Supporting Information Table S1). These are tandemly arranged, with the Mub2 repeats inserted between Mub1 repeats RIV and RV (Fig. 1A). Mub type 2 repeats are highly similar (sequence identities of 84-100%), with Mub-R1 showing lowest similarity. In contrast, the Mub1 repeats are more diverse with sequence identities ranging from 29% for RIV and RV to 88% for RI and RII (MacKenzie et al., 2009). Here, the Mub-RV crystal structure (PDB entry 4MT5) was solved by molecular replace-ment and refined at 2.6 Å with an overall crystallographic R-factor (Rcryst) of 21.1% (Rfree 26.9%) (Supporting Information Table S2). Two Mub-RV molecules are present in the crystallographic asymmetric unit and are highly similar having an root-mean-square deviation (rmsd) of 0.9 Å over their Cα-atoms. Mub-RV folds to form an elongated structure 110 Å in length and 24 Å in diameter, comprising an N-terminal B1 domain and a C-terminal B2 domain with a small β-sheet inter-domain region (IR) (Fig. 1B). The structure is similar to that reported for the Mub2 repeat, Mub-R5 (MacKenzie et al., 2009) with a structural alignment Z-score of 15.1 over 176 aligned residues (rmsd 4.1 Å) (Fig. 1C) while sharing 42% sequence identity. The Mub-RV B1 domain has an ubiquitin-like β-grasp fold containing two pairs of antiparallel β-strands in a fourstranded sheet connected by an α-helix, which is similar to that found in members of the immunoglobulin-binding (Ig) superfamily (MacKenzie et al., 2009). However, in contrast to Mub-R5, there was no evidence for a Ca 2+ ion coordinated by residues of the loop connecting strands β3 and β4. The Mub-RV repeat lacks the three amino acids (Asp 60 , Asp 62 and Asn 65 ) present in Mub-R5 which mediate ion binding. The Mub-RV B2 domain is classed as MucBP in the Pfam database, with a modified ubiquitin-like β-grasp fold, in which the outer strands of the fourstranded β-sheet are connected by a β-strand (β3′) instead of an α-helix as in the B1 domain. This connecting strand, together with an additional β-strand (β5′) located between β4′ and β6′, forms a third antiparallel β-sheet (Fig. 1B). The B2 domain of Mub-RV shows high structural homology to the MucBP domain of the Gram-positive cell surface protein Spr1345 of Streptococcus pneumoniae (PDB entry 3NZ3) (Du et al., 2011) with a structural alignment Z-score of 11.6 at 37% sequence identity (rmsd 1.4 Å). The MucBP domain of Spr1345 has been shown to be essential for mucin-binding (Du et al., 2011).

Structural organization of Mub type 1 and type 2 repeats
SAXS analysis of single Mub repeat proteins (Mub-R5, Mub-RV and Mub-RI), multiple Mub repeats of one Mub type or of mixed repeats (Mub-R8-V, Mub-RV-VI and Mub-RI-II-III), and the MUB N-terminal domain (Nterm) was carried out to gain insights into MUB architecture, tandem-repeat organization and behaviour in solution. The composition of secondary structural elements (1% α-helix, 64-71% β-sheet and β-turn) as determined by far UV circular dichroism was shared by all recombinant Mub repeats (Supporting Information Fig. S1), suggesting a similar fold of individual Mub domains present in single or multiple repeat proteins. In contrast, the MUB Nterm domain showed a distinct secondary structure composition (6% α-helix, 56% β-sheet and β-turn), likely indicating an alternative protein fold (Supporting Information Fig.  S1). SAXS profiles of single Mub repeats (Supporting Information Fig. S2A), double and triple repeats ( Fig. 2A and C) as well as the Nterm domain ( Fig. 2C) revealed these proteins to be monomeric in solution (Supporting Information Table S2). The pair distribution P(r) curves of all Mub repeats (Supporting Information Fig. S2B, Fig. 2B   Fig. 1. (A) Domain architecture of MUB. MUB comprises six Mub type 1 repeats (RI to RVI) (blue), eight Mub type 2 repeats (R1 to R8) (green) and an N-terminal (Nterm) domain (grey). The C-terminal LPXTG-motif (black) anchors MUB to peptidoglycan of the bacterial cell wall. (B) X-ray crystal structure of the Mub type 1 repeat protein Mub-RV. Protein fold of Mub-RV with α-helix and β-sheets coloured red and yellow respectively. The Nand C-termini of the protein and the main structural elements are labelled. Mub-RV has two distinct domains B1 and B2, the latter including an inter-domain region (IR) with a three-stranded antiparallel β-sheet. (C) Structural superposition of Mub-RV with type 2 Mub-R5. Mub-RV is represented in yellow and red, and Mub-R5 in grey (PDB entry 35I7). The Ca 2+ ion present in the Mub-R5 structure is shown as a grey sphere. and D) showed a single peak with an elongated tail, which is characteristic of extended rod-shaped proteins (Mertens and Svergun, 2010). The maximal particle diameters (Dmax) for the single repeats Mub-RV, -RI and -R5, were found to be 110 Å, 106 Å and 105 Å (Supporting  Information Table S3), respectively, and in good agreement with Mub-RV and -R5 X-ray crystal structures. The Dmax values for the double repeats Mub-R8-V and Mub-RV-VI were calculated to be 205 Å and 206 Å, respectively, whereas the triple Mub-RI-II-III protein showed a maximal particle diameter of 292 Å (Supporting Information Table S3). Solution envelopes were reconstructed from P(r) functions (Supporting Information Fig. S2C-E; Fig. 2E-G), revealing an extended boomerang-like solution structure of Mub repeats. A good fit was observed with Mub-RV and -R5 crystal structures manually docked and refined into these envelopes (Supporting Information  Table S3). Taken together, these results indicate an elongated conformation of single and multiple Mub-repeat proteins in solution and suggest an arrangement of Mub repeats akin to 'beads on a string' within the full-length MUB surface protein. In contrast, the D max for the Nterm protein was found to be 159 Å, significantly lower than that of Mub-RI-II-III (a protein of similar molecular weight), suggesting a different solution structure for the Nterm domain compared to the Mub repeats. The solution envelope of Nterm demonstrated a more globular protein shape with an elongated tail ( Fig. 2H), which may be indicative of a different function.

Native full-length MUB forms elongated fibre-like structures in solution
MUB is normally covalently attached to peptidoglycan in the bacterial cell wall by a sortase-anchoring motif but can be shed into the culture medium in vitro (MacKenzie et al., 2010). In order to gain further insights into the overall structure of the full-length MUB protein (353 kDa), native MUB was purified to homogeneity from the spent culture medium of L. reuteri ATCC 53608 by size-exclusion chromatography (Supporting Information Fig. S3). Analytical ultracentrifugation sedimentation velocity experiments (data not shown) showed a sedimentation coefficient 5.7S for native MUB; such low sedimentation values are frequently observed for elongated proteins (Erickson, 2009;Kollman et al., 2009). Images of native MUB were acquired by AFM, revealing the presence of monomeric extended fibres (Fig. 2I). A total of 80 MUB molecules were observed, and the measured contour lengths exhibit a distribution of 120 to 192 nm (Fig. 2J). Histographic analysis of the length data yields a modal value of 174 nm and a mean value of 163 ± 15.5 nm. These values are in good agreement with the sum of a total of 14 Mub repeats (each of about 11 nm in length as observed in X-ray crystal structures and SAXS studies) plus the N-terminal domain of MUB with a D max of approximately 16 nm preceding the first Mub repeat.

Mub repeats share structural similarity with Gram-positive pathogenic adhesins
As with the corresponding domain of the type 2 repeat, Mub-R5, the B1 domain of the type 1 repeat, Mub-RV, resembles the immunoglobulin (Ig)-binding Protein L (PpL) (Z-score 5.4, rmsd 2.9 Å, 15% sequence identity). The B2 domain is structurally similar to the B-repeat of the Listeria invasion protein Internalin B (InlB) (PDB entry 2Y5P) (Z-score 5.9, rmsd 2.6 Å, 13% sequence identity), a protein with as yet unknown receptor specificity (Ebbes et al., 2011). More remarkable is the structural similarity of B2 to a number of pilin proteins such as GBS52 of Streptococcus agalactiae (PDB entry 3PHS) (Z-score 5.4), BcpA of Bacillus cereus (PDB entry 3KPT) (Z-score 4.7), RrgB of S. pneumoniae (PDB entry 3RPK) (Z-score 4.6), Spy0128 of Streptococcus pyogenes (PDB entry 3B2M) (Z-score 4.2) and SpaA of Corynebacterium diphtheriae (PDB entry 3HR6) (Z-score 4.0) as well as the microbial surface components recognizing adhesive matrix molecules (MSCRAMM) CnaB of Staphylococcus aureus (PDB entry 1D2P) (Z-score 2.5) (Deivanayagam et al., 2000;Kang et al., 2007;2009;Krishnan et al., 2007;Budzik et al., 2009;Paterson and Baker, 2011). These proteins contain a similar domain organization of two to four IgG-like domains of at least seven β-strands with a maximum dimension between 85 Å and 134 Å. The pilin GBS52 is the closest structural homologue of Mub-RV and folds into two IgG-like domains, N1 and N2 (Krishnan et al., 2007). The GBS52 N2 domain shows a structural alignment Z-score of 5.4 and an rmsd of 2.4 Å over 64 aligned Cα-atoms (Fig. 3A). The IgG-like domains of GBS52, BcpA and SpaA share an IgG-rev fold with two sheets of three and four β-strands in a DAG-CBEF topology, while RrgB and Spy0128 display the closely related canonical DAGF-CBE topology first observed in the structure of the CnaB D1 domain (Bork et al., 1994;Deivanayagam et al., 2000). The core of the β-sandwich of the Mub-RV B2 domain comprises two sheets, three-stranded and twostranded, with a modified IgG-rev fold generated from that observed in GBS52 by a simple deletion of the C and B strands from the edge of the larger four-stranded sheet of the latter ( Fig. 3B and C). Hence, the conserved strands from the IgG-rev fold observed in GBS52, BcpA and SpaA pilins are A,D,E,F and G. The question of whether this structural homology has arisen by a process of convergent or divergent evolution is an intriguing one. It is interesting to note in this respect that the GY sequence motif noted as being conserved in the InlB B-repeat, Mub repeats and in other mucin-binding MucBP domains (Ebbes et al., 2011) is also conserved in the GBS52 (Krishnan et al., 2007) and BcpA pilin proteins (Budzik et al., 2009). This motif may play an important structural role as the side chain of the tyrosine (Y) residue of the motif forms a cap for the hydrophobic core of the domain, stabilizing its fold.

L. reuteri native MUB binds to mucus via terminal sialylated structures
Mub repeats and MucBP domains have been shown to bind to mucins and Igs in vitro (Roos and Jonsson, 2002;Pretzer et al., 2005;Bumbaca et al., 2007;MacKenzie et al., 2009;von Ossowski et al., 2011;Coic et al., 2012;Watanabe et al., 2012). To gain insights into the binding of full-length MUB to mucus, adhesion assays were carried out using a human mucus-secreting intestinal cell line and mammalian tissue sections. Purified full-length MUB bound to mouse gastric tissue with staining localized above the gut epithelium (Fig. 4A). A reduction of MUB adhesion was observed after tissue desialylation with sodium periodate at pH 5.5 [as confirmed by reduction of sialic acid-binding lectin Sambuccus nigra (SNA-I) staining] (Fig. 4A). Upon a more drastic treatment at pH 4.5 leading to a reduction in general glycosylation [as confirmed by staining with Fuc-binding lectin, peanut agglutinin (PNA) and GlcNAc-binding lectin, wheat germ agglutinin (WGA)], MUB binding appeared to be redistributed within the intestinal crypts (Fig. 4A). In order to further explore the binding specificity of full-length MUB to mucus glycan epitopes, we tested binding of native MUB to HT29-MTX cell monolayers, which secrete both MUC5AC and MUC2 (Lesuffleur et al., 1993). MUB adhered to mucus droplets overlying the HT29-MTX cell monolayer, as confirmed by MUC5AC co-staining (Supporting Information Fig. S4). Addition of benzyl-N-acetylalpha-galactosaminide (benzyl-α-GalNAc), an inhibitor of sialylation of glycoproteins and mucin secretion (Delannoy et al., 1996;Prescher and Bertozzi, 2006), caused a reduction of MUC5AC immunostaining and mucus sialylation (SNA-I staining) but not of general glycosylation (WGA and PNA staining) (Fig. 4B). There was a marked reduction in MUB adhesion to benzyl-α-GalNAc treated HT29-MTX cells after 24 h, which was reversible after a further 24 h (Fig. 4B). These functional  ( Fig. 5), and often occur together in alternating repeats; both types of B2 also occur in tandem self-repeats (Supporting Information Table S4). The number of Mub domains per protein ranges widely from 1 to 28, with the most common frequencies being 1-4. Mub-containing proteins among the non-opportunistic pathogens are generally limited to only one or two domains (usually type 1 B2). A sole exception is a Streptococcus suis D12 protein    positive group contains more Mub domains per MUB protein (mean 6.3) than the signal/anchor-negative group (3.3), consistent with the notion that Mub domains functioning in an extracellular binding role are present in multiple copies, while single or oligo-Mub repeats may be associated with other functions.

Discussion
Adhesion to host tissues is a necessary first step of bacterial colonization and is mediated by cell surface adhesion proteins. Typical adhesins such as pili and fimbriae or other cell surface proteins have been extensively studied in enteropathogens (Kline et al., 2009). In the past decade, structural analysis using X-ray crystallography has enhanced our understanding of the interactions between MSCRAMM and the host extracellular matrix proteins, i.e. collagen, fibrinogen and fibronectin (Joh et al., 1999) by revealing several novel structural features that dictate surface protein assembly and the mode of their adhesion to host tissue (Vengadesan and Narayana, 2011). Although the adhesive abilities of enteropathogenic bacteria have been extensively studied, the systems responsible for intestinal adhesion of gut commensal and probiotic bacteria to mucus are poorly understood. High-throughput sequencing of Lactobacillus genomes provides a platform for functional analysis of genes that may contribute to adhesion and probiotic function (Pridmore et al., 2004;Pfeiler and Klaenhammer, 2007;Ventura et al., 2009;Turpin et al., 2012;Meyrand et al., 2013). A growing number of reports indicate that LPXTGanchored cell wall proteins with modular domain organization, such as L. reuteri MUB, function in the adherence of probiotic strains to the host intestinal mucosa (Kleerebezem et al., 2010;Walter et al., 2011;Call and Klaenhammer, 2013). In addition, mucus-binding pili have recently been reported in Lactobacillus rhamnosus strains (Douillard et al., 2013). The mucus-binding ability of L. rhamnosus GG is conferred by the pilus-associated SpaC, present along the length of the pilus shaft potentially resulting in the prolonged residency of L. rhamnosus GG in the GI tract (Kankainen et al., 2009;Reunanen et al., 2012). More recently pili have also been implicated in GI colonization and persistence of Bifidobacteria in the murine gut (O'Connell Motherway et al., 2011;Turroni et al., 2013) and in bacterial autoaggregation and biofilm formation of Lactococcus lactis in vitro (Oxaran et al., 2012;Meyrand et al., 2013). These appendages that decorate the bacterial surface are increasingly considered as key molecules in mediating bacterial adherence to the host epithelium and in influencing mucosal immune responses. Unfortunately, no structural information is available on commensal pilins. Filamentous pili structures are more common and better studied in pathogenic bac-teria, and can reach lengths of 70-200 or 300-3000 nm depending on pilus type (Telford et al., 2006).
Our study provides the first insights to the architecture of an exemplar of Gram-positive commensal adhesins. Following an N-terminal capping domain, MUB from L. reuteri has 14 tandemly arranged Mub repeats and an LPXTG cell-wall anchor motif. We showed that L. reuteri MUB forms 174 nm extended protein fibrils, potentially enhancing ligand-accessibility to mucus. A similar protein shape was observed for the Gram-positive serine-rich pilus of S. parasanguinis with an estimated length of several hundred nanometres by electron microscopy (Ramboarina et al., 2010). Our AFM analysis of MUB architecture was supported by SAXS experiments, providing structural information on fragments consisting of single, tandem and triple Mub repeats in solution. Single-Mub repeats display 'boomerang-shaped' solution envelopes about 105 to 110 Å (∼10 nm) in length, fitted well by the elongated Mub type 1 and the type 2 X-ray crystal structures. In addition, the low resolution X-ray scattering envelope reconstruction of double and triple Mub repeats gave elongated protein shapes of around 20 and 30 nm in length respectively. These findings suggest a 'beads on a string' arrangement of Mub repeats each consisting of two (B1 and B2) domains capped by the more globular N-terminal domain. The close structural similarity between the B1 and B2 domains and their co-occurrence in multiple proteins in many species, as indicated by genomewide sequence analysis, suggest that these domains may have evolved from a common ancestor by duplication. Our crystal structure data suggest only limited flexibility between the B1 and B2 domains about the inter-domain region, and the SAXS data are consistent with only limited flexibility about the B2-B1 domain junctions. Taken together, the picture that evolves of MUB modular organization and dynamics is then of a relatively rigid chain of tubular Mub repeats protruding from the bacterial cell wall terminated by a more globular N-terminal capping domain with B1 (Ig binding) and B2 (MucBP; mucus binding) domains having evolved to fulfill different functions. The function of the MUB N-terminal domain, located at the tip of the protein distal from the bacterial membrane, has yet to be established.
Recent structural studies of the component pilins from pathogenic bacteria have revealed a common pattern of tandem immunoglobulin (Ig)-like domains, joined endon-end (Kang and Baker, 2012). Here, we described extensive structural similarity between the newly determined crystal structure of the Mub-RV B2 domain and a number of pilin and MSCRAMM from pathogens displaying the IgG-rev fold. Those structures include the CnaB structures of S. aureus (Deivanayagam et al., 2000), the N2 domain from GBS52 of S. agalactiae (Krishnan et al., 2007), the C-terminal D4-domain of RrgB of S. pneumoniae (Paterson and Baker, 2011), the major Spy0128 pilin of S. pyogenes (Kang et al., 2007) and the SpaA shaft pilin of C. diphtheriae (Kang et al., 2009). Hence, Mub-RV shows structural similarity to members of the four major groups of invasive Gram-positive pathogens. This close structural similarity suggests an evolutionary relatedness. However, whereas in pathogens the primary site of adhesion is usually associated at the tip of the shaft with other domains providing a scaffold for optimum presentation of the glycan recognition site (Imberty and Varrot, 2008;Juge, 2012), the picture emerging from the structural characterization of MUB is one where multiple binding domains are presented along the length of the protein (Fig. 6). A similar role has been suggested for L. rhamnosus GC SpaC pilin (Reunanen et al., 2012) and, although no structural information is available for this protein, sequence analysis suggests the presence of multiple IgG-like domains.
Bioinformatics analyses indicated that the MUB family of adhesins is characterized by the type, number and permutation of discrete modules, with extracellular B2 domain-containing multi-Mub repeats associated with commensals, whereas proteins from pathogens were more frequently limited to single or double copies only. This structural arrangement is consistent with the niche adaptation of bacteria to the gut. Commensal bacteria are believed to inhabit and be retained within the mucus layer through interactions with mucins and, in normal physiological conditions, do not reach the epithelium (Johansson et al., 2011;Moran et al., 2011;Ouwerkerk et al., 2013). The presence of multiple Mub repeats and their distribution along the length of the fibre may be necessary to ensure binding to these complex and highly glycosylated proteins by enhancing multivalency to mucin glycans (Fig. 6). Possible functions of Mub repeats other than glycan binding may include their involvement in cis or trans cross-linking with other MUB proteins protruding from the bacterial cell surface, as suggested by autoaggregation phenotypes of MUB-expressing strains (MacKenzie et al., 2010). Additionally, interaction with glycan-independent ligands, in particular proteins (such as IgA) present in large amounts in mucus , is suggested by adhesion assays (MacKenzie et al., 2009). These multiple interactions will potentiate the retention of commensals within the outer mucus layer (Fig. 6). Here, adhesion assays of native MUB to mammalian tissue sections or epithelial cell cultures showed evidence for binding to terminal sialylated mucin structures within mucus, whereas alteration of glycosylation resulted in MUB penetrating further down towards the epithelium and into the intestinal crypts. Sialic acid is highly abundant in mucus (Hamer et al., 2009) and displays regio-specific distribution along the mammalian GI tract (Robbe et al., 2003;Holmen-Larsson et al., 2013). In addition to playing a role in the tropism of bacteria across mucus, these interactions may also explain the regio-specific location of bacteria along the human GI tract. A different situation occurs with enteric pathogens, which have evolved strategies to circumvent the mucus barrier, through flagella-mediated motility or through Fig. 6. Schematic representation of MUB binding to mucus. MUB forms fibre-like structures with multiple binding repeats distributed along the fibre potentiating multiple interactions with terminal sialic acid structures and mucus receptors including mucin glycans and proteins. These multiple and intimate associations will enhance retention of the bacteria within the outer mucus layer. (Not drawn to scale.) enzymatic degradation of the mucus, and can reach the epithelium surface where they bind to specific oligosaccharide structures via individual lectins or lectin N-terminal domains (Imberty and Varrot, 2008;McGuckin et al., 2011;Juge, 2012).
The structural and functional analysis reported herein provides new insights into how lactobacilli adapt to the mucosal environment in the human GI tract thus to exert beneficial effects. The adhesion mechanisms adopted by Gram-positive bacteria revealed intriguing structural similarities and singularities between commensal and pathogenic bacteria. These findings are important to refine targets and probiotic strategies for inhibiting interactions of pathogenic bacteria with the host.

Expression and purification of Mub repeats
The Mub-repeat genes coding for the Mub-R5, -RI, -RV, -R8-V, -RV-VI and -RI-II-III proteins or the N-terminal domain protein (Nterm) of the mucus-binding protein of Lactobacillus reuteri ATCC 53608 (American Type Culture Collection) were cloned into the pETBlue-1 AccepTor vector or the pOPINF vector and expressed in Escherichia coli. Recombinant Mub-repeat proteins were purified to homogeneity via ionexchange chromatography followed by size exclusion chromatography (SEC). The Nterm protein was purified via immobilized metal ion affinity chromatography followed by SEC. Proteins were dialysed in 10 mM Tris-HCl (pH 7.5) or 10 mM sodium phosphate buffer (Mub-RV for crystallization purposes, Mub-RI-II-III) and concentrated to about 10 mg ml −1 .

Protein crystallization, data collection and structure determination
Purified Mub-RV was crystallized at a protein concentration of 10 mg ml −1 by sitting drop vapour diffusion at 16°C with a precipitant solution of 0.2 M ammonium acetate and 24% (w/v) PEG 3350. Crystals were cryoprotected by adding 25% (v/v) DMSO to the reservoir solution and a diffraction data set was collected on the i04 beamline at Diamond Light Source, Didcot, UK.
The integration and reduction of X-ray diffraction data was performed by the Xia2 automated data reduction system. The PHENIX program suit was used for phasing, model building and refinement. An initial model was obtained via automated Molecular Replacement (autoMR) and autobuilding using the N-terminal domain structure of Mub-R5 and the mucinbinding protein (MucBP) domain structure of the adhesion protein PEPE_0118 (PDB entry ID 3LYY). The final model was generated after several alternating rounds of manual Coot (Emsley and Cowtan, 2004) model building and refinement.

Small-angle X-ray scattering
Scattering curves of Mub repeat proteins (Mub-R5, -RI, -RV, -R8-V, -RV-VI) (in Tris-HCl pH 7.5) were recorded in a con-centration range of ∼0.6 to 9 mg ml −1 as 10 × 10 sec frames at a wavelength of 0.93 Å and at a sample-detector distance of 2.4 m covering the momentum transfer range of 0.04 < s < 0.61 Å −1 (s = 4π sin (θ)/λ, where 2θ is the scattering angle and λ the wavelength) on the ID14-3 beamline (ESRF, Grenoble, France). Scattering profiles for Mub-RI-II-III (sodium phosphate pH 7.5, 2 mM DTT) and Nterm (Tris-HCl pH 7.5) were recorded in a concentration range of ∼0.5 to 5 mg ml −1 and 0.6 to 9 mg ml −1 , respectively, as 10 × 10 sec frames at a wavelength of 0.99 Å and at a sample-detector distance of 2.9 m covering the momentum transfer range of 0.03 < s < 0.45 Å −1 on the BM29 beamline (ESRF, Grenoble, France). The ATSAS (version 2.4) software was used for SAXS data analysis. Briefly, data were normalized subtracting the buffer scattering, scaled for concentration, and data points across different concentrations were merged using PRIMUS (Konarev et al., 2003). The radius of gyration (Rg) and scattering at zero angle [I(0)] were calculated by Guinier approximation with Rgs < 0.8 for elongated proteins, demonstrating the scattering profiles to be free from aggregation. The distance distribution function [P(r)] was generated by GNOM (Semenyuk and Svergun, 1991) computing the maximum particle diameter (Dmax) and an Rg value calculated for the whole scattering range. Ten ab initio shapes were reconstructed for Mub-R5, -RI, -RV, -R8-V, -RV-VI and Nterm by GASPOR (Svergun et al., 2001) or for Mub-RI-II-III by DAMMIF and averaged by the DAMAVER program package , generating a Χ (Chi) value, a measure for the fit of the experimental data and shape reconstruction, and a normalized spatial discrepancy, a measure for the agreement between computed shape models. Manual docking of the high-resolution X-ray structures of Mub-R5 (PDB entry ID 3I57) and Mub-RV (PDB entry 4MT5), representatives for Mub2 and Mub1 repeats, into the low-resolution reconstructions was performed using SCULPTOR and SITUS (http:// situs.biomachina.org/). The refinement of the docking solution by SITUS calculated a cross correlation coefficient R allowing quantitative evaluation of volumetric map and docked structure. The solution scattering of Mub-R5 and Mub-RV CRYSOL were computed from their atomic structures and fitted to the collected experimental scattering curves using CRYSOL (Svergun et al., 1995). The molecular weight of the particles was calculated from the scattering intensity value at zero angle I(0) after scaling against protein concentration with bovine serum albumin (BSA) as standard, providing information on the oligomeric state of the proteins in solution.

Native MUB purification
The full-length native MUB was purified from spent media of a L. reuteri ATCC 53608 culture grown in LDMII [Lactobacillus defined media type II, (Roos and Jonsson, 2002)] with 2% (w/v) sucrose at 37°C for 24 h. Bacterial cells were discarded after centrifugation at 7500 × g (15 min, 4°C) and spent media further clarified by vacuum-filtration using 0.45 μm and 0.2 μm filters. The clarified media extract was concentrated by tangential flow filtration using Vivaflow 200 cassette (100 000 MWCO) (Vivascience AG, Hannover, Germany). The sample solution was then dialysed in 10 mM phosphate buffered saline (PBS) (pH 7.4), filtered using 0.45 μm Ultrafree-Cl spin columns (Millipore, Merk KGaA, Darmstadt, Germany) and concentrated using spin concentrators (100 000 MWCO). MUB protein was purified to homogeneity by size exclusion chromatography using a Superose 6 prep grade column (GE Healthcare, Uppsala, Sweden) equilibrated with 10 mM PBS (pH 7.5) at a flow rate of 0.4 ml min −1 on an AKTA Fast Protein Liquid Chromatography system (GE Healthcare, New Jersey, USA).

Atomic force microscopy
The atomic force microscope used in this study was an MFP-3D BIO (Asylum Research, Goleta, CA, USA), and it was operated in air using AC mode for imaging. The cantilevers used were Olympus AC160TS (Olympus, Japan) with a nominal spring constant of ∼42 N m −1 oscillated at a frequency 10% below resonance (typically around 320 kHz). The damping set point for imaging was kept to the minimum value that allowed stable tracking of the sample surface in order to minimize any sample deformation. Images were acquired at a scan rate of 1 Hz. Immediately prior to imaging, MUB solutions were diluted into ultrapure water to a concentration of 10 μg ml −1 . Deposition was carried out by spotting 4 μl of the diluted sample onto freshly cleaved mica, incubation for 1 min to allow adsorption to take place and then blowing the excess liquid off using argon.

Benzyl-α-GalNAc treatment of HT29-MTX cells
HT29-MTX monolayers were cultured for 14 days in 24-well plates on glass coverslips. The culture medium (DMEM, 10% FCS, 1% L-Glutamine) was replaced with DMEM (without FCS) containing 5 mM benzyl 2-acetamido-2-deoxy-α-Dgalactopyranoside (benzyl-α-GalNAc, Sigma-Aldrich, UK). Control wells contained DMEM only. HT29-MTX monolayers were cultured for 24 h in the presence of benzyl-α-GalNAc. The culture medium was removed and wells washed once with PBS. To test the specificity and reversibility of the interaction, HT29-MTX monolayers cultured for 24 h with 5 mM benzyl-α-GalNAc were washed once and cultured for a further 24 h in culture medium without benzyl-α-GalNAc. Slides were washed three times in PBS and mounted in Hydromount mounting medium (National Diagnostics, UK).

Sodium periodate treatment
Formalin-fixed paraffin-embedded C57BL/6 mouse gastric tissue sections (5 μm) were washed in 0.1 M NaAc buffer (0.35% acetic acid, 0.32% (w/v) sodium acetate; pH 4.5 or pH 5.5) twice for 5 min, followed by an incubation in periodate buffer (10 mM periodate in 0.1 M NaAc buffer) pH 4.5 (2 h) or pH 5.5 (20 min) in the dark. Tissue was washed in 0.1 M NaAc buffer once for 5 min, and twice in PBS. Tissue was reduced by immersion in borate buffer (50 mM NaBH4 in PBS, pH 7.6) for 30 min. Slides were washed twice in PBS and blocked with Tris-NaCl-Block buffer (PerkinElmer, Cambridgeshire, UK) for 1 h. Slides were rinsed with PBS and incubated with lectins or MUB (4 μg ml −1 in PBS) overnight at 4°C. Following three washed in PBS-T (PBS, 0.05% (w/v) Tween-20) for 10 min, slides were incubated with neat antiserum of rabbit anti-Mub-R5 diluted in PBS for 3 h. Slides were washed three times in PBS-T for 10 min, and incubated with goat anti-rabbit Alexa Fluor 488 for 1 h in the dark. Following two washes in PBS-T for 10 min, nuclei were stained with DAPI for 10 min in the dark, washed three times and mounted in Hydromount.

MUB adhesion assay
HT29-MTX monolayers or tissue sections were incubated with 4 μg ml −1 MUB diluted in PBS for 2 h at room temperature or overnight at 4°C. Monolayers or tissues were incubated with antiserum of rabbit anti-Mub-R5 (1:100) overnight at 4°C or for 3 h at room temperature respectively. Secondary antibody (1:200 goat anti-rabbit Alexa Fluor 488, Invitrogen) was applied for 1 h at room temperature. Coverslips were washed three times in PBS and mounted in Hydromount mounting medium (National Diagnostics, UK).

Bioinformatics analyses
Four seed alignments of each of the four domains (type 1 B1, type 1 B2, type 2 B1, type 2 B2) were used as queries of the UniProt database (UniProt, 2013) using PSI-BLAST (Altschul et al., 1997). Each of the resulting four sets of hit domain sequences was then used to build a profile HMM using HMMER3 (http://hmmer.org). Each profile HMM, along with a fifth [MucBP from Pfam (Punta et al., 2012)] was used to search all the complete NCBI bacterial genomes (ftp:// ftp.ncbi.nlm.nih.gov/genomes/Bacteria/) supplemented by five L. reuteri genomes in the NCBI drafts database (http:// www.ncbi.nlm.nih.gov/genome/genomes/438). Processing with SignalP (Petersen et al., 2011) and presence of the Pfam domains YSIRK_signal, Gram_pos_anchor and C-term_ anchor were used to assess the presence or absence or secretory signals and anchor domains. Refer to the Supplementary Material for full details.  (MacKenzie et al., 2009)]. (B) The proportion of secondary structural elements was calculated using the CONTIN analysis program, revealing a 64-71% content of β-sheet and β-turns and a low percentage of α-helices for all tested Mubproteins. Nterm showed an α-helix content of 6.2% and a 56.2% content of β-sheet and β-turns. Fig. S2. SAXS analysis of single Mub-repeat proteins and solution shape reconstruction. SAXS data for the type 1 Mub-RV (blue) and Mub-RI (red) repeats and the type 2 Mub-R5 repeat (green) are presented. (A) The experimental scattering curves are shown as the logarithm of the scattering intensity I (black dots) as a function of the reverse momentum transfer s and presented offset for better visualisation. Overlaying the scattering profiles, are fits of the reconstructed averaged models for Mub-RV, Mub-RI and Mub-R5 generated by GASPOR. (B) Pair distribution functions P(r) were generated from the experimental scattering using GNOM. Low resolution shape reconstructions of Mub-RV (C), Mub-RI (D) and Mub-R5 (E) were achieved by GASPOR and high resolution structures of Mub-RV (blue) and Mub-R5 (green) were manually docked and refined using Sculptor and SITUS. Fig. S3. Purification of native full-length MUB protein.
Native MUB protein was purified from Lactobacillus reuteri ATCC 53608 spent media in a multi-step process. (A) In a final size exclusion chromatography (SEC) MUB elutes with the void volume of the column in elution fractions of high protein homogeneity verified by SDS-PAGE gel (B) and specific protein detection via anti-Mub-R5 and -Mub-RI antibodies after Western blotting (C) 1: cell pellet, 2: extracted protein before SEC, 3-11: MUB-containing elution fractions (red box). Fig. S4. MUB protein binding to HT29-MTX cell monolayers. HT29-MTX cell monolayers (n = 12) were incubated with MUB for 2 h at 37°C, followed by staining with rabbit anti-Mub-R5 and goat anti-rabbit Alexa Fluor 488 secondary antibody (A). HT29-MTX cell monolayers were stained with polyclonal anti-MUC5AC followed by goat antirabbit Alexa Fluor 488 secondary antibody (B). Bright field images are shown adjacent to fluorescent images to demonstrate the correlation of fluorescence staining with mucus droplet structures. Magnification ×400; scale bars, 50 μm. Table S1. Size of Mub repeats and the N-terminal domain. Table S2. Mub-RV data collection and refinement parameters. Table S3. SAXS data statistics.