Notice: Wiley Online Library will be unavailable on Saturday 27th February from 09:00-14:00 GMT / 04:00-09:00 EST / 17:00-22:00 SGT for essential maintenance. Apologies for the inconvenience.
Bacterial cellulosomes are generally believed to assemble at random, like those produced by Clostridium cellulolyticum. They are composed of one scaffolding protein bearing eight homologous type I cohesins that bind to any of the type I dockerins borne by the 62 cellulosomal subunits, thus generating highly heterogeneous complexes. In the present study, the heterogeneity and random assembly of the cellulosomes were evaluated with a simpler model: a miniscaffoldin containing three C. cellulolyticum cohesins and three cellulases of the same bacterium bearing the cognate dockerin (Cel5A, Cel48F, and Cel9G). Surprisingly, rather than the expected randomized integration of enzymes, the assembly of the minicellulosome generated only three distinct types of complex out of the 10 possible combinations, thus indicating preferential integration of enzymes upon binding to the scaffoldin. A hybrid scaffoldin that displays one cohesin from C. cellulolyticum and one from C. thermocellum, thus allowing sequential integration of enzymes, was exploited to further characterize this phenomenon. The initial binding of a given enzyme to the C. thermocellum cohesin was found to influence the type of enzyme that subsequently bound to the C. cellulolyticum cohesin. The preferential integration appears to be related to the length of the inter-cohesin linker. The data indicate that the binding of a cellulosomal enzyme to a cohesin has a direct influence on the dockerin-bearing proteins that will subsequently interact with adjacent cohesins. Thus, despite the general lack of specificity of the cohesin–dockerin interaction within a given species and type, bacterial cellulosomes are not necessarily assembled at random.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
Cel48F from C. cellulolyticum with its native C. cellulolyticum dockerin module appended
Cel48F from C. cellulolyticum with a C. thermocellum dockerin module appended
Cel5A from C. cellulolyticum with the dockerin of Cel48F from C. cellulolyticum appended
Cel5A from Clostridium cellulolyticum with its native C. cellulolyticum dockerin module appended
Cel5A from C. cellulolyticum with a Ruminococcus flavefaciens dockerin module appended
Cel5A from C. cellulolyticum with a C. thermocellum dockerin module appended
Cel9G from C. cellulolyticum with the dockerin of Cel48F from C. cellulolyticum appended
Cel9G from C. cellulolyticum with its native C. cellulolyticum dockerin module appended
Cel9G from C. cellulolyticum with an R. flavefaciens dockerin module appended
Cel9G from C. cellulolyticum with a C. thermocellum dockerin module appended
Cellulosomes are extracellular multienzyme complexes produced by cellulolytic anaerobic bacteria that efficiently depolymerize plant cell wall polysaccharides into fermentable sugars [1, 2]. The ‘simplest’ cellulosomes are those produced by mesophilic cellulolytic clostridia, such as Clostridium cellulolyticum [3, 4], Clostridium josui [5, 6], Clostridium papyrosolvens,  and Clostridium cellulovorans [8-10], as the complexes synthesized by these bacteria are mainly composed of a single large scaffolding protein that shows no enzymatic activity but integrates all of the cellulosomal enzymes. This ‘scaffoldin’ contains a family 3a carbohydrate-binding module (CBM) that interacts with crystalline cellulose, one or more ‘X2’ modules with functions that are not clearly established, and a series of six to nine receptor modules called cohesins [5, 11, 12]. The cohesins bind to a complementary module called the dockerin, which is borne by the catalytic subunits of the complex [13, 14]. Other anaerobic bacteria, such as Clostridium thermocellum , Ruminococcus flavefaciens , and Acetivibrio cellulolyticus , produce more sophisticated cellulolytic complexes, comprising several interacting scaffoldins and cell surface proteins with cohesins appended. They are also characterized by the presence of several types of specific cohesin–dockerin docking devices that coexist within the same species , whereas mesophilic clostridia usually have only a single type of cohesin–dockerin docking system.
Despite their diversity in terms of structure, organization, and number of protein components, bacterial cellulosomes share common features. The cohesin–dockerin interaction is calcium-dependent and of high affinity, with reported KA values in the range of 109m−1 or above [17-19]. Within a given species and type, each dockerin-bearing component can usually interact in vitro with any of the cognate cohesins with the same affinity . This observation is not surprising, considering the very high sequence identity among cohesins, and between dockerins of the same type and species. This observation would suggest random incorporation of the catalytic subunits during cellulosome assembly . A few exceptions to strict sequence conservation and binding fidelity have been reported, but, in these specific cases, the dockerin and/or cohesin sequence differs from the most common sequence found for the given species and type . Thus, it was hypothesized that, in general, the proportion of each cellulosomal catalytic subunit in the cellulolytic complexes simply reflects the level of expression of the corresponding gene , and that cellulolytic bacteria produce a population of heterogeneous cellulosomes in terms of enzymatic composition, stoichiometry, and arrangement .
Because natural cellulosomes are complex and highly heterogeneous, cellulosome chimeras were designed to facilitate the biochemical and biophysical exploration of these cellulolytic complexes and to gain new insights into the functioning of bacterial cellulosomes [17, 23-26]. This technology exploits the species specificity of the cohesin–dockerin interaction to assemble complexes mimicking bacterial cellulosomes but composed of a limited number of enzymes with dockerins from different species appended. Furthermore, each enzyme is bound at a specific location onto the hybrid scaffoldin that bears the complementary cohesin modules. Thus, cellulosome chimeras are smaller and completely homogeneous, whereas natural cellulosomes are larger and highly heterogeneous. Former studies using artificial complexes showed that the cellulosome efficiency mainly stems from: (a) the enzyme proximity within the complex, which triggers additional synergy; and (b) the substrate-targeting effect of the CBM hosted by the scaffoldin, which anchors the whole complex at the surface of crystalline cellulose and enhances the overall activity [23, 24]. Nevertheless, these artificial and homogeneous complexes, even those containing a rather large number of enzymes, were systematically found to be less active than natural heterogeneous cellulosomes on crystalline cellulose or raw substrates such as hatched straw [17, 27].
This observation prompted us to evaluate the impact of the heterogeneity induced by random assembly of bacterial cellulosomes on the degradation of crystalline cellulose. For this purpose, a new scaffoldin bearing a CBM and three copies of cohesin 1 from C. cellulolyticum was constructed and mixed with one family 5 (Cel5A) , one family 9 (Cel9G)  and one family 48 (Cel48F)  dockerin-bearing glycoside hydrolase from the same microorganism, to create a heterogeneous mixture of randomly assembled minicellulosomes. Surprisingly, the protein partners assembled ‘at random’ generated only three distinct complexes in constant proportions, rather than the 10 types of complex that would be expected, thereby indicating preferences among cellulosomal enzymes during complex formation. Furthermore, at high substrate concentrations, the mixture of the three complexes was found to be slightly more active on crystalline cellulose than the corresponding fully homogeneous cellulosome chimera containing the same three enzymes . It was also shown that the preferences among cellulosomal enzymes are not caused by minor variations in cohesin–dockerin affinity, and are directly affected by the length of the inter-cohesin linker segments of the scaffoldin.
Assembly of trivalent homogeneous and heterogeneous minicellulosomes
Two analogous CBM-bearing scaffoldins that include three cohesins were designed for this work (Fig. 1). The hybrid scaffoldin Scaf6  contains a family 3a CBM framed by one cohesin module from C. cellulolyticum at the N-terminus, and two cohesin modules from C. thermocellum and R. flavefaciens, respectively, at the C-terminus (Fig. 1). To compare the activity of the highly homogeneous trivalent cellulosome chimera based on Scaf6 with a complex of similar size but randomly assembled, a new scaffoldin, termed Scaf7, was designed that resembles Scaf6 but contains three identical copies of the first cohesin of the scaffoldin CipC from C. cellulolyticum. When mixed with equimolar amounts of three C. cellulolyticum cellulases bearing their native dockerins [Cel5A from C. cellulolyticum with its native C. cellulolyticum dockerin module appended (Cel5Ac); Cel48F from C. cellulolyticum with its native C. cellulolyticum dockerin module appended (Cel48Fc); and Cel9G from C. cellulolyticum with its native C. cellulolyticum dockerin module appended (Cel9Gc)], this scaffoldin would be expected to generate a mixture of 10 complexes with different enzyme compositions, whereas interaction of Scaf6 with Cel5Ac, Cel48F from C. cellulolyticum with a C. thermocellum dockerin module appended (Cel48Ft) and Cel9G from C. cellulolyticum with an R. flavefaciens dockerin module appended (Cel9Gf) was formerly shown to induce the formation of a single homogeneous complex . These three cellulases were selected as components for both types of complex, because they were previously shown to form one of the most active Scaf6-based complexes on crystalline cellulose or raw substrate, such as hatched straw . In both cases, complex formation was verified by nondenaturing PAGE. As expected, a single major band was obtained for the homogeneous Scaf6-based chimera (Fig. S1), but, surprisingly, only two major bands and one minor band with altered mobility were observed in the lane corresponding to Scaf7-based complexes (Fig. 2A). A similar result was obtained when the mixture of Scaf7-based complexes was analyzed with IEF, as two major bands and one faint band were detected, corresponding to three distinct pIs (Fig. 2B). Protein extraction from the two major IEF bands for subsequent analysis by SDS/PAGE indicated that the major band at ‘acidic’ pI (~ 5.4) essentially contained Scaf7 and Cel9Gc, whereas the ‘basic’ band (~ 5.8) contained Scaf7, Cel48Fc, and Cel5Ac (Fig. 2C), with Cel48Fc being more abundant than Cel5Ac. MS analyses confirmed that the major band at pI 5.8 was essentially composed of Scaf7, Cel48Fc, and Cel5A (Table 1), whereas the second major band at a more acidic pI (estimated to be 5.4) mainly contained Scaf7 and Cel9Gc. Unfortunately, the exact protein composition of the minor IEF band could not be determined with either MS or SDS/PAGE, because it contained less protein, and its location (between the major bands) made it difficult to excise without contamination from the other bands. Nevertheless, a complex made of Scaf7 and three Cel5Ac modules generated a band in the IEF gel at the same position (Fig. S2). Considering that all proteins were mixed in equimolar amounts, these data suggest that three distinct complexes were formed, and that the complex containing only Cel9Gc [Scaf7–(Cel9Gc/Cel9Gc/Cel9Gc)] accounted for 33% of all complexes. The complex corresponding to the ‘basic’ IEF band in which Cel48Fc seemed to be preponderant over Cel5Ac should correspond to Scaf7–(Cel48Fc/Cel48Fc/Cel5Ac), and its proportion should be 50% of all assembled complexes. Finally, the faint IEF band should correspond to Scaf7–(Cel5Ac/Cel5Ac/Cel5Ac), with a proportion of 17%. To confirm this hypothesis, these three complexes were assembled separately and mixed in the same proportions prior to analysis by nondenaturing PAGE, together with the initial mixture of Scaf7-based complexes. Quantitative analysis of the resulting gel with imagequant tl software showed that both electrophoretic profiles could be superimposed (Fig. 3A,B). In contrast, even the omission of the minor complex (17%) containing three Cel5Ac modules induced significant variations in the electrophoretic pattern (Fig. 3A, lane 2; Fig. 3B, red line). Thus, altogether, these results support the enzyme compositions and complex proportions hypothesized above.
Table 1. Protein composition of the major IEF bands determined by MS
Peptides prior to IEF separation
Peptides in pI 5.8 IEF band
Peptides in pI 5.4 IEF band
Number of peptides found by LC-MS/MS analysis that match theoretical peptides.
Replacement of the native dockerins of Cel5Ac and Cel9Gc with that of Cel48Fc
The dockerin modules borne by Cel5Ac, Cel48Fc and Cel9Gc are highly homologous but not identical. In particular, the dockerin of Cel5Ac shows the typical C. cellulolyticum dyad AL at positions 11 and 12 of the first conserved segment of the dockerin, whereas the sequence AF is found at the same position in the second conserved segment [19, 28]. Although surface plasmon resonance analyses formerly indicated that Cel5Ac , Cel48Fc  and Cel9Gc have similar affinity constants for the first cohesin of CipC (H.-P. Fierobe, unpublished data), minor differences in affinity for this cohesin module that were not formerly detected could possibly account for the limited heterogeneity observed for the Scaf7-based complexes. To rule out this hypothesis, the dockerin module was ‘standardized’ for all three enzymes. Thus, the native dockerin modules found in Cel5Ac and Cel9Gc were replaced by that of Cel48Fc, which shows the most canonical C. cellulolyticum dockerin sequence, thereby generating the engineered cellulases Cel5A from C. cellulolyticum with the dockerin of Cel48F from C. cellulolyticum appended (Cel5Ac48) and Cel9G from C. cellulolyticum with the dockerin of Cel48F from C. cellulolyticum appended (Cel9Gc48), respectively (Fig. 1). Incubation of the scaffoldin Scaf7 with equimolar amounts of Cel5Ac48, Cel48Fc and Cel9Gc48 generated a mixture of complexes that seemed similar to that obtained with the wild-type enzymes Cel5Ac, Cel48Fc, and Cel9Gc, as two major bands and one minor band were observed on native PAGE (Fig. 4A), and three bands were obtained on the IEF gel (Fig. 4C). Nevertheless, the replacement of the dockerins slightly changed the pI of the engineered cellulases, and the positions of the IEF bands corresponding to the various complexes were more distinct and thus facilitated their subsequent excision for SDS/PAGE analysis. The same protein compositions were observed: the ‘lower’ and ‘intermediate’ IEF bands contained Scaf7 and Cel9Gc48, and Scaf7, Cel48Fc, and Cel5Ac48 (Cel48Fc seeming to be preponderant over Cel5Ac48), respectively, whereas the ‘upper’ IEF band was composed of Scaf7 and Cel5Ac48 (Fig. 4D). Once more, considering that all components were mixed in stoichiometric amounts, the data suggest that 50% of the complexes contain Scaf7–(Cel48Fc/Cel48Fc/Cel5Ac48), 33% contain Scaf7–(Cel9Gc48/Cel9Gc48/Cel9Gc48), and 17% contain Scaf7–(Cel5Ac48/Cel5Ac48/Cel5Ac48). These proportions were verified by assembling the three complexes separately and combining them in the same proportions prior to nondenaturing PAGE analysis (Fig. 4A, lane 2). Quantitative analysis of the obtained gel indicated a similar electrophoretic pattern to that of the initial mixture of Scaf7-based complexes (Fig. 4B). Thus ‘standardization’ of the dockerin module for all three cellulases had no impact on the compositions and proportions of the obtained Scaf7-based complexes.
Activities on crystalline cellulose of homogeneous versus heterogeneous complexes
The initial objective of the present study was to compare the activity of a highly homogeneous cellulosome chimera with that of a mixture of randomly assembled complexes containing the same enzymes, at both quantitative and qualitative levels. Thus, the activity on crystalline cellulose Avicel of the homogeneous Scaf6-based complex containing one Cel5Ac, one Cel48Ft and one Cel9Gf was compared with that of the mixture of Scaf7-based complexes. We also assayed the activity of a controlled combination of homogeneous Scaf6-based chimeras designed to mimic the composition of the heterogeneous mixture obtained with Scaf7-based complexes. For this purpose, Cel5A from C. cellulolyticum with an R. flavefaciens dockerin module appended (Cel5Af) and Cel9G from C. cellulolyticum with a C. thermocellum dockerin module appended (Cel9Gt) were constructed (Fig. 1). The resultant complexes Scaf6–[Cel5Ac/Cel5A from C. cellulolyticum with a C. thermocellum dockerin module appended (Cel5At)/Cel5Af], Scaf6–(Cel48Fc/Cel48Ft/Cel5Af) and Scaf6–(Cel9Gc/Cel9Gt/Cel9Gf) were assembled separately, and mixed at 17%, 50%, and 33%, respectively, i.e. in the same proportions established for the Scaf7-based complexes. The released soluble sugars were identified and quantified by anion exchange chromatography coupled with pulsed amperometry. At a previously employed substrate concentration of 3.5 g·L−1 , the heterogeneous Scaf7-based complexes and the analogous mixture of Scaf6-based complexes showed the same activity, which was found to be in the same range as that of the homogeneous cellulosome chimera composed of the same cellulases (Fig. 5A). In all cases, glucose, cellobiose, cellotriose and cellotetraose were released in the same proportions with this concentration of Avicel (Fig. S3). Longer cellodextrins were not detected, except in the case of cellopentaose, for which trace amounts (< 1 μm) were observed at the shortest incubation time examined (1 h). Interestingly, with a significantly higher substrate concentration of 30 g·L−1, the heterogeneous Scaf7-based or Scaf6-based complexes proved to be 20% more active than the homogeneous Scaf6-based cellulosome chimera (Fig. 5B). Furthermore, the proportions of the various soluble sugars released by the mixture of Scaf6-based and Scaf7-based heterogeneous complexes at all incubation times matched completely, but significantly diverged from the pattern of oligosaccharides released by the homogeneous Scaf6-based cellulosome chimera, especially in the case of glucose (Fig. 5C–F). The identical patterns of released cellodextrins found for both the heterogeneous mixture of Scaf7-based complexes and the combination of Scaf6-based chimeras provides further support for the proportions of the various chimeras being correct.
Quantification of the enzymatic preferences during cellulosome assembly
As described above, the incubation of Scaf7 with the three cellulases bearing similar or identical C. cellulolyticum dockerins generated only three distinct complexes, thus indicating preferential integration of enzymes upon binding to the scaffoldin. The influence of the initial incorporation of a specific enzyme on the occupancy of the adjacent cohesin was also investigated. For this purpose, a new strategy was developed, based on another hybrid scaffoldin that has formerly been produced and studied, Scaf4c . This miniscaffoldin lacks a CBM but contains one cohesin from C. thermocellum and one cohesin from C. cellulolyticum, separated by the typical 11-residue inter-cohesin linkers found in the scaffoldin CipC from C. cellulolyticum (Fig. 1). One of the available C. cellulolyticum cellulases with a C. thermocellum dockerin appended (Cel5At, Cel48Ft, or Cel9Gt) was first mixed with an equimolar amount of Scaf4c. The obtained binary complex was verified by nondenaturing PAGE, and subsequently incubated with a mixture of equimolar amounts of Cel5Ac48, Cel48Fc, and Cel9Gc48, which all contain the identical C. cellulolyticum dockerin module (from Cel48Fc) and compete for the C. cellulolyticum cohesin. In all cases, a mixture of ternary complexes resulted, which contained the preloaded enzyme bearing the C. thermocellum dockerin bound to its matching cohesin of Scaf4c and a distribution of the enzymes containing the C. cellulolyticum dockerin.
We initially planned to label the three cellulases bearing the same C. cellulolyticum dockerin with three different fluorescent probes to directly quantify the amount of each enzyme in the mixture of complexes. Unfortunately, it turned out that the modified cellulases had reduced affinity for the cognate cohesin as compared with the corresponding unlabeled enzymes, thus introducing a bias, even when fluorescent probes targeting cysteines, which are found in the catalytic domains but not in the dockerins, were used. For this reason, it was finally decided to employ unlabeled cellulases bearing the same C. cellulolyticum dockerin, and to analyze the resulting ternary complexes by nondenaturing PAGE.
To identify and quantify each minicellulosome, standard ternary complexes containing either Cel5Ac48, Cel48Fc or Cel9Gc48 were loaded onto the same gel (an example of such a gel obtained with the ternary complexes containing Cel48Ft is shown in Fig. 6A). As described above, quantitative analysis of the resulting gel with imagequanttl software (an example of the ternary complexes containing Cel48Ft is shown in Fig. 6B) and overlapping peak resolution with the Solver function of Excel software , based on the analysis of the standard complexes, provided an estimate of the proportion of each ternary complex in the mixture. The results for all ternary complexes in Fig. 6C show that the initial binding of Cel9Gt induced a strong preference for occupancy of the C. cellulolyticum cohesin by Cel9Gc48, as the ternary complex Scaf4c–(Cel9Gt/Cel9Gc48) accounted for ~ 62% of the mixture. As a second choice, Cel48Fc (35%) was clearly preferred to the family 5 cellulase (2%). Similarly, the initial attachment of Cel48Ft to the C. thermocellum cohesin of Scaf4c induced a strong preference (61%) for the binding of Cel48Fc to the adjacent cohesin. Scaf4c-based ternary complexes containing Cel48Ft and Cel9Gc48 were found to be slightly more abundant (25%) than those containing Cel48Ft and Cel5Ac48 (14%). In contrast, initial binding of Cel5At to the cognate cohesin of Scaf4c induced a comparatively moderate preference (49%) for Cel5Ac48 as a neighboring enzyme, whereas Cel48Fc and Cel9Gc48 were incorporated in similar proportions (23–28%). This second approach thus confirmed the occurrence of preferences among enzymes during cellulosome assembly, but also indicated that some cellulases, such as Cel9G, are more selective than others with respect to the dockerin-bearing partners bound to nearby cohesins.
Influence of the inter-cohesin linker length on enzyme preference during cellulosome assembly
The impact of the inter-cohesin distance within the scaffoldin on enzyme preference was also evaluated. For this purpose, additional hybrid scaffoldins derived from Scaf4c, but with different linkers connecting the cohesins from C. thermocellum and C. cellulolyticum, were used (Fig. 1). These hybrid scaffoldins, which were previously constructed and purified , have linkers that are shorter (four residues, Scaf4o) or longer (39 and 50 residues for Scaf4t and Scaf4tc, respectively) than the typical C. cellulolyticum inter-cohesin linker segment of 11 residues found in Scaf4c. In the case of Scaf4t, the cohesins are separated by the serine-rich and threonine-rich inter-cohesin linkers found in scaffoldin CipA from C. thermocellum, whereas Scaf4tc (formerly named Scaf4 ) contains, in addition, the characteristic 11-residue linker from C. cellulolyticum. As described above for the case of Scaf4c, a cellulase bearing a C. thermocellum dockerin was first bound to the cognate cohesin, and the binary complex was subsequently mixed with Cel5Ac48, Cel48Fc, and Cel9Gc48.
The obtained ternary complexes were analyzed by nondenaturing electrophoresis as described above. The results (Fig. 6C) obtained with Scaf4o were very similar to those obtained with Scaf4c. The initial binding of Cel9Gt and Cel48Ft to the C. thermocellum cohesin of scaffoldin Scaf4o induced strong preferences for Cel9Gc48 (59%) and Cel48Fc (65%), respectively, whereas the initial binding of Cel5At led to a weaker preference (46%) for binding of the same enzyme bearing a C. cellulolyticum dockerin on the adjacent cohesin. Furthermore, as observed with Scaf4c, when Cel9Gt was preliminarily incorporated into Scaf4o, there was a marked preference for Cel9Gc48 to be bound to the C. cellulolyticum cohesin, but, as a ‘second choice’, Cel48Fc was favored (33%) over Cel5Ac48 (8%).
In contrast, for the scaffoldin with the 39-residue linker (Scaf4t), the preferences observed with Scaf4o and Scaf4c were significantly diminished. The initial incorporation of Cel48Ft induced only a moderate preference (43%) for Cel48Fc, whereas the binary complex Scaf4t–Cel9Gt integrated a near-equivalent amount of any of the three competing cellulases bearing the same C. cellulolyticum dockerin (27–37%). Nevertheless, a weak preference for Cel5Ac48 was maintained (45%) when Cel5At was preliminarily bound to the C. thermocellum cohesin of Scaf4t. When the cohesins were separated by a 50-residue linker (Scaf4tc), however, the preferences induced by the initial binding of either Cel5At, Cel48Ft or Cel9Gt seemed to be completely abolished (Fig. 6C), as all binary complexes could incorporate the three C. cellulolyticum dockerin-bearing enzymes with similar yields (33% ± 7%).
In previous studies on cellulosome-producing bacteria, it was suggested that cellulosic complexes are essentially randomly assembled, as the cohesin–dockerin interactions were reported to be rather unspecific within the same species and type [11, 33], although some exceptions have been reported for cellulosomal proteins that bear slightly divergent dockerins as compared with the typical dockerin sequence and show a net preference for specific cohesins of auxiliary scaffoldins . All cellulosomal components do not contribute equally to the cellulosomes, and some cellulosomal enzymes have been systematically found to be more abundant than others, e.g. the pivotal family 48 cellulase [4, 34, 35]. These variations in terms of relative proportions within the cellulosic complexes were assumed to simply reflect differences in the expression levels of their respective genes [21, 34]. Also, proteomic analyses of purified cellulosomes from C. cellulolyticum, for example, identified 30–48 different dockerin-bearing proteins (depending upon the cellulosic growth substrate) [3, 36], whereas the scaffoldin CipC only has eight cohesin modules, thus demonstrating that the bacterium secretes heterogeneous populations of cellulosomes .
In contrast, cellulosome chimeras are much more homogeneous than natural cellulosomes, because, in these artificial complexes, the enzymes have dockerins from various species appended, and they are bound at specified locations of a hybrid scaffoldin bearing the complementary cohesin modules . The use of cellulosome chimeras showed that some enzyme combinations proved to be much better than others on pure cellulose or on raw cellulosic substrates [17, 23]. This observation implies that, if natural cellulosomes are completely assembled at random with available dockerin-bearing catalytic subunits, the bacterium may produce a panel of cellulolytic complexes ranging from highly active to much less productive.
In the present study, we aimed to determine whether homogeneity or heterogeneity of the cellulolytic complexes is most beneficial in terms of enzymatic efficacy on recalcitrant cellulosic substrates. For this purpose, we tested one of the most efficient trifunctional cellulosome chimeras known, composed of Cel5Ac, Cel48Ft and Cel9Gf bound to the cognate cohesins hosted by the hybrid scaffoldin Scaf6, on crystalline cellulose (Fig. 1) . A mixture was then prepared of similar minicellulosomes composed of the same three cellulases but bearing identical C. cellulolyticum dockerins bound to Scaf7, a scaffoldin with the same overall organization as Scaf6 but harboring three copies of the same C. cellulolyticum cohesin. Surprisingly, instead of obtaining a panel of 10 complexes with different enzyme compositions, the stoichiometric mixture of the four interacting proteins systematically generated only three different complexes, in constant proportions. Two complexes were found to contain only one type of enzyme, either Cel5Ac48 (17%) or Cel9Gc48 (33%), whereas the most abundant Scaf7-based minicellulosome (50%) contained two Cel48Fc modules and one Cel5Ac48. This result clearly indicates that preferences occur between cellulosomal enzymes occupying neighboring cohesins during cellulosome assembly that are not caused by minor variations in cohesin–dockerin affinities, as identical C. cellulolyticum docking modules were used.
To further characterize and quantify this phenomenon, another strategy based on a hybrid scaffoldin bearing one cohesin from C. thermocellum and one from C. cellulolyticum (Scaf4c; Fig. 1) but lacking a CBM was used . The initial binding of a specific enzyme (Cel5At, Cel48Ft, or Cel9Gt) to the C. thermocellum module of the scaffoldin had a direct impact on the type of cellulase that was preferentially bound to the adjacent C. cellulolyticum dockerin. Some cellulases (Cel48Ft and Cel9Gc) were found to be more restrictive than others (Cel5Ac) with respect to the occupancy of the neighboring cohesin. Interestingly, these preferences were observed when the two cohesins were separated by the typical 11-residue C. cellulolyticum linker or a shorter linker, but they were attenuated when the characteristic C. thermocellum 39-residue linker was used. In the case of the hybrid scaffoldin with both the C. thermocellum and the C. cellulolyticum linkers (50 residues in total), no significant enzymatic preferences could be detected. A former study showed that modification of the length (or the amino acid composition) of the inter-cohesin linkers in the hybrid scaffoldin Scaf4 did indeed change the inter-cohesin distance and the overall conformational flexibility . The same study, however, demonstrated that the binding of a specific enzyme pair (Cel48Ft and Cel9Gc) to these various Scaf4 derivatives generated minicellulosomes showing very similar activity on crystalline cellulose. Consequently, modifications of the inter-cohesin linker segments had no measurable impact on the functioning of the resulting complexes . The present study reveals another role of the linkers, and shows that the enzymatic preferences during cellulosome assembly depend directly on the lengths of inter-cohesin linkers.
To our knowledge, this is the first time that it has been experimentally shown that, in solution, bacterial cellulosomes do not necessarily self-assemble in a random fashion, and that mechanisms other than minor variations in cohesin–dockerin affinity can induce a bias in the enzymatic composition of the cellulolytic complexes. Computational simulations of the self-assembly of the C. thermocellum cellulosome have previously suggested that, depending on their physical characteristics and flexibility, which govern their residence time around the scaffoldin, some enzymes will, statistically, bind more often than others . This model could explain, in part, why only three complexes with specific compositions were obtained when Scaf7 was mixed with equimolar amounts of Cel5Ac, Cel48Fc, and Cel9Gc. The experiments performed with the series of Scaf4 derivatives, however, do not fully comply with this hypothesis, as the initial binding of Cel5At, Cel48Ft or Cel9Gt induced different enzyme preferences with respect to the occupancy of the neighboring C. cellulolyticum cohesin. According to the simulations formerly published , the same enzyme showing the longest residence time would be preferentially incorporated, whatever cellulase is already bound to the C. thermocellum cohesin module. It should, however, be noted that the computational simulations were performed with the C. thermocellum enzymes Cel5B and CelS (also called Cel48A), which are homologous to Cel5Ac and Cel48Fc from C. cellulolyticum, respectively, whereas the C. thermocellum GH9 enzyme CbhA, which is expected to bind more frequently, is different in its modular content and more voluminous than the cellulase Cel9Gc used in the present study.
Clearly, the enzymatic preferences during complex formation are not dictated by steric hindrance, as the initial binding of Cel48Ft or Cel9Gt, whose masses are in the 80-kDa range, to Scaf4c (or Scaf4o), did not favor subsequent binding of the smallest competing cellulase (Cel5Ac48, 51 kDa) to the C. cellulolyticum cohesin, but rather induced the incorporation of one of the voluminous cellulases. Interestingly, the initial binding of Cel48Ft and Cel9Gt to the C. thermocellum cohesin of Scaf4c or Scaf4o strongly favored the binding of the same enzyme (Cel48Fc and Cel9Gc48, respectively) to the adjacent cohesin. The same phenomenon was also observed, but to a lesser extent, in the case of the family 5 endoglucanase. Similarly, two of three of the obtained Scaf7-based complexes are homogeneous in terms of enzyme components, as they contain either three Cel5Ac modules or three Cel9Gc modules. The latter observation could indicate that some of these enzymes form dimers or oligomers in the free state, but gel filtration experiments performed on these proteins (data not shown), as well as previously reported small-angle X-ray scattering analyses performed on free Cel48Fc/Cel48Ft and Cel5Ac/Cel5At, clearly show that these cellulases do not polymerize in solution [38, 39]. Thus, the present study indicates that these enzymatic preferences during cellulosome assembly are not caused by slightly different affinities among cohesin–dockerin interactions or enzyme dimerization in the free state, but depend on the inter-cohesin distance. It is also worth noting that the enzyme discrimination was observed despite the cohesin–dockerin dual binding mode that was demonstrated for the (type I) docking modules of C. cellulolyticum and C. thermocellum used in the present study [19, 40].
The molecular mechanism(s) responsible for the preferential integration of cellulosomal components occupying adjacent cohesins on the scaffoldin remain(s) unknown. Nevertheless, it is hypothesized that interactions (or repulsions) between catalytic domains and/or ancillary modules other than the dockerin may be involved. It is worth noting that the three C. cellulolyticum cellulases selected in the present study belong to three different glycoside hydrolase families, and show no sequence homology, apart from their C-terminal dockerin module. Annotation of the C. cellulolyticum genome, however, revealed the presence of a number of genes encoding similar cellulosomal enzymes. For instance, 12 genes coding for GH9 enzymes with a dockerin appended were discovered, and proteomic analyses of purified cellulosomes showed that all of the corresponding proteins are found in the cellulolytic complexes, whatever cellulosic growth substrate is used. Among these GH9 cellulosomal components, five putative enzymes (Cel9H, Cel9J, Cel9P, and the proteins encoded by the genes at loci Ccel_231 and Ccel_1249) show the same overall organization as Cel9Gc (leader peptide–GH9–CBM3c–dockerin), which share ~ 45% sequence identity with the entire Cel9Gc sequence . Would discrimination occur between these very similar enzymes during cellulosome assembly, or would their binding be favored (or disfavored) just like that of the homologous cellulase Cel9Gc? Similarly, would the major cellulosomal cellulase Cel9E and the product of the gene at locus Ccel_2392 , which is 78% identical to Cel9E, be discriminated upon cellulosome assembly? Solving these questions would certainly provide some clues to the mechanisms that induce enzyme preferences during cellulosome assembly.
Finally, another result concerns the activity on crystalline cellulose of the heterogeneous Scaf7-based complexes, which was similar to, and, at high substrate concentrations, even greater than that of the Scaf6-based homogeneous cellulosome chimera composed of the same enzymes. Considering the enzymatic composition of the Scaf7-based complexes, this high activity suggests that some synergy may occur between complexes with different enzyme compositions. This observation was formerly reported in the case of different fractions of wild-type cellulosomes from C. cellulolyticum separated by chromatography and showing different enzyme contents. Thus, thanks to the enzymatic heterogeneity of the bacterial cellulosomes, different complexes can act synergistically and achieve better depolymerization of the plant cell wall polysaccharides . The present study indicates that, even in the case of pure cellulose and minicellulosomes, composed of only three different cellulases, minicomplexes showing limited heterogeneity proved to be as efficient as (and, at high substrate concentrations, even slightly better than) the corresponding fully homogeneous cellulosome chimera. The data thus support a significant physiological role for the observed heterogeneity of natural cellulosomes, over and above the simple diversity of enzyme content required to match the diversity and recalcitrance of the polysaccharides that characterize the plant cell wall substrate.
Strains and plasmids
The proteins encoded by the various plasmids constructed in this study are summarized in Fig. 1. The construction of pET-Ac, pET-Fc, pET-Gc, pET-At, pET-Ft, pET-Gf, pET-Scaf4, pET-Scaf4o, pET-Scaf4c, pET-Scaf4t, and pET-Scaf6, encoding Cel5Ac, Ce48Fc, Cel9Gc, Cel5At, Cel48Ft, Cel9Gf, Scaf4tc, Scaf4o, Scaf4c, Scaf4t, and Scaf6, respectively (Fig. 1), has been described previously [17, 23, 24, 31].
Vector encoding Scaf7 bearing CBM3a from C. thermocellum and three copies of cohesin 1 from C. cellulolyticum
pET-Scaf6 was first modified by introducing, at the unique NcoI site located at the 5′-extremity of the gene, an adaptor composed of the complementary forward Hf (5′-CATGCACCACCACCACCACCA-3′) and reverse Hr (5′-CCATGTGGTGGTGGTGGTGGTG-3′) primers encoding a His-tag, thereby generating pET-HScaf6, in which the NcoI site was erased. With pET-Scaf6 as the template, the region encoding CBM3a was amplified with the forward CBMf (5′-CACCGGTATCAGGCAATTTGAAGG-3′) and reverse CBMr (5′-TTTTAGGATCCTTTTCCCATGGGCGGTATTGTTGTTGCAGGTGG-3′) primers, introducing an NcoI site and a BamHI site (underlined) at the 3′-extremity of the DNA. The amplicon was cloned in EcoRI–BamHI-linearized pET-HScaf6, to generate pET-Coh1c-CBMt. The DNA encoding cohesin 1 of CipC from C. cellulolyticum was amplified from pSOSCipC  with the forward CoF (5′-TTTTTTCCATGGGCGATTCTCTTAAAGTTACAGTAGGAACA-3′) and reverse CoRb (5′-ATTAGGATCCTTATACTGCTACTTTAAGTTCCTTTGTAGG-3′) primers, introducing an NcoI site and a BamHI site (underlined) at the 5′-extremity and the 3′-extremity, respectively. The amplicon was subsequently cloned in Nco1–BamH1-linearized pET-Coh1-CBMt, thus generating pET-Coh1c-CBMt-Coh1c. The vector pET-Scaf7 was obtained by amplification of the DNA encoding cohesin 1 from CipC with pSOSCipC as the template and the forward CoF and reverse CoRn (5′-TTTTTTTCCATGGCTACTTTAAGTTCCTTTGTAGGTTG-3′) primers, introducing an NcoI site (underlined) at both extremities, and cloning of the amplicon in Nco1-linearized pET-Coh1c-CBMt-Coh1c. Clones containing pET-Scaf7 with correct orientation of the insert were selected by PCR on colonies with the primer pair CBMf/CoRn, and verified by sequencing.
Vectors encoding Cel5A and Cel9G with modified dockerins
The vector pET-Ac48 encoding Cel5A with the native C. cellulolyticum dockerin of Cel48F appended was obtained by overlap extension PCR. The DNA encoding the catalytic module of Cel5Ac was amplified with the template pET-Ac and the forward Af (5′-TTTTTTCTCATATGTATGATGCTTCACTTATTCCG-3′; NdeI site underlined) and reverse A48r (5′-CTTTCGAAGCCAAGACAGATCCAGATCAAGGTCCAGAAAAATTATTGGG-3′) primers. The DNA encoding the dockerin module of Cel48F was amplified from pET-Fc with the forward A48f (5′-CCCAATAATTTTTCTGGACCTTGATCTGGATCTGTCTTGGCTTCGAAAG-3′) and reverse Fr (5′-TAAAACTCGAGTTGGATAGAAAGAAGTGC-3′; XhoI site underlined) primers. The two overlapping fragments (overlapping region in italics) were mixed, and a combined fragment was synthesized with the external primers Af and Fr. The fragment was cloned into NdeI–XhoI-linearized pET22b(+) (Novagen, Madison, WI, USA), thereby generating pET-Ac48. The vector encoding Cel5Af (pET-Af) was constructed in a similar manner. An amplicon was obtained with pET-Ac and the forward Af and reverse Afr (5′-CATGTAGGAACGAGCTTTGTGCCGGGGTCAGGATCTGTCTTGGCTTCGAAAG-3′) primers. The DNA encoding the R. flavefaciens dockerin was amplified from pET-Gf with the forward Aff (5′-CTTTCGAAGCCAAGACAGATCCTGACCCCGGCACAAAGCTCGTTCCTACATG-3′) and Gfr (5′-TTTATTCTCGAGTTGAGGAAGTGTGATGAG-3′, Xho1 site underlined) primers. Both amplicons were mixed and combined by use of the external primers Af and Gfr, and the final fragment was cloned in the NdeI–XhoI-linearized pET22(b)+ vector, thereby generating pET-Af.
The vector pET-Gc48 was obtained similarly. PCR with pET-Gc as the template and the forward G1550f (5′-GATCCTTTAAGCCTTGTAACAAG-3′; annealing 50 bases upstream of the unique NcoI site) and reverse G48r (5′-AACGAACCCGCAGGTGGATCAGAAAACCCAGATCAAGGTCCAGAAAAATTATTGGG-3′) primers was performed, whereas the DNA encoding the dockerin module of Cel48F was amplified from pET-Fc with the forward G48f (5′-CCCAATAATTTTTCTGGACCTTGATCTGGGTTTTCTGATCCACCTGCGGGTTCGTT-3′) and reverse Fr primers. The overlapping amplicons (overlapping region in italics) were mixed and combined by use of the external primers G1550f and Fr. The final amplicon was cloned in NcoI–XhoI-linearized pET-Gc, thus generating pET-Gc48. The vector encoding Cel9Gt was also obtained by overlap extension PCR. A first amplicon was generated by PCR from pET-Gc with the forward G1550f and reverse Gtr (5′-GCCGTATAATTTAGTAGAAGGAGTACCATCAGGGTTTTCCGATCCACCTGC-3′) primers. The DNA encoding the C. thermocellum dockerin was amplified from pET-Ft with the forward Gtf (5′-GCAGGTGGATCGGAAAACCCTGATGGTACTCCTTCTACTAAATTATACGGC-3′) and reverse Ftr (5′-TTATTCTCGAGGTTCTTGTACGGCAATGTAT-3′; XhoI site underlined) primers. The two overlapping fragments (overlapping region in italics) were mixed, and a combined fragment was synthesized with the external primers G1550f and Ftr. The fragment was cloned into NcoI–XhoI-linearized pET-Gc, thereby generating pET-Gt.
Positive clones were verified by DNA sequencing. The BL21(DE3) Escherichia coli strain (Novagen) was used as the production strain.
Production and purification of cellulases and scaffoldins
The production and purification of Cel5Ac, Cel48Fc, Cel9Gc, Cel48Ft, Cel9Gf, Scaf4tc, Scaf4o, Scaf4c, Scaf4t and Scaf6 have been described previously [17, 23, 24]. Cel5Ac48 and Cel5Af were produced and purified with the same procedure as used for Cel5Ac. Cel9Gc48 and Cel9Gt were produced and purified with the same procedure as used for Cel9Gc.
The BL21(DE3) strain overproducing Scaf7 was grown in toxin flasks at 37 °C in LB medium supplemented with glycerol (12 g·L−1) and kanamycin (50 mg·L−1) up to D600 nm = 1.5. The culture was cooled, and induction of expression was performed overnight at 20 °C with 200 μm isopropyl thio-β-d-galactoside. After 16 h, the cells were harvested by centrifugation (3000 g, 15 min), resuspended in 100 mL of 50 mm potassium phosphate buffer (pH 7.0), supplemented with few milligrams of DNAse I (Roche, Mannheim, Germany), and broken in a French press. The crude extract was mixed with 20 g of crystalline cellulose Avicel PH101 (Fluka, Buchs, Switzerland), and the suspension was vacuum filtered (Whatman glass fiber filter with a pore size of 2.7 μm; GE Healthcare, Uppsala, Sweden). Cellulose was washed three times with 150 mL of 50 mm potassium phosphate buffer (pH 7.0), and once with 150 mL of 10 mm potassium phosphate buffer (pH 7.0). The proteins specifically adsorbed on cellulose were eluted with 100 mL of 1% triethylamine (v/v). The pH was immediately reduced to 8 by adding 4 mL of 2 m Tris/HCl (pH 8). The purification of Scaf7 containing an N-terminal His-tag was performed on Ni2+–nitrilotriacetic acid resin (Qiagen, Vanloo, The Netherlands).
The purified proteins were dialyzed by ultrafiltration against 10 mm Tris/HCl (pH 8.0) and 1 mm CaCl2, and stored at − 80 °C. The concentration of the proteins was estimated by absorbance at 280 nm in 25 mm sodium phosphate (pH 6.5), with protparam (www.expasy.org/tools/protparam.html).
Complex formation and analyses of the heterogeneous complexes by electrophoresis
Complex formation was routinely verified as follows: samples (10 μm final concentration) were mixed at room temperature in 20 mm Tris/maleate (pH 6.0) and 1 mm CaCl2, and 4 μL was subjected to PAGE (4–15% gradient) with a Phastsystem apparatus (GE Healthcare).
Complex proportions were estimated by loading the sample of interest on a 4–15% gradient gel (1 μm final concentration, same buffer as above), with standard complexes of known enzyme composition. After Coomassie blue staining and destaining, the gel was digitalized, and quantitative analysis was performed with imagequant tl (GE Healthcare). The obtained peaks were resolved and quantified with Microsoft Excel Solver, as formerly described by Dasgupka , providing an estimate of the contribution of each type of complex to the heterogeneous mixture of minicellulosomes.
IEF was also performed on IEF 3-9 Phastgel (GE Healthcare). Four microliters of sample was loaded at 10 μm. In some experiments, to determine the protein composition of each band, the bands were excised after Coomassie blue staining, and incubated for 2 min in 0.112 m Tris/HCl (pH 8.0), 0.112 m acetic acid, 1% (w/v) dithiothreitol, and 2.5% (w/v) SDS. A second 2-min incubation in 0.112 m Tris/HCl (pH 8.0), 0.112 m acetic acid, 1% (w/v) dithiothreitol, 2.5% (w/v) SDS and 0.1% (w/v) bromophenol blue was then performed. The plugs were subsequently loaded onto a 15% SDS/PAGE gel (MiniProtean vertical gel system; Bio-Rad, Hercules, CA, USA) with pure cellulases and scaffoldin (at 0.2 μm or 1 μm) as standards, and stained with the Plus One Silver Staining Kit Protein (GE Healthcare). Alternatively, the IEF bands were cut and analyzed by MS (see below).
Cellulase labeling with fluorescent probes
Cel5Ac48, Cel48Fc and Cel9Gc48 (1.8 mg of protein) were mixed with 0.1 mg of DyLight 488, 550 or 633 maleimide dye (Thermo Scientific, Waltham, MA, USA), respectively, already resuspended in dimethylformamide (10 g·L−1 final concentration). Alternatively, 3.5 mg of Cel5Ac48, Cel48Fc and Cel9Gc48 in 20 mm potassium phosphate buffer (pH 7.0) were mixed with Alexa Fluor 430, 546 or 633 succinimidyl ester dye (Protein Labeling Kit, Invitrogen, Carlsbad, NM, USA), respectively, according to the manufacturer's instructions. Free fluorescent probe and labeled protein were separated with a spin fluorescent dye removal column (Thermo Scientific). Complexes containing fluorescent cellulases were analyzed by nondenaturing electrophoresis as described above, and the resulting gel was scanned with an FLA-5100 fluorescence scanner (FujiFilm, Tokyo, Japan) with appropriate filters (DBR1, DGR1 and LPR; FujiFilm), before analysis with imagequant tl software.
In-gel trypsin digestion of proteins and ion trap LC-MS/MS analyses
For MS, proteins were excised from IEF gels by slicing each band. A robotic workstation (Freedom EVO 100; Tecan, Männedorf, Switzerland) was used to perform automated sample preparation, including steps of washing, reduction and alkylation, digestion by trypsin (Sigma-Aldrich, St Louis, MO, USA), extraction and drying of mixed peptides, as previously described . The corresponding liquid sample (at 10 μm) prior to IEF separation was also directly subjected to reduction, alkylation, and digestion by trypsin.
The peptides were analyzed by two-dimensional LC, with a dynamic nanospray ionization source, on the ion trap LCQ-DECAXP mass spectrometer (Thermo Finnigan, Ringoes, NJ, USA), with the big three method from the Finnigan Protomex 2.0 data acquisition xcalibur software as previously described . A peak list was generated with bioworksbrowser version 3.3 (Thermo Finnigan). Protein identification was performed with the sequest (v28rev12) algorithm, with the nonredundant NCBI database restricted to E. coli and supplemented with the sequences of Scaf7, Cel5Ac, Cel48Fc, and Cel9Gc.
Kinetics were determined by incubating, at 37 °C under mild shaking, aliquots (40 μL) of protein samples [10 μm in 20 mm Tris/maleate (pH 6.0) and 1 mm CaCl2] with 4 mL of Avicel PH101 at 3.5 g·L−1 or 30 g·L−1 in 20 mm Tris/maleate (pH 6.0), 1 mm CaCl2, and 0.01% (w/v) NaN3. The final concentration of homogeneous or heterogeneous minicellulosomes in the reaction tube was thus 0.1 μm. Aliquots (900 μL) were extracted at 0, 1, 6 and 24 h, and centrifuged (15 000 g) at 4 °C, and the released soluble sugars present in the supernatant were analyzed in a Dionex (Sunnyvale, CA, USA) ICS 3000 equipped with a pulsed amperometric detector. Two hundred microliters of supernatant was mixed with 50 μL of 0.5 m NaOH, and 25 μL was applied to a Dionex CarboPac PA1 column (4 × 250 mm) and the corresponding guard column (4 × 50 mm). Sugars were eluted with the buffers 0.1 m NaOH and 0.5 m sodium acetate + 0.1 m NaOH as eluants A and B, respectively, with the following multistep procedure: isochratic separation (5 min, 95% A + 5% B), gradient separation (8 min, 10–37% B), column wash (2 min, 99% B), and subsequent column equilibration (2.5 min, 95% A + 5% B). The flow rate was kept at 1 mL·min−1. Injection of samples containing glucose, cellobiose, cellotriose, cellotetraose and cellopentaose (Sigma-Aldrich) at known concentrations (ranging from 5 to 100 μm) were used to identify and quantify the released sugars.
We are very grateful to R. Lebrun and S. Lignon (Plateforme protéomique de l'IMM, Marseille-Protéomique, Marseille, France) for performing the MS analyses. P. de Philip, C. Tardif, Y. Denis and H. Celik are thanked for helpful discussions.