Characterization of a dockerin-based affinity tag: application for purification of a broad variety of target proteins


  • This article is published in Journal of Molecular Recognition as a special issue on Affinity 2009, edited by Gideon Fleminger, Tel-Aviv University, Tel-Aviv, Israel and George Ehrlich, Hoffmann-La Roche, Nutley, NJ, USA.


Cellulose, a major component of plant matter, is degraded by a cell surface multiprotein complex called the cellulosome produced by several anaerobic bacteria. This complex coordinates the assembly of different glycoside hydrolases, via a high-affinity Ca2+-dependent interaction between the enzyme-borne dockerin and the scaffoldin-borne cohesin modules. In this study, we characterized a new protein affinity tag, ΔDoc, a truncated version (48 residues) of the Clostridium thermocellum Cel48S dockerin. The truncated dockerin tag has a binding affinity (KA) of 7.7 × 108 M−1, calculated by a competitive enzyme-linked assay system. In order to examine whether the tag can be used for general application in affinity chromatography, it was fused to a range of target proteins, including Aequorea victoria green fluorescent protein (GFP), C. thermocellum β-glucosidase, Escherichia coli thioesterase/protease I (TEP1), and the antibody-binding ZZ-domain from Staphylococcus aureus protein A. The results of this study significantly extend initial studies performed using the Geobacillus stearothermophilus xylanase T-6 as a model system. In addition, the enzymatic activity of a C. thermocellum β-glucosidase, purified using this approach, was tested and found to be similar to that of a β-glucosidase preparation (without the ΔDoc tag) purified using the standard His-tag. The truncated dockerin derivative functioned as an effective affinity tag through specific interaction with a cognate cohesin, and highly purified target proteins were obtained in a single step directly from crude cell extracts. The relatively inexpensive beaded cellulose-based affinity column was reusable and maintained high capacity after each cycle. This study demonstrates that deletion into the first Ca2+-binding loop of the dockerin module results in an efficient and robust affinity tag that can be generally applied for protein purification. Copyright © 2010 John Wiley & Sons, Ltd.


Cellulose is a major component of plant matter comprising up to 50% of wood biomass (Bayer and Lamed, 1992). The structural complexity of the plant cell wall makes the degradation of cellulose a challenging task; nevertheless, many different types of cellulose-degrading microorganisms have evolved in a variety of biological niches. Bacterial cellulase systems may include free enzymes (catalytic module only or with an appended carbohydrate-binding module, CBM), cell-bound enzymes (Zhao et al., 2006), and supramolecular complexes termed cellulosomes.

The cellulosome, first discovered in the anaerobic thermophilic bacterium Clostridium thermocellum (Bayer et al., 1983; Lamed et al., 1983a,b), is an extracellular macromolecular complex (Mr > 2 MDa) that mediates bacterial cell adhesion to and the efficient degradation of plant cell wall cellulose and its associated polysaccharides. The cellulosome is composed of a core non-catalytic subunit, the scaffoldin, which usually contains a single CBM (Fierobe et al., 2002) and a number of cohesin modules. The cohesins are responsible for the integration of the different dockerin-containing proteins through very powerful calcium-dependent cohesin–dockerin interactions (KA ∼ 1011 M−1) (Bayer et al., 1994; Mechaly et al., 2001). The cohesins interact in a species-specific manner (Pagès et al., 1997); they bind strongly to any of the enzyme-borne dockerin modules of the same species, yet generally fail to bind to those of other species.

In an early review, we proposed the potential application of the cohesin–dockerin pair for use in a variety of affinity-based processes, including affinity chromatography (Bayer et al., 1994). The small size of the dockerin (∼8–10 kDa) makes it suitable to serve as a fused affinity tag, with minimal modification to the protein of interest. Its high-affinity interaction with the cohesin module (Fierobe et al., 1999) renders it highly specific, a property of great importance in any affinity process. Nevertheless, due to the very high affinity of the cohesin–dockerin interaction, it proved extremely difficult to dissociate the cellulosome (Lamed et al., 1983a; Lamed and Bayer, 1988), and could be achieved only under relatively harsh treatments with combined detergent, heat, and chelating agents (Bhat and Wood, 1992; Mori, 1992), conditions inappropriate for most purification procedures. Later, the dissociation of the C. thermocellum cellulosome under more subtle conditions of 60°C and EDTA was demonstrated (Morag et al., 1996), indicating the pivotal role of Ca2+ in the maintenance of the structural integrity of the complex.

In a recent work, Karpol et al. (2008) developed an efficient affinity-purification system based on the cohesin–dockerin interaction combined with the binding of a CBM to cellulose matrices. The work employed a dockerin module that was truncated in its first calcium-binding loop and most of the first α-helix. The utility of the truncated dockerin (ΔDoc) was demonstrated in the purification of a model target protein: the enzyme xylanase T-6 from Geobacillus stearothermophilus. The affinity purification system consisted of a recombinant C. thermocellum scaffoldin fragment that included the CBM and the adjacent cohesin, such that the cohesin bound to the ΔDoc and its host protein, and the CBM bound the interacting proteins to the cellulose column. The truncated dockerin retained high levels of affinity for its complementary cohesin, similar to that of the wild-type dockerin, yet enabled almost complete dissociation of the dockerin from the CBM-Coh affinity column under mild elution conditions.

However, xylanase T-6 has a thermophilic origin and its relative stability could have contributed to the observed high performance of the tag by stabilizing the interaction between the dockerin and cohesin. Furthermore, the activity of the enzyme following the purification process was not examined in that study. Therefore in the present work we were interested to further characterize and examine the more general potential of the ΔDoc to act as an affinity tag for purification of proteins from different origins and with different activities. For this purpose, we have designed, purified and investigated diverse candidate proteins fused to the truncated dockerin. As our main model we chose the well-studied jellyfish Aequorea victoria green fluorescent protein (GFP) (Zimmer, 2002) and produced various configurations of the ΔDoc-tagged GFP. We also examined the use of the affinity tag for purification of enzymatically active proteins, i.e., C. thermocellum β-glucosidase (Kadam and Demain, 1989; Grabnitz et al., 1991) and E. coli TEP1 (Lee et al., 1997), in order to examine possible effects of the purification on enzyme activity. In addition we chose to purify two copies of the Staphylococcus aureus Fc-binding B-domain of protein A (ZZ-domain) (Nilsson et al., 1987), in order to examine the possibility of producing a reusable column for antibody purification.



All cloning was based on previously described vectors (Karpol et al., 2008, 2009). These vectors contained a G. stearothermophilus xylanase T-6 (Xyn) fused to a C. thermocellum Cel48S dockerin in its wild-type (wtDoc) or truncated form (ΔDoc) (at the C- or N-terminus). Our strategy was to replace the xylanase by the desired target genes (i.e., GFP, β-glucosidase, TEP1, and ZZ-domain) thus creating fusion proteins with wtDoc or ΔDoc. A His-tag was located at the opposing end to the dockerin and served as an additional purification tool. Due to the variety of the cloned genes we used several different cloning techniques. Using the common technique of vector digestion with EcoRI and XhoI restriction enzymes and ligation of the respectively digested PCR products, we cloned TEP1 and the ZZ-domain. When our model genes contained these restriction sites in their sequence we had to resort to alternative more robust cloning strategies. For this purpose, we employed either the Clontech In-Fusion™ PCR cloning system (following manufacturers instructions) for the cloning of the β-glucosidase or the restriction-free cloning system (van den Ent and Lowe, 2006) for the cloning of the GFP constructs. Briefly, the gene of interest was amplified using a standard PCR procedure with a 5′ overhang that corresponds to the vector sequence, which then served as a mega primer in a linear amplification reaction around the circular plasmid vector. Primers used for cloning of the different constructs are found in Table 1. Templates for the GFP and ZZ-domain genes were kindly provided by Dr. Ely Morag (Designer Energy, Rehovot, Israel). The template for the TEP1 gene was a kind gift of Prof. Dan Tawfik (Weizmann Institute, Rehovot, Israel), and the β-glucosidase gene was amplified from the genomic DNA of C. thermocellum.

Table 1. List of primers used in this work
ConstructPrimer name Sequence

The gene encoding the protein construct CBM-Coh, consisting of a cellulose-binding module (CBM) and a cohesin (Coh) from the C. thermocellum CipA, was cloned as described previously (Yaron et al., 1996).

Protein expression

Protein expression was accomplished as described previously (Karpol et al., 2008) with minor modifications. Briefly, the host cells were grown until the culture reached OD600 > 0.6. IPTG (isopropyl-β-D-thiogalactopyranoside) was added at final concentration of 0.1–1 mM, for induction of protein expression. Culture growth was continued for another 3 h at either 37 or 30°C, or overnight at 16°C, according to predetermined optimization experiments. The different constructs used in this study are described in Table 2.

Table 2. Fusion proteins used in the affinity purification studies
Fusion proteinMolecular weight (Da)Parent proteinAffinity tagLocation
  • wtDoc, C. thermocellum Cel48S dockerin; ΔDoc, truncated C. thermocellum Cel48S dockerin.

  • *

    Indicated proteins included a His tag located at the N-terminus of GFP. All other constructs had a C-terminal His tag.

wtDoc-GFP35 550GFPwtDocN-terminus
ΔDoc-GFP33 880GFPΔDocN-terminus
GFP-wtDoc*35 270GFPwtDocC-terminus
GFP-ΔDoc*33 600GFPΔDocC-terminus
ΔDoc-BglA58 472β-glucosidaseΔDocN-terminus
ΔDoc-ZZ-domain22 500ZZ-domainΔDocN-terminus
ΔDoc-TEP127 670TEP1ΔDocN-terminus
CBM-Coh36 645CBM-CohCBMN-terminus
wtDoc-Xyn53 135XynwtDocN-terminus
ΔDoc-Xyn50 785XynΔDocN-terminus

Purification of His-tagged proteins

The Ni-NTA affinity chromatography was performed essentially as described by Karpol et al. (2008) with minor modifications. The column was washed with five column volumes of wash buffer (Tris-buffered saline – 25 mM Tris–HCl, 137 mM NaCl, 2.7 mM KCl, pH 7.4), supplemented with 1 mM CaCl2 (TBS–CaCl2) and 5 mM imidazole. Eluted fractions (1–2 ml) were collected and analyzed on SDS–PAGE (10–15%) and visualized by Coomassie brilliant blue (CBB) staining. Fractions containing relatively pure protein were pooled and dialyzed overnight at 4°C against TBS–CaCl2.

Purification of CBM-Coh

The CBM-Coh fusion protein was purified as described previously by Karpol et al. (2008).

Cohesin–dockerin-based affinity chromatography

Cohesin–dockerin-based affinity chromatography was performed essentially as described by Karpol et al. (2008) with minor modifications. The column was loaded with 54 µM of purified CBM-Coh, and then flushed with 30 ml of TBS–CaCl2. A 5- to 20-ml cell extract of E. coli BL21 (λDE3) expressing the dockerin-tagged protein was applied to the column and washed with 30 ml of TBS–CaCl2.

ELISA binding/affinity assay

The ELISA-based Coh-Doc binding assay was performed essentially according to Barak et al. (2005) with minor modification. The ELISA plates were coated with 5–30 nM of CBM-Coh. After discarding the blocking buffer, the plates were washed three times with washing buffer (blocking buffer without BSA), >300 µl/well per wash. Subsequently, Doc-Xyn proteins were diluted to 1 pM to 0.2 µM in TBS buffer, supplemented with either 1 mM CaCl2 or 10 mM EDTA (according to the experiment), and dispersed in duplicate into the wells. Plates were again incubated for 0.5–1 h, and washed three times before commencing the detection reaction.

The effective concentration at 50% (EC50) was determined from the binding curves of the ΔDoc–Xyn fusion proteins, compared with that of the wtDoc–Xyn (Reichmann et al., 2007a,b) by calculating a nonlinear fit for the ELISA curves using the general equation for a dose–response curve (Equation 1) of the GraphPad Prism 4 program (GraphPad Software, Inc., La Jolla, CA):

equation image(1)

“Bottom” is defined as the minimal Y value at the bottom plateau of the curve (minimal binding); “Top” is the maximal Y value at the top plateau (binding saturation), and log EC50 is the logarithm of the effective concentration at 50% (EC50), where the X value is halfway between the Top and Bottom (i.e., 50% of the Doc molecules are bound).

The changes in the Gibbs free energy (ΔΔG) between the wild-type and the truncated dockerin and their respective interactions with test cohesin were calculated using Equation (2)

equation image(2)

where R is the gas constant and T is the absolute temperature (°K).

In order to calculate the different KD values (e.g., for ΔDoc) we first calculated the ΔGWT by solving Equation (3) for KD = 1.7 × 10−10 (Handelsman et al., 2004) and T = 298°K. Next we calculated ΔGmut by solving Equation (2) with the previously calculated values of ΔΔG and ΔGWT. Finally, we calculated the different KD values by solving Equation (3) once again with the calculated ΔGmut value (and T = 298°K). The binding affinity constant (KA) is defined as the reciprocal of KD

equation image(3)

Competitive ELISA

MaxiSorp 96 wells ELISA plates were initially coated with CBM-Coh (50 nM), blocked and washed as described (Handelsman et al., 2004; Barak et al., 2005). Next, different concentrations (1 pM to 0.2 µM) of the dockerin-fused proteins (ΔDoc-GFP, wtDoc-GFP) were mixed with a constant concentration of the wtDoc-Xyn (100 pM). This mixture was then added in duplicate into the wells for interaction. Subsequently, washing and detection steps were conducted as mentioned above. The wtDoc-Xyn interaction with the cohesin was challenged by increasing concentrations of the dockerin-fused test protein, which resulted in the reduction of recognition by the primary antibody and consequently in the signal produced by the secondary antibody.

To determine the inhibition concentration (IC50) of the ΔDoc/wtDoc, a nonlinear fit for the ELISA curves was calculated using one-site binding competitive equation (Equation 4) of the GraphPad Prism 4 program (GraphPad Software, Inc.). log IC50 is the logarithm of the, IC50 (where 50% of the binding sites are occupied by the competitor)

equation image(4)

The changes in the Gibbs free energy (ΔΔG°) between the wild-type and the truncated dockerin and their respective interactions with test cohesin were calculated using Equation (5)

equation image(5)

Enzymatic assays

For the thioesterase/protease I (TEPI) activity assay, esterase activity was tested using 4-nitrophenyl acetate (pNPA, FlukaTM). pNPA was dissolved in dimethyl sulfoxide (DMSO) to final concentration of 0.5 M that comprised the main stock solution. The assay mixture (200 µl) contained 0.5 µM enzyme, and between 0.0625 and 2.25 mM pNPA in TBS pH 7. The reaction was carried out at 25°C and initiated by the addition of the enzyme. Initial rates were monitored by measuring the formation of p-nitrophenol at 405 nm using a Power HT microtiter scanning spectrophotometer (Bio-Tek Instruments, Inc., Winooski, VT) (ε25°C = 2100 M−1 cm−1).

β-Glucosidase activity was measured using 4-nitrophenyl-β-D-glucopyranoside (pNPG, Sigma Chem. Co., St. Louis, MO) as substrate. The assay reaction (200 µl) contained 0.1 µM enzyme, and 5–25 mM pNPG in 50 mM citrate buffer (pH 6). The reaction was carried out at 50°C, the approximate temperature of the thermophilic enzyme's environment, and was initiated by the addition of the enzyme after a short incubation period at this temperature. Initial rates were monitored by measuring the formation of p-nitrophenol at 405 nm using a Power HT microtiter scanning spectrophotometer (Bio-Tek Instruments, Inc.) (ε50°C = 5300 M−1 cm−1).

Calculations of Vmax and Km were performed using Michaelis–Menten equation (Equation 6) of the GraphPad Prism 4 program

equation image(6)

where X is the substrate concentration, Y is the initial reaction rate, and Vmax is the maximum reaction rate. Km is the Michaelis–Menten constant; the substrate concentration needed to achieve half-maximum reaction rate (in the same units as X).

Sequence analysis and structure prediction

Computer analyses of DNA, primer design, multiple sequence analysis, virtual cloning, and protein parameters were performed and calculated by the Vector NTI program suite (Invitrogen, Carlsbad, Ca.). Sequence similarity searches against external databases were conducted using the Blast program of NCBI ( Multiple sequence alignment was done using the ClustalW program (

Protein structural predictions were performed using the SWISS-MODEL Workspace (Arnold et al., 2006) (, structural alignments using the DaliLight server (Holm and Park, 2000) (, and protein imaging using Pymol (DeLano Scientific LLC, San Carlos, CA, USA) (DeLano, 2002).


Characterization of binding affinity

The binding affinity of both the wtDoc- and ΔDoc-tagged Xyn toward type-I cohesin was evaluated in the presence of Ca2+ and in the presence of EDTA, using an ELISA-based technique (Figure 1). The wtDoc-Xyn presented the strongest binding in the presence of Ca2+, while in the presence of EDTA its binding was compromised (ΔΔG = 2.01 kcal mol−1), but still strong enough (KA ∼ 109 M−1) that the protein would be retained on an affinity column. In the presence of Ca2+, ΔDoc-Xyn interacted similarly to the wtDoc-Xyn supplemented with EDTA (ΔΔG = 2.4 kcal mol−1, between wild-type and truncated form supplemented with Ca2+), while in the presence of EDTA the ΔDoc-Xyn failed to present any significant binding. These results demonstrate that ΔDoc, although lacking its first and most of the first α-helix Ca2+ binding loop (Karpol et al., 2008), retained relatively high binding capacities, which are fully reversed in the presence of EDTA. These properties are highly suitable for use in affinity purification schemes.

Figure 1.

Calcium-dependent cohesin-binding properties of the truncated versus wild-type dockerin. Cohesin–dockerin interactions were measured using the affinity-based ELISA assay. Wells of microtiter plates were coated with CBM-Coh and interacted with either wtDoc-Xyn or ΔDoc-Xyn in the presence of 1 mM CaCl2 or 10 mM EDTA.

We next evaluated the binding affinity of the ΔDoc-tagged GFP (ΔDoc-GFP), relative to the wild-type analog (wtDoc-GFP) by a competitive ELISA assay (Handelsman et al., 2004). For this purpose we coated the ELISA plate with CBM-Coh and mixed increasing amounts of either wtDoc- or ΔDoc-GFP fusion proteins together with a constant amount of wtDoc-Xyn. After an appropriate incubation time, the plate was washed and examined with anti-Xyn antibodies (Figure 2). The ΔDoc had a similar affinity to that of the wild-type module (ΔΔG = 1.2 kcal mol−1).

Figure 2.

Determination of relative binding affinity by competitive enzyme-linked immunoassay (cELIA). Microtiter plates were coated with CBM-Coh and interacted with either wtDoc-GFP or ΔDoc-GFP, in the presence of competitor wtDoc-Xyn. The reaction was performed in the presence of 1 mM CaCl2. Measured OD values reflect the relative amount of wtDoc-Xyn bound to the cohesin immobilized to the plates.

Prediction of a structural model for ΔDoc

In order to assess the molecular elements involved in the binding of the dockerin to cohesin and account for the observed species specificity, three-dimensional crystal structures of the native cohesin–dockerin complex were determined earlier. Initially, the structure of a complex between the C. thermocellum cohesin interacting with its cognate dockerin module through its second duplicated repeat was reported (Carvalho et al., 2003). In this latter complex, a dockerin from a different cellulosomal enzyme component (the type-I dockerin, DocZ, from xylanase 10B) was used. Alignment of DocS and DocZ revealed substantial similarity between the two modules (46% identity), rendering them appropriate for homology modeling of the truncated dockerin. Models of the predicted DocS and ΔDocS structures were thus constructed using the SWISS-MODEL Workspace structure homology-modeling server (Arnold et al., 2006) by providing the alignment and the PDB accession number of the original cohesin–dockerin complex (1OHZ) (Carvalho et al., 2003). Despite the high sequence similarity, some differences exist between the sequences (Figure 3). Notably, the sequence of DocZ includes two additional amino acid residues (Ala-33 and Arg-34) in the interior linker sequence that separates the two repeated dockerin segments. These extra residues most likely contribute to the formation of the third helix in the DocZ structure, which is absent from the DocS model. Furthermore, DocS has Gly and Lys residues instead of the Leu-49 and Ser-52, respectively, of the DocZ sequence.

Figure 3.

Multiple sequence alignment of C. thermocellum DocZ, DocS, and ΔDocS. Structural sequence alignment of the C. thermocellum DocZ and DocS was performed using the Dali server; ΔDocS was added manually. The residues involved in Ca2+ coordination are highlighted in gray. Black-highlighted residues represent those involved in direct hydrogen bonding to cohesin when the complexed dockerin is bound mainly through its second duplicated repeat. Hydrophobic residues involved in the interaction with cohesin are shown in open boxes. Identical residues (ident) are indicated by vertical lines; secondary structural elements, helix (H) and loop (L), were assigned by the DSSP algorithm as implemented by the Dali server. h1, h2, and h3 indicate the positions of the three helices in the DocZ sequence.

In the proposed DocS and ΔDocS models, most of the amino acids interacting through hydrogen bonds with the cohesin in the wild-type are found in the truncated form (Figure 4). However, in the DocZ molecule from the complex structure a Ser-52 interacts through two water molecules with Asp-87 in the cohesin. Instead, DocS has a Lys in that position (Lys-50), which is predicted to form direct hydrogen bonds with Asp-87 of the cohesin (Figure 4B). Interestingly, of the 75 type-I dockerin sequences in the C. thermocellum genome, 62 (∼82%) show a lysine at position 52; i.e., the serine at that position in DocZ is an outlier.

Figure 4.

Three-dimensional structural models of DocS and ΔDocS and their predicted interactions with the cohesin module. (A) Cartoon representation of the wild-type DocS and predicted ΔDocS models. (B) The predicted 3D structures of the type-I cohesin–dockerin complex interface. ΔDocS, the truncated dockerin module is depicted as a ribbon model; DocS, the wild-type dockerin model is depicted as a cartoon model; cohesin, the cohesin module. Residues forming hydrogen bonds are shown as sticks. Direct hydrogen bonds are marked as dashed lines. Calcium ions are depicted as spheres.

According to the model, the hydrophobic interactions of the cohesin–dockerin complex would not appear to be compromised in the truncated dockerin, with the exception of Leu-49 in DocZ which is substituted with a Gly residue in DocS. Nevertheless, no significant loss of the interacting residues is observed, thus supporting the ELISA results, which show similar binding affinities of both the wild-type and the truncated dockerin.

Cohesin–dockerin affinity chromatography

A schematic representation of the general approach based on the high-affinity cohesin–dockerin interaction can be found in a previous report (see Figure 1, Karpol et al., 2009). CBM-Coh is first bound to the beaded cellulose resin, followed by a sample containing the desired ΔDoc-bearing target protein in the presence of Ca2+. The latter protein is eluted subsequently using EDTA.

Purification of ΔDoc-tagged GFP fusion proteins

Fusion proteins comprising either wild-type or truncated dockerin at either the N- or the C-terminus of the GFP were designed (Table 1). In one of these proteins, a ΔDoc tag was fused to the N-terminus of the GFP, and overexpressed in E. coli. The bacterial cell lysate was applied onto beaded cellulose resin previously incubated with CBM-Coh. The cellulose column was then exposed to extensive washing with buffer, followed by an elution step using an EDTA gradient (Figure 5). The different stages (protein loading, column washing and protein elution) were monitored throughout the procedure by following protein absorbance at 280 nm (Figure 5A). The eluted fractions were analyzed subsequently by SDS–PAGE (Figure 5B). For repeated applications, the column was extensively washed with TBS–CaCl2 buffer in order to remove residual EDTA. This step allowed reuse of the column by repeatedly applying unbound protein fractions (Figure 5A).

Figure 5.

Affinity purification of ΔDoc-GFP on a CBM-Coh affinity column. (A) Repeated application and elution of the target protein. Samples of the E. coli crude lysate (10 ml), containing the expressed ΔDoc-GFP, were applied (Ap) onto the column (2 ml of beaded cellulose, 2 mg CBM-Coh) and eluted (El) using an EDTA gradient. Following the first elution (El 1), two consecutive applications of the unbound protein (green fractions, ∼5–10 ml) were further applied and eluted (El 2 and El 3). (B) Consecutive elution fractions were analyzed by SDS–PAGE in order to evaluate protein purity (∼40–50 nmol purified protein per elution). (C) Boiled column beads after final wash. Gel visualization was performed using Commassie brilliant blue (CBB) staining.

It can be seen already in the first elution that the protein band corresponding to the calculated size of the ΔDoc-GFP appears to be homogeneous and enriched after only one purification step (Figure 5B). Since the column is based on the direct binding between the cohesin and the ΔDoc, the maximum amount of protein we could obtain corresponds to the amount of pre-incubated CBM-Coh. With each consecutive elution, the capacity of the column was maintained. The single band in the third elution attests to the robustness of the system (Figure 5B). Eluted fractions were collected, and protein concentration was measured spectroscopically, revealing ∼90% recovery of the theoretical maximum capacity in each case.

In order to examine the effectiveness of the elution step, we removed the cellulose beads after the final application and elution of the test protein, and subjected them to sample buffer (100°C, 5 min) prior to SDS–PAGE. A single band was observed, corresponding to the molecular weight of the CBM-Coh, thus indicating that the ΔDoc-GFP was completely eluted following application of EDTA (Figure 5C).

In order to compare the new affinity tag procedure to the commonly used immobilized metal-ion affinity chromatography (IMAC), we purified the ΔDoc-GFP using its His tag rather than the ΔDoc affinity tag (Figure 6). Although high amounts of protein were achieved using IMAC, an additional band was observed adjacent to the eluted protein. This band was absent from any of the elutions observed when using the ΔDoc system, demonstrating its advantage and high specificity in protein purification. In order to examine whether this contaminating band may represent cleavage between the ΔDoc and the GFP we examined its N-terminal sequence. While the upper band was recognized as the beginning of the ΔDoc, the lower band could not be identified (data not shown), suggesting that it is an unrelated contamination of the host protein, presumably with exposed histidine residues that were captured on the Ni beads.

Figure 6.

SDS–PAGE analysis of ΔDoc-GFP fusion protein purified on Ni-NTA. Gel visualization was performed using CBB staining. Note the minor band (apparent contaminating protein), which migrates slightly faster than the position of the major band, representing ΔDoc-GFP.

In contrast to the ΔDoc tag, the use of a wild-type dockerin at the N-terminus of GFP (wtDoc-GFP) resulted in binding to the CBM-Coh that was so tight it was impossible to elute the target protein even with concentrations of EDTA as high as 500 mM (Figure 7A). It can be seen that the wtDoc-GFP remained bound to column, and was released together with the CBM-Coh only after boiling the column beads in the presence of SDS. Only negligible amounts of wtDoc-GFP could be seen in the EDTA-eluted fraction. CBM-Coh has a higher calculated molecular weight (∼36.6 kDa) than that of wtDoc-GFP (∼35.6 kDa), both of which are evident in the fraction released from the column.

Figure 7.

SDS–PAGE analysis of wtDoc-GFP fusion protein purified on a CBM-Coh (A) or Ni-NTA (B) column. (A). Elution fractions – eluted proteins after applying increasing amounts (up to 500 mM) of EDTA. Beads, beaded cellulose, after completing the elution step, the column beads were boiled for 5 min in SDS sample buffer, in order to release attached proteins. (B) Eluted protein from the Ni-NTA column. The lanes of the SDS–PAGE gel show the protein content of successive fractions after applying increasing amounts of imidazole. Visualization was done using CBB staining.

In order to rule out the possibility that the wtDoc-GFP did not express well in the bacteria, we employed a C-terminal His tag that allowed us to utilize an alternative affinity system (Ni-NTA) for purification of the protein (Figure 7B). Using this purification approach we obtained large amounts of protein from the bacterial cell lysate, indicating that the strong binding of the intact dockerin, rather than difficulties of expression, accounted for the poor protein elution observed using the CBM-Coh column.

Positioning of the ΔDoc tag at the C-terminus

A good affinity tag should exhibit similar qualities (specific attachment and efficient elution) when fused either to the N- or the C-terminus of the protein of interest in order to extend the purification options. Therefore, two additional versions of the GFP-dockerin fusion proteins were cloned, expressed and tested. In both proteins, the ΔDoc or the wtDoc was positioned at the C-terminus of GFP, and a His tag was positioned at the N-terminus. Under similar conditions as previously described, a relatively pure band corresponding in size to GFP-ΔDoc was purified in two consecutive applications of cell lysate (Figure 8A), thus demonstrating the productivity of the tag at the C-terminus of the protein. In this case, an additional minor band co-purified with the major protein band.

Figure 8.

SDS–PAGE analysis of GFP with C-terminal ΔDoc (A) or wtDoc (B) affinity tag, purified on CBM-Coh directly from bacterial cell lysate. (A) Two consecutive elutions of GFP-ΔDoc. (B) GFP-wtDoc elution. Elution – eluted proteins after applying increasing amounts (up to 500 mM) of EDTA. Beads – beaded cellulose. After completing the elution step the column beads were boiled for 5 min in SDS sample buffer, in order to release attached proteins. Gel visualization was done using CBB staining.

Similar to the wtDoc-GFP, wherein the tag was positioned at the N-terminus of the target protein, the GFP-wtDoc did not elute from the column and could be seen together with CBM-Coh after boiling the column beads in the presence of SDS (Figure 8B).

Purification of ΔDoc-tagged ZZ-domain fusion protein

The ZZ-domain is widely used in both research and industry, due to its ability to bind the heavy chain Fc region of immunoglobulins. Attaching a detachable affinity tag to the ZZ-domain could render it reusable and more applicable, not only for antibody purification but for other nanotechnological applications. Thus, ΔDoc, fused to the N-terminus of the ZZ-domain, was cloned and expressed (Figure 9A). Due to the repetitive nature of the ZZ-domain, we cloned it with an additional 20 amino acid residues (7 upstream and 13 downstream), which enabled us to clone the two identical domains together. The resulting protein specifically bound antibodies from mouse total blood serum (Figure 9B), as can be deduced from the molecular weight of the eluted proteins (50 and 25 kDa), corresponding to the 150 kDa molecular weight of IgG's heavy and light chains. However, during the elution of the antibodies using low pH (0.1 M glycine·HCl buffer, pH 2.8), part of the ΔDoc-ZZ also detached from the column. In view of these results, further optimization of antibody elution will be required in future studies.

Figure 9.

Purification and antibody-binding activity of the ΔDoc-ZZ-domain. (A) SDS–PAGE analysis of the CBM-Coh-purified ΔDoc-ZZ-domain. (B) ΔDoc-ZZ was immobilized onto beaded cellulose through interaction with CBM-Coh. Subsequently, diluted mouse serum was applied onto the column, and, after extensive washes with TBS–CaCl2, IgG's, indicated by H-Chain and L-Chain (heavy and light chain, respectively), were eluted using glycine·HCl buffer, pH 2.8. In addition to the desired IgG's, ΔDoc-ZZ (indicated by an arrow) was also eluted under these conditions. Gel visualization was performed using CBB staining.

Purification of ΔDoc-tagged β-glucosidase

Many affinity tags have some influence on their host proteins and therefore have to be removed (Arnau et al., 2006). On the other hand, the His tag is considered to have a relatively minor effect on the host protein, and is therefore a preferred affinity tag in this respect. In order to examine the effect of the ΔDoc tag on its fused protein function we compared the activity of a purified non-cellulosomal C. thermocellum β-glucosidase (BglA) either fused to the ΔDoc tag or to a His-tag.

The C. thermocellum β-glucosidase has a crucial role in the degradation of cellulose. It hydrolyzes the major product of cellulase digestion, cellobiose, which is a strong inhibitor of certain enzymes of the cellulase system (Johnson et al., 1982; Morag et al., 1991). The non-inhibitory glucose thus allows the degradation of cellulose to proceed continuously. We chose this enzyme as an interesting model for enzymatic studies.

ΔDoc was fused to the N-terminus of β-glucosidase (Table 1, ΔDoc-BglA), expressed and purified on CBM-Coh bound to beaded cellulose. A single band of about ∼58 kDa, corresponding to the calculated size of the fusion protein, was observed in SDS-PAGE gels (data not shown). Two consecutive application/elution steps were performed; each produced a relatively pure single major band (data not shown). The addition of the ΔDoc tag had no significant effect on the activity of the enzyme compared to the activity of the wild-type non-cellulosomal enzyme (lacking the dockerin module, containing a His tag and purified using a Ni-NTA affinity column). ΔDoc-BglA had kinetic parameters similar to that of the His-tagged wild-type BglA, with slightly improved values (Table 3).

Table 3. Kinetic parameters of the wild-type BglA and ΔDoc-BglA
 Vmax (M s−1)Km (mM)kcat (s−1)kcat/Km (mM−1 s−1)

Purification of ΔDoc-tagged TEP1

E. coli TEP1 has been documented to execute diverse enzymatic activities, including thioesterase, esterase, arylesterase, protease, and lysophospholipase (Upton and Buckley, 1995; Lee et al., 1997). The physiological role of TEP1 is, however, unclear. The enzyme has been suggested to be potentially useful for the kinetic resolution of racemic mixtures of industrial chemicals (Lee et al., 1997).

From the wide variety of catalyzed reactions, we concentrated on the esterase activity of TEP1 that was examined using 4-nitrophenyl acetate (pNPA). ΔDoc was fused to the N-terminus of TEP1 (ΔDoc-TEP1), expressed and purified on a CBM-Coh bound to beaded cellulose. A single band of about ∼27 kDa, corresponding to the calculated size of the fusion protein, was observed in SDS–PAGE gels (data not shown). Two consecutive application/elution steps were performed showing a similar single protein band (data not shown). The enzyme was active with a Vmax of 6.42 10−6 M s−1 on pNPA and a Km of 1.098 mM.


Affinity chromatography can be a robust and rapid purification approach that can lead to highly purified protein after only one purification step. The potential of employing the cellulosome-based cohesin–dockerin interaction for use in affinity chromatography was first proposed long ago (Bayer et al., 1994), and interest in this approach has more recently attracted attention by others (Nordon et al., 2009). Nevertheless, the ultra-high affinity between the dockerin and the cohesin seemed to thwart previous attempts to develop this system for affinity-based purification. Such affinities (KA ∼ 1010 M−1) rendered it almost impossible to separate the two molecules from each other once bound to the column.

Nevertheless, in recent work (Craig et al., 2006), the calcium-dependent interactions of the wild-type dockerin from C. thermocellum with its complementary cohesin was demonstrated in a purification process. In that work, a CBM-Coh fusion protein was covalently immobilized to a Sepharose matrix, and used to trap an antibody-binding domain, protein LG, fused to a wild-type dockerin module. The dockerin-bearing antibody-binding protein was then eluted from the column using the Ca2+-chelating agent EDTA. However, in light of the strong cohesin–dockerin interaction, the elution step appeared to be inefficient.

In a recent work Kamezaki et al. (2010) have explored the cohesin–dockerin interaction of a different cellulosome system from the mesophilic bacterium, C. josui, in order to construct an affinity purification process. Similar to the work of Craig et al. (2006) a cohesin was covalently coupled to a Sepharose matrix, and a wild-type dockerin was employed as an affinity tag. Use of the latter dockerin, however, resulted in a dissociation that lasted more than 6 h (Yamaguchi et al., 2004). The authors therefore mutated specific residues of the dockerin, in order to reduce the binding affinity for the cohesin partner.

The major difference in the present work is the application of the very high affinity of the truncated dockerin for its cohesin counterpart from the thermophilic C. thermocellum, which is then subject to complete reversal of the interaction by the chelating agent. In developing an efficient, cost-effective affinity system, we sought to take advantage of the special characteristics inherent in the cellulosome-based system, i.e., potential reversibility of the cohesin–dockerin interaction and the selective binding properties of the CBM to an inexpensive cellulosic resin. The idea was to improve the reversibility of the cohesin–dockerin interaction and to use an inexpensive cellulose polymer as an affinity matrix, rather than the costly Sepharose resin. Theoretically, all components of the column would be subject to individual recovery. The ΔDoc target protein should be removable from the CBM-Coh, and the latter should also be subject to detachment from the cellulose matrix.

In previous work (Karpol et al., 2008), we have attempted to better understand the dockerin components responsible and the potential reversibility of its interaction with cohesin. Many different types of recombinant dockerins were thus examined for their interaction with cohesins. Among the many dockerins examined, a truncated dockerin derivative (ΔDoc) exhibited unique binding properties, which made it suitable for use as an affinity tag. It retained very high affinity toward its cognate cohesin, yet its binding was impaired in the presence of the calcium-chelating agent EDTA. These properties suggested that the ΔDoc would be particularly appropriate for use as an affinity tag in affinity chromatographic applications. Indeed, the utility of this tag was demonstrated in the purification of G. stearothermophilus XynT-6 as a model target protein (Karpol et al., 2009).

In the present work, we have further characterized the binding affinities of ΔDoc and explored the general potential of using the tag for affinity purification. For this purpose we designed a variety of recombinant proteins using the GFP as a general model protein. Utilizing the ΔDoc as an affinity tag led to highly purified GFP after only one step, with about 90% protein recovery from the theoretical maximal capacity of the affinity column. Furthermore, the affinity column could be reused while retaining full binding capacity. The ΔDoc tag was active at both the N- and the C-terminal positions of the target protein, an important feature that extends the purification options when using this affinity tag.

As shown for the Xyn derivatives (Karpol et al., 2009), wtDoc when fused to either the N- or C-terminus of GFP could not be readily detached from the CBM-Coh column. Only when boiled in the presence of SDS could the tagged protein be released from the column together with the CBM-Coh intermediary, thus underscoring the tenacious binding of the wild-type dockerin. These results further demonstrate the necessity for truncation of the dockerin to allow its dissociation from the affinity column.

The ΔDoc tag possesses higher specificity than the commonly used His tag (Belew et al., 1987). The relative homogeneity of the isolated ΔDoc-GFP released from the CBM-Coh affinity column, compared to that isolated by IMAC, indicated the advantage of the ΔDoc tag for purification of target proteins. Moreover, it is smaller than many other protein tags (e.g., maltose-binding protein and cellulose- or chitin-binding modules), and its bacterial origin guards against errant recognition in higher systems, such as that observed using the calmodulin-binding peptide (Terpe, 2003). One of the defining characteristics of the system is the very high affinity of the ΔDoc tag for its cohesin counterpart (similar to that of the wild-type tag) and its near-complete disruption by mild elution using EDTA (as opposed to the wild-type tag).

In the present study, we characterized a truncated version of the C. thermocellum dockerin module derived from the cellulosomal enzyme Cel48S, and demonstrated its ability to serve as an effective general and versatile affinity tag for protein purification. The range of proteins purified using this system in this work confirms and broadens the initial studies using a xylanase as a model target protein (Karpol et al., 2009). The results of this study also demonstrated that the ΔDoc affinity tag had little effect on the properties of the proteins used in this study, including the enzymes. The ΔDoc affinity tag was employed together with the CBM-Coh module as a molecular counterpart that interacted with a relatively inexpensive and readily available beaded cellulose matrix. No chemical activation was required, and the native ability of the CBM to bind cellulose was exploited. The system is readily regenerated, its components can be stored for prolonged periods of time and used repeatedly. Due to the high affinity and specificity between the cohesin and the truncated dockerin tag, even minute amounts of expressed proteins can be purified under mild elution conditions, making this a unique and highly useful general technique for affinity purification.


The authors are grateful to Dr. Ely Morag, Dr. Bareket Dassa, Dr. Ilit Noach, and Michal Slutzki for their assistance and insight provided during the preparation of this manuscript, and to Dr. Olga Khersonsky and Dr. Itamar Yadid for their comments regarding the kinetics analysis. This research was supported by grants from the United States-Israel Binational Science Foundation (BSF), Jerusalem, Israel and by the Israel Science Foundation (grant nos. 966/09 and 159/07), and the Yeda CEO Fund (Weizmann Institute). E.A.B. holds The Maynard I. and Elaine Wishner Chair of Bio-organic Chemistry.

Abbreviations used:

Clostridium thermocellum β-glucosidase


Coomassie brilliant blue


carbohydrate (cellulose)-binding module


recombinant CBM-cohesin fusion protein


a dockerin module containing cellulosomal exoglucanase


C. thermocellum scaffoldin subunit


C. thermocellum CipA cohesin2




C. thermocellum Cel48S dockerin module


C. thermocellum type-I dockerin from xylanase 10B


truncated form of the C. thermocellum Cel48S dockerin module


wild-type form of the C. thermocellum Cel48S dockerin module


immobilized metal-ion affinity chromatography


nickel nitrilotriacetic acid


Escherichia coli thioesterase/protease I


Geobacillus stearothermophilus xylanase T6


xylanase fusion protein bearing a full-length N-terminal dockerin


xylanase fusion protein bearing an N-terminal 16-residue truncated dockerin


a synthetic analog of the heavy chain Fc-binding B-domain of protein A from the bacterium Staphylococcus aureus.