High-throughput automated refolding screening of inclusion bodies


  • Renaud Vincentelli,

    1. Architecture et Fonction des Macromolécules Biologiques, Unité Mixte de Recherche (UMR) 6098, Centre National de la Recherche Scientifique (CNRS) et Universités d'Aix-Marseille I et II, 13402 Marseille Cedex 20, France
    Search for more papers by this author
    • These authors contributed equally to this work.

  • Stéphane Canaan,

    1. Architecture et Fonction des Macromolécules Biologiques, Unité Mixte de Recherche (UMR) 6098, Centre National de la Recherche Scientifique (CNRS) et Universités d'Aix-Marseille I et II, 13402 Marseille Cedex 20, France
    2. Architecture et Fonction des Macromolécules Biologiques, UMR 6098, CNRS et Universités d'Aix-Marseille I et II, 31 chemin Joseph Aiguier, 13402 Marseille Cedex 20, France; fax: +00-334-91-16-45-36.
    Search for more papers by this author
    • These authors contributed equally to this work.

  • Valérie Campanacci,

    1. Architecture et Fonction des Macromolécules Biologiques, Unité Mixte de Recherche (UMR) 6098, Centre National de la Recherche Scientifique (CNRS) et Universités d'Aix-Marseille I et II, 13402 Marseille Cedex 20, France
    Search for more papers by this author
  • Christel Valencia,

    1. Architecture et Fonction des Macromolécules Biologiques, Unité Mixte de Recherche (UMR) 6098, Centre National de la Recherche Scientifique (CNRS) et Universités d'Aix-Marseille I et II, 13402 Marseille Cedex 20, France
    Search for more papers by this author
    • Present address: Institut Gilbert Laustriat, IFR85, 74 route du Rhin, BP 60024, F-67401 Illkirch Cedex, France.

  • Damien Maurin,

    1. Architecture et Fonction des Macromolécules Biologiques, Unité Mixte de Recherche (UMR) 6098, Centre National de la Recherche Scientifique (CNRS) et Universités d'Aix-Marseille I et II, 13402 Marseille Cedex 20, France
    Search for more papers by this author
  • Frédéric Frassinetti,

    1. Architecture et Fonction des Macromolécules Biologiques, Unité Mixte de Recherche (UMR) 6098, Centre National de la Recherche Scientifique (CNRS) et Universités d'Aix-Marseille I et II, 13402 Marseille Cedex 20, France
    Search for more papers by this author
  • Loréna Scappucini-Calvo,

    1. Architecture et Fonction des Macromolécules Biologiques, Unité Mixte de Recherche (UMR) 6098, Centre National de la Recherche Scientifique (CNRS) et Universités d'Aix-Marseille I et II, 13402 Marseille Cedex 20, France
    Search for more papers by this author
  • Yves Bourne,

    1. Architecture et Fonction des Macromolécules Biologiques, Unité Mixte de Recherche (UMR) 6098, Centre National de la Recherche Scientifique (CNRS) et Universités d'Aix-Marseille I et II, 13402 Marseille Cedex 20, France
    Search for more papers by this author
  • Christian Cambillau,

    1. Architecture et Fonction des Macromolécules Biologiques, Unité Mixte de Recherche (UMR) 6098, Centre National de la Recherche Scientifique (CNRS) et Universités d'Aix-Marseille I et II, 13402 Marseille Cedex 20, France
    Search for more papers by this author
  • Christophe Bignon

    1. Architecture et Fonction des Macromolécules Biologiques, Unité Mixte de Recherche (UMR) 6098, Centre National de la Recherche Scientifique (CNRS) et Universités d'Aix-Marseille I et II, 13402 Marseille Cedex 20, France
    2. Architecture et Fonction des Macromolécules Biologiques, UMR 6098, CNRS et Universités d'Aix-Marseille I et II, 31 chemin Joseph Aiguier, 13402 Marseille Cedex 20, France; fax: +00-334-91-16-45-36.
    Search for more papers by this author


One of the main stumbling blocks encountered when attempting to express foreign proteins in Escherichia coli is the occurrence of amorphous aggregates of misfolded proteins, called inclusion bodies (IB). Developing efficient protein native structure recovery procedures based on IB refolding is therefore an important challenge. Unfortunately, there is no “universal” refolding buffer: Experience shows that refolding buffer composition varies from one protein to another. In addition, the methods developed so far for finding a suitable refolding buffer suffer from a number of weaknesses. These include the small number of refolding formulations, which often leads to negative results, solubility assays incompatible with high-throughput, and experiment formatting not suitable for automation. To overcome these problems, it was proposed in the present study to address some of these limitations. This resulted in the first completely automated IB refolding screening procedure to be developed using a 96-well format. The 96 refolding buffers were obtained using a fractional factorial approach. The screening procedure is potentially applicable to any nonmembrane protein, and was validated with 24 proteins in the framework of two Structural Genomics projects. The tests used for this purpose included the use of quality control methods such as circular dichroism, dynamic light scattering, and crystallogenesis. Out of the 24 proteins, 17 remained soluble in at least one of the 96 refolding buffers, 15 passed large-scale purification tests, and five gave crystals.

In the context of Structural Genomics (SG) projects involving targets from Escherichia coli (ASG), Mycobacterium tuberculosis (MT), and viruses (SPINE), we have performed expression assays on ∼600 genes (Sulzenbacher et al. 2002; Vincentelli et al. 2003). One of the main obstacles we and other authors have encountered when expressing recombinant proteins in E. coli is the relatively low soluble protein yield obtained with many of the source organisms used. In the case of eukaryotes, viruses, and Mycobacterium tuberculosis, most of the genes were expressed in the form of insoluble aggregates called “inclusion bodies” (IB). This obstacle to obtaining suitable targets for performing structural studies was particularly severe in the case of MT, with which 93% of our 182 targets yielded IB when proteins were expressed fused to an N-terminal His tag.

IBs are assumed to result from illegitimate interactions between hydrophobic residues located in the core of different molecules. This process is auto-catalyzed and therefore rapidly results in the precipitation of all the recombinant proteins produced in the cell (Mukhopadhyay 1997). Methods have been designed to recover correctly folded proteins from these amorphous aggregates. These include the “dilution,” “dialysis,” and “solid phase” methods (De Bernardez-Clark 1998), all of which involve an initial IB solubilization step using highly concentrated solutions of chaotropic agents such as guanidinium chloride and urea. The subsequent step in all these methods consists of removing the denaturing agent and restoring the protein to its native shape from the unfolded soluble state. The pathway used to remove the chaotropic agent differs between the three methods, however, although the same result is reached in each case. With the dilution method, refolding is assumed to occur immediately upon diluting the protein in a large volume of nondenaturing buffer (“refolding buffer”), which has to be sufficiently large to both cancel out the solubilizing effect of the chaotropic agent and reduce the probability that protein interactions will occur. The dialysis method involves the use of the same initial and final buffer compositions as the dilution method, but in this case, there is no dilution to decrease the protein–protein contacts (Rudolph and Lilie 1996; Mukhopadhyay 1997). Finally, it was established that physically separating molecules from each other during the renaturation process (solid phase refolding) greatly improved the refolding yield (Stempfer et al. 1996).

Whatever the method used to replace denaturing by non-denaturing buffer (a dilution, dialysis, or solid state method), it would be easier to use a single refolding buffer. Unfortunately, experience has shown that the composition of the refolding buffer is strongly protein dependent and that simply maintaining a difference between the pH of the refolding buffer and the isoelectric point (IP) of the protein does not usually suffice to keep the protein soluble.

Hence the idea of testing several refolding buffers simultaneously. For instance, Perbio has addressed this issue with Pro-Matrix, a refolding kit consisting of nine basic buffers, which can be supplemented with additives (Qoronfleh 2004). Using a fractional factorial approach, Armstrong et al. (1999), Chen and Gouaux (1997), and Hampton Research (FoldIt) have each developed separate procedures using 16 refolding conditions.

Despite these improvements, some difficulties were still encountered in the protein solubility assays performed to monitor the refolding process. Because no solubility assay was provided with the Pro-Matrix kit, this assay had to be set up by the customer, and the methods suggested for a solubility assay in the case of the FoldIt kit (size exclusion chromatography [SEC]), as well as those used by Armstrong et al. (1999) and Chen and Gouaux (1997) (dialysis and centrifugation), were not compatible with a high-throughput or with automation, which are two of the most crucial features in SG studies.

To solve the problems associated with the above limitations, a protein solubility test based on light scattering has been devised (Trésaugues et al. 2004). In practice, the turbidity of the solution is assessed by measuring the optical density (OD) at 390 nm, before and after adding the protein. If the protein remains soluble, the absorbance remains unchanged. In the opposite case, the OD increases proportionally to the amount of precipitate produced. This procedure is much faster than SEC and can be easily automated, but the number of conditions was still limited to 12, and the proteins often precipitated in all of them. This clearly suggested that the number of conditions needed to be further increased. A method of making this quantitative jump has been experimented in microtiter plate format, using 203 refolding conditions (Sijwali et al. 2001). However, the latter study was only designed for screening different GSH:GSSG ratios.

It is worth noting that although increasing the number of refolding conditions increases the probability that a protein will meet a buffer composition favoring its solubility, it also increases the number of samples to be handled. One possible solution to this problem consists of automating the screening process. In addition, automation is required to obtain sufficiently large SG throughputs. A partially automated refolding screening procedure was recently described (Scheich et al. 2004). With this procedure, however, the automation did not include any test for assessing the solubility and only 30 refolding conditions were used.

We therefore designed a refolding strategy involving the use of 96 different buffers in microtiter plate format, based on the above mentioned idea that the probability of a protein encountering a buffer composition favoring correct folding was likely to increase with the number of buffers tested. The solubility assay used in our screening procedure is basically the same as that described by Trésaugues et al. (2004), which accounts for protein solubility, and not for protein folding. After the preparatory refolding stage, circular dichroism (CD), dynamic light scattering (DLS), and crystallogenesis quality control procedures were added to respectively assess the folding, aggregation state, and homogeneity of the protein solution. These methods were chosen because they can be applied in theory to any protein, which is a prerequisite in the field of post-Genomics, which deals mainly with proteins having an unknown function. Finally, the availability of a pipetting robot made it possible to automate the whole process in a 96-well plate format.

To the best of our knowledge, this is the first completely automated “wide spectrum” 96-well IB refolding screening procedure to be developed based on a factorial approach. The present article describes the setup involved and confirms the validity of the method, based on tests carried out with proteins originating from two SG projects.


Optimization of the solubility assay

The recently described solubility test, in which the turbidity of the solution is measured in terms of the light absorbance at 390 nm, involves light scattering by a protein precipitate (Trésaugues et al. 2004). As no proof was available that this wavelength was the most suitable one, we first addressed this point.

For this purpose, the absorbance of a bovine serum albumin (BSA) precipitate was scanned between 230 and 600 nm. As shown in Figure 1 (curve A), the absorbance decreased continuously from 230 to 600 nm. In addition to this regular decay, a small shoulder was present in the 280 nm region. To determine whether this feature was due to any remaining soluble proteins, the precipitate was spun down and the scanning performed again on the supernatant. Surprisingly, in this case, OD230–600 was indistinguishable from the baseline, which means that the protein content had been entirely converted into insoluble species. These results indicate that the absorbance pattern of the protein precipitate, which is shown in Figure 1 (curve A), was entirely accounted for in terms of light scattering and not even partially in terms of the absorbance of soluble proteins.

Because the solubility assay was expected to distinguish between the absorbance due to precipitated and soluble proteins, the same experiment was performed under conditions where the proteins remained 100% soluble. In this case (Fig. 1, curve B), the absorbance profile was that of a typical protein solution, peaking at 280 nm (aromatic side chains) and at 200 nm (peptide bonds). Note that only the beginning of the peptide bonds' absorbance peak (λ max 200 nm; Stoscheck 1990) was visible between 230 and 240 nm.

In conclusion, the wavelength to be used in the solubility test should satisfy the following contradictory criteria: (1) It should be high enough above 280 nm to prevent any risk of obtaining false negative results due to the absorbance of (partially or totally) soluble proteins, at values of 280 nm and below, but (2) it should be as small as possible to provide the highest signal-to-noise ratio, according to curve A, and hence the most sensitive assay. In practice, 340-(manual procedure) and 350-nm (automated procedure) wavelengths were selected because they fulfilled these two criteria and provided better results than 390 nm.

Selection of 96 refolding conditions

The chemicals listed in Table 1, which were used to prepare the refolding mixes presented in Figure 2, were selected on the basis of the following criteria:

  1. A 4 pH to 9 pH range was chosen because the proteins to be screened had various IPs and were likely to denature below or above these values.

  2. Various ionic strengths (none; 100 mM NaCl or KCl; and 200 mM NaCl) were used because the solubility can increase (salting in) or decrease (salting out) with the salt concentration from one protein to another.

  3. With the dilution method used, refolding was allowed to proceed for a very short time. Amphiphilic components (glycerol, PEG) were introduced to prevent the hydrophobic residues of different molecules still accessible at intermediate refolding stages from interacting with each other. In addition, glycerol and PEG were already provided in other refolding kits (Trésaugues et al. 2004) and were compatible with crystallogenesis. Glucose and arginine were used for the same reason, although Arg had to be removed before the crystallogenesis trials (see below).

  4. Solubilizing reagents in the NDSB series were selected because they have been successfully used in protein crystallogenesis (Karaveg et al. 2003) and refolding experiments (Vuillard et al. 1998; Expert-Bezancon et al. 2003).

  5. Proteins bearing odd numbers of cystein can form unnatural intermolecular disulfide bonds, which is a possible cause of precipitation during the refolding process. Ten millimolar of β-MSH were introduced to prevent this mispairing.

  6. The “cocktail” contained potential cofactors that might be required during the refolding process in the case of some proteins, whereas some other proteins tend to precipitate in the presence of divalent cations, hence the presence of EDTA.

  7. The chaotrops (urea and guanidinium chloride) present in the commercial kits were discarded because they were liable to damage the robot's pipetting valves.

It was necessary to use a fractional factorial approach on the first 80 wells, because the combination of 20 chemicals would have resulted in too many experimental points (the full factorial design would have been 2560 combinations).

In the 16 remaining microplate wells, mini chaperones (a soluble form of GroEL; Altamirano et al. 1997) and redox components (GSSH, GSSG, DsbA) were combined, because the disulfide bond formation/reduction during the folding process itself has been found to be crucial (Wei et al. 1999). Details of each of the refolding conditions are given in Figure 2.

Testing of 96 refolding conditions

The 96-well screening procedure was tested on a panel of 24 proteins from two SG projects: MT (18 targets) and SPINE (6 targets). The results obtained are given in Table 2. Eleven out of the 18 MT targets (61%) and all the SPINE targets subjected to screening remained soluble under at least one of the 96 refolding conditions. In addition, except for MT target Rv1373 (buffer 57), all the responsive targets remained soluble in many buffers, which made it possible to choose the most suitable one(s) for the downstream steps such as crystallogenesis. In addition, the pH was not found to be a decisive parameter, because most of the targets remained soluble in a wide pH range, except Rv1525, Rv1515c, Rv0323c, and Rv2045, which remained soluble only at pH 4. Generally speaking, no particular buffer composition (pH, ionic strength, etc.) peaked more than the others, which suggests that the solution was always protein specific. The solubility yield at the production stage also appeared to be very high: 10 out of the 11 responsive MT targets (91%), and five out of the six responsive SPINE targets (83%) succeeded in passing the large-scale refolding and the first concentration steps. Only one SPINE (63) and two MT (Rv0323c and Rv1515c) targets were lost during the second concentration step following the gel filtration. In these particular cases, CD was nonetheless performed, but on protein solutions with concentrations too low for crystallogenesis.

Validity of the refolding screening procedure

Protein solubility and folding superimpose satisfactorily, but the overlap is not always 100%. We therefore tried to assess the overlap in the case of proteins that were quantitatively refolded. In post-Genomics, one is often dealing with genes encoding proteins with an unknown function, and functional tests for each of the targets are frequently lacking. Therefore, depending on the targets, generic and/or specific methods can be used to assess the folding.

Generic methods

Circular dichroism (protein folding), dynamic light scattering (protein aggregation), and crystallogenesis (protein folding and dispersion homogeneity) were used for this purpose. Note that out of the 17 targets that reached the large-scale refolding stage, five could not be subjected to CD analysis either because of the presence of NDSB in the refolding buffer or because the amount of protein available was not sufficient. Crystallogenesis was also taken to be a valid folding criterion, because only properly folded proteins with an even aggregation state yield well-ordered crystals.

The results obtained with these three methods, which are summarized in Table 2, indicated upon CD analysis that all the targets that produced crystals also displayed folding features. This was so in the case of both MT (Rv2392, Rv1399c, Rv1208) and SPINE (targets 5 and 23). However, the opposite was not true: CD-positive MT targets Rv1564c Rv1523, Rv1515c, Rv0323c, and Rv2045 and SPINE target 10 did not produce crystals. Therefore, although the sole presence of secondary structures (β-sheet and/or α-helix) did not necessarily lead to successful crystallogenesis, its absence could be said to suggest a poor prognosis in terms of crystallogenesis, at least with this particular protein sample. By contrast, protein aggregation detected by DLS analysis seems to have a lower predictive value, because MT target Rv1208 produced crystals despite its aggregated state. Finally, the crystallization yield obtained with this procedure (five targets [36%]) was outstandingly high.

Specific method

Although the presence of secondary structures (CD), the lack of aggregates (DLS), and crystal growth argue in favor of correct folding, it is necessary to carry out more specific tests whenever possible. This was the case with Rv1399c. Because this target had been annotated as a putative lipase, a specific enzymatic assay was set up (Canaan et al. 2004). As illustrated in Table 3, the enzymatic activity could be measured after the refolding step, which provides evidence that our refolding screening procedure yields functional proteins, and not only soluble proteins. Two additional points are worth noting in Table 3: First, the refolding yield could be assessed, and turned out to be particularly high (50%). Second, 24 h after the refolding process, the total enzymatic activity was six times higher, which reflects the occurrence of a slow refolding process.

Scale up: Criteria for the choice of refolding buffer

Isoelectric point

As can be seen from Table 2, whenever possible, we chose conditions giving the largest difference in pH with the isoelectric point (IP) of the protein. Although we do not know how many proteins would remain soluble if a mixture with a pH near the IP was used, our choice actually resulted in 100% of the targets being successfully purified.

Compatibility with downstream steps

High concentrations of arginine sometimes artificially maintained proteins in the soluble state. Consequently, the removal of arginine often resulted in protein precipitation (not illustrated). In addition, due to its “anti-aggregation” effects (Umetsu et al. 2003), 800 mM Arg would have hampered crystallogenesis. We therefore tested the solubility of Rv2391 and Rv1373 in buffers with decreasing concentrations of Arg. Because these proteins remained soluble without any Arg, we decided to purify them in Arg-free buffer 57.

Pipetting a solution containing both a high protein concentration and 20% glycerol would lead to poor performances of the Cartesian crystallization robot in the ∼100 nL range. The same dilution technique was therefore used with glycerol as that described above in the case of Arg, with similar results and effects on large-scale purification.

It can therefore be said that although Arg and glycerol were helpful during the refolding step, they were no longer required subsequently to maintain the solubility of the protein, at least with these particular targets.

Choosing between manual and automated procedures

If a small number of proteins have to be screened, the manual procedure is preferable, whereas a large number of targets (tens to hundreds) requires the use of an automated procedure. In this case, screening one plate takes only 5 min, and in its present form, the robot can process 27 plates in 2 h 30 min without any human interventions. Thanks to the color code, the automated procedure, in addition to saving time, made it possible to display the results in a form that was easier to analyze than the manual procedure (Fig. 3B).


IB refolding versus soluble expression in SG

To manage our SG programs, we have developed a general strategy based on several “screening rounds” of increasing complexity (Vincentelli et al. 2003). In the first round, targets are expressed using a single vector encoding an N-terminal His-tag fusion and a single E. coli strain. In the second round, eight E. coli strains are transformed by the same vector as in round 1, and used to express the recombinant proteins at different temperatures. In the third round, the coding sequences are fused with maltose-binding protein, thioredoxin, glutathione S-transferase, and NusA. In the fourth round, the same experimental conditions are used as in round 1, except that the proteins are refolded from IB.

Comparisons between rounds 3 and 4

In the MT program, screening round 3 seems to be the most fruitful procedure so far, as it yielded 56 soluble proteins after proteolytic cleavage of the fusion (S. Canaan, R. Vincentelli, D. Maurin, F. Frassinetti, L. Scappucini-Calvo, Y. Bourne, C. Cambillau, and C. Bignon, unpubl.). However, its cost (in terms of the time required to prepare fusion constructs and to process the fusion vectors, the price of the endopeptidase, etc.) could easily be prohibitive. Conversely, IB refolding at preparative scale yielded 10 MT soluble proteins at a much lower cost, starting with only a fraction (27%) of the insoluble MT targets. Although no SPINE target was processed in round 3, it is worth noting that five out of six targets (83%) yielded soluble proteins in the preparatory stages of IB refolding, starting with only 3% of SPINE insoluble proteins.

Comparisons between rounds 1 and 4

In addition, the success rate (defined as the percentage of the proteins that succeeded in passing the scale-up step) obtained in round 4 with 18 MT and six SPINE targets (61% and 83%, respectively) was much more satisfactory than that obtained in round 1: Out of 182 MT and 244 SPINE target genes, only 14 (7.7%) MT targets and 80 (33%) SPINE targets were directly recovered in the form of soluble proteins after E. coli cell lysis. This means that at least in some cases, the IB chemical refolding procedure produces soluble species more efficiently than living bacteria. Therefore, we propose to adopt IB refolding in the initial stages of SG projects dealing with highly insoluble proteins, such as the MT project. The validity of this approach has been established in the case of small (<18 kDa) proteins intended for NMR structural analysis (Maxwell et al. 2003). Because 58% of the proteins were found to be properly refolded when a single renaturation buffer was used, one can expect to obtain a much higher refolding yield if an upstream refolding screening procedure is carried out in addition (Maxwell et al. 2003).

Limitations of the screening procedure

The 96-well plate refolding screening procedure is not suitable for use with either high pressure (St. John et al. 1999) or reverse micelle (Vinogradov et al. 2003) approaches, for physical reasons. Nor can this method be used to study refolding processes using time-dependent techniques such as stepwise dialysis with additives (Umetsu et al. 2003) or air oxidation techniques (Menzella et al. 2002). Other limitations of our method are due to the OD340 detection method used:

  1. IB redissolved in chaotrop must be free of contaminants, otherwise these might promote precipitation, yielding false negative results. In this respect, the nickel affinity purification step is of particular importance.

  2. If the protein concentration is too low in the chaotropic agent, there may be no detectable precipitate after diluting the protein in refolding buffer, even if the buffer is not favorable to maintaining the solubility.

  3. We have observed that the first OD340/350 reading was

  4. sometimes misleading: Some positive spots became negative due to the slow protein precipitation with time. The opposite also occurred, presumably due to the presence of proteins with slow refolding kinetics, such as Rv1399c (see Table 3). This prompted us to systematically perform a second OD340/350 reading after a 1-d interval. Because refolded proteins must remain soluble throughout the long crystallization process, long-lasting solubility is more desirable than instant but transient solubility.

Possible improvements and perspectives

The high success rate obtained with the crystallogenesis procedure could be further increased by separating folded from misfolded species quantitatively prior to the crystallogenesis trials. Ion exchange chromatography might be a suitable method for this purpose, because folded and unfolded soluble forms of the same protein do not have the same overall charge. Reversed phase and hydrophobic interaction chromatography (Scheich et al. 2004) might also be suitable methods, as misfolded proteins are assumed to display a larger number of hydrophobic residues on their surface. Another possibility is to add a freezing/thawing step. In the case of Rv1399c, this step was found to differentiate between folded and unfolded populations: The inactive (misfolded) enzymes precipitated after thawing (12 mg), whereas the active (correctly folded) enzymes remained soluble, resulting in a greater specific activity (Table 3).

Low refolding yields and poorly diffracting crystals could also be improved by appropriately tuning one of the basic conditions provided by our refolding kit, as done routinely in crystallogenesis.

Another possible way of improving our screening method is to extend the pH range below 4 and above 9, because a dramatic increase in protein solubility has been reported to occur when the pH increased from 11 to 12.5 during the refolding of human growth hormone (Patra et al. 2000).

To improve the throughput, the Tecan Genios+ micro-plate reader can be used with 384-well plates. This would increase the number of mixtures to be screened on the same surface fourfold. If an automated plate sealer, a robot-driven centrifuge, and carousels in a 4°C atmosphere were added in the immediate vicinity of the robot, the screening process could be run nonstop, and the process would be completely automated.

Refolding screening can be performed at any stage in protein production procedures involving protein solubility problems, as illustrated here in the case of SPINE target 23, which was not refolded from IB, but was resolubilized from a precipitate that formed after the protein has been eluted from the Ni affinity column (Table 2).

Lastly, the role of lysis buffers in protein precipitation processes has been investigated (Lindwall et al. 2000). Because many of the refolding buffers in Figure 2 proved to be highly effective for maintaining protein solubility, they could also be tested for use as lysis buffers along with a 96-well sonicator.

Materials and methods

Cell growth and lysis

All coding sequences were subcloned by recombination (Gateway, Invitrogen) into the pDEST17O/I expression vector, a modified pDEST17 (Invitrogen) to which LacO and LacI were inserted to allow a better control of protein expression (Canaan et al. 2004). BL21(DE3)pLysS cells (Novagen) were transformed with 150 ng of the resulting constructs, and plated on ampicillin (100 μg/mL) and chloramphenicol (34 μg/mL). After one night at 37°C, all the colonies were scraped off the plate using a toothpick, and used to inoculate 1 L of LB. When the cell culture had reached an OD600 of 0.5, IPTG was added to a final concentration of 2 mM. After 4 h of shaking at 37°C, cells were recovered by centrifugation, resuspended in 50 mL of lysis buffer (50 mM Tris at pH 8, 150 mM NaCl, 1 mM EDTA, 0.1% Triton X100, 1 mM PMSF, 0.25 mg/mL lysozyme) and frozen overnight at −80°C.

Processing of inclusion bodies

After thawing the cell suspension, DNAse and MgSO4 were added at final concentrations of 10 μg/mL and 20 mM, respectively. The lysate was incubated for 30 min at 37°C (or until it was no longer viscous), and then spun for 30 min at 17,000g. After discarding the supernatant, the pellet was thoroughly resuspended in 50 mL of Tris buffer (50 mM Tris at pH 8, 150 mM NaCl), disrupted by sonicating it four times with a 15-sec pulse, and spun again. This washing procedure was repeated three times. The final pellet was solubilized in 20 mL of 50 mM Tris, 150 mM NaCl, 10 mM imidazole, 8 M guanidinium chloride. After a 30-min run at 17,000g, the supernatant was loaded onto a 5-ml Chelatin sepharose fast-flow column (Amersham Bioscience) preequilibrated with 50 mM Tris, 150 mM NaCl, 10 mM imidazole, and 8 M urea (buffer A). The column was washed with buffer A supplemented with 50 mM imidazole, and the recombinant protein was eluted with buffer A containing 250 mM imidazole. Protein purity and integrity were checked by SDS-PAGE. Elution fractions containing the protein of interest were pooled, and the imidazole was removed using a desalting column. Proteins were concentrated to at least 5 mg/mL and, if necessary, cysteines were reduced by incubating for 1 h in the presence of 10 mM β-mercaptoethanol.

Screening plates

The 96 refolding mixes (Table 1; Fig. 2) were handmade and packaged in 50-mL tubes. A 95 μL aliquot of each refolding solution was dispensed by the Tecan Genesis Freedom 200 robot (Fig. 3A) into each of the 96 wells of flat-bottom clear microplates (Greiner). Approximately 500 plates/50 mL tube could be prepared in advance using this procedure. After filling the plates, they were sealed manually and stored at −20°C until use (we never observed any buffer precipitation at thawing).

Buffer compositions of wells A1 to H10 (see Fig. 2) were determined using SAmBA software (Audic et al. 1997). Wells A11 to H12 were filled with a single buffer (50 mM Tris at pH 8, 150 mM NaCl, 1 mM EDTA) containing different combinations of chaperones and/or redox components (DsbA, dbGroEL, GSH, GSSG).

Refolding screening

Manual and robot-assisted procedures have been set up.

Manual procedure

After thawing, the content of each of the 96 wells was mixed individually with 5 μl of an ∼5 mg/mL urea-denatured and β-MSH-reduced protein solution (see “Processing of inclusion bodies,” above), using a multichannel pipette. Immediately after the mixing step, the turbidity was assessed by measuring the OD at 340 nm using a μQuant microplate reader (BioTek Instruments Inc.) and the KC4 software program. The blank, that is the absorbance before adding the protein, was automatically subtracted by the computer program by recalling this previously recorded data. The protein was taken to be soluble at OD < 0.05. The plate was sealed and then stored at 4°C. Twenty-four hours later, the seal was removed and a second reading was performed.

Automated procedure

A Tecan Genesis Freedom 200 robot with an eight-needle pi-petting arm, a microplate handling arm, and a Tecan Genios Plus microplate reader was used (Fig. 3A). The robot transferred 5 μL of an ∼5 mg/mL urea-denatured and β-MSH-reduced protein solution from a microtube into each of the 96 wells. After dispensing the solution, the robot moved the microplate into the reader. The latter mixed the contents of each well by shaking the plate for 30 sec, and then measured the optical density at a wavelength of 350 nm (and not 340 nm as in the manual procedure, because this filter was not available for the automated setup). At the end of the assay, the software driving the reader (Magellan) subtracted the previously recorded blank value, and the resulting OD value could be displayed on the computer screen in the form of a microplate layout with a color code. At OD < 0.05 the well was green, and at OD > 0.05 the well background was red (see Fig. 3B). Because the robot could not seal or store the plates at 4°C, this was done manually. On the next day, the plates were returned to the robot and, after stripping the cover, another reading at 350 nm was performed. As expected, the wavelength shift from 340 to 350 nm resulted in only negligible differences (see Fig. 1).

Large-scale protein refolding

The total volume of the denatured protein obtained at the end of the “inclusion body processing” step was diluted 20-fold in the refolding buffer selected at the end of the screening procedure. When more than one buffer could be used, the default choice was that exhibiting the highest compatibility with the downstream steps (CD, DLS, crystallogenesis). If the first buffer selected proved to be unsatisfactory at any of the subsequent steps, a second one was tested, and so on until a buffer that was usable throughout was found. The renatured protein was then concentrated in a stirred Amicon cell and further purified by SEC using the same refolding buffer. This step removed the remaining 0.4 M urea still present after diluting the protein in the refolding buffer, along with any unwanted compounds (see Results section). The protein was concentrated to 5 mg/mL and tested by performing parallel CD, DLS analysis, and automated crystallogenesis trials.

Circular dichroism, dynamic light scattering, crystallogenesis

When the refolding buffer composition was compatible (i.e., when the buffer did not absorb between 190 and 260 nm), the presence of secondary structures was assessed by performing CD analysis (Jasco PTC-423S). Data deconvolution by the CDNN program provided the percentages of the strands, helices, turns, and random coils, which varied with both the molecular mass and the concentration used. Spectra of purified protein (final concentration 0.2 mg/mL) were recorded at 20°C at wavelengths ranging between 190 and 260 nm, with a 30-min averaging step. The final CD spectrum obtained was the mean of three measurements. The protein was taken to have significant secondary structure features when the α-helix and β-sheets amounted to more than 30%.

In addition to promoting secondary structures, efficient refolding buffers were expected to favor mono- (or pauci-) meric states of the refolded protein. To assess the state of aggregation of the proteins after the refolding process, these were subjected to DLS analysis in line with the manufacturer's instructions. Experiments were performed with a Dynapro MSTC-200 (Protein Solutions) at 20°C. Samples were filtered prior to the measurements (using Millex syringe filters, pore size 0.22 μm; Millipore Corp.). The fractions of monomer, dimer, and so forth were calculated using the software program provided by the manufacturer. Proteins were taken to be aggregated when the hydrodynamic radius was ≥10 nm (in general, the Rh of proteins is in the 2–4-nm range). The result of a DLS experiment was taken to be acceptable when (1) the polydispersity was moderate (<30%), (2) the major component did not consist of aggregates, and (3) the major component comprised at least 95% of the detected particles.

Crystallogenesis was performed using nano-drop robotics, as previously described (Sulzenbacher et al. 2002).

Table Table 1.. Chemicals used to make the 80 first refolding buffers
Buffer (50 mM)Ionic strengthAmphiphilicDetergent (100 mM)Reducing agent (10 mM)Additive
  • a

    The concentrations indicated are those used before adding the protein.

  • a

    a Consisted of 50 μM of each of the following: NADH, thiamine HCl, biotine, CaCl2, MgCl2, CuSO4, ZnCl2, CoSO4, ADP, and NiCl2.

NaAc, pH 4NaCl 100 mMGlycerol 20% (v/v)NDSB 1953-MSHArginine 800 mM
MES, pH 5NaCl 200 mMPEG 4000 0.05% (w/v)NDSB 201 Glucose 500 mM
MES, pH 6KCl 100 mMPEG 400 0.05% (w/v)NDSB 256 Cocktaila
TRIS, pH 7    EDTA 1 mM
TRIS, pH 8     
CHES, pH 9     
Table Table 2.. (A) MT and SPINE targets remaining soluble in at least one refolding buffer and (B) summary of positive targets at each step
ATargetMWOrganismSoluble in buffer aPurificationIPpHCDDLSCrystal
 Rv239166MT39, 54, 5757 (−Arg)
  1. 31

 Rv239230MT39, 49, 55, 56, 59, 61, 63, 64, 66595.878OkndYes
 Rv1399c36MT41, 44, 48, 49, 56, 59, 65, 66414.387OkMYes
 Rv120837MT41, 43, 48, 54, 56, 59, 63, 65, 66, 68, 69, 70, 74, 80744.759OkAYes
 Rv137340MT5757 (−Arg)6.368ndANo
 Rv1564c84MT41, 43, 44, 49, 56, 57, 59, 63, 66414.957OkDNo
 Rv152340MT4, 7, 10, 11, 124 (−glyc)8.064OkndNo
 Rv1515c36MT4, 5, 7, 10, 11, 124 (−glyc)6.794OkndNob
 Rv0323c27MT2, 3, 4, 5, 9, 10, 11, 124 (−glyc)5.814OkndNob
 Rv2045c59MT3, 4, 5, 6, 7, 10, 11, 1247.674OkndNo
 Rv3487c29MT2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 18, 19, 20, 21, 22, 23, 24, 29, 45, 47, 49, 54, 57, 75nd8.85 ndndnd
 SPINE 523Sendai10, 58, 59, 67, 73, 76695.069OkTYes
 SPINE 1023Measles1, 2, 3, 4, 5, 6, 9, 10, 11, 12, 16, 17, 22, 24, 26, 32, 49, 54, 75, 78, 7968.994OkANo
 SPINE 2152SFV1, 2, 3, 4, 6, 7, 8, 9, 11, 12, 13, 21, 29, 31, 45, 49, 75, 784, 68.804ndANo
 SPINE 2253SFV2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 31, 45, 49, 57, 78, 79nd9.03 ndndnd
 SPINE 23c23HumanAll except 5, 6, 16, 17, 26, 42, 53, 61, 65, 76338.686OkDYes
 SPINE 6323HIV1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 25, 26, 27, 28, 29, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 47, 48, 49, 51, 52, 54, 55, 57, 58, 66, 78199.95OkHndb
BTarget NumberResponsive targetsLarge-scale purificationCD OKDLS OKCrystal
  • a

    (Target) The Rv nomenclature used was that of the MT genome (Cole et al., 1998; Camus et al. 2002). (MW) theoretical molecular weight (kDa). (IP) isoelectric point (taking into account the His tag when present). (pH) pH of the mix used for large-scale purification. (CD) ok, the protein fulfilled the criteria defined in Materials and Methods. (DLS) Only the main (>95%) population (M, D, etc. ) was included in the table. (M) monomeric; (D) dimeric; (T) tetrameric; (H) Hexameric; (A) Aggregates (see Materials and Methods for details).

  • a

    a The numbers refer to the buffers listed in Fig. 2 (1 = 1A, 2 = 1B … 9 = 2A, etc.). (−Arg), (−glyc) protein purification was performed using the buffer indicated devoid of arginine or glycerol, respectively.

  • b

    b Lost during gel filtration or after the last concentration step.

  • c

    c This target was not refolded from IB, but from a Ni eluate that precipitated just after elution.

  • e

    (Target number) Number of targets subjected to refolding screening. (Responsive targets) Number of targets subjected to refolding screening that remained soluble in at least one refolding buffer. (DLS OK) DLS was taken to be satisfactory when the criteria defined in Materials and Methods were fulfilled.

Table Table 3.. Rv1399c refolding in the preparatory stage
StepProtein (mg)Total activity (U)Active protein (mg)Specific activity (U/mg)Refolding yield (%)
  1. a

    The enzymatic activity was measured as described (Canaan et al. 2004), one unit (U) of activity being defined as the hydrolysis of one micromole of substrate per minute. The amount of active protein was calculated by dividing the total activity recorded at each step by the maximum specific activity (1350 U/mg). The refolding yield was calculated by dividing the amount of active protein obtained in each step by the amount of starting material (160 mg eluted from the Ni affinity column).

Ni+ affinity column and concentration1600000
Dilution in refolding buffer16012,8809.580.55.9
Dilution in refolding buffer (24 h later)16077,76057.648636
Freezing/thawing (before centrifugation)9296,60071.5105044.7
Freezing/thawing (after centrifugation)80108,00080135050
Figure Figure 1..

Absorbance spectra of precipitated and soluble forms of a protein. Twenty microliters of a 20 mg/mL BSA solution were diluted in 500 μL of either 100% isopropanol or 8 M guanidinium chloride. A chaotropic solution was used to ensure that the entire protein content was soluble. The absorbance of the resulting protein suspension (in isopropanol) or solution (in guanidinium chloride) was recorded from 230 to 600 nm, using a Varian Cary Scan 50 spectrophotometer. After subtracting the baseline (the absorbance of each solvent in the absence of protein), the absorbance intensities were plotted vs. the wavelengths. (Curve A) Precipitated protein in isopropanol. (Curve B) Soluble protein in guanidinium chloride. From left to right, three vertical arrows indicate the position of 280, 340/350, and 390 nm wavelengths, respectively.

Figure Figure 2..

Detailed composition of each well in the refolding plate. (*) Tris (pH 8), NaCl 150 mM, EDTA. For details, see Table 1.

Figure Figure 3..

(A) Robot used in the automated procedure. The tools required for the refolding screening procedure are indicated by arrows. (B) Results of Rv2392 refolding screening. At the end of the experiment, the results (in Excel format) were displayed using a color code: Green and red indicate the wells containing soluble (DO < 0.05) and precipitated (DO > 0.05) proteins, respectively.


We thank Dr. Steward Cole and Dr. Nadine Honoré for providing us with the M. tuberculosis cosmids and BACs libraries. We are indebted to Avidis SA for providing us with soluble dbGroEL and DsbA, to Dr. Mariella Tegoni and Dr. Véronique Receveur-Bréchot for valuable advice on the light scattering experiments, and to Dr. Jessica Blanc for revising the English manuscript. This work was supported by grants from the 5th PCRDT program of the European Union (X-TB and SPINE) and by the French national Genopole network.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.