In the context of Structural Genomics (SG) projects involving targets from Escherichia coli (ASG), Mycobacterium tuberculosis (MT), and viruses (SPINE), we have performed expression assays on ∼600 genes (Sulzenbacher et al. 2002; Vincentelli et al. 2003). One of the main obstacles we and other authors have encountered when expressing recombinant proteins in E. coli is the relatively low soluble protein yield obtained with many of the source organisms used. In the case of eukaryotes, viruses, and Mycobacterium tuberculosis, most of the genes were expressed in the form of insoluble aggregates called “inclusion bodies” (IB). This obstacle to obtaining suitable targets for performing structural studies was particularly severe in the case of MT, with which 93% of our 182 targets yielded IB when proteins were expressed fused to an N-terminal His tag.
IBs are assumed to result from illegitimate interactions between hydrophobic residues located in the core of different molecules. This process is auto-catalyzed and therefore rapidly results in the precipitation of all the recombinant proteins produced in the cell (Mukhopadhyay 1997). Methods have been designed to recover correctly folded proteins from these amorphous aggregates. These include the “dilution,” “dialysis,” and “solid phase” methods (De Bernardez-Clark 1998), all of which involve an initial IB solubilization step using highly concentrated solutions of chaotropic agents such as guanidinium chloride and urea. The subsequent step in all these methods consists of removing the denaturing agent and restoring the protein to its native shape from the unfolded soluble state. The pathway used to remove the chaotropic agent differs between the three methods, however, although the same result is reached in each case. With the dilution method, refolding is assumed to occur immediately upon diluting the protein in a large volume of nondenaturing buffer (“refolding buffer”), which has to be sufficiently large to both cancel out the solubilizing effect of the chaotropic agent and reduce the probability that protein interactions will occur. The dialysis method involves the use of the same initial and final buffer compositions as the dilution method, but in this case, there is no dilution to decrease the protein–protein contacts (Rudolph and Lilie 1996; Mukhopadhyay 1997). Finally, it was established that physically separating molecules from each other during the renaturation process (solid phase refolding) greatly improved the refolding yield (Stempfer et al. 1996).
Whatever the method used to replace denaturing by non-denaturing buffer (a dilution, dialysis, or solid state method), it would be easier to use a single refolding buffer. Unfortunately, experience has shown that the composition of the refolding buffer is strongly protein dependent and that simply maintaining a difference between the pH of the refolding buffer and the isoelectric point (IP) of the protein does not usually suffice to keep the protein soluble.
Hence the idea of testing several refolding buffers simultaneously. For instance, Perbio has addressed this issue with Pro-Matrix, a refolding kit consisting of nine basic buffers, which can be supplemented with additives (Qoronfleh 2004). Using a fractional factorial approach, Armstrong et al. (1999), Chen and Gouaux (1997), and Hampton Research (FoldIt) have each developed separate procedures using 16 refolding conditions.
Despite these improvements, some difficulties were still encountered in the protein solubility assays performed to monitor the refolding process. Because no solubility assay was provided with the Pro-Matrix kit, this assay had to be set up by the customer, and the methods suggested for a solubility assay in the case of the FoldIt kit (size exclusion chromatography [SEC]), as well as those used by Armstrong et al. (1999) and Chen and Gouaux (1997) (dialysis and centrifugation), were not compatible with a high-throughput or with automation, which are two of the most crucial features in SG studies.
To solve the problems associated with the above limitations, a protein solubility test based on light scattering has been devised (Trésaugues et al. 2004). In practice, the turbidity of the solution is assessed by measuring the optical density (OD) at 390 nm, before and after adding the protein. If the protein remains soluble, the absorbance remains unchanged. In the opposite case, the OD increases proportionally to the amount of precipitate produced. This procedure is much faster than SEC and can be easily automated, but the number of conditions was still limited to 12, and the proteins often precipitated in all of them. This clearly suggested that the number of conditions needed to be further increased. A method of making this quantitative jump has been experimented in microtiter plate format, using 203 refolding conditions (Sijwali et al. 2001). However, the latter study was only designed for screening different GSH:GSSG ratios.
It is worth noting that although increasing the number of refolding conditions increases the probability that a protein will meet a buffer composition favoring its solubility, it also increases the number of samples to be handled. One possible solution to this problem consists of automating the screening process. In addition, automation is required to obtain sufficiently large SG throughputs. A partially automated refolding screening procedure was recently described (Scheich et al. 2004). With this procedure, however, the automation did not include any test for assessing the solubility and only 30 refolding conditions were used.
We therefore designed a refolding strategy involving the use of 96 different buffers in microtiter plate format, based on the above mentioned idea that the probability of a protein encountering a buffer composition favoring correct folding was likely to increase with the number of buffers tested. The solubility assay used in our screening procedure is basically the same as that described by Trésaugues et al. (2004), which accounts for protein solubility, and not for protein folding. After the preparatory refolding stage, circular dichroism (CD), dynamic light scattering (DLS), and crystallogenesis quality control procedures were added to respectively assess the folding, aggregation state, and homogeneity of the protein solution. These methods were chosen because they can be applied in theory to any protein, which is a prerequisite in the field of post-Genomics, which deals mainly with proteins having an unknown function. Finally, the availability of a pipetting robot made it possible to automate the whole process in a 96-well plate format.
To the best of our knowledge, this is the first completely automated “wide spectrum” 96-well IB refolding screening procedure to be developed based on a factorial approach. The present article describes the setup involved and confirms the validity of the method, based on tests carried out with proteins originating from two SG projects.