- Top of page
- Materials and methods
A recurring obstacle for structural genomics is the expression of insoluble, aggregated proteins. In these cases, the use of alternative salvage strategies, like in vitro refolding, is hindered by the lack of a universal refolding method. To overcome this obstacle, fractional factorial screens have been introduced as a systematic and rapid method to identify refolding conditions. However, methodical analyses of the effectiveness of refolding reagents on large sets of proteins remain limited. In this study, we address this void by designing a fractional factorial screen to rapidly explore the effect of 14 different reagents on the refolding of 33 structurally and functionally diverse proteins. The refolding data was analyzed using statistical methods to determine the effect of each refolding additive. The screen has been miniaturized for automation resulting in reduced protein requirements and increased throughput. Our results show that the choice of pH and reducing agent had the largest impact on protein refolding. Bis-mercaptoacetamide cyclohexane (BMC) and tris (2-carboxyethylphosphine) (TCEP) were superior reductants when compared to others in the screen. BMC was particularly effective in refolding disulfide-containing proteins, while TCEP was better for nondisulfide-containing proteins. From the screen, we successfully identified a positive synergistic interaction between nondetergent sulfobetaine 201 (NDSB 201) and BMC on Cdc25A refolding. The soluble protein resulting from this interaction crystallized and yielded a 2.2 Å structure. Our method, which combines a fractional factorial screen with statistical analysis of the data, provides a powerful approach for the identification of optimal refolding reagents in a general refolding screen.
The identification of 20,000–25,000 genes from the human genome project has resulted in a wealth of potential targets for structural biology investigation and pharmaceutical design (International Human Genome Sequencing Consortium 2004). Since the completion of the project, expectations have been high that the number of protein crystal structures would dramatically increase but, in reality, there has only been a moderate rise in the number of crystal structures, due largely to a lack of sufficient quantities of protein suitable for structural studies (Service 2002). Although the technology responsible for expressing recombinant proteins is highly developed (Chambers et al. 2004), it is still difficult to produce enough soluble protein for these structural studies. The ultimate goal of determining crystal structures on a genome-wide scale requires methods designed to improve the yield of functional protein.
Historically, optimization of soluble protein expression has been the first strategy when trying to obtain protein for structural studies. In contrast, refolding insoluble protein has often been a strategy of last resort due to the unpredictable and time-consuming nature of the refolding process. However, the literature shows that numerous proteins can be refolded into their active forms, and that certain additives can assist in the refolding process. The combination of these additives dictates the efficiency of refolding as well as the utility of this method to gain soluble protein. Some of the more effective additives include reducing agents, thiol shuffling enzymes, polar and nonpolar reagents, various detergents, and chaperonins; numerous excellent reviews have previously discussed these and other refolding additives in more detail (Rudolph and Lilie 1996; De Bernardez Clark 1998; Lilie et al. 1998; Voziyan et al. 2000; Clark 2001; Middelberg 2002). Due to the unpredictable nature of the refolding process, the development of a systematic method for identifying useful refolding conditions is needed. Fractional factorial refolding screens have emerged as a way to compensate for this unpredictability. Fractional factorial screens contain a representative subset of reagent combinations contained in full factorial screens and are designed to maximize the number of refolding variables explored while minimizing the amount of data collection (Hofmann et al. 1995; Chen and Gouaux 1997; Armstrong et al. 1999; Tobbell et al. 2002). These screens have been used successfully to refold proteins, but the choice of refolding additives included in these screens is based on historical precedent and does not take into account novel reagents shown to improve protein renaturation. More recently, Vincentelli et al. (2004) designed an automated, 96-well refolding strategy that incorporated a fractional factorial buffer design utilizing both the traditional refolding additives used in previous refolding screens as well as a newer class of refolding agents known as NDSBs.
Although prior refolding screens identify useful conditions for protein refolding, they stop short of using statistical methods to determine the utility of each reagent when used in a general screen on a diverse protein data set. In this study, we investigate the effects of additives on the refolding of 33 proteins using a fractional factorial refolding screen. We include reagents such as the reductants BMC and TCEP, and the detergent-mimic NDSB 201 in our matrix as a way of assessing their utility in refolding a variety of proteins. These reagents have been shown to be beneficial to protein refolding, extraction, and stability (Vuillard et al. 1995a,b; Woycechowsky et al. 1999; Chong and Chen 2000; English et al. 2002). The screen has been miniaturized for automation, resulting in reduced protein requirements, increased throughput, and enhanced reproducibility. To assess the applicability of the screen to a wide spectrum of proteins, we refolded multiple members from five gene families, as well as single members from additional families. The data gathered from refolding 33 proteins were analyzed using statistical methods to identify individual reagents, and reagent interactions having a significant effect on protein refolding. Every buffer condition successfully refolded at least one protein, and of the 14 reagents tested, 12 reagents significantly improved protein refolding. Finally, this screen was used successfully to identify a positive synergistic interaction between reagents that resulted in the production of soluble, functional protein leading to diffraction quality crystals and the solution of a protein structure. The results obtained support the use of a fractional factorial screen in combination with statistical analysis to identify suitable reagents to be included in a general refolding screen and provide a systematic method for optimizing the refolding process.
- Top of page
- Materials and methods
A significant barrier facing structural genomic projects is the generation of soluble, functional eukaryotic protein for structural studies. Meeting this demand has proven to be a challenge, given the low success rate for expressing soluble eukaryotic proteins compared to prokaryotic proteins (Yee et al. 2002; Chambers et al. 2004). An alternative approach for generating sufficient quantities of soluble protein is refolding the insoluble protein expressed in the inclusion bodies of Escherichia coli. In theory, refolding these proteins should be a straightforward process given that the refolding literature is replete with the effects of individual reagents on the refolding of single proteins. In practice, however, there is no universal method or buffer for reliably refolding a given protein of interest and identification of initial refolding conditions remains a major hurdle.
One way to overcome this obstacle is by the introduction of refolding screens to rapidly identify initial conditions that result in folded protein (Hofmann et al. 1995; Chen and Gouaux 1997; Armstrong et al. 1999; Tobbell et al. 2002; Maxwell et al. 2003; Scheich et al. 2004; Tresaugues et al. 2004; Vincentelli et al. 2004). These screens were designed to test a variety of refolding additives in a minimal number of experiments. Although these screens have been successful in refolding multiple proteins, a comprehensive statistical analysis of the importance of the reagents for generalized protein refolding is minimal. Our method uses a fractional factorial design combined with statistical analysis to directly compare the effects of both well-known, and lesser-known, refolding reagents on a large and diverse set of proteins. The data gathered from this study was used to determine the general utility of each reagent for the better design of future refolding screens.
Based on our analysis, pH and reductants had the largest impact on refolding our set of 33 proteins. The effect of pH on protein refolding has been well documented on a protein-specific basis, but previous analysis regarding the optimal pH for protein refolding has been limited. Our data demonstrates a direct comparison of four pH levels and provides examples where pH extremes are crucial for protein refolding. Likewise, the data from a refolding screen designed by Vincentelli et al. (2004) showed that a broad pH range was important for protein solubility, underscoring the importance of exploring pH when designing a generalized refolding screen. Reducing agents also play an important role in refolding proteins; however, the use of compounds for protein refolding beyond the more traditional reductants (DTT, GSH:GSSG, and βME) remains protein-specific. BMC is a dithiol that improves protein refolding both in vitro and in vivo, and is thought to mimic protein disulfide isomerase (PDI) by catalyzing native disulfide bond formation (Woycechowsky and Raines 2000). TCEP is a nonthiol-containing molecule and is a stronger reductant than DTT at pH values below 8 (Getz et al. 1999). The results from this protein data set strongly support the inclusion of BMC and TCEP in a refolding screen. Proteins containing disulfide bonds were more effectively refolded using BMC than its well-studied counterpart, GSH:GSSG. In contrast, proteins lacking disulfide bonds were more effectively refolded using TCEP than DTT. The utility of alternative reductants, such as 4-mercaptobenzeneacetate (4-MPA) shown in the literature to aid protein folding (Gough et al. 2002), suggests that other compounds may also be useful, and could be explored in future refolding screens.
Although important, pH and reductants are not the only variables to consider when designing a refolding screen. Studies have shown that a single protein can refold under markedly different conditions (Hofmann et al. 1995; Armstrong et al. 1999). Our data set contained two phosphatases with 65% sequence identity and nearly identical structural folds. Even with such a high level of identity, one of the proteins refolded productively in twice as many buffer conditions as the other. One way to overcome the unpredictable nature of protein refolding is to include an array of reagents known to improve refolding as a way to maximize the opportunity to recover functional protein. As such, our screen also includes all the reagents originally described in a fractional factorial screen by Chen et al. (Chen and Gouaux 1997) as well as the detergent Tween 80 and the detergent-mimic NDSB 201. The latter two were added because they inhibit aggregation during the refolding process resulting in increased yields of soluble protein (Goldberg et al. 1996; Arakawa and Kita 2000; Chong and Chen 2000). NDSBs lack the hydrophobic tail of detergents, thereby preventing micelle formation and have been shown to be especially helpful in refolding at higher protein concentrations (Expert-Bezancon et al. 2003). Vincentelli et al. (2004) included NDSBs 195, 201, and 256 in their refolding screen and found them to be useful refolding additives. The remaining reagents in our screen improved the refolding of at least one protein with the exception of GdnHCl and divalent metal ions. The results from our analysis suggest that inclusion of all the reagents discussed, aside from GdnHCl and divalent metal ions, will increase the chance of successfully applying a broad refolding screen. The inclusion of alternative refolding agents like cyclodextrins, which have been used successfully in prior refolding studies (Machida et al. 2000; Scheich et al. 2004), could be explored in future fractional factorial screens.
While the effects of reagent interactions on refolding have been touched upon previously (Tobbell et al. 2002), the optimization of a positive reagent interaction for generating crystallization quality protein is unique. Reagent interactions can be identified depending on the resolution of the fractional factorial screen. The importance of using appropriate experimental designs and statistical methods to analyze the refolding data is particularly relevant when looking beyond the main effects for these interactions. SAmBA, a software program used previously to design a refolding matrix (Vincentelli et al. 2004), is good for setting up the experimental design but lacks the complementary statistical methods needed to analyze the data. The reagent interactions in our screen were not immediately discernable, and could only be identified using statistical analysis. Using this method, we were able to identify potential interactions, and interestingly, a third of these interactions were between pH and the various reductants. The interaction between NDSB 201 and BMC on the refolding of Cdc25A was selected for follow-up due to the novelty of the reagents. In addition, the low refolding efficiency of the protein made it a more challenging example to pursue. The resultant crystal structure of Cdc25A supports the literature in promoting the utility of refolding for generating soluble protein for structural genomics programs (Maxwell et al. 2003).
The matrix described here allowed the rapid exploration of 14 different reagents on the refolding of 33 proteins representing significant diversity in structure and function. Moreover, this screen incorporated recently described reagents shown to improve the refolding process while decreasing the total number of conditions from >8000 data points in a full factorial to a mere 32 data points. While other refolding screens have used light scattering as a measurement of refolding (Tresaugues et al. 2004; Vincentelli et al. 2004), protein activity provides a useful alternative method to measure refolding, and has low protein requirements of <500 μg of unfolded protein per triplicate primary screen. In addition, the small reaction volumes allow future screening designs to include more difficult to obtain refolding reagents such as chaperonins.
The identification of important new reagent effects and interactions that enhance refolding highlights the need to identify optimal buffer conditions for refolding proteins in a methodical, fast, and economical way. In this regard, the combination of automation, fractional factorial screens, and a thorough analysis of the data using statistical software provide a powerful tool to expand on existing refolding methodology. The data presented here demonstrates the strength of this strategy as a way to overcome the bottleneck of obtaining soluble, functional protein for structural genomics programs.