Membrane proteins play a fundamental role in human disease and therapy, but suffer from a lack of structural and functional information compared to their soluble counterparts. The paucity of membrane protein structures is primarily due to the unparalleled difficulties in obtaining detergent-solubilized membrane proteins at sufficient levels and quality. We have developed an in vitro evolution strategy for optimizing the levels of detergent-solubilized membrane protein that can be overexpressed and purified from recombinant Escherichia coli. Libraries of random mutants for nine membrane proteins were screened for expression using a novel implementation of the colony filtration blot. In only one cycle of directed evolution were significant improvements of membrane protein yield obtained for five out of nine proteins. In one case, the yield of detergent-solubilized membrane protein was increased 40-fold.
Abbreviations: CoFi, colony filtration; IMP, integral membrane protein.
During the last two decades there has been a dramatic increase in structural information on soluble globular proteins. This is largely due to the introduction of novel and improved technologies for structure determination, and more recently through the establishment of structural genomics pipelines. Meanwhile, the progress in membrane protein structure biology has been limited and currently less than 30 novel membrane protein structures are determined each year. Our understanding of integral membrane protein (IMP) structure and function, therefore, remains rudimentary. This lack of structural information is particularly acute, since IMPs are involved in many pathological conditions and constitute a major fraction of current drug targets (Drew et al. 2003). The limited progress in membrane protein structural biology can be attributed to three outstanding problems:
Difficulties in establishing high levels of recombinant overexpressed IMPs;
problems in defining efficient schemes for detergent solubilization compatible with homogenous protein purification;
the low success rate of growing well-ordered crystals that lead to high-resolution X-ray diffraction data.
Improvements in the earlier step of the process, the expression, and solubilization, usually lead to subsequent improvements in the purification and crystallization steps. Therefore, technological improvements in the overexpression and detergent solubilization steps are critical for further advances in membrane protein structural biology.
The difficulties in recombinant overexpression of membrane proteins arise partly from the physical space limitations of the cell membrane, but overexpression can also be toxic to the host cell (Laible et al. 2004). In addition, if the capacity of protein folding or insertion into the membrane is exceeded, the result will be aggregation and accumulation of inactive protein in inclusion bodies (Loll 2003). The hydrophobic nature of IMPs complicates the purification step since it requires the identification of suitable detergents for extraction of the protein from the lipid bilayer. After extraction, different detergents have different stabilizing properties and yield protein samples with varying homogeneity. However, it is hard to predict which detergent will be optimal for the extraction, stabilization, and crystallization of a particular IMP. In practice, purification trials with different stabilizing detergents significantly improve the probability of obtaining useful crystals for structural studies (Iwata 2003).
E. coli has, for several reasons, been the preferred host for overexpression of proteins for structural studies; it allows for large amounts of biomass to be generated rapidly at a low cost; the molecular biology of the host is well studied, and homogeneous protein samples can be obtained. The authors and others have previously shown that E. coli IMPs can often be produced homologously at levels suitable for biochemical and structural studies (Daley et al. 2005; Eshaghi et al. 2005). However, eukaryotic membrane proteins are much more challenging to produce in E. coli than their bacterial counterparts. This has partly been attributed to differences in the membrane insertion machinery and lipid bilayer composition (Loll 2003). Still, a number of eukaryotic IMPs have been produced in a functional form in E. coli (Tucker and Grisshammer 1996; Opekarova et al. 1999; Quick and Wright 2002; Moberg et al. 2003).
To date, only few eukaryotic membrane protein structures have been determined, and a majority of these have been purified directly from their native source, a strategy that is not applicable for most membrane proteins. The major strategy for improving recombinant protein expression is to tailor expression parameters such as expression strains, vectors with different fusion tags or promoters, coexpressed chaperones, rare codon t-RNA expressing plasmids, induction systems, and growth media (Sorensen and Mortensen 2005). An alternative approach is to tailor the target gene, e.g., by the introduction of point mutations. However, rational engineering of proteins has turned out to be very difficult, since it requires some previous knowledge of the structure. Directed evolution strategies, on the other hand, do not require structural information, but instead, an efficient screen for monitoring expression levels is needed, preferably at the colony level (Waldo 2003). To date, the most widely used method to monitor expression levels is the fusion of green fluorescent protein (GFP) to the C terminus of the target protein. Directed evolution using GFP-screening has been shown to have a potential to identify protein variants of soluble proteins that are more suitable for overexpression and crystallization (Pedelacq et al. 2002; Keenan et al. 2005).
However, these studies have only been reported on a few successful targets, and there is still a lack of understanding of the broader applicability for these methods. The potential usefulness of the GFP method for monitoring levels of membrane protein expression has been shown using a set of wild-type (WT) proteins in liquid cultures (Drew et al. 2002; 2005). However, the method has neither been used to screen mutant libraries, nor applied at the colony level.
Previously, only one directed evolution study on membrane proteins has been reported (Zhou and Bowie 2000). In this study, Zhou and Bowie showed the feasibility of improving the thermostability of one detergent-solubilized integral membrane protein using directed evolution (Zhou and Bowie 2000). A functional assay in a 96-well format was used where mutant proteins with altered stability could be identified. This approach, however, demands a functional assay for detection; thus, it is neither generic nor does it allow for screening of larger libraries than what could be conveniently assayed in a multi-well format.
In the present study we have subjected eight different E. coli inner membrane IMPs and one human membrane protein, all from different functional families, to random mutagenesis in an effort to enhance expression levels. The levels of detergent-solubilized proteins were screened for using the detergent-adapted version of the colony filtration (CoFi) blot method, recently developed in our laboratory (Cornvik et al. 2005). Using this method, thousands of clones can be analyzed in a single round of screening. The results not only show that a random mutagenesis approach is a feasible strategy for improving protein yield, but also that the screening method is robust and will be useful for many other applications to improve properties of IMPs. In addition, no prior structural or functional information of the target protein is required, and the method is applicable to any membrane protein.
In order to enable screening of detergent-solubilized IMPs using the CoFi blot, it has been modified and optimized for the use of detergents (data not shown).
The E. coli proteins in the present study were chosen and classified according to their levels of expression obtained in earlier benchmarking experiments (see Materials and Methods). From this set we selected eight E. coli targets as well as one human membrane protein with null, low, or medium expression (Table 1). All IMPs were subjected to random mutagenesis, CoFi blot screening, and characterization.
Table Table 1.. Proteins used in this study
Creation of random mutagenesis libraries
The random mutation libraries were created using a commercially available polymerase with an easily controllable mutation rate, which was restricted to 3–7 mutations/kb. The mutated open reading frame (ORF) was cloned into the expression plasmid containing an N-terminal FLAG tag and a C-terminal His6 tag using the Gateway system and transformed into a cloning strain, resulting in at least 15,000 colonies. This amplified library was then harvested and transformed into the expression strain C41 and screened with the CoFi blot. Typically 6000–8000 colonies were screened per library.
CoFi blot screening and liquid culture validation
In libraries corresponding to the four proteins MP01, 02, 06, and 08, we were unable to detect clones expressing at levels above background in the CoFi blot (Fig. 1A). All of these four proteins had previously been designated as low or non-expressing (Eshaghi et al. 2005) (Table 1; Supplemental Fig. 1). Libraries of the five proteins originally classified as low or medium expressers (MP03, 04, 05, 07, and 09) showed considerable intensity variations between colonies (Fig. 1B–D). From these CoFi blots we selected between 20 and 50 colonies with the highest intensities. For MP07, only a few strong colonies were present and most colonies selected were of medium intensity. The colonies selected from the different proteins were grown and induced in liquid cultures along with duplicate WT constructs. Using a 96-well-based filtration separation strategy (Knaust and Nordlund 2001) followed by a dot blot probed with a Ni-chelate reagent, detergent-solubilized protein constructs were identified. For targets MP03, 04, 05, 07, and 09, a number of the selected clones were shown to express better than WT (Table 1).
To get an estimate of the performance of the detergent-adapted CoFi blot, we compared 24 colonies rated as low and 24 rated as high from each of the libraries of MP04 and 05. For MP07 we also included colonies of medium intensities, since very few colonies were judged to express at a high level. The constructs were grown in liquid cultures and dot blots of purified material were made. For MP04 and MP05 there was a good correlation between the CoFi blot rating and the intensities seen in dot blots of solubilized material (Fig. 2A,B,D,E), and for MP07 we were able to correctly identify the few significantly stronger clones present in the library along with several constructs expressing at a medium level (Figs. 1B, 2C,F).
Medium-scale expression and purification
In order to confirm the differences in expression levels, we performed medium-scale IMAC purifications on at least triplicate samples for the best expressing clone and corresponding WT for targets MP03, 04, 05, 07, and 09. The purified proteins were analyzed either by Western blots (MP03, 05, and 09) or by dot blots (MP04 and 07).
In all cases the enhancement in yield was significant and the differences within the test sets were small (Fig. 3A–E). The smallest change in yield in these medium-scale experiments was an approximate 1.5-fold increase on average for MP03. The largest change observed was for MP07, where we saw an approximate 12-fold increase on average.
Large-scale expression and purification
Targets MP03, 07, and 09 were expressed in large-scale liquid culture (4–12 liters) and subjected to affinity purification and gel filtration. The purified samples were analyzed by Coomassie-stained SDS-PAGE (Fig. 3F). The gel filtration chromatograms showed that the proteins could be purified as non-aggregates, and that the chromatographic profile of each clone resembled that of WT (Fig. 3G–I).
For these proteins, a more accurate determination of the improved clone expression yield was made by calculating the area under the chromatographic curve. For MP03, the increase in yield was improved by 25%, similar to the estimated amount from the medium scale experiments. Clones one and two from MP09 showed an increase in expression of 60% and 80%, respectively, and MP07 had a 40-fold (4000%) improvement in expression.
The positive clones selected from the CoFi blot that yielded increased expression levels in liquid cultures were sequenced in order to determine the extent and location of the mutations. On average, 0.60 amino acid substitutions were observed per 100 amino acids (Table 1). Clones showing only a moderate increase, i.e., close to wild-type expression levels, were also sequenced and had the same mutation rates. Since there is no available experimentally determined structure of our target proteins, we predicted the secondary structure of the proteins using TMHMM (Krogh et al. 2001). These results were then used to create a secondary structure model onto which the mutations were plotted (Fig. 4A–E). In general, the mutations were evenly distributed over the entire protein. All substitutions in positive clones are summarized in Supplemental Table 1.
The use of colony-based expression screening methods in directed evolution experiments is attractive, as it allows large numbers of clones to be screened at a very low cost. Methods utilizing C-terminal reporters have been described and used to select for improved solubility at the colony level (Pedelacq et al. 2002). However, these methods have certain disadvantages such as the reporter affecting the target solubility and that the reporter fusion needs to be removed. Recently, we showed that the colony filtration blot could be used for selecting soluble proteins without the requirement of a fusion reporter protein (Cornvik et al. 2005). Using this method we demonstrated a relatively high correlation between expression at the colony level and in liquid culture, where ∼80% of the positive colonies also express well in liquid culture. As an extension of our method, we have developed a method for screening for detergent extractable membrane proteins. The increase in yield for the five successful proteins, compared to WT, ranged from 1.5- to 40-fold. To confirm that the smallest changes were not due to random variations, we repeated these experiments three to seven times and showed that they were highly reproducible. The increased yield was also retained when going from small- to large-scale liquid cultures for all of the best clones (Fig. 3A–E).
For several of the medium-expressing constructs the low mutation frequency gave surprisingly large improvements in yield. The best clone of MP07 gave an increase that is roughly 40-fold that of WT, as a result of three mutations. For MP09, all of the identified clones contain only a single mutation, each situated in a different region of the protein (Fig. 4E). Even though the increase in expression yield is not as dramatic for MP09 (about twofold) as for MP07, the fact that only one mutation can give such an increase is encouraging. In the present study we only performed a one-step random mutagenesis cycle to demonstrate that the detergent adapted CoFi blot can select for improved levels of detergent-extracted IMPs. As demonstrated by Zhou and Bowie (2000), it is hoped that these mutations, when combined, are additive, and that a multistep directed evolution strategy using DNA shuffling could further improve the yield.
As shown for MP07, where only a few strong colonies were detected in the CoFi blot, the number of colonies screened seems to be just enough to identify this small fraction of positive clones. Larger libraries or higher mutation rates may therefore be successful at improving the expression yields further, not only for those targets where modest improvements in expression levels were already observed, but potentially also for the four targets that yielded no positive clones. It is also noteworthy that most random mutations seem to decrease expression (Figs. 1C,D, 2A,B,D,E). We can, with the present strategy, identify some of the relatively few mutations with retained or improved expression levels. Therefore, another application of the CoFi blot is to select for several well-expressing variants of a target protein. The availability of multiple variants of a protein is known to improve the success rates for obtaining crystals (Keenan et al. 2005). This would be an alternative approach to using naturally occurring homologs from different species for crystallization (Campbell et al. 1971).
One major advantage of the CoFi blot method, compared to protein reporter methods, is that the selection is done after cell lysis and detergent solubilization. The protein variants selected are therefore not just expressing well, but are also stable during purification at the specified solvent conditions, which could, for example, be the conditions for an activity assay. To improve the probability of obtaining crystals for structural studies, the CoFi blot can be used to search for constructs that are optimally solubilized by detergents suitable for crystallization.
There are several potential ways in which mutations might influence the final protein yield in expression experiments. They might affect transcriptional, translational, or folding events. They might also affect the stability of the protein in the membrane, or alternatively, the preference for detergents. From the sequenced clones, no significant preferences were observed for certain amino acid substitutions (Supplemental Table 1), or for locations of the substitutions in relation to predicted secondary structure or the transmembrane region (Fig. 4). This is in contrast to Zhou and Bowie (2000), who identified a slight preference for residues close to the membrane solvent interface. Both data sets are, however, relatively small, and further studies are required to establish an understanding of which mutations influence expression and stability of IMPs.
A potential disadvantage with selection from random mutagenesis libraries is that there is a theoretical risk for preferential selection of inactive protein variants, which might be less toxic to the host cell and therefore yield higher expression. To get an indication whether this was the case with our strategy, we measured the enzymatic activity of MP09 and showed that the mutated protein had the same activity as the WT protein (Supplemental Fig. 2). Therefore, at least in this case, the function was not selected against.
In conclusion, we have shown that the CoFi blot works as an efficient screening method for identifying clones that produce higher levels of detergent-extractable membrane proteins. The method allows for rapid screening of tens of thousands of clones in a single experiment. We have also, for the first time, demonstrated that the yield of detergent-solubilized membrane protein, expressed in E. coli, could be increased by using random mutagenesis. In this study, our strategy worked for more than half of the proteins tried. Even though the increases in yield for some proteins were moderate, we have also clearly demonstrated the feasibility of improving the production levels for these proteins. Extending the method using multiple cycles of evolution and larger libraries will likely further increase its usefulness to solve difficult expression problems, including the production of eukaryotic IMPs in E. coli. Therefore, we believe that the detergent-adapted CoFi blot will provide an important extension of the currently limited toolbox of expression technologies for structural biology of membrane proteins. This will aid current efforts trying to accelerate the pace for which we gather insights into structure and function of membrane proteins.
Materials and Methods
Cloning of initial benchmarking set
Cloning and expression screening was performed as described earlier (Eshaghi et al. 2005), with the exception that only one temperature (37°C), one strain C41 (DE3) (Avidis), and the detergent n-dodecyl-β-D-Maltopyranoside (DDM) (Anatrace) were used in all experiments.
Generation of random mutant libraries
Template DNA was isolated using QIAprep Spin Miniprep Kit (Qiagen). The mutant libraries were made using the GeneMorph PCR Mutagenesis Kit from Stratagene. For medium-range mutation frequency (3–7 mutations/kb) we added 2.5 ng of target DNA and performed reactions according to manufacturers protocols. Primers were designed to amplify the attL sites to allow site-specific recombination, catalyzed by the LR Clonase enzyme mix (Invitrogen), into the expression vector pT73.3-FLAG GW (Tobbell et al. 2002). The resulting expression constructs were transformed into E. coli DH5α cells and grown on LB-Agar (LA) plates containing 30 μg/mL tetracycline. The amplified libraries were harvested, and DNA was isolated using the Quantum Prep Plasmid Midiprep kit (BioRad). The library was then transformed into E. coli C41 (DE3) expression strain for analysis by CoFi blot.
The plated colonies were picked up with a 0.45-μm Durapore membrane (Millipore) and placed with the colonies facing up on an LA plate (d = 14 cm) containing 0.1 mM isopropyl-β-D-1-thiogalactoside (IPTG). Protein expression was induced for 3 h at 37°C, after which the membrane was placed on top of a nitrocellulose membrane and a Whatman 3MM paper soaked in lysis buffer (50 mM Tris pH 8, 100 mM NaCl, 0.5 mg/mL lysozyme, 0.40 mg/mL DNAse I, complete EDTA-free protease inhibitor cocktail tablet/50 mL [Roche], and 1.0% DDM). The filter sandwich was incubated at RT for 1 h. The cellulose membrane was removed and washed 3 × 10 min in TBS-T and incubated for 1 h with INDIA-His-probe (Pierce) diluted 1:5,000 in TBS-T. The membrane was washed 3 × 10 min with TBS-T and developed according to the manufacturer's instructions with the SuperSignal West Dura chemiluminescence kit (Pierce). Detection was done with the Fluor-S Multi Imager (BioRad)). Colonies of interest were picked and transferred to LB media containing 30 μg/mL tetracycline for small-scale confirmation with dot blots.
Small-scale purifications and dot blot
Isolated colonies were grown in either deep-well plates or cell cultivation tubes at 37°C in 1 or 3 mL LB medium, respectively, until OD600 of ∼0.7. The cultures were then induced with 0.1 mM IPTG. After induction the final OD600 was measured and the samples were normalized for cell number by harvesting a volume corresponding to an OD600 of 1. The cells were resuspended in lysis buffer, placed on ice for 45 min, and freeze thawed three times in liquid nitrogen. The lysate was filtered using 0.45-μm filter plates (Millipore). Two microliters of filtered sample were applied to nitrocellulose membrane and allowed to dry. Whenever small-scale IMAC purifications were used, the filtrate were loaded onto Ni-NTA agarose resin (Qiagen) pre-equilibrated with purification buffer containing 20 mM Tris-HCl pH 8.0, 300 mM NaCl, 5 mM β-mercaptoethanol, and 0.05% DDM. After 15 min of agitation at 4°C, the unbound material was removed by 30-sec centrifugation at 100g. The resin was then washed with 30 mM imidazole in the purification buffer at 100g for 30 sec. The bound proteins were finally recovered in purification buffer containing 500 mM imidazole by centrifugation at 100g for 1 min.
The eluted samples were dotted onto nitrocellulose membrane and analyzed as above.
Medium- and large-scale purifications and gel filtration
Medium-scale (250 mL LB) and large-scale (4–12 L LB) cultures were inoculated with overnight cultures and grown until the OD600 reached 0.6. Induction was performed for 3 h using 0.1 mM IPTG.
The cell pellets were resuspended in lysis buffer and left to stand for 1 h at 4°C. Freeze thawing was followed by centrifugation at 15,000g. The membrane fraction was harvested by 1 h centrifugation at 150,000g and resuspended and solubilized with 1% DDM in purification buffer supplemented with 20 mM imidazole using a glass homogenizer, followed by centrifugation for 45 min at 200,000g. The supernatants containing solubilized membrane proteins were loaded on Ni-NTA agarose resin pre-equilibrated with purification buffer supplemented with 20 mM imidazole. The resin was washed with purification buffer containing 40 mM imidazole. The recombinant proteins were eluted with purification buffer supplemented with 500 mM imidazole.
The eluted samples were loaded onto a HiLoad 16/16 Superdex 200 column (Amersham Biosciences). The yields from pure fractions were calculated by measuring the absorbance at 280 nm, and the ratio WT:Clone was estimated by calculating the curve areas in the chromatogram. The purified protein was either dot blotted (as above) or loaded on Nu-PAGE 4%–12% Bis-Tris gels (Invitrogen) and dyed with Coomassie stain or transferred to nitrocellulose membrane by Western blotting. The blots were developed as described above and quantified using the Quantity One software (BioRad).
All constructs were sequenced by Sanger sequencing at MWG Biotech AG (Martinsried, Germany) and translated into protein sequences for secondary structure predictions using TMHMM (22).
Electronic supplemental material
Supplemental Figure 1 shows expression levels of 42 different E. coli membrane proteins from the study by Eshaghi et al. (2005). The E. coli constructs used in this study are highlighted.
Supplemental Figure 2 shows a comparison of enzymatic activity for MP09.wt and mutant clones 1 and 2.
Supplemental Table 1 summarizes mutations identified for each protein variant.
Pär Nordlund and Tobias Cornvik are founders of the biotechnology company Evitra AB.
We are grateful to Marie Hedrén for help with the initial cloning work and Albert Beuscher for helpful comments on the manuscript. We thank the Swedish Research Council, the Wallenberg Consortium North, and the EU projects SPINE, E-MEP, and EICOSANOX for financial support.