Phasing statistics for alpha helical membrane protein structures

Authors


Abstract

In this report we highlight the latest trends in phasing methods used to solve alpha helical membrane protein structures and analyze the use of heavy atom metals for the purpose of experimental phasing. Our results reveal that molecular replacement is emerging as the most successful method for phasing alpha helical membrane proteins, with the notable exception of the transporter family, where experimentally derived phase information still remains the most effective method. To facilitate selection of heavy atoms salts for experimental phasing an analysis of these was undertaken and indicates that organic mercury salts are still the most successful heavy atoms reagents. Interestingly the use of seleno-l-methionine incorporated protein has increased since earlier studies into membrane protein phasing, so too the use of SAD and MAD as techniques for phase determination. Taken together this study provides a brief snapshot of phasing methods for alpha helical membrane proteins and suggests possible routes for heavy atom selection and phasing methods based on currently available data.

Abbreviations
GPCRs

G-Protein Coupled Receptors

K2Au(CN)2

potassium gold cyanide

K2PtCl6

potassium hexachloroplatinate

K2Pt(NO2)4

potassium platinum nitrate

K2PtCl4

potassium tetrachloropantinate

MFS

Major Facilitator Superfamily

MIR

multiple isomorphous replacement

MIRAS

multiple isomorphous replacement with anomalous scattering

MMC

methyl mercury chloride

MRC

Medical Research Council

MPs

membrane proteins

SIR

single isomorphous replacement

SIRAS

single isomorphous replacement with anomalous scattering

Ta6Br12

tantalum bromide.

Introduction

Membrane protein structural biology is currently undergoing rapid expansion with many developments in new technologies and beam line instrumentation that are making a real, significant impact on the number of new structures reported.[1] The bottlenecks in membrane protein structure determination are well known[2] and for obvious reasons substantial resource has gone into understanding recombinant expression,[3, 4] protein stabilization, crystallization,[5, 6] and data collection. Given the often poorly diffracting, anisotropic nature of crystals grown from detergent solubilized samples, further developments in phasing practices and methodologies are perhaps the next area where more insight could make a difference. Using our previous database compiled to analyze crystallization conditions for alpha helical membrane proteins (MPs),[7] we present here a complimentary analysis of the current trends in phasing practice and heavy atom use for these proteins. The information updates previous studies conducted before the recent increase in the number of alpha helical MP structures and provides indications as to which heavy atom salts and phasing methods might be tried following successful crystallization and native data collection.

Results

Current trends in phasing methods

Our crystallization database contains 360 entries for alpha helical membrane protein structures. Of these, 229 had initial phase information determined using Molecular Replacement (63%) whilst the remaining 131 (37%) required the use of experimental phase determination methods (Fig. 1 inset). This analysis indicates that almost all types of alpha helices MP are amenable to molecular replacement, with more than 60% of structures for the Respiratory Complexes (61%), Channels (69%), Photosynthetic and Light Harvesting Complexes (60%), G-Protein Coupled Receptors (GPCRs) (93%), ATPases (63%), and Bacterial Rhodopsins (94%) having their initial structures phased using this method. The remarkable success in the application of protein fusions and lipid cubic phase crystallization methodology in the GPCR family in particular has resulted in all of the recent structures being phased using MR rather than through the use of experimental methods. Of note however is that for the transporters the application of MR has proved far less successful. Although accounting for a quarter of all structures in the database, the same as for the Channel proteins, only 40% of these structures were determined using MR. Perhaps this is due to the irregular helical structures observed in this family, which make construction of an accurate search model(s) difficult from distantly related proteins. It is also becoming apparent that within the same structural family of transporters, such as the Major Facilitator Superfamily (MFS), different sub-families have different localized regions of helical flexibility, making their use as MR search models dubious. Our results indicate that where a representative structure from the same sub-family is available as an MR search model, this is likely to be far more effective than using a distantly related structure, even if the distantly related structure is likely to share the same overall fold.

Figure 1.

Histogram showing the uses of different elements in heavy atom derivatives of alpha helical membrane proteins. Inset, pie of pie showing the proportion of alpha helical membrane protein structures phased using molecular replacement versus experimental methods. Experimental methods have been further subdivided by technique.

However there is some redundancy within our database, with multiple entries for the same protein in some cases that may artificially bias the data with regard to MR use. On closer examination of the phasing method for novel structures, where novel is determined as being the first structure for a given protein, in a given conformation from a particular organism, the percentage of structures determined using MR drops to under 40%. Looking closer at these structures 66 of them used phase information from a similar protein (either a prokaryotic counterpart or the same protein from a different organism) or indeed a portion (for example a previously solved soluble domain). This leaves only 15 novel entries that were solved by molecular replacement, these are all structures that either used a FAB fragment or were structures that employed an engineered protein where T4 lysozyme has been inserted within the protein. Our data therefore suggest that MR is gaining in popularity for determining initial phase information for membrane protein structures. However, were new structures are being sought experimental phasing is very likely to be the main route to phase determination.

With regards to experimental phasing then, which methods are the most successful? Of the 131 structures that were determined using experimental phase information by far the most successful methods were either single wavelength anomalous dispersion (SAD) (17%) or multi wavelength anomalous dispersion (MAD) (13%). These were followed by multiple isomorphous replacement (MIR) (8%), single isomorphous replacement with anomalous scattering (SIRAS) (7%), multiple isomorphous replacement with anomalous scattering (MIRAS) (8%) and someway behind was single isomorphous replacement (SIR) (2%). From the database entries there was no obvious correlation of the choice of phasing method with resolution. The average resolution for structures solved using MAD and SAD was 3.0 Å; whereas, that for MIRAS and SIRAS was 2.8 Å. MIR and SIR had the lowest average resolution at 3.3 Å.

Heavy atom choice

In the past few years the number of membrane protein structures has increased substantially. Our analysis shows that 37% of these required experimental phasing. A previous study into the preparation of heavy atom derivatives for membrane proteins was reported in 2006 for 38 alpha helical MP structures.[8] Here we report an updated analysis from 131 structures and focus solely on the alpha helical class of membrane proteins. Our major findings are summarized in Figure 1. Table 1 shows the breakdown of the successfully reported heavy atom derivatives. Consistent with the findings of the 2006 study, we also observe mercury compounds clearly dominating the database, having been successfully used in 29% of 191 cases where heavy atoms were used to determine the phase information for structure determination. This number is higher than the total number of structures (131) as a large proportion of the structures used two or more heavy atoms or a heavy atom combined with seleno-l-methionine, for example. Following mercury the next most successful heavy atom was platinum 14% (26 cases), with lead 6% (11 cases), and gold 5% (10 cases) accounting for a small but significant subset. The most successful mercury compound is methyl mercury chloride (MMC), accounting for 23% (13 cases) of all 55 cases were mercury salts were used. This is closely followed by the ethylmercurials [acetate (4 cases), chloride (2 cases), thiosalicylate (11 cases) and phosphate salts (7 cases), which collectively account for 43% (24 cases)]. The success of the organomercurial salts is very likely due to the ability of these compounds to partition efficiently into both the lipophilic and hydrophilic environments found in alpha helical MPs to react with the cysteine side chains. More reactive mercury salts are also present in significant numbers, mercury chloride (9%), potassium mercury iodide, K2HgI4 (5%), and chloro-(4-sulfophenyl) mercury (PCMBS) (9%), indicating that these compounds are worth pursuing for dramatization purposes.

Table 1. The most Successful Heavy Atom Compounds used to Phase Alpha Helical Membrane Protein Structures have been Listed, Grouped into Element Type
Heavy atom usageNo. of uses in database
MercuryMMC13
EMP7
EMTS6
HgCl25
MMA4
PlatinumK2PtCl410
K2Pt(NO2)44
K2PtCl63
LeadTMLA7
GoldKAu(CN)26
KAuCl42
HA clusterTa6Br1210

By far the most common platinum salt was potassium tetrachloropantinate (K2PtCl4), accounting for 38% (10 cases) of all platinum salts reported. This was followed some way behind by potassium platinum nitrate [K2Pt(NO2)4] 15% (4 cases) and potassium hexachloroplatinate (K2PtCl6) 10% (3 cases). Lead compounds favored trimethyl lead acetate, which accounted for 63% of all cases were lead was used. Potassium gold cyanide [K2Au(CN)2] was used in six cases, and accounting for 60% of all gold cases in the database. Interestingly the number of reported uses of gold compounds was equivalent to that of the tantalum bromide (Ta6Br12) cluster, which has proved successful for the phasing of large macromolecules at low resolution, including the respiratory Complex I and Na, K-ATPase.[9, 10]

A noticeable difference from a previous analysis of heavy atom use in 20068 has been the substantial increase in the use of seleno-l-methionine incorporated protein structures in phase determination. Se incorporation now accounts for 25% of heavy atom uses reported to date (48 cases). However this figure rises to 38% if all the structures are taken into account that has used Se in combination with another HA. This is a substantial increase from the 16% of structures reported in 2006. A possible reason for this increase may be a better understanding of recombinant membrane protein expression in E. coli, that now allows for high levels of seleno-l-methionine incorporation.[3, 11] Advancements in beamline technology, such as highly focused X-ray beams[12] and the installation of fast readout, pixel based detectors,[13] have enabled crystallographers to accurately measure the weak anomalous signal from Se, whilst simultaneously reducing radiation damage.[14]

Discussion

MR has been steadily rising as the principle means of phasing membrane protein structures over the last few years.[2] For the GPCR family in particular molecular replacement phasing has proved remarkably successful. Advances in developing and deploying the technique of lipid cubic phase crystallization to the GPCR family has also enabled high quality data to be collected from these crystals.[6, 15] The combination of the high quality diffraction data with the remarkable success of fusing the T4 lysozyme protein to either wild type or thermostabilized receptor molecules has resulted,[16] from a phasing perspective, an ideal system for molecular replacement phasing, thus negating the need for experimental phasing procedures in this class of membrane protein.[17] The use of antibodies as crystallization scaffolds has also been successful for calculating initial phases for the GPCR family[18] and has proved useful for other families, as demonstrated originally for the cytochrome oxidase structure from Paracoccus denitrificans[19] but also recently for AdiC transporter from Escherichia coli.[20] MR has been far less successful in the transporter group of membrane proteins, most likely due to the highly dynamic nature of these proteins, although no extensive study has been carried out to date. However, it is becoming more apparent, as the number of different transporter structures increases, that the individual sub-families within the larger structural super families, such as the MFS or “LeuT-like fold”, display significantly different structural dynamics. This may be one reason why distantly related transporters, which ultimately have the same overall structure, prove ineffectual as generic MR search models. Recent examples include the eukaryotic proton coupled phosphate transporter, PipT and the nitrate-nitrite exchanger NarU, which although sharing the same overall fold as other members of the MFS were not solved using MR, but instead required experimental phasing methods.[21, 22]

As we present in this report however, experimental phasing still accounts for over half the structures of transporter proteins, suggesting this method is likely to remain for the foreseeable future. In addition, the incorporation of seleno-l-methionine into recombinant protein using eukaryotic expression systems is far from trivial indicating an increased likelihood of the need to pursue heavy atom derivatization. In making informed decisions as to which heavy atoms are appropriate to test, Table 1 should provide a useful guide.

Although individual proteins will have unique phasing solutions, our analysis identifies some general trends and routes to try when designing experimental set-ups for phasing. Derivatization of native cysteine residues using an organic mercury salt has proved very successful, as has the introduction of cysteine residues for this purpose, possibly as a substitution of serine. However, our own experience has indicated that the presence of >5 cysteine residues tends to result in heavy protein precipitation following this procedure. Our data also suggest that for membrane proteins containing large extra membrane domains, such as the ATPase family, ABC transporters and respiratory complexes, using more “exotic” phasing compounds, such as the “Tantalum Bromide cluster” have proved successful, although often as part of a wide range of different phasing data sets.[9, 23, 24] Indeed, the requirement of using multiple sources of phase information is a recurrent theme in the database (Supporting Information database: MP_phasing.xls), suggesting that a combination of phasing methods should always be considered and pursued.[25]

As more data become available we will continue to update the current analyses on heavy atom selection and hope this will lead to a more rational approach to one of the most challenging hurdles in membrane protein structural biology.

Methods

An analysis of the deposited structures for alpha helical MPs was carried out as described previously[7] to include phasing method and heavy atom information. A copy of the database incorporating the phasing information used in the present analysis is available as Supporting Information MP_phasing.xls.

Acknowledgments

This research was funded through the Medical Research Council (MRC) Career Development Award grant G0900399 to SN. The authors declare they have no conflict of interest with regard to the information and analyses presented herein.

Ancillary