A new computational approach has been developed to determine the spatial arrangement of proteins in membranes by minimizing their transfer energies from water to the lipid bilayer. The membrane hydrocarbon core was approximated as a planar slab of adjustable thickness with decadiene-like interior and interfacial polarity profiles derived from published EPR studies. Applicability and accuracy of the method was verified for a set of 24 transmembrane proteins whose orientations in membranes have been studied by spin-labeling, chemical modification, fluorescence, ATR FTIR, NMR, cryo-microscopy, and neutron diffraction. Subsequently, the optimal rotational and translational positions were calculated for 109 transmembrane, five integral monotopic and 27 peripheral protein complexes with known 3D structures. This method can reliably distinguish transmembrane and integral monotopic proteins from water-soluble proteins based on their transfer energies and membrane penetration depths. The accuracies of calculated hydrophobic thicknesses and tilt angles were ∼1 Å and 2°, respectively, judging from their deviations in different crystal forms of the same proteins. The hydrophobic thicknesses of transmembrane proteins ranged from 21.1 to 43.8 Å depending on the type of biological membrane, while their tilt angles with respect to the bilayer normal varied from zero in symmetric complexes to 26° in asymmetric structures. Calculated hydrophobic boundaries of proteins are located ∼5 Å lower than lipid phosphates and correspond to the zero membrane depth parameter of spin-labeled residues. Coordinates of all studied proteins with their membrane boundaries can be found in the Orientations of Proteins in Membranes (OPM) database:http://opm.phar.umich.edu/.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
Abbreviations: TM, transmembrane; OPM, Orientations of Proteins in Membranes (database); PPM, Positioning of Proteins in Membranes, (software); PDB, Protein Data Bank; ATR FTIR spectroscopy, attenuated total reflection Fourier transform infrared spectroscopy; MD, molecular dynamics; EM, electron cryo-microscopy; NEM, N-ethylmaleimide; DM, n-dodecyl-β-D-maltoside; DHPC, 1,2-dihexanoyl-sn-glycero-3-phosphatidylcholine; RMSD, root-mean-square deviation; ΔGtransf, transfer free energy; D, hydrophobic thickness; τ, tilt angle; σ, atomic solvation parameter.
Thousands of membrane-associated proteins have been deposited in the Protein Data Bank (PDB; Berman et al. 2000), and their number is rapidly growing. However, the precise spatial positions of these proteins in membranes are unknown. Membrane proteins are unique because they function in the highly anisotropic environment of a lipid bilayer, which is characterized by complex polarity gradients and a heterogeneous molecular composition in different regions and leaflets. The positioning of proteins in the lipid matrix may affect their biological activity, folding, thermodynamic stability, and binding with surrounding macromolecules and substrates (White and Wimley 1999; Booth et al. 2001; Bowie 2001; DeGrado et al. 2003; Engelman et al. 2003; Hong and Tamm 2004; Lee 2004). Hence, the orientations of many peptides and proteins in membranes have been studied by a variety of experimental techniques including chemical modification, spin-labeling, paramagnetic or fluorescence quenching, X-ray scattering, neuron diffraction, electron cryomicroscopy, NMR, and polarized infrared spectroscopy (Frillingos et al. 1998; Hristova et al. 1999; London and Ladokhin 2002; de Planque and Killian 2003; Hubbell et al. 2003; Opella and Marassi 2004; Tatulian et al. 2005). However, since the amount of such experimental data is limited, this problem should also be addressed computationally to keep up with the expanding flow of structures in the PDB.
The arrangement of a protein with respect to the membrane can be defined by its shift along the bilayer normal (d), rotational and tilt angles (φ and τ), and thickness of its membrane-spanning region (D = 2z0, Fig. 1). Although the orientation of a transmembrane (TM) protein in a lipid bilayer can be assessed manually (Lee 2003), development of automated methods is necessary to provide more objective, reproducible, and accurate results. The existing computational approaches range from elaborate molecular dynamic (MD) simulations of proteins with explicit water and lipids (Ash et al. 2004; Roux et al. 2004; Gumbart et al. 2005) to simplified approaches that minimize protein transfer energy from water to a hydrophobic slab, which serves as a crude approximation of the membrane hydrocarbon core (Yeates et al. 1987; Rees et al. 1989). In the latter case, the transfer energy can be estimated by using various hydrophobicity scales of whole residues (Zucic and Juretic 2004), a normalized nonpolar accessible surface area (Tusnady et al. 2004), or atomic solvation parameters derived from partition coefficients of model organic compounds between water and nonpolar solvents (Basyn et al. 2003). One such computational method has recently been applied to create the PDB_TM database that provides an up-to-date list of all TM peptides and proteins from the PDB (Tusnady et al. 2005). However, the hydrophobic boundaries of proteins in PDB_TM were not compared with relevant experimental studies and their accuracy is uncertain.
In this paper, we present a new computational approach for positioning proteins in membranes that agrees better with experimental data, which are currently available for 24 TM proteins of known 3D structure. The optimal spatial arrangement of a protein is determined by minimizing its transfer energy from water to a hydrophobic slab with decadiene-like polarity. This method was developed, verified, and applied to all TM proteins from the PDB. The results are deposited in our Orientations of Proteins in Membranes (OPM) database to allow their further examination, use and testing by the scientific community (Lomize et al. 2006).
Development of the method
Choice of atomic solvation parameters
The results for TM proteins were strongly dependent on the choice of atomic solvation parameters that have been applied for calculations of protein transfer energy (Table 1). These parameters can be derived from partition coefficients of uncharged solutes between water and octanol, cyclohexane, or other nonpolar solvents (Eisenberg and McLachlan 1986; Ducarme et al. 1998; Efremov et al. 1999; Lomize et al. 2004). To choose the appropriate parameter set, we calculated the orientations of TM proteins by our program using three alternative scales and compared the results with experimental studies (Table 2). The comparison shows that 1,9-decadiene and cyclohexane-based parameters produce nearly identical results that are consistent with the experimental data, while the octanol scale performs poorly. In all subsequent calculations, we used parameters for decadiene, because this solvent was shown to be the best model of the bilayer interior in studies of membrane permeability barriers (Xiang and Anderson 1994a, b; Mayer and Anderson 2002). We also found that two slightly different parameter sets should be applied for proteins in detergents and bilayers (Table 1). The results of the calculations with decadiene-based parameters (“lipid bilayer scale”) were more consistent with experimental studies for mechanosensitive MscL channel, F-type Na+-ATP synthase, and rhodopsin in bilayers, while hexadecene-based parameters (“detergent scale”) were more suitable for reproducing EPR data for rhodopsin in dodecyl maltoside (see below).
Table Table 1.. Comparison of atomic solvation parameters applied for simulations of peptides and proteins in micelles and lipid bilayers
Table Table 2.. Comparison of experimental (Dexper) and theoretical (Dcalc) hydrophobic thicknesses of TM proteins. The latter values were calculated with solvation parameters for water-cyclohexane (chx), water-octanol (oct) and water-decadiene (dcd) systems
Description of membrane interfacial region
The results were also dependent on the model of the membrane interfacial area. In the final version, all solvation parameters were normalized by the effective concentration of water, which changes gradually along the bilayer normal in a relatively narrow region between the lipid head groups and the hydrocarbon core, as follows from EPR studies of spin-labeled lipid analogs (Marsh 2001, 2002; Kurad et al. 2003; Erilov et al. 2005). We applied the sigmoidal polarity profiles with the characteristic distance λ ∼0.9 Å, which is more justified than linear (Pellegrini-Calace et al. 2003) or polynomial (Lazaridis 2003) functions, or sigmoidal profiles with larger values of λ ∼2 Å (Jahnig and Edholm 1992; Basyn et al. 2003). Hydrophobic thicknesses of some TM proteins are increased by ∼1 Å if calculated with a smaller value of λ ∼0.4 Å.
Treatment of internal cavities in TM proteins
It was also important that the transfer energy does not include any contributions of atoms that face internal polar cavities of TM proteins and do not directly interact with surrounding bulk lipid. Without this, the orientations of β-barrels and many α-helical transporters would be calculated incorrectly, because they have large interior channels or funnels filled by polar and charged residues. This problem was recognized and addressed previously in calculations with TMDET and Garlic (Tusnady et al. 2004; Zucic and Juretic 2004) and in Monte Carlo simulations of β-barrels (Basyn et al. 2003). Our algorithm for excluding atoms in the funnels or channels was different from that in the previously published methods (see Materials and Methods).
Main features of the method
The following approximations were found to be necessary and sufficient for reproducing the experimental data: (1) a lipid bilayer is represented as a planar hydrophobic slab with adjustable thickness and a narrow interfacial area with a sigmoidal polarity profile; (2) a protein is considered as a rigid body with flexible side chains whose transfer energy is minimized with respect to four variables; (3) transfer energy is calculated at an all-atom level using atomic solvation parameters determined for the water-decadiene system; (4) explicit electrostatic interactions were neglected, while including ionization penalties for charged residues that are considered neutral in the nonpolar environment; (5) contributions of pore-facing atoms in TM proteins are automatically eliminated. The computational model obtained is exceptionally simple, because it depends only on five atomic solvation parameters (for N, O, S, and sp2 and sp3 carbons), a constant λ defining the size of the interfacial region, and ionization energies of charged groups. None of these parameters are adjustable, but rather, independently derived from various experimental sources. This method was implemented into the program PPM 1.0 (Positioning of Proteins in Membranes).
Precision of the method
Parameters from different crystal forms of the same protein were calculated to estimate the precision of the method (Supplemental Material, Table 1). Deviations of parameters D and τ were within 1 Å and 2°, respectively, in complexes with identical numbers of TM subunits. Deviations were primarily caused by different conformations of flexible side chains and nonregular loops in crystal structures. Orientations of proteins can fluctuate significantly if their energy surfaces are shallow. The corresponding maximal variations (±) of D and τ parameters were calculated within 1 kcal/mol around the global minimum of transfer energy, as has been suggested previously (Yeates et al. 1987). Obtained ranges are indicated in all Tables. They are larger than the precisions of the corresponding parameters estimated from comparison of different crystal forms.
Verification of the method
Results were verified through all available experimental data for 24 TM proteins of known 3D structure whose spatial positions in bilayers have been experimentally studied: rhodopsin (1gzm), bacteriorhodopsin (1py6), sensory rhodopsin II (1h2s), photosynthetic reaction centers from two species (1rzh and 1dxr), cytochrome c oxidase (1v55), Na+-ATPase (1yce), Ca2+-ATPase(1iwo, 2agv, 1wpe, 1t5s, 1su4, 1wpg), phospholamban (1zll), lactose permease LacY (1pv6), protein translocase SecY (1rh5), Na+/H+ antiporter (1zcd), K+-channel KcsA (1r3j), MscL mechanosensitive channel (1msl), acetylcholine receptor (2bg9), outer membrane proteins OmpA (1qjp), OmpX (1qj8), OmpLA (1qd6), OmpF (1hxx), ferric enterobactin receptor receptor FepA (1fep), ferric hydroxamate uptake receptor FhuA (1qfg), cobalamine transporter BtuB (1nqe), α-hemolysin (7ahl), and gramicidin A (1grm). A number of methods, such as chemical modification, spin-labeling, fluorescence spectroscopy, ATP FTIR, NMR, X-ray scattering, neutron diffraction, electron cryo-microscopy, and hydrophobic matching studies were used to determine hydrophobic thicknesses or tilts of these proteins, to locate their membrane-embedded segments, and to evaluate penetration depths and environments of their residues in lipid bilayers or detergents (Supplemental Material).
Comparison with hydrophobic thicknesses of matching lipid bilayers
The calculated hydrophobic thicknesses of 12 TM proteins agree with the corresponding experimental values obtained from site-directed spin-labeling studies (BtuB transporter and FepA receptor), EM data (Na+ ATPase), X-ray scattering of photoreceptor membranes (rhodopsin) or hydrophobic matching experiments (other proteins), as can be seen from comparison of Dcalc(dcd) and Dexp values in Table 2. The hydrophobic matching studies determine the bilayer thickness that provides optimal functional activity of a protein (Lee 2004) or maximum protein–lipid binding affinity (Lee 2003), or identify lipids whose temperature of phase transition is not affected by the presence of the protein (Dumas et al. 1999). The hydrocarbon thicknesses of lipid bilayers are obtained by subtracting 10 Å from their phosphate-to-phosphate distances determined by X-ray scattering (Lewis and Engelman 1983a; Nagle and Tristram-Nagle 2000). Thus, the calculated hydrophobic boundaries are located ∼5 Å from the phosphate groups toward the membrane center, i.e., at the level of the carbonyl groups of the lipid molecules.
Comparison with experimental tilt angles
NMR studies of the bacteriorhodopsin trimer show that helix A and the extracellular section of helix B are tilted with respect to the bilayer normal by 18°–22° and by less than 5°, respectively (Kamihira et al. 2005). This is consistent with the calculated tilt angles of 23° and 5°, respectively, for these helices in a trimer (1qm8). The calculated tilt of gramicidin A channel (2° ± 10°) is also in excellent agreement with solid-state NMR and infrared dichroism studies that show a nearly perpendicular arrangement of the dimer in the membrane (Nabedryk et al. 1982; Andronesi et al. 2004; Andersen et al. 2005). The calculated average tilts of α-helices or β-strands in TM proteins correlate well with ATR FTIR spectroscopy data (Table 3). However, the experimental values are systematically larger, which could be due to some orientational disorder under the experimental conditions. It has been noted that values of τ obtained by ATR FTIR spectroscopy may represent upper limits of the actual tilt angles for α-helical peptides, due to such disorder (Bechinger et al. 1999; de Planque and Killian 2003).
Table Table 3.. Average tilt angles (°) of TM α-helices or β-strands relative to the membrane normal calculated by PPM 1.0 (βcalc) and determined by ATR FTIR spectroscopy (βexper)
The overall tilt calculated for rhodopsin (τ ∼ 8°) is consistent with the orientation of the protein in 2D crystals (Krebs et al. 2003). The tilts of seven individual helices were 32° (I), 27° (II), 27° (III), 4° (IV), 32° (V), 9° (VI), and 16° (VII) in the 2D crystals and 33° (I), 25° (II), 27° (III), 9° (IV), 15° (V), 13° (VI), and 20° (VII) in the calculated orientation of rhodopsin. Thus, a significant discrepancy was found only for TM helix V of rhodopsin, which is the least reliably defined in EM maps. The calculated orientation of hetero-trimeric SecY complex (1rh5) is similar but not identical to that in the 2D crystal where this protein forms a dimer of trimers (Breyton et al. 2002).
Comparison with membrane penetration depths of individual residues
The calculated membrane-embedded portions of the regular secondary structures agree with studies of BtuB transpoter, bacteriorhodopsin, FepA receptor, and MscL and KcsA channels by spin-labeling, MscL by fluorescence, and Na+ ATPase by EM and X-ray crystallography (Table 4). The experimental and calculated penetration depths of individual spin-labeled residues are generally consistent for MscL and KcsA channels, bacteriorhodopsin, and FepA receptor (Fig. 2). However, experimental and calculated depths may deviate up to 5 Å, as observed for residue 69 in MscL channel (Dcalc = 1.1 Å and Dexper = 6.0 Å). Such deviations probably appear because the depths were taken for the Cβ-atom of the spin labeled cysteine instead of the nitroxyl group, which actually interacts with the paramagnetic quenchers. The penetration depth of the nitroxyl radical in Cys69 of MscL can be in the range of −1.0 Å to 6.5 Å, depending on four χ angles of the spin-labeled cysteine. The latter value is more consistent with the experiment.
Table Table 4.. Comparison of calculated and experimentally determined membrane-embedded portions of α-helices or β-strands
The calculated position of the retinal β-ionone ring in bovine rhodopsin along the bilayer normal (−4.1 Å) corresponds well to its location in 2D crystals (between sections z = 0 and z = −6 Å; Krebs et al. 2003). Four interfacial Trp residues of OmpA are located at distances of 8 Å to13 Å from the calculated bilayer center, close to the value of 9 Å to 10 Å obtained by the parallax method (Kleinschmidt and Tamm 1999). Several Trp residues of α-hemolysin (7ahl) are situated in the lipid head group area according to our results, which is consistent with their accessibility to water-soluble iodide and doxyl probes (Raja et al. 1999) and with locations of lipid head groups determined crystallographically (Galdiero and Gouaux 2004). The shallow location of Trp452 from the γ subunit of the nicotinic acetylcholine receptor (1.4 Å below the calculated hydrophobic boundary) is also consistent with fluorescence studies (Chattopadhyay and McNamee 1991).
Comparison with environments of residues
The water- or lipid-facing environments of individual residues in TM proteins can be mapped by different chemical probes. Lactose permease LacY (1pv6) has been more extensively studied by chemical modification than any other TM protein. A total of 393 single-Cys mutants of LacY were modified by bifunctional reagents to identify residues that could be involved in intermolecular cross-linking (Guan et al. 2002; Ermolova et al. 2003). Only residues located in regions of sufficiently high polarity could produce the reactive thiolate anion required for the formation of intermolecular disulfide bonds. Indeed, our calculations show that most of the residues susceptible to cross-linking are situated in water-exposed periplasmic and cytoplasmic loops or in a relatively narrow (∼5 Å) layer of the hydrophobic core where the dielectric permittivity may be intermediate between that in lipid and water (Fig. 3A). These layers are parallel to the calculated interfacial planes, thus confirming that these planes were identified correctly. However, some of the reactive residues (Trp78, Phe398) are situated 7–10 Å from the surface, which is most probably because they occupy polar sites at the C-termini of TM helices close to Lys74, Lys188, and Lys289 where local dielectric constant may be higher.
The calculated membrane boundaries of LacY are also consistent with site-directed modification of its 159 residues from TM helices II, VII, IX and X by N-ethylmaleimide (NEM) (Voss et al. 1997; Frillingos et al. 1998; Venkatesan et al. 2000a, b, c; Kwaw et al. 2001; Zhang et al. 2003). All residues inaccessible to NEM either face toward the lipid within the calculated hydrophobic slab, or are buried in the protein interior (blue in Fig. 3B). All residues modified by NEM are either accessible to water (outside the membrane boundaries, or in the large interior channel of the permease), or are situated at the water–lipid interface (for example, F308; red in Fig. 3B). Furthermore, site-directed spin-labeling studies of TM helices IV, V, and XII identified a number of residues that presumably face the lipid phase judging from their low accessibility to chromium and high accessibility to oxygen (Voss et al. 1996, Zhao et al. 1999). All these residues are located within the calculated boundaries and are exposed to lipid (green in Fig. 3B).
Vertebrate rhodopsins have also been studied in great detail. Environments of many rhodopsin residues were characterized in intact photoreceptor membranes (Davison and Findlay 1986a, b) or in n-dodecyl-β-D-maltoside (DM) (Hubbell et al. 2003). The results are consistent with the data for the native membranes, primarily with the results of chemical modification of ovine rhodopsin by hydrophobic and hydrophilic probes, which interact with residues exposed to nonpolar or polar environments, respectively (Barclay and Findlay 1984; Davison and Findlay 1986a, b). The hydrophobic probe was shown to modify all 14 Cys, Trp, Tyr, and His residues within the calculated hydrophobic boundaries (blue in Fig. 3C) and a few residues situated just outside the boundary in the lipid head group area (His65, Lys66, Tyr74, Lys231, Cys316) (Davison and Findlay 1986a). All polar residues that were modified by nonpermeable hydrophilic probe applied from the intracellular side (Barclay and Findlay 1984) are located outside the calculated hydrophobic slab (red in Fig. 3C).
The accessibilities to paramagnetic quenchers of two residues in Na+/H+ antiporter (1zcd) and eight residues in sensory rhodopsin II (1h2s) (Wegener et al. 2000; Hilger et al. 2005) are consistent with locations of the membrane boundaries calculated for the corresponding proteins.
Comparison with studies of TM proteins in detergents
The hydrophobic dimensions of TM proteins have also been studied in detergents using neutron diffraction with contrast variation in crystals (two photosynthetic reaction centers, trimeric porin OmpF and monomeric phosphoripase OmpLA), solution NMR (OmpX) and spin-labeling (rhodopsin). Detergent molecules form monolayers around nonpolar surfaces of TM proteins and most of them are oriented perpendicular to the protein surface, unlike lipids in bilayers (Fig. 3F; le Maire et al. 2000). In crystals of TM proteins, these monolayers look like regular rings with dimensions of 15–20 Å in the direction perpendicular to TM domains and 15–30 Å parallel to them (Roth et al. 1989, 1991). The latter values are in agreement with hydrophobic thicknesses of the corresponding proteins calculated by PPM 1.0 (Table 5). Moreover, the calculated membrane boundary planes of the proteins closely correspond to the borders of the detergent monolayer. For example, these planes pass through the aromatic rings of Trp78, Trp98, Phe109, and Phe122 residues in monomeric phospholipase A (OmpLA) in agreement with neutron diffraction (Snijder et al. 2003).
Table Table 5.. Comparison of calculated protein hydrophobic lengths (Dcalc) and experimental thicknesses of detergent monolayers around the proteins, as determined by neutron diffraction with contrast variation (Dexper)
A higher resolution picture of detergent–protein interactions has been obtained by solution NMR studies of OmpX β-barrel (1qj8) in the presence of a small amphiphile, 1,2-dihexanoyl-sn-glycero-3-phosphatidylcholine (DHPC) (Fernandez et al. 2002). Importantly, this NMR study identified environments of individual atoms in solution rather than in the crystal. A large set of aliphatic and NH hydrogens involved in NOEs with hydrophobic tails and head groups of DHPC has been determined (Fernandez et al. 2002), which allowed mapping of the detergent embedded area of the protein with high precision. The membrane boundaries calculated with PPM 1.0 are in agreement with NMR data (Fig. 3E). Only two NH backbone groups that interact with detergent occupy an “aromatic spot” outside the calculated slab.
Results of the calculations with “detergent” and “bilayer” scales (Table 1) were nearly identical for all proteins studied in detergents, except rhodopsin. To reproduce the experimental conditions, we used the crystal structure of rhodopsin with extended helix V (1gzm), removed its C-terminal palmitates, and applied the “detergent” solvation parameters. This led to an increased hydrophobic thickness (from 32.4 Å to 36.9 Å) and tilt angle (from 8° to 16°) (Fig. 3D). The expanded membrane boundaries are in much better agreement with spin-labeling data in DM (Hubbell et al. 2003) than with chemical modification data in native membranes (Davison and Findlay 1986a, b). The EPR studies identified a number of interfacial residues that were buried from water when substituted by a spin-labeled cysteine (V63, P71, V137, H152, K231, T251, and N310). The Cβ-atoms of these residues are indeed situated within the expanded boundaries calculated with “detergent” parameters, but well outside the boundaries obtained with “membrane” parameters (Supplemental Material, Table 3). Most importantly, several residues in the last turn of helix V (227–231) were shown to be coated with the detergent (Hubbell et al. 2003), but are accessible to water in the native membrane.
Application of the method to TM proteins from the PDB
After successful testing of the method for 24 well-studied TM proteins, it was applied to all other TM proteins deposited in the PDB. The calculations were conducted for 109 TM protein complexes (80 α-helical, 28 β-barrels, and gramicidin A dimer), 32 representative integral monotopic and peripheral proteins selected from the literature, and a control set of 20 water-soluble proteins with the highest hydrophobicity score in PDB_TM. Any protein that did not traverse the membrane after the optimization was interpreted as peripheral or monotopic, and the maximal membrane penetration depth of its atoms is calculated instead of its hydrophobic thickness.
Figure 4 shows that TM, integral monotopic and peripheral proteins occupy separate areas of the plot of hydrophobic thickness (D = 2z0) versus transfer energy (ΔGtransf), and therefore, they can be easily distinguished based on these two parameters. All peripheral and monotopic proteins have penetration depths of <15 Å. Sixteen hydrophobic water-soluble proteins from PDB_TM have transfer energies in the range from 0 to −0.5 kcal/mol. However, there are four exceptions that can be interpreted as probable membrane-associated proteins. Among them are VH antibody domain resistant to aggregation (1ohq, ΔGtransf = −7.2 kcal/mol), extracellular domain of bone morphogenetic protein receptor (1es7, ΔGtransf = −5.3 kcal/mol), gephyrin domain that links glycine receptor and tubulin (1t3e, ΔGtransf = −3.4 kcal/mol), and caseine kinase (1rqf, ΔGtransf = −1.6 kcal/mol). Indeed, interactions with membranes were shown to be functionally important for gephyrin and bone morphogenetic receptor (Sebald et al. 2004; Sola et al. 2004). Thus, our program can be applied for automatic identification and discrimination of TM, integral monotopic and water-soluble proteins, although the threshold between peripheral and nonmembrane proteins is sometimes blurred.
Calculated hydrophobic thicknesses of TM proteins range from 21.1 Å to 43.8 Å, depending on the type of biological membrane (Table 6). Their average values are 29–30 Å for proteins from the inner bacterial, archaebacterial, endoplasmic reticulum and thylakoid membranes, but slightly higher (∼31 Å) for proteins from eukaryotic plasma membranes and slightly lower (∼27 Å) for proteins in inner mitochondrial membranes. However, thicknesses of proteins from outer membranes of Gram-negative bacteria (∼24 Å) and especially cell wall membranes of Gram-positive bacteria (43.8 Å) significantly deviate from 30 Å. This trend was noted previously and explained by the specific lipid compositions of the corresponding membranes (Faller et al. 2004; Tamm et al. 2004).
Table Table 6.. Calculated hydrophobic thicknesses of proteins from different biological membranes (Dmin, Dmax, Daver)
Calculated tilt angles (τ) of TM proteins vary from zero in all symmetric complexes to 1°–6° in the majority of monomeric and hetero-oligomeric structures. A few proteins have larger tilt angles, 7°–11°: rhodopsin (1gzm), mitochondrial succinate dehydrogenase (1zoy), SecY translocase (1rh5), and small monomeric β-barrels OmpA (1qjp), OmpX (1qj8), and NspA (1p4t). More extreme tilts (20°–26°) were obtained only for PagP enzyme from outer bacterial membrane (1thq), sulfohydrolase (1p49), different functional states of Ca2+-ATPase (1wpe, 1wpq, 1t5s, 1su4, 1iwo), the sensory domain of KvAP channel (1ors), and subunit c of F-type ATPase (1a91). A significant tilt of PagP was previously suggested based on the arrangement of its aromatic residues (Bishop 2005). Thus, TM proteins tend to be nearly perpendicular to the membrane, although the individual helices are tilted with respect to the bilayer normal by an average of 21° (Bowie 1997). All strongly tilted structures are either parts of incompletely assembled complexes (1gzm, 1zoy, 1rh5, 1ors, and 1a91), or have small TM domains and are therefore orientationally unstable (1qjp, 1qj8, 1p4t, and 1p49) or undergo large-scale conformational transitions (PagP and Ca2+-ATPase). Significant tilts are usually stabilized by peripheral helices that float in the membrane parallel to its surface, e.g., PagP enzyme, Ca2+-ATPase and rhodopsin.
D and τ parameters of TM proteins were prone to fluctuations within 1 kcal/mol around the global minimum of transfer energy. These fluctuations were usually smaller than 2 Å and 4°, respectively. However, the fluctuations were larger for proteins with a smaller TM perimeter (Fig. 5).
Three proteins in our data set had unexpectedly small calculated hydrophobic thickness: EmrE transporter (1s7b), the first published structure of KvAP potassium channel (1orq), and heptameric mechanosensitive channel MscS (1mxm). This may reflect distorted, incomplete, or conformationally labile structures of these proteins. For example, according to our calculations, the EmrE dimer has very small hydrophobic thicknesses (16 Å), and an unusual tilt of TM (τ = 81°). On the other hand, the 3D structure of EmrE was described as inconsistent with cross-linking and biochemical data (Soskine et al. 2002, 2004; Butler et al. 2004) and with its EM image in 2D crystals (Ubarretxena-Belandia and Tate 2004). A small hydrophobic thickness (23 Å) of the KvAP tetramers (1orq) is probably related to the non-native arrangement of its three N-terminal helices in the crystal (Cuello et al. 2004; MacKinnon 2004; Long et al. 2005). Removing these three helices from the calculations resulted in a larger thickness (26 Å) and zero tilt angle, indicating that the rest of the structure is native. The relatively small (23 Å) calculated thickness of MscS mechanosensitive channel may be attributed to the open or another expanded state of the sensor, which is formed when the membrane becomes thinner under the influence of osmotic pressure (Bass et al. 2002; Akitake et al. 2005). Alternatively, this might be due to the disordered ends of TM helices, which do not include 25 N-terminal residues in each of the seven symmetric subunits.
In contrast, the hydrophobic thicknesses of F- and V-type ATPases and lipid flippases were unusually large, ∼36 Å (1a91, 1c17, 1yce, 2bl2, 1pf4, and 1z2r). This is expected to produce a significant hydrophobic mismatch between these proteins and their host bilayers. The mismatch may facilitate large-scale movements of these proteins in the lipid bilayers, because it reduces protein–lipid binding affinities (Lee 2003).
Some potential problems can also be detected based on the calculated tilt angles. These angles were close to zero for all TM complexes with noncrystallographic symmetry except cytochrome b6f from M. laminosus (1vf5) and fumarate reductase dimer from E. coli (1kf6), which both have τ of ∼3°. The non-zero overall tilt of cytochrome b6f complex appears due to significant differences between the symmetry-related subunits (RMSD of Cα-atoms ∼ 1 A). The dimer of fumarate reductase is loosely packed and therefore was suggested to be non-native (Iverson et al. 1999).
It is noteworthy that some TM proteins can form non-native dimers or trimers in crystals, for example, rhodopsin (1u19 and 1gzm), bacteriorhodopsin (1py6), lactose permease (1pv6), OmpA (1qjp), OmpX (1qj8), and fatty acid transporter FadL (1t16 and 1t1l). Hydrophobic thicknesses of such non-native oligomers are usually reduced (see Supplemental Material). Therefore, it is important to know the complete and correct quaternary structure of a multimeric complex to calculate its position in the membrane.
The positioning of proteins in membranes is an important problem that has been previously addressed by various methods. Our computational approach to this problem is exceptionally simple, because it neglects all fine details of the protein–lipid interactions in the membrane interfacial areas and accounts only for the hydrophobic burial of the proteins in the membrane core. We found that it is important to implement an appropriate atomic solvation parameter set, include ionization penalties for charged residues, and exclude contributions of pore residues that are not involved in interactions with surrounding bulk lipids. Since the results of the calculations are generally consistent with the experimental studies of 24 TM proteins, the main underlying assumptions of the method seem to be reasonable, as discussed below.
Description of the membrane
The bilayer core was approximated by a hydrophobic slab with narrow interfacial areas, based on EPR studies of spin-labeled lipid analogs (Marsh 2002). This crude approximation is appropriate, because the positioning of TM proteins in membranes is dominated by hydrophobic forces (Booth et al. 2001; Engelman et al. 2003), although electrostatic and other interactions at the lipid interface can also play an important role, especially for weakly bound peripheral peptides and proteins (Murray et al. 1997; White and Wimley 1999; Cho and Stahelin 2005). The calculated boundary planes of the proteins are located ∼5 Å deeper than lipid phosphates judging from the comparison of hydrophobic thicknesses of the proteins and lipid bilayers (Table 2). The effective concentrations of polar and nonpolar paramagnetic probes are approximately equal at these boundaries (Φ = 0) judging from the spin-labeling studies of TM proteins (Fig. 2). The borders between nonpolar and polar regions at the surfaces of protein complexes are usually well approximated by planes. Only a few nonpolar residues remained outside the calculated planar boundaries and few charged groups penetrated inside the calculated membrane-spanning regions. Some tendency for local hydrophobic thinning was observed in the structures of chloride channels and mitochondrial cytochrome c oxidase.
The concept of hydrophobic matching
The hydrophobic thickness was applied as a free variable, because the lipid bilayers are known to adjust their local thickness to match the nonpolar areas of large TM proteins (Dumas et al. 1999). Such adjustment costs relatively little energy, ∼0.025 kcal/mol per fatty acyl chain carbon atom (Williamson et al. 2002; Lee 2003), compared to 0.7 kcal/mol required for transfer of a CH2 group from nonpolar solvent to water. The corresponding contractions or expansions of bilayers upon addition of TM proteins have been observed in biological membranes (Mitra et al. 2004). Our results demonstrate that TM proteins from inner bacterial membranes have more diverse hydrophobic thicknesses (from 22.5 Å to 37 Å) than TM proteins from other membrane types (Table 6). This may be due to a higher elasticity of the inner bacterial membranes facilitated by their heterogeneous lipid composition and lack of cholesterol (Denich et al. 2003). Indeed, the depletion of Escherichia coli membranes of proteins results in a substantial decrease of the thickness (from 27.5 Å to 23.5 Å; Table 6), while the parameters of the apical plasma membranes remained constant (Mitra et al. 2004). Thus, certain types of biological membranes may be relatively rigid due to the presence of cholesterol or dolichol/sphingomyelin (apical plasma membranes), lipids with phytanyl chains (archeabacteria), or lipopolysaccarides (outer membranes of Gram-negative bacteria). In all such cases, one could expect more uniform hydrophobic thicknesses in all proteins from the same type of membranes, as is actually observed in our results.
Implicit solvation model and hydrophobicity scale
An implicit solvation approach was appropriate because the interior of lipid bilayers is fluid. Transfer energy of TM proteins from water to the membrane interior was calculated using atomic solvation parameters developed for the water-decadiene system. The decadiene can serve as a good approximation of the bilayer interior, as follows from the experimental studies of membrane permeability barriers (Xiang and Anderson 1994a, b; Mayer and Anderson 2002), our previous work (Lomize et al. 2004), and results described here. In contrast, the experimental hydrophobic thicknesses of TM proteins were reproduced poorly with the octanol scale (Table 2). This is not surprising, since the octanol solution contains a significant amount of water (∼2.3M; Leo et al. 1971) and has high dielectric constant (∼10–20), unlike the hydrocarbon core of lipid bilayers (Benz et al. 1975; Radzicka and Wolfenden 1988). It was noted that water–octanol transfer energies of neutral solutes correlate poorly with membrane permeability barriers (Walter and Gutknecht 1986).
Comparison with other computational methods
Our method, PPM 1.0, represents a significant improvement upon other simple methods, such as optimization of normalized nopolar accessible surface areas (TMDET; Tusnady et al. 2004), Monte Carlo simulations with atomic solvation parameters (IMPALA; Basyn et al. 2003), or manual assessment (Lee 2003), because it provides results that are in better agreement with experimental data (Table 7). Surprisingly, the manually assessed hydrophobic thicknesses are close to those calculated by PPM 1.0. There are just a few cases of moderate discrepancies between the manually assessed and PPM's values, such as bacteriorhodopsin, bc1 complex, MscL channel, and Ca2+-ATPase. However, the parameters calculated by IMPALA and especially by TMDET (PDB_TM) differ significantly from ours, sometimes by more than 10 Å. It is noteworthy that the thicknesses of all β-barrels from Gram-negative bacteria are very uniform when calculated with PPM 1.0 (22–25 Å), as was previously suggested (Tamm et al. 2004). However, the β-barrel thicknesses calculated with IMPALA vary from 14 Å to 29 Å, while those in the PDB_TM database vary from 17.5 Å to 34 Å. Comparison of our method with the results of MD simulations with explicit lipids is less straightforward, because MD does not produce the hydrophobic thickness as an intrinsic property of a protein. The dynamically averaged tilt angles of TM proteins in MD studies are more or less consistent with our results. For example, the tilt of OmpA β-barrel was 8° ± 6° and 5°–10° when generated by PPM 1.0 and MD (Bond et al. 2002), respectively. However, the agreement was less than perfect for the β-barrel of OmpT and α-helical dimer of glycophorin A. Both proteins are oriented nearly perpendicularly to the membrane plane in our results (τ = 2° ± 5° and τ = 2° ± 10°, respectively), but are more tilted in MD studies (τ = 20° and τ = 8°–10°, respectively) (Petrache et al. 2000; Baaden and Sansom 2004).
Table Table 7.. Comparison of hydrophobic thicknesses (Å) evaluated manually from crystal structures of the corresponding proteins (Man) and calculated using PPM 1.0 or other computational methods (IMPALA and PDB_TM)
Our method has the following advantages: it is computationally fast, sufficiently accurate, and has been extensively verified through comparison with the experimental data that define positions of 24 TM proteins in lipid bilayers. This method can reliably distinguish TM and water-soluble proteins and it gives consistent results for different crystal forms of the same protein. Due to its simplicity, it may be applied to large sets of proteins, unlike the more computationally expensive molecular dynamics simulations. The method was designed to reproduce the spatial positions of proteins in membranes, rather than their binding affinities. This “minimalist” approach can serve as a basis for design of more advanced continuum models, which would incorporate additional energetic terms to describe electrostatic and other interactions in the membrane interfacial area, hydrophobic mismatch, changes of bilayer curvature, and surface and lateral pressure.
Materials and methods
Calculation of transfer energy
A protein was considered as a rigid body that freely floats in the planar hydrocarbon core of a lipid biliayer. The transfer energy of a protein, ΔGtransfer, was calculated as a function of variables d, z0, τ, and φ in a coordinate system whose Z-axis coincides with the membrane normal (Fig. 1):
where ASAi is accessible surface area of atom i, σiW−M is the solvation parameter of atom i (transfer energy of the atom from water to membrane interior expressed in kcal/mol per A2), and f(zi) is interfacial water concentration profile. ASA were determined using the subroutine SOLVA from NACCESS (obtained from S.J. Hubbard and J.M. Thornton, University College London), with radii of Chothia (1975) and without hydrogens.
Solvation parameters have been derived specifically for lipid bilayers (Table 1) and normalized by the effective concentration of water, which changes gradually along the bilayer normal in a relatively narrow region between the lipid head group region and the hydrocarbon core. We used a sigmoidal water concentration profile as determined in EPR studies on spin-labeled phospholipids (Marsh 2002):
The characteristic distance λ of this profile was chosen as 0.9 Å (see Results).
Transfer energy was calculated only for atoms that are exposed to water or lipid at the outer perimeter of TM protein complexes. Contributions from exposed pore-facing atoms were excluded. The “lipid-facing” atoms were automatically defined as those that are not staggered by other protein atoms when “looked at” from the inner longitudinal axis of the TM α-bundle or β-barrel. This axis was calculated as the sum of individual TM secondary structure vectors whose signs were chosen to provide an identical general orientation of all vectors relative to the membrane. The protein was first rotated to superimpose the found longitudinal axis with the Z-axis. Then, each atom, i, was considered as staggered by atom j if the following conditions were satisfied. (1) |zi – zj| < 2 Å; (2) rj > ri + 2 Å, where rj and ri are distances from atoms i and j to Z-axis, and (3) the distance from atom j to a line, which is perpendicular to the Z-axis and passes through atom i, is <3.6 Å (sum of two van der Waals radii). For large TM complexes, one central axis “to look from” was insufficient; it was translated into different positions around the central point (x = 0, y = 0), and each atom that was not staggered by other atoms while looking from at least one position of the axis was assigned as lipid-facing.
Ionization/protonation energies of charged residues were described by the Henderson–Hasselbalch equation (Fersht 1999).
These energies are small at pH = 7: 6.9, 4.7, 4.9, 4.0, and 0.6 kcal/mol for Arg, Lys, Asp, Glu, and His, respectively, as follows from average pKa values of these residues in proteins, i.e., 12.0, 10.4, 3.4, 4.1, and 6.6, respectively (Fersht 1999; Edgcomb and Murphy 2002; Forsyth et al. 2002). The ionization penalties were included as a part of the solvation parameters of N or O atoms in the charged groups, except those that are buried in the protein interior, because interactions and ionization states of the buried charged groups are not expected to change during immersion into the lipid phase. An ionizable group was considered as buried if it had ASA of polar atoms less than 1 Å, or if it formed at least two hydrogen bonds in the protein structure.
Global energy optimization
The transfer energy surfaces are typically shallow and have numerous local minima. Therefore, we used a reliable deterministic search strategy that included two steps: (1) grid scan to determine a set of low-energy combinations of variables z0, d, φ, and τ (those with relative energies <4 kcal/mol); and (2) local energy minimization by the Davidon-Fletcher-Powell method starting from each low-energy point. Partial derivatives of the energy with respect to z0, d, φ, and τ variables were calculated analytically. The appropriate grid scan steps were chosen empirically as 0.5 Å for z0 and d, and 5° and 2° for φ and τ, respectively. Performance of the global search procedure was verified by starting from different spatial orientations of the complexes to make sure that they converge to essentially the same solution.
After global energy minimization, we determined minimal and maximal values of each variable that could be achieved in an interval of relative energies from 0 to 1 kcal/mol (xmin and xmax). This was done using values of energy calculated and stored during a grid scan and subsequent local energy minimizations. The xmin and xmax parameters provided an amplitude of possible fluctuations of the variable around the position of global minimum (x0). The amplitudes were calculated as ½(xmax − xmin) and then used as ± values. Thus, they represent maximal rather than mean deviations.
We found that the results of calculations can be significantly affected by the orientation of interfacial side chains that are charged, accessible to the solvent, and situated close to the calculated boundaries. Many such side chains are flexible and poorly resolved (have high B-values), or even omitted in PDB entries. These side chains can easily change orientation upon association with the membrane. At present, there are no proven computational methods for automated side chain packing at the membrane interface, which must be done simultaneously with optimizing the orientation of the entire protein in the bilayer. Therefore, rearrangement of side chain rotamers was done semiautomatically (using PPM 1.0 and molecular modeling modules of QUANTA) as follows. First, the protein orientation was calculated for the PDB structure with original side chain rotamers. Secondly, we identified all solvent-exposed charged side chains (usually Lys and Arg) that were situated close to the calculated boundaries and could be rotated away from the hydrophobic core without creating interatomic overlaps or reducing the number of hydrogen bonds (χ angles were kept close to ±60° or 180°). Remarkably, the majority of charged side chains were already turned away from the hydrophobic slab in the crystal structures of TM proteins, so only a few of them needed to be adjusted (see Supplemental Material). Third, conformations of the found side chains were readjusted, all charged residues with missing atoms near the boundaries were reconstructed, and membrane boundaries were recalculated. Fourth, the resulting structure was accepted if it had lower transfer energy than the original structure.
Protein data set
Structures of TM proteins were taken from the current release of PDB (Supplemental Material, Table 5). We temporarily omitted theoretical models, entries with backbone coordinates only, some EM-based structures, NMR models derived from orientational constraints, and nonfunctional structures (such as peptide fragments, monomeric units of protegrin, mellitin, alamethicin, and zervamicin, double helices of gramicidin A). A representative structure (usually the one with the highest resolution) was selected for every protein, i.e., we did not consider all mutants, complexes with different ligands, or independent crystal determinations. It was important to conduct all calculations for biologically relevant quaternary structures (biological units) rather than for individual polypeptide chains or non-native oligomers. The quaternary structures were taken from the Protein Quaternary Structure (PQS) database (Henrick and Thornton 1998), except several known non-native complexes such as the antiparallel dimer of rhodopsin. All cofactors, lipids, detergents, ions, and water molecules were excluded from the calculations, except those in photosynthetic complexes where they form a significant part of the protein core (photosynthetic reaction centers, photosystems I and II, and light-harvesting and cytochrome b6f complexes), and in the complex of lipid flippase with lipopolysaccharide (1z2r). In all other complexes, the incorporation of cofactors or crystallized lipids only led to minor changes of the calculated parameters. Some disordered N- or C-terminal segments (5–7 in 1pw4, 62–69 in 1afo, and 1030–1036 in 1oye) and three misplaced N-terminal helices of KvAP channel (1orq) were omitted in final calculations.
All regular secondary structures were first determined by the DSSP algorithm incorporated in QUANTA. Then, all gaps in the middle of TM α-helices and β-strands were eliminated. All α-aneurisms, short fragments of 310 helices, and β-bulges were interpreted as parts of continuous secondary structures. Each regular secondary structure was represented by a vector passing through its starting and ending points. These points were calculated as average positions of several Cα atoms at the beginning or end of the corresponding α-helix or β-strand (the number of Cα atoms for the averaging was chosen as 7, 4, or 1 depending on length of the secondary structure).
Electronic supplemental material
Supplemental materials include PDB codes of all proteins used in this study, hydrophobic thicknesses and tilt angles calculated for different crystal forms of TM proteins, a list of 24 TM proteins used for testing the methods, calculated and ESR-determined water-inaccessible segments of rhodopsin, and a list of flexible side chains whose conformers are adjusted to fit the membrane boundaries.
We thank Drs. Simon Hubbard, Vladimir Maiorov, and Simon Sherman for provided software, Dr. Kim Henrick for discussion of protein quaternary structure, and Drs. Eugene Krissinel, and Gabor Tusnady for explanations in regard to the SSM server and the PDB_TM database. This work was supported by NIH grant DA003910 (HIM) and an Upjohn Research Award from the College of Pharmacy, University of Michigan (ALL).