Refined structures of mouse P-glycoprotein


  • Jingzhi Li,

    1. Department of Pharmacology and Toxicology, Center for Structural Biology, University of Alabama at Birmingham, Birmingham, Alabama
    Search for more papers by this author
  • Kimberly F. Jaimes,

    1. Department of Pharmacology and Toxicology, Center for Structural Biology, University of Alabama at Birmingham, Birmingham, Alabama
    Search for more papers by this author
  • Stephen G. Aller

    Corresponding author
    1. Department of Pharmacology and Toxicology, Center for Structural Biology, University of Alabama at Birmingham, Birmingham, Alabama
    • Correspondence to: Stephen G. Aller, Department of Pharmacology and Toxicology, Center for Structural Biology, University of Alabama at Birmingham, 1025 18th Street South, Birmingham, AL 35205. E-mail:

    Search for more papers by this author


The recently determined C. elegans P-glycoprotein (Pgp) structure revealed significant deviations compared to the original mouse Pgp structure, which suggested possible misinterpretations in the latter model. To address this concern, we generated an experimental electron density map from single-wavelength anomalous dispersion phasing of an original mouse Pgp dataset to 3.8 Å resolution. The map exhibited significantly more detail compared to the original MAD map and revealed several regions of the structure that required de novo model building. The improved drug-free structure was refined to 3.8 Å resolution with a 9.4 and 8.1% decrease in Rwork and Rfree, respectively, (Rwork = 21.2%, Rfree = 26.6%) and a significant improvement in protein geometry. The improved mouse Pgp model contains ∼95% of residues in the favorable Ramachandran region compared to only 57% for the original model. The registry of six transmembrane helices was corrected, revealing amino acid residues involved in drug binding that were previously unrecognized. Registry shifts (rotations and translations) for three transmembrane (TM)4 and TM5 and the addition of three N-terminal residues were necessary, and were validated with new mercury labeling and anomalous Fourier density. The corrected position of TM4, which forms the frame of a portal for drug entry, had backbone atoms shifted >6 Å from their original positions. The drug translocation pathway of mouse Pgp is 96% identical to human Pgp and is enriched in aromatic residues that likely play a collective role in allowing a high degree of polyspecific substrate recognition.


P-glycoprotein (Pgp; ABCB1) may be the most promiscuous drug pump of any transporter in mammals, recognizing and effluxing possibly thousands of different small molecules from the cell. Understanding the molecular bases for polyspecific drug recognition by Pgp is critical for characterizing a major mechanism of multidrug resistance and for circumventing chemotherapeutic drug efflux. Achieving structural “snapshots” of drug-free and drug-bound conformations of Pgp would help shed light on the mechanisms of polyspecific drug recognition and folding of proteins in the ATP-Binding Cassette superfamily. Aller et al.[1] reported the first structure of Pgp that was captured in a conformation that is open to the cytoplasm including a distinct separation of nucleotide binding domains. This “inward-facing” conformation revealed portals open to the position of the lipid bilayer and a mechanism for extracting small molecules specifically from the inner (cytoplasmic) leaflet.[2-5] Additional inward-facing structures with drugs bound to the internal cavity were presented, revealing amino acid residues involved in drug recognition including overlap with several residues previously identified by biochemical studies. The structure has been proposed to represent a nonphysiological conformation due to the absence of a lipid bilayer and nucleotide.[6, 7] Recent work by Wen et al. demonstrated that, in intact lipid bilayers in the presence or absence of nucleotide, Pgp is capable of opening wider—and closing down further—from the conformation in the original mouse Pgp crystal structure.[8] Recent all-atom molecular dynamics (MD) simulations have also confirmed that inward-facing conformations of Pgp are highly plausible in the presence of an intact bilayer.[8, 9]

The original mouse Pgp structure was determined using multiwavelength anomalous dispersion (MAD) phasing with mercury labeled protein crystals. MAD phasing requires at least two datasets (λ1 and λ2) collected at different x-ray wavelengths for the same rotation sweep through the crystal. A single crystal was used (designated “Crystal2”), and the X-ray beam was positioned on the same location of the crystal for both datasets. In recent years, phasing by single-wavelength anomalous dispersion (SAD) has achieved greater prominence than MAD[10] and now accounts for approximately half of the structures determined by experimental phasing methods.[11] The increasing popularity of SAD phasing is largely due to the requirement of needing to collect only a single dataset from a single crystal, which minimizes radiation damage and diminishes nonisomorphism issues. The consequence of radiation damage was particularly noticeable for the Pgp λ2 dataset, which exhibited reduced intensities and higher error (R-factor) for all resolution shells and very high error for the last two resolution shells (Supporting Information Table S1). The initial experimental electron density map for Crystal2 was produced incorporating data up to 4.5 Å (the maximum resolution for which a solution could be found). Modeling could not be performed de novo due to the limiting resolution of the experimental map, but relied on fitting regions of electron density with sections of the Sav1866 structure[12] mutated to mouse Pgp residues. The original structure was then refined to the full 3.8 Å resolution to a final Rwork/Rfree of 0.306/0.347. The final experimental electron density map was achieved by noncrystallographic symmetry (NCS) averaging of electron density contained within a mask of the two molecules in the asymmetric unit. Subsequently, two structures of mouse Pgp bound to stereoisomeric analogs of dendroamide-A,[13] QZ-RRR, and QZ-SSS, were solved by molecular replacement, rigid body and simulated annealing refinement to 4.4 and 4.35 Å, respectively. The positions of the cyclic peptides were verified by visualization of the anomalous Fourier difference electron density arising from the selenium atoms in the cyclic peptides.

Jin et al.[14] recently reported a higher resolution structure (3.4 Å) of PGP-1 from the nematode Caenorhabditis elegans. As with mouse Pgp, C. elegans PGP-1 was also captured in a drug-free inward-facing conformation but is even more open to the cytoplasm. Their detailed comparison of C. elegans PGP-1 to mouse Pgp revealed possible registry shifts of three transmembrane (TM) helices in the structure of the mouse ortholog. Furthermore, arginine-scanning mutagenesis by Loo et al. on human Pgp revealed discrepancies with the mouse Pgp structure at TM3 and TM5.[15] To determine if the original mouse Pgp structural model could be improved, we produced a considerably higher quality experimental electron density map through SAD phasing of a single original mouse Pgp dataset (λ1) to the full 3.8 Å resolution and model-independent density modification (DM). The quality of the map was sufficient to inspect all regions of the structure and to completely rebuild several TM regions de novo. Refinement of the improved structure was achieved with excellent statistics (9.4 and 8.1% decrease in Rwork and Rfree, respectively, from previous values) revealing many significant differences compared to the original structure. The improved structure contains significant corrections to transmembrane helices (TM) 3–5, 8, 9, and 12 as well as corrections to three of four intracellular helices (also called coupling helices; IH1, IH3, and IH4), both “elbow helices,” both TMD-NBD connectors and minor corrections to all extracellular loops, each NBD and IH2. The improved drug-free structure was refined against the original QZ59-RRR and QZ59-SSS datasets to achieve a more accurate snapshot of drug binding. The improved structures reveal important implications for polyspecific drug recognition in the form of a high concentration of aromatic residues in the drug translocation pathway that is most conserved among mammalian orthologs.


Scope of changes to Pgp model

To examine possible structural divergence between mouse- and C. elegans- Pgp, as well as ascertain whether any problems in modeling occurred in the original mouse Pgp structure, we calculated an experimental electron density map from single anomalous dispersion (SAD) phasing of an original dataset (λ1) that did not suffer from the effects of radiation damage during data collection nearly as much as the second dataset (λ2). The experimental map was straightforward to produce using original mercury sites and NCS restraints (see Materials and methods section). The new SAD-phased map was superior in quality compared to the original MAD-phased map, revealing new details not previously visualized (Fig. 1). The appearance of more detail in the new map compared to original prompted a careful inspection of the entire structure. In all cases described below, corrections could be made using de novo modeling. The corrections to mouse Pgp structure are global, with 669 residues in the asymmetric unit (28%) have backbone Cα that are displaced more than 2 Å from their original positions (Fig. 2 and Supporting Information Fig. S1). Refinement of the structure resulted in a marked improvement in Rwork/Rfree, Ramachandran favorability, bond angles and clash score, which were overlooked during quality control in the original model building (Table 1).

Figure 1.

Comparison of experimental electron density maps. Left panels represent the highest resolution map published in Aller et al.[1] produced from multiwavelength anomalous dispersion (MAD) phasing using “Crystal 2.” Right panels represent the new map produced from single-wavelength anomalous (SAD) phasing. The new map was produced by phasing 12 original mercury sites and the λ1 dataset using phenix.phaser. The final model-free phases were obtained using Wang density modification and non-crystallographic symmetry operators as described in materials and methods. Bottom panels include the original model (left, 3G5U) and the improved structure (right, 4M1M) in gray sticks. The four panels are shown in wall-eyed (divergent) stereo view. Each map is contoured at 1σ.

Figure 2.

Improved structure of mouse P-glycoprotein. Wall eyed stereo view of one molecule of the asymmetric unit. The improved structure is displayed and is colored by the distance each Cα has moved from the original position according to the color bar at the top of the figure. Red, orange, and yellow colors represent Cα positions that have changed the most upon correction (up to 8 Å). The changes to the improved mouse Pgp structure are global in scope, with the greatest change to the position of transmembrane helix 4 (TM4). All twelve transmembrane domains (TM1-TM12), and the two nucleotide binding domains are labeled (NBD1, NBD2).

Table 1. Refinement and Model Statistics for Drug-Free Structures of P-Glycoprotein
 Original mouse PgpImproved mouse PgpC. elegans PGP-1
 PDB code 3G5UPDB code 4M1MPDB code 4F4C
  1. a

    Highest resolution bin (4.04–3.80 Å) is shown in parentheses.

  2. b

    Model was refined with phenix.refine, but R values were calculated with CNSv1.3 which ignores the anisotropy tensor terms.

  3. c

    Highest resolution bin (3.61–3.40 Å) is shown in parentheses.

  4. d (using electron-cloud x-H bond-lengths and no N/Q/H flips).

  5. e

    Percentile score at the time of this publication is shown in parentheses.

  6. f

    MolProbity score combines the clashscore, rotamer, and Ramachandran evaluations into a single value.

Rwork (CNS)0.306 (0.448)a0.259b (0.353)a0.250 (0.342)c
Rfree (CNS)0.347 (0.486)a0.303b (0.402)a0.283 (0.360)c
Rwork (Phenix) 0.212 (0.282)a 
Rfree (Phenix) 0.266 (0.329)a 
Clashscore, all atomsd, e124.1 (5th)10.22 (97th)7.55 (97th)
Poor rotamersd383 (20%)43 (2.21%)33 (3.35%)
Ramachandran outliersd408 (17%)14 (0.59%)6 (0.48%)
Ramachandran favoredd1354 (57%)2235 (95%)1175 (94%)
Molprobity scored-f4.47 (5th)2.15 (100th)2.19 (100th)
Cβ deviationsd620
Bad backbone bondsd100
Bad backbone atomsd9231
Rmsd bond lengths (Å)0.0100.0050.009
Rmsd bond angles (°)1.9000.9701.200

Corrections to the mouse Pgp model, fully supported by new experimental electron density, were required as detailed in the accompanying supporting material (Supporting Information Figs. S2–S26). The most notable include: (1) a ∼90° rotation of the N-terminal elbow helix, (2) significant remodeling of intracellular helix 1, (3) a ∼90° rotation (one residue registry shift) of TM3, (4) A four-residue registry shift (∼360° rotation and one-turn translation) for TM4, (5) remodeling extracellular loop 2, (6) a ∼90° rotation of TM5, (7) remodeling of the TM6-NBD connector (residues 368–385), (8) remodeling of the second elbow helix, (9) a one-residue registry correction for a portion of TM8 (residues 742–759), (10) a ∼45° rotation of a portion of TM9 (residues 828–849), (11) remodeling of IH4 as a proper helix, (12) remodeling of extracellular loop 6, (13) a rebuild of TM12 residues 968–987 to correct for a gradual and increasing registry error, and (14) a rebuild of TM12-NBD2 connector residues 1009–1028. Other corrections to TM1, TM2, TM6, TM7, TM10, and TM11 from refinement improved the positioning of dozens of side chains into new electron density that had poor or no substantial electron density representation in the original map. The totality of all changes made to the improved mouse Pgp structure resulted in a final Rwork/Rfree of 0.212/0.266 when refined against an original dataset. These Rwork/Rfree values represent a significant reduction (9.4%/8.1%) from the values describing the original published structure (16.6%/15.7% reductions for the last resolution shell).

Validation of TM registry shifts

The greatest changes to the mouse Pgp structure include registry shifts for the entire span of TM4 and TM5 (Fig. 2 and Supporting Information Fig. S1), each of which is relatively deficient in large side chains that would otherwise be straightforward to visualize in mid-resolution maps, if ordered. Registry corrections in α-helices have significant impact on side chains because, even if the Cα distance shift for a residue is modest, the changes to the side chain atoms are amplified since they point in a very different direction and shift greater distances. Three additional residues were also added to the N-terminus that were not previously modeled. To validate the changes to TM4, TM5, and the N-terminus, we have replaced selected residues in these regions with cysteine and labeled with mercury to pinpoint their locations. Anomalous difference electron density revealed mercurated cysteines at locations that strongly support the improved structure (Fig. 3). Our results confirm a full registry shift was required for TM4 and a partial shift for TM5, each involved in a crossover motif that communicates information between TM domains and nucleotide binding. TM4 may also play an important role in selecting drugs from the lipid bilayer, since it forms one of the portals open to the inner leaflet.

Figure 3.

Validation of N-terminus and TM Registry Shifts. Using a Cysless mouse Pgp as template, three additional mutants were constructed, crystallized, and labeled with mercury. (A) A32C mutation near the N-terminus; (B–D) A216C/A244C double mutation; (E–G) A280C/I302C double mutation. The original Pgp structure (PDB 3G5U) is shown in brown ribbons and the improved Pgp structure (PDB 4M1M) is shown in green. Anomalous difference density for all mutants is shown in magenta mesh. (A) view of the N-terminus looking toward the “elbow” helix. Anomalous difference density is contoured at 6 σ. (B) View of entire TM4 for the A216C/A244C double mutant. (C) Zoom view of Ala-216 shown in spheres and labeled for original and improved structures. (D) Zoom view of Ala-244. Anomalous density in panels B–D is contoured at 4 σ. (E) View of entire TM5 for the A280C/I302C double mutant. (F) Zoom view of Ala-280. Panel (H): Zoom view of Ile-302. Anomalous density in panels E-G is countered at 6 σ.

The drug translocation pathway of P-glycoprotein

The improved mouse Pgp structure agrees well with several reports that examine the location of drug-binding residues[16-20] as well as crosslinking between TM2 and TM11,[21] TM5 and TM8,[22] NBD1 and IH4,[23] and NBD2 and IH224 (Fig. 4). The improved drug-free structure allowed straightforward structure refinement against original datasets of mouse Pgp crystals grown in the presence of the cyclic peptides (QZ59-RRR and QZ59-SSS). Refinement resulted in Rwork/Rfree values that were 9.1/7.6 and 8.2/7.7%, respectively, lower than the previously published values (Supporting Information Table S2). We summarize the drug-binding cavity for QZ59-RRR and QZ59-SSS, including residues that extend to the ceiling of the cavity and more residues previously shown to interact with verapamil (Fig. 5). The pathway contains nine aromatic residues that are identical in human and mouse Pgp but are not conserved in C. elegans PGP-1 (Supporting Information Fig. S27).

Figure 4.

Agreement of Improved Mouse Pgp Structure with Biochemical Data. (A) overall structure of mouse Pgp. (B and C) pairs of residues in TMDs that formed disulfide bonds (green line) when mutated to cysteines.[21, 22] (D and E) pairs of residues at NBD-IH interfaces that were crosslinked.[23, 24] (F) Wall-eyed stereo view of the drug transport pathway. Mouse Pgp residues corresponding to drug interacting residues from human Pgp biochemical studies[16-20] are labeled and shown as magenta balls. The non-protected residues Tyr 114, Val 121, Val 129, Cys 133, Gln 191, Ile 293, Gly 296, Ala 297, Leu 300, Ala 304, Ala 307, Phe 310, Ser 725, Phe 755, Ser 762, Gly 770, Leu 829, Phe 833, Ile 836, Ala 837, Gly 840, Thr 841, Ile 843, Ile 844, Ile 845, Ala 867, Ser 939 and Phe 953, are shown in gray.[19, 20]

Figure 5.

Amino acid residues involved in the drug translocation pathway for mouse P-glycoprotein. Residues were selected for the Venn diagram if they are 5 Å or less from the cyclic peptides, QZ59-RRR and QZ59-SSS, in the improved mouse Pgp structures or residues involved in drug interactions as determined by previous biochemical studies. Only two amino acid residues in the entire drug translocation pathway are nonidentical between mouse- and human Pgp (human Pgp residues and numbering shown in parentheses). aCys mutant had reduced ATPase when exposed to MTS-verapamil;[51] bInteraction also observed with vinblastine and colchicine when mutated to cysteine;[52] cInteraction also observed with rhodamine when mutated to cysteine;[53] dCys mutant had permanent ATPase in MTS-verapamil.[17]

Overall, the conservation of 46 residues in the drug translocation pathway between mouse and C. elegans PGP-1 is rather low at 13% sequence identity, whereas human- and mouse-Pgp are 96% identical (Supporting Information Table S3). Mammalian orthologs of Pgp also contain no charged residues penetrating into the translocation pathway, which is in contrast to the bacterial lipid flippases (such as MsbA) as well as C. elegans PGP-1 (Supporting Information Fig. S28). MsbA, which has 16 formally charged residues pointed directly toward the substrate translocation pathway, is a well-characterized lipid-A flippase.[25-29] The presentation of charged residues by lipid flippases to the hydrophobic section of the bilayer is reminiscent of α-helical proteins and amphipathic α-helical peptides that exhibit reversible lipid binding in a membrane.[30, 31] Taken together, the composition of the 46 residues of the translocation pathway in Pgp provides structural evidence that a significant amount of drug–protein interaction is likely to be electrostatic,[32] including a strong component of cation–π, CH–π, or π–π recognition.[33, 34] The aromatic residues in the mouse Pgp drug translocation pathway could provide strong affinity for substrates with lower geometric constraints compared to hydrogen bonding, and a consequentially lower entropic penalty of binding drug substrates.

Conserved folding motifs in ABC proteins

In the maltose importer MalK, transmembrane extensions in the form of intracellular helices (IHs) make critical contacts with the nucleotide binding domains (NBDs) that are maintained by salt bridges during conformational cycling.[35, 36] Salt bridging between NBDs-IHs is a widely conserved feature in both bacterial and mammalian ATP-Binding Cassette (ABC) proteins, and in both importers and exporters.[36] A salt bridge (Asp-Arg), engaged in polar contact with a tyrosine forming an interacting triad, appears twice (related by twofold pseudosymmetry) in the high-resolution crystal structure of C. elegans PGP-1. One triad forms a basis of the IH1-IH4-NBD1 interface and the second triad forms the IH2-IH3-NBD2 interface.[14] Our new experimental electron density map showed good contiguous electron density for the corresponding interfaces of mouse Pgp (Supporting Information Fig. S29, A,G), allowing correct modeling of the salt bridging triad in the improved structure [Fig. 6(A–D)]. The interfaces are incorrectly modeled in all recent structures of mouse Pgp37 including an attempt to correct the original structure without an accompanying manuscript or supporting data (PDB code 4LSG; Supporting Information Fig. S29, C–F, I–L).

Figure 6.

Interdomain polar contacts in the improved mouse P-glycoprotein structure. (A and B) View of IH1-IH4-NBD1 polar contacts in molecules “A” and “B” of the asymmetric unit, respectively. (C and D) View of IH2-IH3-NBD2 polar contacts in molecule “A” and “B,” respectively. (E) Incorrect modeling of the TM4-TM5 region connecting IH2 in the original mouse Pgp structure. The Glu 252–Arg 272 pair, just above IH1, was modeled more than 11 Å apart in the original structure. (F) A salt bridge between Glu 252 and Arg 272, required for proper folding of Pgp,[38] is shown for one molecule of the asymmetric unit in the improved structure.

An additional salt bridge just above IH2 (E256-R276) connecting intracellular extensions of TM4 and TM5 was shown to be critical for bioprocessing and folding of human Pgp in recent biochemical experiments.[38] Charge reversal at either site (individually) blocked mature glycosylation, but the double charge reversal mutant (E256R/R276E) allowed protein maturation and restored verapamil-stimulated ATPase activity to wildtype levels. This region had substantial modeling errors in the original structure [Fig. 6(E)]. Close inspection of the new experimental electron density map allowed remodeling of the region as mostly helical, placing the corresponding pair in mouse Pgp (E252-R272) in near proximity to form a salt bridge [Fig. 6(F)].

Interfacial residues between NBD1 and ICL4 (IH4) share some conservation among all mammalian ABC proteins, including the only channel protein in the superfamily, CFTR/ABCC7. The NBD1-ICL4 interface contains an aromatic residue (either Tyr or Phe) that is invariant in all five mammalian ABC subfamilies (A, B, C, D, and G). A deletion of the interfacial aromatic residue in CFTR (Phe 508) results in misfolded, mislocalized, and degraded protein. Loo et al. demonstrated that human Pgp also resulted in misfolded protein, cellular mislocalization and degradation when the corresponding interfacial aromatic residue (Tyr 490) was removed.[39] The interface is formed by helical extensions of TM10 and TM11 as well as IH4 (Fig. 7). Residues involved in forming this important interface were incorrectly modeled in the previous structure of mouse Pgp in the form of a registry error of IH4. The corrected interface has good experimental electron density (Fig. 7), and sheds light on a potential folding mechanism conserved in mammalian ABC proteins. The interfacial aromatic residue in mouse Pgp, Tyr 486, is in close proximity to IH4 residues Arg 908, Lys 911, Phe 912, and Met 915. Mouse Pgp Arg 908 and Phe 912 are identical to human CFTR Arg 1070 and Phe 1074, and mouse Pgp Met 915 is similar to human CFTR Leu 1077. Biochemical studies of CFTR suggest that Pgp and CFTR have similar contacts in this region. When Loo and Clarke mutated Val 510 to Aspartate, CFTR-F508del misfolding could be partially corrected.[40] They showed that the double charge reversal mutation (V510R/R1070D) could also partly correct the F508del defect, probably by forming a salt bridge. Thus, it would appear that deficient interaction energy at the interface as a result of the F508del mutation of CFTR can be partly compensated for by augmenting interaction energy nearby. Our improved structure shows that mouse Pgp Arg 908 is in very close proximity to position 488 (Fig. 7), which is the comparable position to CFTR Val 510, and confirms that these two positions are sufficiently close for engineering a salt bridge. Close inspection of the interface in our improved structure suggests that a prefolded state of Pgp, and by inference CFTR, would seem to include cation–π interactions between Arg 908 or Lys 911 and Tyr 486 as NBD1 and ICL4 approach each other during folding. When contact between NBD1 and ICL4 is achieved as visualized in the folded, correct structure, Arg 908 and Lys 911 turn to the side and interact with other residues as they avoid steric clash with NBD1. If this folding dynamic is confirmed, then small molecules that mimic similar bridging between NBD1 and ICL4 may be effective at assisting the folding the major CF mutation, CFTR-F508del.

Figure 7.

Mouse Pgp NBD1-IH4 Interface. Experimental electron density from SAD phasing is shown in blue mesh. The interface is formed by key interactions with Tyr 486 on NBD1 and surrounding residues from extensions of TM 10, 11, and IH4. The interaction energy of the folded interface is largely derived from Van der Waals contact, π–π interactions, and polar contacts. The interface is reasonably well conserved between mouse Pgp (mPgp) and human CFTR (mPgp Phe 476 corresponds to hCFTR Met 498, mPgp Tyr 486 corresponds to hCFTR Phe 508, mPgp Thr 902 corresponds to hCFTR Thr 1064, mPgp Val 903 corresponds to hCFTR Leu 1065, mPgp Arg 908 corresponds to hCFTR Arg 1070, mPgp Phe 912 corresponds to hCFTR Phe 1074, and mPgp Met 915 corresponds to hCFTR Leu 1077). The figure is shown in wall-eye stereo.


The recent 3.4 Å structure of C. elegans PGP-1 and careful comparisons of Jin et al.[14] to the original mouse Pgp structure revealed problems in mouse Pgp model in TMs 3, 4, and 5. The discovery prompted a thorough inspection and complete rebuild of the mouse Pgp structure that required global corrections to the model. The corrections to the mouse Pgp structure presented here were only possible after producing an improved experimental electron density map. What criteria allow us to conclude that our new map is improved? The “phasing power” statistic is useful for older algorithms that employ least squares minimization such as PHASES (Furey, 1997) that was used in the original MAD phasing solution of mouse Pgp. The phasing power reported for the 3.8 Å mouse Pgp MAD solution was rather low at 1.32 compared to the author's original 4.5 Å map from “Crystal 1” that had a phasing power of 3.1 (Aller et al., Science 2009, Supporting Information Table S1). Phasing power, however, is not useful for the more recent maximum likelihood based routines such as PHASER[41] used in this work. Another statistic, figure of merit (FOM), must be used with care because values are not directly comparable for MAD- versus SAD-phasing solutions. In general, for a MAD dataset, a FOM of 0.5 is acceptable, 0.6 is good, and anything above 0.7 is very good. For a SAD dataset, a FOM of 0.3 is acceptable, 0.4 is good, and anything above 0.5 is very good. The original mean FOM for the original 3.8 Å MAD solution was reported to be 0.434 (Aller et al, Science 2009, Supporting Information Table S1), which falls short of what is generally considered to be acceptable. This may likely explain the lack of detail in the original experimental map. The mean FOM of our SAD phasing solution is 0.336 prior to any DM, and is in the acceptable range.

More appropriate phasing statistics have recently been described. Terwilliger et al. performed a comprehensive analysis for assessing the quality of experimental maps using 246 structures recently determined by the Joint Center for Structural Genomics (JCSG). These authors showed that the single best measure for evaluating the quality of experimental maps prior to density modification (DM) is the Bayesian estimator based on skewness (skew) of the density values. The correlation between estimated and actual map quality for this single statistic was an impressive 90%. The combination of two measures, skew and the correlation of local rms density (r2RMS), gave a further improvement in estimating map quality with an overall correlation coefficient of 92%. The new SAD experimental map had a skew value of 0.11, which is expected to be positive for good experimental maps that should lack negative electron density. Furthermore, our SAD map yielded a r2RMS = 0.85, which scores in the top 72nd percentile (986/1359) among the 1359 JCSG maps (includes all solutions for all structures). The correlation of local rms density (r2RMS) for our phasing solution may be even more significant considering that all of the JCSG datasets were of much higher resolution (a 2.5 Å cutoff was used to produce all JCSG maps).

To produce an improved unbiased experimental electron density map, DM is typically performed using one of several model-independent methods to determine protein-solvent boundaries. The Wang method of DM[42] was clearly preferred in the construction of the new SAD map resulting in a final overall Rfactor = 0.238 compared to either the histograms method (Rfactor = 0.307) or classic method (Rfactor = 0.603). This is consistent with the observed advantage of the Wang method in reducing the influence on the overall DM process of strong anomalous scatterers such as the mercury labeling we employed for phasing. In assessing quality of experimental maps however, there is no substitute for close visual inspection of electron density. Our unbiased experimental SAD map following DM using the Wang method and implementing NCS found in the original published structure revealed side chain details of the protein not previously visualized (Fig. 1). Real space cross-correlation analysis (Supporting Information Table S4) confirmed that our improved Pgp model had the highest overall correlation to the new SAD experimental map compared to the original MAD map. Furthermore, the improved structure had a significantly higher number of residues covered by acceptable electron density.

Considering the extent of validation offered in the Supporting Information accompanying the original mouse Pgp publication,[1] how did errors carry over into the original structure? The authors presented multiple composite Fo-Fc maps as well as a Sigma-A weighted 2Fo-Fc composite simulated annealing omit map to validate their structure, because details were absent in the original experimental map. Caution should be taken with such a strategy because all difference maps rely on model phases, which can introduce model bias. Specifically, errors likely carried over into the difference maps because once the original atomic model was refined, the positions and other parameters describing the majority of correctly placed atoms were subjected to minor adjustments during refinement to compensate for the minority of incorrectly placed atoms. Consequently, even when the incorrectly placed atoms were removed from the model before the calculation of phases (when using omit maps), a memory of their positions remained and the resulting difference maps retained incorrect features. To avoid such errors in the future, we consider it critical to achieve a high-quality experimental electron density map yielding good phasing statistics and exhibiting features that are consistent with the claims of the resolution of the refined structure. Relatedly, Ward et al.[37] recently published new structures of mouse Pgp in the absence or presence of a camel antibody. A low-resolution (4.5 Å) experimental map was produced revealing a conformation even more open to the cytoplasm. The structures do appear to contain registry corrections to TMs 3–5, likely prompted by the more recent C. elegans PGP-1 structure, but other discrepancies still persist in regions not validated by their cysteine labeling, most notably: (1) intracellular helix 1 (IH1) is ∼90° out of register; (2) polar contacts at NBD-IH interfaces are incorrectly modeled; (3) a segment of TM8 (residues 742–759) remains ∼90° out of register; (4) a portion of TM9 (residues 828–849) remains ∼45° out of register; (5) a gradual and increasing registry error remains for TM12 residues 968–987 as shown by our new anomalous data (see Supporting Information Fig. S30), and 6 the registry error of the N-terminal elbow helix (EH1) by ∼90° prevented correct modeling of the salt bridge between Arg40-Asp366 (EH1 and TM6). This pair clearly forms a bridge in mouse Pgp as revealed by new experimental density (Supporting Information Fig. S2, C). Curiously, the corresponding conserved pair in C. elegans PGP-1, Arg67, and Asp394, are too far away to form a salt bridge, whereas the pseudosymmetry related pair connecting EH2 and TM12 in both PGP-1 (Glu742-Arg1056) and mouse Pgp (Arg695-Glu1009) are sufficiently close in the structures to form salt bridges. We note that the all of the above modeling errors are also present in a recent attempt to correct the original mouse Pgp structure with a released PDB entry (4LSG), which has Rwork/Rfree values that are actually worse than the original values. It would appear, therefore, that modeling errors have propagated in all mouse Pgp structures to date originating from errors in the original structure due to insufficient detail in the experimental map. This problem is reminiscent of the retractions surrounding MsbA structures,[43, 44] in which swapping columns of anomalous data does not explain the root cause of the continuing propagation of errors in the various structures.[45]

The 3.4 Å structure of C. elegans PGP-1 revealed an inward-facing conformation that is even more open to the cytoplasm (the nucleotide binding domains are separated by an additional 16 Å) compared to mouse Pgp. Thus, a high degree of flexibility is evolutionarily conserved in Pgp allowing the transporter to open to the inner leaflet for extracting small molecules from the membrane. It is noteworthy that rather hydrophilic substrates might be capable of gaining access to the drug translocation pathway directly from the cytoplasm. The 16 Å spacing difference between original mouse Pgp and C. elegans PGP-1 structures makes particular regions difficult to compare. Likewise, the C. elegans genome contains at least 20 orthologs and isoforms of ABCB-like proteins, compared to two ABCB1 orthologs in mouse (ABCB1A and ABCB1B), and only one in human. C. elegans PGP-1 has 46% identity to human Pgp and 43% identity to mouse Pgp (ABCB1A), whereas human ABCB1 and mouse ABCB1A are 87% identical. The lack of charged residues in the drug-translocation pathway of mammalian Pgp compared to nematode Pgp or bacterial lipid flippases suggests evolution of differing functions. The charged residues of a lipid flippase could serve to lower the relatively large energy barrier of pulling phospholipid headgroups, which bear formal charge, into the translocation pathway. Mammalian Pgp seems to prefer aromatic residues in the drug translocation pathway possibly to maximize polyspecific recognition and export of neutral/cationic small molecules that are rich in sp2 hybridization. Relatedly, Jin et al.[14] demonstrated that, of the many drug compounds that are known to stimulate human Pgp, only a small subset could stimulate the ATPase activity of C. elegans PGP-1 which contains fewer aromatic residues in the drug translocation pathway compared to human Pgp. Thus, it could be that C. elegans achieves polyspecificity by employing a greater number of genomic orthologs of Pgp that each are relatively more specific for certain substrates compared to the single ortholog in humans.

Phylogenetic analysis reveals a large jump in divergence of nematode Pgp compared to their vertebrate counterparts (Supporting Information Fig. S31). We therefore produced a homology model of human Pgp based on the improved mouse structure as the more appropriate template in order to minimize errors arising from evolutionary divergence. Our human Pgp model exhibits good protein geometry and clash scores (Supporting Information Table S5), and should also provide a useful template for MD simulations that probe conformational change, drug binding, and the “trigger” mechanism toward NBD-sandwich formation.

In conclusion, an experimental electron density map using SAD phasing revealed considerably more detail compared to the original MAD-phased map. The new experimental map allowed extensive improvements to the mouse Pgp structural model, revealing new details in the overall folding of mammalian ATP-Binding Cassette proteins as well as polyspecific drug recognition for a multidrug resistance transporter. The higher degree of polyspecific drug recognition characteristic of mammalian Pgp orthologs may be a direct result of evolving a high concentration of aromatic amino acid residues in the drug translocation pathway. Importantly, recent attempts to solve structures of mouse P-gp in the absence of a high-quality experimental map have led to the propagation of modeling errors originating from errors present in the original mouse Pgp structure. Heavy metal labeling to produce site-specific anomalous Fourier density is useful for validating individual cysteine positions, but the process is not a substitute for a high quality experimental map that displays sufficient detail to guide critical initial steps of model building.

Materials and Methods

SAD phasing, structure rebuilding, and refinement

The new experimental electron density map was produced using an original dataset (“Dataset #1”, Supporting Information Table S1) and 12 original mercury sites with phenix.phaser.[41, 46] The minimum distance between sites was fixed at 3 Å and a Z-score cutoff for accepting peaks as new atoms was set to 8.0 (no new sites were found). SAD phasing was conducted to the full 3.8 Å resolution of the dataset. Model-free phases were subjected to Wang-based DM[42] and twofold NCS restraints (Supporting Information Table S6) using RESOLVE[47] within PHENIX.[46] NCS operators were determined from the original mouse Pgp structure (3G5U) using the phenix.python script A temperature factor of −38 Å2 was applied to sharpen the electron density map (see Supporting Information Table S4). Hendrickson–Lattman map coefficients were explicitly written out to facilitate conversion to CCP4-style maps for figure rendering within PyMol.[48]

An anomalous difference Fourier map produced from a near-final refined improved model and “Dataset #1” revealed a 13th mercury site, which was added prior to a final round of refinement. We manually inspected the goodness of fit in experimental electron density for all residues of each molecule in the asymmetric unit. Most of the corrections could be made by manually dragging the peptide chain into density or using “Real Space Refine Zone” or “Auto Fit Rotamer” options within Coot.[49] For each of TMs 3, 4, and 5, the entire helix was deleted and rebuilt as regularized helical sections using the “Place Helix Here” function within Coot, and then manually editing of the positions of main chain and side chain atoms to fit within density. Since “Dataset #1” (a = 97.72, b = 115.68, and c = 378.98) is virtually isomorphous with the original published unit cell dimensions (a = 97.54, b = 115.43, and c = 378.86), we chose to refine against the original published structure factors which preserved all original reflections flagged for the test set. Refinement was performed with phenix.refine against the maximum likelihood target using NCS restraints and secondary structure restraints, group B-factor, individual B-factor, and restraining NCS-related B-factors. Refinement was always performed using TLS,[50] as defined by the following: chain “A” (eight groups): residues 30–99, 100–366, 367–592, 593–736, 737–884, 885–960, 961–1079, and 1080–1271; and chain “B” (seven TLS groups): residues 31–208, 209–317, 318–592, 593–736, 737–967, 968–1044, and 1045–1271.

Refinement of cyclic peptide bound structures

To achieve improved structures of cyclic peptides bound to mouse Pgp, the final drug-free mouse Pgp structure was first refined into the 4.4 Å QZ59-RRR and 4.35 Å QZ59-SSS datasets without drug (rigid body, TLS, NCS, group-, and individual- B-factors). Anomalous difference Fourier maps were then calculated (CNSv1.3) from model phases of the refined structures to pinpoint the location of the selenium atoms in the datasets. Regularized molecules of the RRR- and SSS-cyclic peptides were then manually docked into the structures (in both possible orientations), placing the selenium atoms in anomalous difference Fourier density. The docked structures were then subjected to another round of refinement using cyclic peptide parameter files with the exception that rigid body refinement was omitted. To examine the orientation of the drugs, refined structures were used to calculate Fo-Fc difference maps, and the vicinity of the drug was carefully inspected. As was originally discovered, the “upper” QZ59-SSS molecule was significantly disordered since Fo-Fc density was absent for two vertices of the triangular drug molecule. These atoms were subsequently removed from the final version of the structure.

Cysteine labeling and anomalous Fourier

Cysteine mutants were introduced on a cysless mouse Pgp construct and subjected to refinement against their respective datasets (Supporting Information Table S7) as described.[1]


Figures were generated using PyMol[48] and Adobe Photoshop 7.0. Figure 2 was created using the python script “” downloaded from the PyMol script repository hosted by the Molecular Modeling and Crystallographic Computing Facility at Queen's University. This script calculated the pairwise rmsd for each backbone atom in the original Pgp structure (3G5U) and the improved structure (4M1M), and replaced the b-factor array with the rmsd values. The figure was colored by rmsd using the standard b-factor coloring mode within PyMol.


The improved drug-free and cyclic-peptide bound structures of mouse Pgp have been deposited to the Protein Data Bank with accession codes 4M1M, 4M2S, and 4M2T.


The authors declare no competing financial interests or conflicts of interest. The authors thank Dr Thomas Terwilliger for suggesting the cross-correlation analysis.