In contrast to implicit solvent models, explicit-solvent, all-atom molecular dynamics (MD) simulations do not rely on extreme, simplifying assumptions and so, in principal, provide the level of detail necessary to describe membrane anisotropy (Figure 1). The development of MD-based models of small molecule membrane permeability through explicit biologic lipid bilayers was initiated with the pioneering work of Bassolino-Klimas et al. (33), who simulated benzene diffusion through a dimyristoylphosphatidylcholine (DMPC) lipid bilayer. Complementary efforts by Bemporad and Essex showed that these methods were able to correctly rank-order the permeability rates for a set of eight organic solute molecules, but that the absolute permeabilities were an order of magnitude larger than experiment, which the authors attributed to insufficient sampling, ligand parameters inappropriately suited to the bilayer, and subtle issues surrounding ensemble choice (34). Since then, a number of studies have been carried out utilizing similar approaches, most recently (34–39), and we explore these methods in more methodological detail below.
To determine whether the more detailed models are sufficiently accurate to capture the origins underlying permeability differences among a diverse small molecule set, exploratory molecular dynamics umbrella sampling simulations (40) were used to estimate the passive permeability of the 11 small molecules shown in Figure 2. To carry out the MD simulations, DMPC bilayers were modeled with 36 lipids per leaflet and a 20-Å pad of water was positioned on either side of the membrane. TIP3P water (41) and CHARMM 36 parameters (42) were used. Following the minimization, 10 ns of NPT equilibration, with an anisotropic barostat at 1 atm and a Langevin thermostat at 310.15 K, were performed in the NAMD molecular dynamics program (43). A more detailed description of the simulation protocol can be found in the Supporting information. The ‘health’ of the lipid bilayer models was determined by comparison with available experimental parameters, such as area per lipid headgroup, tail order parameters, and surface tension (data not shown).
Thermodynamic influences: constructing potentials of mean force
Formally, the potential of mean force at z, or W(z), is the free energy at z relative to some arbitrary reference state, and it is directly connected to the local probability density, or ρ(z), through eqn 4. The PMF gives rise to a mean force that causes the basins of the PMF to be more populated than other regions. It is this mean force, and the underlying PMF, which result in the equilibrium tendency of a molecule to partition to various depths within the lipid bilayer, facilitating the transport process. Additionally, coupling the PMF with an atomic-level description of the entire system entails the spatial entropic dependence that is neglected in the implicit solvent models, resulting in a more physically accurate model of the passive transport process.
(4)
Umbrella sampling simulations are the most established technique to determine PMF values along a reaction coordinate in a biased fashion. Umbrella sampling is a straightforward approach that confines the solute of interest to discretized sections along the reaction coordinate, a.k.a. ‘windows,’ using a harmonic biasing potential. Done properly, this stratification procedure enables effective and sufficient sampling over every region along the reaction coordinate and results in a set of overlapping, biased probability distributions. Using either the weighted histogram analysis method (WHAM) (44) or the multistate Bennett acceptance ratio (MBAR) method (45), these distributions can be unweighted and combined to minimize statistical error, resulting in unbiased probability distribution, which gives the PMF according to eqn 4.
Following membrane equilibration, umbrella simulations for each of the molecules shown in Figure 2 were prepared. Small molecule parameters were assigned using the CHARMM general force field (CGenFF) (46) using the match program (47). Small molecules were oriented along each of their three principal axes and displaced in 32 1 Å increments from the bilayer center to the bulk solvent, along one leaflet, resulting in a total of 96 umbrella windows. Umbrella simulations were carried out for 3 ns in each window; simulations of propranolol, terbutaline, and verapamil were extended to 10 ns in each window to test convergence properties. The PMF along the membrane normal was estimated independently using the WHAM equations (44) for each of the three small molecule orientations. Averages were calculated across the three simulations, and standard errors were estimated using the standard deviations determined across the three simulations. A more detailed description of the umbrella simulation protocol and standard error calculation can be found in the Supporting information.
Results for propranolol, terbutaline, and verapamil are shown for 3 and 10 ns of sampling per window (Figure 3). While the general shapes of the PMFs are reasonably converged after 3 ns of sampling per window, additional simulation time makes basins more pronounced and changes barrier heights, indicating that more than 10 ns of sampling per window may be necessary to properly average over the long-time conformational relaxation of the lipid tails and other slow system reorganization. To gain a better understanding of the accuracy of the PMF curves, we used them to estimate each small molecule’s water–DMPC membrane partition coefficient (38) and compared them with the experimental water–octanol partition coefficients. This comparison gave a correlation coefficient of 0.59, and both the experimental and predicted values spanned roughly the same orders of magnitude. Octanol has long been taken as an experimental proxy for lipid bilayers; however, as it is not an exact surrogate for a DMPC bilayer, perfect correlation cannot be expected due to fundamental structural and physicochemical differences between the environments (48). A correlation of 0.59 suggests that the estimated PMF curves reasonably describe the fundamental thermodynamics of small molecule membrane partitioning. Nevertheless, lack of stronger correlation may hint at deeper force field deficiencies.
Traditionally, biophysical simulations are carried out in aqueous solutions. Consistent with this, current state-of-the-art fixed-charge force fields are parameterized to implicitly capture the polarization induced by a high-dielectric solvent (49,50). Such a parameterization strategy may lead to inappropriate interactions with apolar solvents such as the greasy region of the membrane bilayer. A strategy to optimize atomic partial charges to better reproduce experimental free energy differences between octanol and water may improve permeability accuracy estimates. Alternatively, polarizable force fields (51) may better capture multibody interactions whose description might be fundamentally important to model the thermodynamics and kinetics governing the passive transport process.
While umbrella simulations are the most widely used technique to estimate PMF curves, other, more contemporary, methods have also been advanced and may prove more efficient. For example, in the adaptive biasing force (ABF) method (52–55), an instantaneous biasing force that opposes the force acting on the solute along the reaction coordinate is estimated and applied. The estimate of this force improves with simulation length, and its application allows the system to diffuse along the reaction coordinate. Averaging the applied forces gives an estimate of the mean force, which can then be integrated along the reaction coordinate to yield the PMF. Other promising methods have also been developed (56) and may be applied. However, the most efficient methods for permeability estimates will yield not only a PMF but also a local diffusion coefficient, which is discussed in the next section.
Enter kinetics: estimating the local diffusion coefficient
While the PMF describes the propensity for a compound to partition from water into the membrane at a given depth, it is purely a thermodynamic quantity and says nothing about the rate of the process. Permeability is a transport process, and equilibrium thermodynamics must be integrated with a kinetic rate description. For example, if the diffusion constant is known along the reaction coordinate, then the mean first passage time can be estimated as a diffusive barrier crossing process (57). While monitoring the mean-square displacement of a compound readily yields diffusion coefficients in the absence of substantial free energy barriers, it is impractical to estimate diffusion profiles along a rough free energy landscape. Instead, a variety of contemporary biased simulations can be paired with different diffusion estimators to arrive at a position-dependent diffusion profile when substantial free energy barriers exist along a reaction coordinate, as for a compound moving through a lipid bilayer. For example, Marrink and Berendsen (23) carried out equilibrium-constrained simulations and a diffusion estimator that relates the autocorrelation of the constraint force to the local diffusion coefficient through the fluctuation–dissipation theorem. Extending work by Straub et al. (58), Woolf and Roux (59) have applied an estimator that calculates a local diffusion coefficient from the velocity autocorrelation function extracted from a harmonically biased simulation, although it has not been applied to permeability predictions. More recently, Hummer extended the work of Wolfe and Roux to derive an estimator based on positional autocorrelation (60), whose use is reported here and described in greater detail below. Within non-equilibrium regimes, such as in steered molecular dynamics (SMD), Kosztin et al. (61) have posited that one can use the slope of the mean dissipative work curve to estimate the local diffusion constant. Yet, this last method has been largely untested. Despite the variety of diffusion estimators, to the best of the authors’ knowledge, a detailed comparison of their performance has not been carried out, and selecting one estimator over another is still, unfortunately, driven by convenience. For example, when using a constrained simulation to calculate the PMF, the most convenient way to estimate the diffusion profile is through the autocorrelation of the constraint force, following the work of Marrink and Berendsen.
During harmonically restrained simulations, as for umbrella sampling PMF reconstruction, position-dependent diffusion coefficients can be estimated within each window using theory formally derived by Hummer (60). If Di(Z) is the local diffusion constant at the position of the harmonic potential minimum used to construct the ith window, its value can be estimated using eqn 5:
(5)
where
is the mean-square fluctuation of the center of mass of the small molecule along the membrane normal from the ith umbrella window. Similarly, the autocorrelation time of the mean-square fluctuation of the center of mass of the small molecule from the ith umbrella window is given τi. The autocorrelation time is calculated by integrating the autocorrelation function of the mean fluctuations of the center of mass of the molecule along the membrane normal according to eqn 6:
(6)
Unlike the PMF estimates, the diffusion profiles determined are noisy (Figure 4). Extending the simulation time modestly improves the results, as indicated by the position-dependent diffusion profiles for 3 and 10 ns of sampling per window for propranolol, terbutaline, and verapamil (Figure 4). All of the diffusion profiles are similarly shaped with slightly higher diffusivity at the bilayer center and lower diffusivity in hydrophobic tail region of the leaflets. While the estimated order of magnitude for the local diffusion profile agrees with values others have estimated for similarly sized molecules (37), the similarity and noisiness of the profiles may be insufficient to discriminate among structurally related small molecules as would be important in a lead optimization setting.
Looking forward prospectively, the validity of the estimators, as well as the affects of the biased simulation, needs to be considered. For example, eqn 5 assumes over-damped Langevin solute dynamics (23,34,59), which implies that the solute undergoes Brownian motion (62). The extent to which this assumption holds at various points throughout the transport process has not been probed in detail. Additionally, simple unbiased positive controls have yet to be compared with estimates from biased simulations in regions where these comparisons are practical, for example, at the membrane interface and in bulk water. While there are likely other outstanding issues that have not been mentioned here, careful study of the two issues raised may lead to long-term improvements in the accuracy of position-dependent diffusion estimation. Such improvement will positively impact the reliability of transport process rate estimates.
Uniting thermodynamics and kinetics to estimate passive membrane permeability
Equation 7 is then integrated across the bilayer and inverted to yield a permeability estimate. Regions where the PMF is much larger than RT, or the average thermal energy available to move the solute over the barrier, contribute exponentially to transport resistivity, while resistivity falls inversely with increasing diffusivity. Consequently, a solute is more likely to get ‘stuck’ in a region bounded by large PMF barriers than a region bounded by low diffusivity. Figure 5 shows the predicted resistivity profiles for propranolol, terbutaline, and verapamil after 3 and 10 ns of sampling per window. Standard errors are reported and were calculated across the three simulations corresponding to each solute starting orientation. Similar to the PMF estimates, the general shape of the resistivity curves is roughly converged after only 3 ns/window. While additional sampling decreases resistivity barrier heights and standard error, it has a modest effect on the estimated permeability (Table 1)
Table 1. The effect of simulation length on permeability estimates | Compound | Log(P): 3 ns/window | Log(P): 10 ns/window | Log(P): experiment |
|---|
| Propanolol | −6.72 | −5.98 | 0.43 |
| Terbutaline | −7.94 | −6.96 | −7.25 |
| Verapamil | −7.57 | −7.01 | 0.26 |
The transport of propranolol and verapamil is most hindered at the membrane–water interface, whereas terbutaline passage is significantly slowed at the center of the bilayer. This may be partly explained by considering the fraction of hydrophilic surface area for each compounda. At roughly 31%, the hydrophilic component of the surface area of terbutaline is much larger than that of propranolol (8%) and verapamil (5%). The larger hydrophilic surface area reduces the equilibrium tendency to partition into the hydrophobic center of the bilayer, as indicated by the roughly 5 kcal/mol PMF barrier for terbutaline (Figure 3B). The diffusion profile of terbutaline, on the other hand, is reasonably flat (Figure 4B). Consequently, the terbutaline resistivity barrier is largely a thermodynamic effect. This brief analysis speaks to the potential power of all-atom permeability estimates. Macroscopically observable transport properties can be explained using a microscopic statistical mechanical description, which can be tied to physical molecular descriptors, like solvent-accessible surface area. As physical molecular descriptors are systematically varied during lead optimization, QSPR can be mechanistically rationalized on a microscopic basis using accurate all-atom permeability estimates, information that may prove invaluable during rational drug design efforts. A similar strategy was recently proposed to integrate information from implicit solvent models in reference (32).
How accurate are all-atom permeability estimates? To answer this, in Figure 6, the computationally predicted permeability, determined with 3 ns of sampling per window, is plotted as a function of the experimental permeability for the set of small molecules shown in Figure 2. Using error propagation techniques, standard errors of the logged permeability estimates were calculated across the three simulations (see Supporting information), corresponding to each solute starting orientation, and are shown as vertical error bars. Standard errors of the experimental permeabilities are shown as horizontal error bars and were calculated from (63). The experimental and computational values used to create the plot are provided in Table S1. While the trend is promising, the coefficient of determination is 0.45, and the predicted values span only 1.5 orders of magnitude, while the experimental values range over eight orders of magnitude. Although this is a modest improvement over a simple ‘null’ model, which plots experimental permeability estimates versus molecular weight and gives a coefficient of determination of 0.33, stronger correlation is desirable. Table 1 shows that while increasing simulation time by more than 300% increases the estimated permeability rate by nearly an order of magnitude, estimates are still confined to a narrow range, relative to experiment. It may be that still more sampling is required. For example, when umbrella sampling with membrane-embedded leucine and arginine side-chain analogs was carried out for over 200 ns/window, the bilayer surface depressed or protruded to accommodate the side-chain analog, leading to improved accuracy of the estimated water–membrane partition coefficient (38). However, because bilayer reorganization properties were different in harmonically restrained and unrestrained simulations, the relevance of the membrane structural reorganization observed during the restrained simulations is questionable (38). It may prove that non-equilibrium methods, such as SMD, carried out with a stiff spring and slow pulling speed (64–66), provide a better model of the naturally occurring transport process. Alternatively, promising emerging new methods, such as ABF, discussed in section Implicit solvent models, which would allow the solute to freely diffuse along the membrane normal, may prove more suitable.
Neglect of alternative ligand protonation and tautomerization states are additional simplifying assumptions that may significantly impact the accuracy of passive membrane permeability estimates. For example, different tautomerization states may be thermodynamically favored and can have higher or lower diffusivity, at different depths in the membrane, appreciably contributing to the experimental permeability measurement. Alternative protonation states, on the other hand, while seemingly more impactful because of larger coulombic effects, may be less problematic, in some cases, than alternative tautomers. In part, this is because experimental permeability is often reported as an ‘intrinsic’ permeability, or the permeability of the uncharged species alone (63), which was the case of the 11 compounds considered in this work. Additionally, charged species are less likely to dissolve to a significant extent in the greasy bilayer, so considering only electrically neutral compounds may be sufficient. However, charged compounds can be partially hydrated in the bilayer (67) and simply discounting their contribution might result in inaccurate permeability estimates.