- Top of page
- Computational Methods
- Supporting Information
A quantitative structure-activity relationship analysis of the inhibitory activity of structurally diverse compounds on recombinant human CYP3A4 is presented using a bilinear approach based on our previously developed LinBiExp model. Using only two main descriptors, molecular size and an indicator variable for the presence of triazole/imidazole moieties, this approach can account for close to 65 % of the variability in the inhibitory activity of more than 70 compounds and provides clear evidence that molecular size plays an important, but nonlinear role. Strongest inhibitory activity is likely to occur for compounds close to an optimal size, which is roughly that of the well-know CYP3A4-inhibitor ketoconazole. The activity-limiting role of size was also confirmed on a large dataset of 3438 compounds (PubChem Bioassay AID 884). This model provides a simple, intuitive interpretation and can serve as the starting point for more complex descriptions of the CYP3A4 inhibitory activity.
Metabolism is a major factor determining the amount (fraction) of active substance reaching its intended target following administration, and it plays a crucial role in determining the ultimate activity and safety of drugs – a fact that is being increasingly recognized by those involved in the design, development, and/or regulation of drug products.1 Most critical metabolic pathways are mediated by oxygenases, and the cytochrome P450 (CYP) superfamily is a large and diverse group of heme-containing enzymes that is involved in the oxidation of many xenobiotics, including most drugs. CYP3A4 is the most abundant hepatic CYP isoform in humans and is responsible for the metabolism of almost half of known drugs. Inhibition of CYPs, including CYP3A4, by co-administered drugs has been shown to be an important cause of undesirable clinical drug-drug interactions; therefore, structure-activity relationship studies that can identify factors resulting in CYP3A4 inhibitory activity are of considerable interest. Here, we present a quantitative analysis of the inhibitory activity of structurally diverse compounds on the activity of recombinant human CYP3A4 using a bilinear approach, and provide evidence that molecular size plays an important albeit nonlinear role, and maximum inhibitory activity is likely to occur for compounds close to an ideal size, which is roughly that of the well-known CYP3A4-inhibitor ketoconazole.
Metabolism is in fact a result of the complex detoxification mechanisms evolved in general for xenobiotics. Most critical metabolic pathways are mediated by oxygenases – a likely consequence of the fact that “an organism’s normal reaction to a foreign substance is to burn it up as food”.2 The cytochrome P450 (CYP) superfamily is a large and diverse group of enzymes that is involved in the oxidation of most drugs. For drugs cleared via metabolism, about 75 % are metabolized by CYP enzymes, primarily CYP3A (46 % of all CYP-mediated metabolism), CYP2C9 (16 %), CYP2C19 (12 %), CYP2D6 (12 %), and CYP1A (9 %).3 The cytochrome P450 name for these enzymes comes from their cellular location (cyto) and spectrophotometric characteristics (chrome 450, as they tend to absorb light at wavelengths of λ≈450 nm when the reduced iron from their heme cofactor forms an adduct with carbon monoxide). Because metabolism usually generates multiple metabolites that are all simultaneously present at various concentrations together with the originally administered drug (D), the overall activity/toxicity of a drug is not just that of D, but a combination of the intrinsic activity/toxicity of the original drug D [i.e., A(D) and T(D)] and those of all the other various metabolic products (Mj, j=1…n) that are formed [i.e., Σj A(Mj) and Σj T(Mj)]. Hence, metabolism is a major determinant of the activity and toxicity of a drug and their time-courses. Furthermore, drugdrug interactions are an important safety concern in those taking multiple medications, and a major source of these are metabolic interactions, which can significantly alter the circulating levels of active drugs.
Cytochrome P450 3A4 (CYP3A4), the subject of the present work, is quantitatively the most important human hepatic CYP isoform as well as catalytically the most promiscuous.4 Because of their crucial role in drug metabolism, CYPs were already the subject of many quantitative structure-activity relationship (QSAR)4,5 as well as inhibitor/non-inhibitor classification studies6 especially since the late 1990s with the increasing recognition that drug metabolism has to be integrated into general drug design considerations.1b,7 Size and/or lipophilicity (log Po/w) have been consistently found to play an important role in determining the effect of small-molecule compounds on the activity of CYP3A4. Here, we explored the possibility that dependence is not linear over the entire range, but rather has a turning point corresponding to an ideal size (or lipophilicity). The approach is based on the LinBiExp model that allows bilinear dependence on two sides of a smooth transition region positioned at a turning point.8 Such an approach proved useful in modeling several pharmaceutically relevant activities, such as toxicities or antimicrobial efficiencies8a as well as the activity of anticholinergics9 and glucocorticoids.10
LinBiExp has been introduced as a general function that can fit any bilinear-type data, i.e., data that show a maximum, a minimum, or just a rate-change around a given x value, but tend to show good linearity away from this.8 It uses a novel functional form consisting of the logarithm of the sum of two exponentials to obtain a bilinear functionality:(1)
To have a completely general bilinear model,8b a total of five, unrestricted parameters are needed (denoted with Greek symbols): (i–ii) two slopes (α1 and α2) for the rising and descending phases, respectively, (iii) a constant (χ) for shifting along the y (vertical) axis, (iv) a constant (ξc) for shifting along the x (horizontal) axis (i.e., to position the rate-change point), and (v) a parameter (η) for adjusting the smoothness/abruptness of the transition between the two linear portions. For QSARtype analyses, we found LinBiExp to be most useful in fitting decimal log-scaled concentration data (e.g., log 1/IC50, log 1/ED50, log 1/Kd) as a function of molecular size. As before, we will use again molecular volume (V) as a representative measure of the three-dimensional molecular size,8a, 11 and we will use volume as the independent variable in Equation 1, x=V. As a further generalization, to also allow for the possible effects of various substitution effects (φi, i=1…n) via the incorporation of indicator variables, Iφi, the following form will be used here – just as it has been done before:8a,9,10(2)
Because there can be strong nonlinear dependence on η and the value of this parameter, which defines the width of the transition phase, cannot always be well defined if there are no sufficient data points in this segment, we will use a fixed value of η=− 1/ln10=− 0.434 (see Buchwald8a for further details and additional QSAR applications). Experimental data used for the present exploratory model are the inhibitory effects of structurally diverse drugs on CYP3A4 quantified as median inhibitory concentration (IC50) in two different studies (Table S1, Supporting Information). One is for the inhibition of the recombinant human CYP3A4mediated metabolism of 7-benzyloxy-4-trifluoromethyl coumarin (BFC) (n=44).5b The other one is for the inhibition of the recombinant human CYP3A4mediated metabolism (demethylation) of erythromycin (EryMyc) (n=28; but with 8 structures that are also part of the previous dataset of 44 compounds).4
Data on the inhibition of CYP3A4-mediated metabolism of BFC5b do indeed indicate the existence of a possible ‘best-fit’ size as compounds with a molecular volume larger than approximately 400 Å3 (i.e., approximately the size of ketoconazole) seem to be positioned along a decreasing trend (Figure 1). There are only relatively few sufficiently large compounds with available data; hence, the decreasing trend is not well-defined, nevertheless, maximum inhibitory effect seems to be observed with a size around that of ketoconazole (or just somewhat larger). Consequently, a bilinear function already gives a good fit with size alone accounting for 44 % of the variance (r2=0.44) vs. only 19 % for a simple linear model (r2=0.19 for linear regression). No other simple structure-derived descriptor gave significant improvement (including octanol-water log P as a lipophilicity measure); however, an analysis of the structure of the highly active outliers indicated that an imidazole moiety is present in them and only in them (clotrimazole, ketoconazole, miconazole, and midazolam; Figure 1, Table S1). With inclusion of an indicator variable for the presence of imidazole rings (Iimidazole), the LinBiExp model can account for almost 70 % of the variability in the CYP3A4 inhibitory data with only two descriptors, V and Iimidazole:(3), (4), (5), (6)
Notably, the quality of the fit obtained here with only two descriptors is already comparable to that obtained previously with a total of 20 Molconn-Z descriptors in a final model obtained via a genetic algorithm-combined partial least squares method: r2=0.77 for the n=35 training set.5b While the rising slope is relatively well defined with a value of 0.011(±0.002), the descending slope is much less so, as there are far fewer data points on this portion and the change in the inhibitory activity might be much more dependent on other factors such as, for example, molecular shape. The present data seem to indicate an optimum size around 400 to 450 Å3, just slightly larger than that of the well-known potent inhibitor ketoconazole (Figure 1). The location of the maximum is defined by the parameters of the model (Vmax=vc−η[ln(α1)−ln(−α2)]/[α1−α2]);8 values from Equation 3 give 469 Å3. However, the exact position is not very well defined (e.g., vc=394(±73) Å3; Equation 3) mainly due to the uncertainty in the descending slope, α2.
Figure 1. Inhibitory activity on CYP3A4-mediated metabolism of BFC (7-benzyloxy-4-trifluoromethyl coumarin) as a function of molecular size (volume, V) for a set of structurally diverse compounds (n=44) fitted with the bilinear LinBiExp model (blue line). Blue diamonds indicate experimentally determined median inhibitory concentrations (IC50)5b with an overlaid purple circle indicating those compounds that contain an imidazole moiety and seem to have higher activity (clotrimazole, ketoconazole, miconazole, and midazolam); the corresponding shifted LinBiExp fit is shown as a dotted purple line. Structures for a few representative compounds are indicated at top (with the imidazole fragments shown in dark purple where present) to give a sense of the corresponding molecular sizes.
Download figure to PowerPoint
As the value of δimidazole indicates (+2.11(±0.38) on log-scale), compounds with an imidazole ring in their structure seem to have on average an approximately hundred-fold enhanced inhibitory activity compared to compounds of corresponding size, but lacking this structural element. Previous studies already indicated that the presence of sterically unhindered nitrogen-containing heterocycles (including pyridine, imidazole, or triazole functions) could cause a significantly increased potency of CYP3A4-inhibition (approximately ten-fold for a given lipophilicity, log D7.4);4 this might be linked to the observation that compounds containing a pyridine ring can potentially bind to the catalytic heme moiety of the enzyme.5d With the present approach, only the presence of an imidazole ring seems to cause increased inhibitory potency. Piperazine (buspirone) as well as fused imidazole (astemizole) and triazole (alprazolam, triazolam) rings do not seem to cause significant change in activity. Among the structures with data included in this set, there were no pyridine ring containing ones; there might be a slight increase in inhibitory activity with dihydropyridine rings (e.g., nicardipine, nifedipine, and nimodipine).
For the present analysis, we have used volume (V) as a measure of 3D size as it proved a useful descriptor in our previous similar QSAR studies and there are reasonable physico-chemical considerations supporting its use.8a,11 However, other size-related descriptors closely correlate with V (e.g., molecular weight MW=1.299 V, r2=0.97 for the present data) and could serve as well as descriptors for fitting purposes. Lipophilicity (e.g., the log octanol-water partition coefficient, log Po/w; Table S1) is another descriptor that could play an important role in determining activity (and can often result in bilinear-type dependencies); therefore, we also considered it as a potential main descriptor especially as it was not closely correlated to size here (r2=0.05, n=52). However, for the present data it gave considerably worse fit than volume: r2=0.53, σ=0.87, AIC=159.0, SBIC=167.9 vs. r2=0.69, σ=0.71, AIC=140.9, SBIC=149.9 for V (Equation 3) with bilinear models. Contrary to size, there is no evidence supporting a bilinear or biphasic dependence on log P: a linear model gives fit of similar quality (r2=0.51, σ=0.87, AIC=157.0, SBIC=162.4) and quantitative model selection criteria, such as the Akaike information criterion (AIC) or the SchwarzBayesian information criterion (SBIC), favor the simpler linear model (lower values).8a,11
As a further analysis, we also looked at another, different dataset with the same model: the inhibitory effect on the recombinant human CYP3A4mediated metabolism (demethylation) of erythromycin (EryMyc) assessed for n=28 compounds,4 eight of which, however, are also part of the previous dataset of 44 compounds (Table S1, Supporting Information). It has to be noted that while there are some general similarities, CYP3A4 inhibitory activities are known to be different for different probe substrates clustering around at least three different groups.12 There are indeed considerable differences between the inhibitory activities seen with the two different probes (BFC vs. EryMyc) for the compounds that are present in both datasets with the erythromycin data typically showing lower activity resulting in sometimes more than ten-fold higher IC50 values (e.g., erythromycin 9.8 vs. 132 µM, verapamil 2.8 vs. 76 µM, and so on; Table S1). Nevertheless, the overall trends are quite similar, and the bilinear model gives a slightly worse fit for the EryMyc data, but with parameter values quite similar to those obtained for the BFC data (Figure 2, Equation 4). In addition to imidazole rings, sterically unhindered (e.g., non-fused) triazole rings also seemed to have a significant effect on inhibitory activity in this dataset (fluconazole, propiconazole, and triadimefon). They were incorporated into the same indicator (Iimid/triazole) as they are structurally very similar (five-membered hetero-aromatic rings with two or three nitrogen atoms, respectively). The resulting LinBiExp model gives:(4), (8), (9), (10)
Figure 2. Inhibitory activity on CYP3A4-mediated metabolism (demethylation) of erythromycin (EryMyc) as a function of molecular size (volume) for a set of structurally diverse compounds (n=28) fitted with the bilinear LinBiExp model (light blue line). Light blue triangles indicate experimentally determined median inhibitory concentrations (IC50)4 with an overlaid purple circle indicating compounds that contain unfused imidazole or triazole moieties as in Figure 1.
Download figure to PowerPoint
The ascending slope is, again, well-defined and has a value very similar to that from the previous dataset, 0.012(±0.003) vs. 0.011(±0.002), a nice indication of consistency. The descending slope is much less well-defined as there are only very few large compounds with available data in this size-range (Figure 2). The actual value can hardly be considered more than a very draft estimate, but it is still reassuring that it is close to that obtained in Equation 3 (−0.0019 vs. − 0.0013), and the value of the trend-change point parameter is also in a similar range, 338(±81) vs. 394(±73) Å3.
The unified model gives good overall fit of the CYP3A4-inhibitory activity data from two different assays accounting for close to 65 % of variability by using a bilinear size-based approach. In a consistent manner, trends (slopes) indicate that inhibitory activity tends to increase with size up to a molecular volume of about 400 to 450 Å3 (Vmax=449 Å3) (roughly corresponding to a molecular weight MW of about 500 to 600 Da), and slowly decrease for larger molecules. The heme pocket of CYP3A4 is known to be flexible. It can undergo conformation changes to interact in a quite promiscuous manner with various types of substrate including quite large ones being able to accommodate, for example, erythromycin or two ketoconazole moieties.13 Nevertheless, the corresponding conformational changes can still require an energetic price to pay and might overall favor the fit of a relatively smaller structure in the active site. The very slowly decreasing trend seen for larger structures, as indicated by the barely negative slope of − 0.0014(±0.0016), agrees well with this known flexibility of the binding site. The slope of the increasing portion obtained here, 0.011(±0.002) (Å3)−1, (Equation 5) means that addition of about two methylene-sized atoms (ΔVe≈2×15 Å3) is expected to approximately double the inhibitory affinity (i.e., to cause a change of 0.011×30≈0.3 in log 1/IC50) as long as they still fit within the binding site.
The presence of a non-fused imidazole or triazole ring in the structure seems to cause, in average, an about 40-fold increase in inhibitory activity compared to structures of the same size, but without such rings – the only exceptions from these being the three smallest structures corresponding to 1-methylimidazole, 2-methylimidazole, and cimetidine, respectively (Figure 3, lower left corner). Fused imidazole (astemizole) or triazole rings (alprazolam, triazolam) do not seem to cause such an increase in activity. Other N-containing aromatic heterocycles also did not cause significant alteration of activity; pyridines seem to have inconsistent effects in the second dataset (e.g., metyrapone vs. timoprazole).
Figure 3. Combined inhibitory activity on CYP3A4-mediated metabolism shown as log 1/IC50 values as a function of molecular size (volume) and fitted with the bilinear LinBiExp model (dark blue line corresponding to the base model and dotted lines corresponding to different shifts caused by indicator variables). Experimental data are for the inhibition of the metabolism of BFC (blue diamonds, n=44)5b and erythromycin (EryMyc; light blue triangles, n=28, but with 8 structures that are also part of the BFC dataset),4 respectively. Overlaid purple circles indicate compounds that contain unfused imidazole or triazole moieties and seem to have increased inhibitory activity.
Download figure to PowerPoint
As a final analysis, we also looked at the effect of molecular size on the inhibition of the human CYP3A4-mediated metabolism in a much larger dataset: the effect on the dealkylation of luciferin-6′ phenylpiperazinylyl (Luciferin-PPXE) to luciferin for a total of n=3438 structures classified as active in PubChem Bioassay AID 884. For data from such a large and highly diverse compound library, size can only have limited predictive power. Nevertheless, size (as quantified here by molecular weight as a first estimate) clearly has an activity-limiting role that fits well a bilinear-like pattern as data shown in Figure 4 illustrate. It has to be emphasized again that activities determined with different probe substrates could be different and characterization of CYP3A4 inhibitory activity in general would require data determined with probe substrates representing different major clusters (i.e., testosterone/erythromycin, midazolam/dextromethorphan, and nifedipine).12 Considering the heterogeneity of the inhibitory mechanisms and the diversity of such a large database, our size-based model can only act as a descriptive model. Nevertheless, the slopes of the limiting function here (Figure 4) are a bit more abrupt than those in Equations 3–5, but the overall trend and the existence of an optimum size-range are nicely confirmed.
Figure 4. Inhibitory activity on CYP3A4-mediated metabolism (dealkylation) of luciferin-6′ phenylpiperazinylyl to luciferin (n=3438; PubChem Bioassay AID 884) as a function of molecular size (MW) and with a bilinear model used as an approximate maximum-limiting function.
Download figure to PowerPoint
In summary, we developed an intuitive, molecular size-based model that can explain to a good extent the CYP3A4-inhibitory activity of structurally diverse small-molecule compounds. Undoubtedly, the model is a highly simplified one, and molecular volume as a size descriptor cannot account for many specific effects, such as those related to shape, charge distribution, substituent positioning, hydrogen bonding ability, and others, that can significantly alter the binding affinity and the inhibitory efficacy of these molecules at CYP3A4. Accordingly, there is considerable unexplained variance remaining in the log 1/IC50 data well indicated by the magnitude of the residual standard deviation, e.g., σ=0.78 (Equation 5) so that in many cases there is likely to be a ten-fold error in the IC50 prediction (due to the log scale). Nevertheless, the present model is still very informative as it (i) gives a simple, intuitive, and visual interpretation of the CYP3A4 inhibitory data in contrast to most other previous ‘black box’ type approaches; (ii) still achieves a predictive quality (as characterized by, e.g., r2 and σ) comparable to most other previous models despite using far fewer descriptors (e.g., r2=0.69 with 2 descriptors used here vs. r2=0.77 with 20 descriptors in the original model for data used in Equation 3); (iii) highlights the possibility of a nonlinear relationship and, hence, an optimal size for inhibitors of this important enzyme; and (iv) can serve as a new starting point for other, more complex approaches.