First of all, it is necessary to vary some parameters to construct the HQSAR models. For this, the generation of molecular fragments for each compound was carried out using the following fragment distinctions: atoms (A), bonds (B), connections (C), hydrogen atoms (H), chirality (Ch), and donor and acceptor (DA). Besides, to assess the hologram generation, several combinations of these parameters were considered using the fragment size default (4–7) as follows: AB, ABC, ABCH, ABCHCh, ABCHChDA, ABH, ABCCh, ABDA, ABCDA, ABHDA, ABCChDA, ABCHDA, and ABHChDA. Another considered option during the HQSAR analyses was the screening of the 12 default series of hologram length values, which ranged from 53 to 401 bins. From the patterns of fragment counts, for the training set compounds and the measured biological activity (IC50), several HQSAR models were generated and investigated using full cross-validated r 2 (q 2) PLS and LOO methods. The predictive capability of all models obtained was assessed by analyzing their q 2 values. The statistical results from the PLS analyses for the 40 training set compounds using several fragment distinction combinations are presented in Table 2.
Table 2. HQSAR results using several fragment distinctions (default fragment size = 4–7)
|Model||Fragment distinction||q 2||SEP||r 2||SEE||HL||N|
Analyzing Table 2, we can see that the best statistical results among all of the 13 generated models were obtained for models 2 and 7 (q 2 = 0.776 and r 2 = 0.911), which were derived using A/B/C and A/B/C/Ch, respectively, and with four being the optimum number of PLS components. The only difference among the two better models is the absence of one distinction (chirality, Ch) in model 2. To investigate the importance of this fragment distinction among the two models (models 2 and 7, Table 2), we decided to study the influence of different fragment sizes in the main statistical parameters. Fragment size parameters control the minimum and maximum length of fragments to be included in the hologram fingerprint. The results obtained after the variation of the fragment sizes are displayed in Table 3. From the results presented in Table 3, it is possible to notice that there is no statistical improvement in the generated HQSAR models (q 2 = 0.776 and r 2 = 0.911, for both models). These results also indicate that the inclusion of one fragment distinction (chirality, model 7) did not improve the statistical quality of the model. So, from these results, we decided to use model 2 (A/B/C), with the fragment size default (4–7), in the future analyses.
Table 3. Influence of various fragment sizes for the two models selected (models 2 and 7)
|Fragment size||q 2||SEP||r 2||SEE||HL||N|
|Model 2 (A/B/C)|
|Model 7 (A/B/C/Ch)|
After the selection of the best HQSAR model, the next stage is to validate this model. Two strategies can be employed for this purpose: internal and external validation. In this work, an internal cross-validation (using LOO technique) was performed. A second process of validation (external validation) involves the use of the HQSAR model obtained with the training set to predict the biological activity of new molecules (test compounds). This is possible as the structure encoded within a 2D fingerprint is directly related to biological activity of molecules within the training set, and the HQSAR model can be used to predict the activity of new related molecules from its fingerprint. The external validation process can be considered the most valuable validation method, as the test compounds are completely excluded during the training of the model. The results from internal validation demonstrate a good correlation between experimental and predicted values, as can be seen from Figure 4, which displays the experimental versus predicted pIC50 values for training set molecules.
Figure 4. Predicted versus experimental values of pIC50 for the 52 CB1, inverse agonists (training and test sets) obtained with hologram QSAR method.
Download figure to PowerPoint
The predictive power of the best HQSAR model derived using the 40 training set molecules was assessed by predicting pIC50 values for 12 test set molecules (compounds 41–52, Table 1), which were not included in the training set. The results of the external validation are presented in Table 4 and can also be seen in Figure 4.
Table 4. Experimental and predicted activities (pIC50), along with the residual values, for a test series of CB1 ligands using HQSAR and CoMFA methods
|Experimental||Predicted HQSAR||Residual HQSAR||Predicted CoMFA||Residual CoMFA|
In Figure 4 and Table 4, it is possible to see a good agreement between experimental and predicted values for the test set compounds, which indicates the high reliability of the constructed HQSAR model. From the low residual values, we can observe that the HQSAR model obtained can be used to predict the biological activity of novel compounds within this structural class. The predicted values fall close to the experimental pIC50 values, deviating by 0.33 log units on average.
Finally, another important application of a QSAR model is to provide hints about what molecular fragments are directly related to biological activity. This information could help in the synthesis of new molecules with improved properties. From the HQSAR analyses, it is possible to generate atomic contribution maps, in which a color code that discriminates the positive and negative contributions to the biological activity is used (colors at the red end of the spectrum indicate poor contributions; colors at the green end correspond to favorable contributions; atoms with intermediate contributions are colored white). Therefore, to analyze the most relevant atomic contributions to CB1 affinity, we selected three compounds of the data set: two high-affinity compounds (7 and 19, Table 1) and one compound with moderate affinity (41, Table 1). Figure 5 displays the individual atomic contributions for these three compounds.
According to HQSAR color code, the molecular fragments making the strongest contribution to CB1 affinity are the pyridine ring, along with the phenyl ring attached to C-6 and also the one in the O-benzyl moiety. This indicates the importance of the benzyl-pyridine scaffold into this series of compounds. The favorable contribution of the O-benzyl fragment is in accordance with IC50 data presented by compounds lacking this group (compounds 28–40, see Table 1), which show only moderate affinity to CB1. On the other hand, the most active compounds have a phenyl ring with different patterns of substitution at this location. Furthermore, it is worth noting that the presence of small groups at C-3 is more suitable for high CB1 affinities, while the bulkier piperidinyl fragment in compound 41 makes an unfavorable contribution to the biological property (IC50). These findings suggest that C-3 in the pyridine ring is an important target for molecular modification and additional SAR studies.
Using the pIC50 values and the electrostatic and steric properties calculated by CoMFA method, and based on the molecular alignment performed with all optimized structures (see Figure 3), several PLS analyses were carried out to correlate these CoMFA fields (electrostatic and steric) with the biological property (pIC50). The main statistical results obtained are displayed in Table 5.
Table 5. CoMFA statistical results for the training set
| ||Region focusing|
|No||w = 0.3||w = 0.5||w = 0.8||w = 1.2|
From Table 5, it can be seen that an initial analysis without using region focusing (an advanced method of noise reduction) produced a low cross-validated correlation coefficient (q 2 = 0.569, with four components). Thus, we have applied the region-focusing procedure, weighted by standard deviation coefficient values ranging from 0.3 to 1.2, with a grid spacing varying from 0.5 to 1.5. The best statistical results were obtained when region focusing was weighted by a standard deviation coefficient of 0.8, along with a grid spacing of 1.0 (r 2 = 0.980, SEE = 0.142, q 2 = 0.769, SEP = 0.458 and four components). Within this CoMFA model, the contributions of steric and electrostatic fields correspond to 54.2% and 45.8% of the total variance, respectively. The internal validation was again performed using the LOO methodology. Figure 6 displays the experimental versus predicted values of pIC50, where it is possible to observe the good correlation between experimental and predicted values of pIC50 for the training set compounds.
Figure 6. Predicted versus experimental values of pIC50 for the 52 CB1 inverse agonists (training and test sets) obtained with the comparative molecular field analysis model.
Download figure to PowerPoint
After the model construction and internal validation, the predictive capability of the most significant CoMFA model, obtained with the 40 training set molecules, was assessed by predicting pIC50 values for the 12 test set molecules (compounds 41–52, Table 1). These test set compounds were submitted to the same alignment and descriptor generation procedures as the training set compounds, as previously described. The results of the external validation are listed in Table 4 and Figure 6. Analyzing these results, we can observe a good agreement between experimental and predicted values for the test set compounds, which indicates the reliability of our best CoMFA model. The predicted pIC50 values for the test set compounds fall close to the experimental values, with a few exceptions.
Going further in the CoMFA analysis, the visualization of steric and electrostatic interaction fields is an important tool to guide further molecular modifications in the search toward improved CB1 ligands. Graphical CoMFA results can be analyzed considering steric and electrostatic fields, and favorable and unfavorable regions for substitution by bulky substituents are represented in green and yellow, respectively. Electrostatic features are characterized in such a way that red contours represent regions in which electronegative substituents may increase the biological activity, whereas blue contours indicate regions in which electropositive groups would contribute to enhance activity. Figure 7 displays the CoMFA contour map for the steric and electrostatic fields, in which one of the most potent CB1 ligands (compound 7) is represented.
According to the CoMFA/PLS analysis, the steric and electrostatic field properties contribute in a 54.2/45.8 ratio to the total variance, meaning that both kind of interactions should be considered as equally important to CB1 affinity within this series of compounds. Figure 7 shows a huge region surrounding the pyridine nitrogen where substitution for electropositive groups can favor the biological activity. This finding emphasizes that the C-2 side chain can contribute not only as a hydrophobic functionality but also as a region where polar interactions can take place (23). Considering steric properties, our model shows very clearly that sizable groups as X2 and X3 substituents (see general structure in Table 1) can improve biological activity, as represented by the green contours. On the other hand, our CoMFA analysis shows that some steric restrictions can be expected in the CB1 receptor active site portion interacting with the X1-phenyl ring, as suggested by the yellow contours. Thus, our CoMFA model suggests that the SAR of these compounds can be further explored by modifying X1, X2, and X3 substituents, keeping in mind that X1 must be a small group, while X2 and X3 can be exchanged by bulkier moieties.
Molecular modeling studies performed on SR141716 and some analogs interacting with homology models of the CB1 receptor (29,44) predicted that the antagonists bind to this receptor in a hydrophobic region located within the transmembrane (TM) region formed by helices TM3, TM5, TM6, and TM7 (45). Specifically, hydrogen bond interactions with Lys192 and Ser383 are considered to be crucial for antagonism (46). Additionally, it is well accepted that hydrophobic contacts are important for stabilization of the interaction, along with π–π stacking interactions between the aromatic system formed by the phenyl groups attached to the pyrazole ring and Tyr275, Trp279, Trp356, and Phe379 side chains (45,46). All of these observations are in good agreement with our findings, suggesting that antagonist affinity can be improved as long as a balance between electrostatic and hydrophobic functions are maintained in the substituents neighboring the pyridine nitrogen.