Theoretical Studies of QSAR and Molecular Design on a Novel Series of Ethynyl-3-Quinolinecarbonitriles as Src Inhibitors


Corresponding authors: Wen Juan Wu,


A theoretical study on the two-dimensional, three-dimensional quantitative structure–activity relationships and docking analysis of a novel series of ethynyl-3-quinolinecarbonitriles acting as Src inhibitors has been carried out. To correlate the c-Src kinase-inhibition activity of these compounds with the two-dimensional and three-dimensional structural properties for 39 known compounds, some excellent quantitative structure–activity relationships models with satisfying internal and external predictive abilities were established. A combined method of the density functional theory, molecular mechanics and statistics as well as the comparative molecular field analysis was applied to develop two-dimensional- and three-dimensional-quantitative structure–activity relationship models. The leave-one-out cross-validation q2 values of two-dimensional-quantitative structure–activity relationship and comparative molecular field analysis models are 0.834 and 0.812, respectively. The predictive abilities of these models were further validated by the test set including 10 compounds, and the predicted IC50 values were in a good agreement with the experimental ones. The appropriate binding orientations and conformations of these compounds interacting with c-Src kinase were also revealed by the docking study. Based on two-dimensional- and three-dimensional-quantitative structure–activity relationship results along with docking analysis, some important factors responsible for inhibitory activity of this series of compounds were discussed in detail. These factors can be summarized as follows: selecting certain large-size substituent R2, increasing the negative charge of the first atom of substituent R1 and the net charge of the C15 atom on ring-C will enhance the activity. Meanwhile, the interaction information between protein and ligand was also revealed in detail. These results help to understand the action mechanism and designing novel potential Src inhibitors. Based on the established models and some designing considerations, three new compounds with rather high predicted Src-inhibitory activity have been theoretically designed and presented to experimenters for reference.

c-Src, the oldest and most studied class of non-receptor protein tyrosine kinase, is the prototypic member of a family of kinases (Src Family Kinases, SFKS), which plays a pivotal role in signaling transduction pathways controlling cell growth, proliferation, invasion, and apoptosis. In most normal cells, c-Src is highly regulated at low levels and maintained in an inactive conformation. However, in many human tumor types, c-Src kinase is unregulated (1,2). Recently, many reports have shown that the over-expression or over-activation of c-Src is associated with a variety of human solid tumors, including breast (3), colon (4), and pancreatic (5,6) cancers, suggesting that controlling the c-Src kinase activity is very important to treat cancer. Src has been recognized as an attractive therapeutic target for the discovery of antitumor drugs (7,8), and thus the development of novel c-Src-targeting agents is very significant. So far, various classes of Src kinase inhibitors have been synthesized (9–12), and they have spanned a variety of structural classes, in which the three most advanced compounds, i.e., the anilinoquinazoline AZD0530 (13), the thiazolecarboxamide BMS-354825 (14), and the quinolinecarbonitrile SKI-606 (15,16) (Figure 1) are undergoing clinical evaluation .

Figure 1.

 Molecular structures of AZD0530, BMS-354825, and SKI-606.

SKI-606 (bosutinib) is a highly selective, dual-specific c-Src/Abl kinase inhibitor based on a 3-quinolinecarbonitrile template, and it is currently being in clinical development for the treatment of wide range of tumor types (17). Recently, a novel series of ethynyl-3-quinolinecarbonitriles possessing potent biological activity against Src, in which 7-alkoxy of SKI-606 is replaced by an ethynyl group, have been synthesized by Broschell et al. (18–20). These compounds, acting as ATP competitive inhibitors of Src kinase activity, can block the growth of cancer cells via interfering with the phosphorylation of Src. From the preliminary structure–activity relationship, we know that the peculiar chemical features may influence their activity. Moreover, the previous 3D-QSAR study on the inhibitors of c-Src kinase has revealed some information on the structural modification at several substitutional positions such as the 3-position, the aniline NH position, and the 2′-, 5′-positions of aniline ring (21). In addition, Sun et al. (22) reported 2D-QSAR studies using several statistical approaches on a series of 4-anilino-quinolinecarbonitriles to exploit the structural requirements for activity against Src kinase. However, so far, the details on how the structural features of these ethynyl-3-quinolinecarbonitriles influence their antitumor activities as well as the inhibitory mechanism of these compounds remain largely unknown. So it is a very significant work to investigate the quantitative structure–activity relationship (QSAR) of this kind of compound and thus to gain an insight into the interaction mechanism and to offer useful clues for designing new antitumor drugs.

For many years, quantitative structure–activity relationship (QSAR) has been efficiently used for the study of biological mechanisms of various reactive chemicals (23,24). This is a powerful technique, which quantitatively relates the variations in biological activity to the molecular structures or properties. In particular, at present, multifarious descriptors (variables) in relation to electronic structures, geometric structures and molecular properties can be calculated by using the density functional theory (DFT) and molecular mechanics (MM2) methods; it is very advantageous to establish a robust and predictive 2D-QSAR model and obtain the most efficient variables relating to biological activity. Meanwhile, 3D-QSAR studies of these inhibitors, such as comparative molecular field analysis (CoMFA) and comparative molecular similarity analysis (CoMSIA), can also give a better understanding of the peculiar chemical features and structural requirement relating to their activity. It is a useful methodology and has routinely been used in modern drug design. The 3D-QSAR models help to understand the non-bonding interaction characteristics between the drug molecule and the target (25,26), because they are vivid and robust. In addition, docking simulations were also conducted on these inhibitors, as it can provide a detailed binding interaction between a ligand and a receptor and favor further study of the action mechanism (27). Therefore, a combined 2D-, 3D-QSAR, and docking studies can offer more insight into understanding the detail of protein–inhibitor interactions and the factors affecting the bioactivity, and thus it can more effectively direct the design of new potential inhibitors.

In this work, a novel series of ethynyl-3-quinolinecarbonitriles with antitumor activity against c-Src protein tyrosine kinase reported recently in literatures (18–20) were chosen to perform the 2D-, 3D-QSAR, and docking studies. A combined method of the DFT, MM2, and statistics was adopted to perform the 2D-QSAR study, and the CoMFA method was used to perform the 3D-QSAR study. The aim of this work focuses to establish reliable 2D/3D-QSAR models and determine the probably binding conformations for these compounds and hereby provide a theoretical direction on the molecular design of this kind of novel anticancer drug. Based on these models, some new compounds with higher inhibitory activity against c-Src were theoretically designed and they are waiting for experimental verification.

Calculation and Research Method

Data sets

A novel series of ethynyl-3-quinolinecarbonitriles with well-expressed inhibitory activities against human Src enzyme, and with the homogeneity of the biological assay, were taken to perform this study. The IC50 values of these inhibitors have 750-fold difference between the lowest and the highest active compounds; thus, there is a sufficient diversity in the data set in order to construct stable QSAR models. The general structural formulae of the studied compounds are shown in Figure 2. The total set of these inhibitors (39 compounds) was divided into a training set (29 compounds) to generate the 2D- and 3D-QSAR models and a test set (10 compounds) to evaluate the predictive power of the resulting models. The test compounds were selected manually only in order to consider the structural diversity and wide range of activities in the data set. In vitro c-Src kinase inhibitory activities were converted into the corresponding pIC50 (−logIC50) values and used as dependent variables in the 2D- and 3D-QSAR analyses.

Figure 2.

 General structural formula and numbering of ethynyl-3- quinolinecarbonitriles (left) and template molecule (right, compound 21).

2D-QSAR analysis

Calculations of 2D-QSAR descriptors.

As all of the compounds have no symmetry (see Table 1), in order to save computational time, the 3D geometries of the compounds were first optimized by the MM2 method in chem3d softwarea to search for lower energy conformation for each compound. Then, the DFT calculations were carried out using Becke’s three-parameter hybrid functional (B3LYP) (28) and 6-31G basis set. Furthermore, the stable configurations of the compounds can be confirmed by the frequency analysis, in which no imaginary frequency is found for all configurations at the energy minima. The natural orbital population analysis (NPA) was also carried out to obtain the NPA charges. The above-mentioned calculations were all performed with the gaussian 03 quantum chemistry program package.b In addition, surface area (S), van der Waals volume (V) of the whole molecule, hydrophobic coefficient (logPR), molar refractivity (MRR) of the substituent (R), etc, were also calculated by the molecular mechanics method (MM2) using hyperchem software (Ver. 7.0).

Table 1.   Structures and biological activities of a series of ethynyl-3-quinolinecarbonitriles
No.R1LR2RArIC50 (nm)
  1. aCompounds in the test set.


From the results of the DFT calculations, we selected some parameters of electronic structures, such as the energy (εHOMO) of the highest occupied molecular orbital (HOMO), the energy (εLUMO) of the lowest unoccupied molecular orbital (LUMO), the energy difference (ΔεL − H) between the LUMO and the HOMO of the compound, the total dipole moment (μ) of the molecule, the total net charge (ΣQA) of the ring-A skeleton and the net charge of the atom, such as the net charge (QN1) of the N atom at site 1 on ring-B, the net charge (QR) of the substituent R, etc. Meanwhile, in order to sufficiently consider the related factors, the descriptors of molecular properties were also selected, such as the surface area (SR1), volume (VR1), the hydrophobic coefficient (logPR1), and molar refractivity (MRR1) of the substituent R1, etc. Over all, more than 30 descriptors were adopted as candidates for the statistical analysis. These descriptors can express all of the electronic, geometric and molecular properties and their interactions, and they have definite physical meaning and have been proven to be useful in the interaction mechanism analysis and molecular design.

Statistical analysis

To select out the predominant descriptors affecting the Src inhibitory activity of the compound, the correlation analysis was performed by the statistical software spss,d taking every candidate as an independent variable and pIC50 as a dependent variable. To eliminate the inter-correlative parameters and minimize the information overlap in the model, the descriptors with lower inter-correlation (inline image) were only considered (29). The descriptors with higher correlation with the pIC50 and lower inter-correlation were selected to carry out the stepwise multiple linear regression analysis to establish the optimal 2D-QSAR model. Then, the well-known scheme of leave-one-out (LOO) cross-validation was adopted to evaluate the predictive ability of the established equation. This test is necessary because a high correlation coefficient R2 only indicates how well the equation fits the data rather than how well it can predict any new data not included in the fitted data. The square of cross-validation coefficient q2, which is used as a criterion of both robustness and predictive ability of the model, should be >0.5 for a reliable model (30).

Usually, a high q2 in training set only shows a good internal validation, but it does not automatically infer its high predictive ability of an external test set (31), because q2 over-estimates the predictive ability of the obtained model. To obtain QSAR with more reliable predictive ability, external validation is also crucial, and it was determined with the test set to farther confirm its predictive ability. A QSAR model is accepted to own high predictive power only if the square of predictive correlation coefficient (inline image) between the experimental and predicted activities is >0.6 for the test set (32).

3D-QSAR analysis

Computional method.

Comparative molecular field analysis study was performed using sybyl 6.9 molecular modeling packagee running on an SGI R2400 workstation. All parameters used in CoMFA were default except for explained.

Molecular modeling and alignment.

Bioactive conformation and molecular alignment rules are crucial variables for CoMFA analysis. As the interactive receptor of these inhibitors are not known, the lowest energy conformations were reasonable initial structures to perform CoMFA analysis. Here, energy conformations were first performed in chem3d software to find out the global energy-minimum conformations for the most active compound 21 by rotating rotatable bonds. Then energy minimizations were performed using the Tripos force field (33) with distance-dependent dielectric constant and convergence gradient method with a convergence criterion of 0.21 kJ/mol. The rest of the molecules were built by changing the substitutions of compound 21 and minimized with the same way.

The most active compound 21 was used as a template for aligning the rest of the molecules to it. The common fragment (the atom numbering from 1 to 19) shown in Figure 2 and the aligned compounds are displayed in Figure 3.

Figure 3.

 Alignment of the 39 studied compounds.

CoMFA modeling.

Comparative molecular field analysis was performed using the QSAR option of Sybyl. The steric (Lennard-Jones 6–12 potential) and electrostatic (Coulomb potential) fields were calculated at each grid point using an sp3 carbon probe atom with +1 charge 0.2 nm grid spacing and 0.152 nm van der Waals radium. The truncation for both the steric and the electrostatic energies were set to 30 kcal/mol (125.4 kJ/mol).

The 3D-QSAR equations were generated using the partial least-square (PLS) statistical method. PLS algorithm with the leave-one-out (LOO) cross-validation method (34) was employed to yield the highest cross-validation correlation coefficient (q2) and the optimum number of components N. The non-cross-validation methods were assessed by the conventional correlation coefficient R2, standard error of prediction Spre and F value. To obtain confidence limits and test the stability of the obtained QSAR models, bootstrapping analysis for 100 runs was also carried out.

Molecular docking

To explore a more detailed interaction mechanism between these ethynyl-3-quinolinecarbonitriles and c-Src kinase, molecular docking study was performed with the program autodock 4.2.f

The X-ray crystal structure (PDB ID: 2H8H) of c-Src kinase in complex with a quinazoline derivative AZD0530 was chosen as template. Beginning of docking, all the water molecules and subunits were removed, and the polar hydrogens and Kollman charges were added to the protein. Then, torsions were set for all compounds. Subsequently, grid map was generated, the docking grid and the active site were defined using autogrid 4.0.g The grid box was centered on the center of the ligand from the corresponding crystal structure complex. The Lamarckian genetic algorithm (LGA) was employed to explore the possible conformations of Src inhibitors in the binding site. The docking settings are as follows: a maximum number of 2 500 000 energy evaluations, an initial population of 300 randomly placed individuals, a maximum number of 27 000 generations, a mutation rate of 0.02, a crossover rate of 0.80, an elitism value of 1 and 100 independent runs. The ligand was fully optimized inside the binding site during the docking simulations.

Results and Discussion

2D-QSAR studies

Considering the balance of the QSAR quality and the number of employed descriptors, a 2D-QSAR equation was obtained for 29 compounds in training set through a stepwise multiple linear regression analysis as follows:


where n is the number of compounds in training set, R2 is the square of correlation coefficient of regression, inline image is the square of adjusted correlation coefficient, r is the root-mean-square deviation, Sreg is the standard deviation of regression and Spre is that of the prediction, F is the Fisher ratio value, p is the p value using the F-statistics, and q2 is the square of LOO cross-validated coefficient. The selected parameters in eqn 1, as well as the deviations of regression and prediction are listed in Table 2.

Table 2.   Selected parameters and calculated 2D-QSAR and CoMFA results of the studied compounds
No.Q20 (a.u.)VR2QC15 (a.u.)pIC50 (expt.)2D-QSAR model3D-QSAR model
eqn 1eqn 2eqn 3CoMFA (pred.)Residualsd
  1. aThe regression (reg.) deviations (dev.) for all compounds, reg. dev. = pIC50 (reg.) − pIC50 (expt).

  2. bThe prediction (pred.)deviation (dev.) for compound i (= 1,…,n), = pIC50 (pred.)i − pIC50 (expt)I, here pIC50 (pred.)i is caculated by the reg.equation obtained after leaving compound i out.

  3. cThe square of LOO cross-validation coefficient of compound i is computed by the new cross-validation procedure after leaving this datum point (No. i) out from n compounds.

  4. dResidual = pIC50 (pred.) − pIC50 (expt).


From Table 2, we can see that the inhibitory activity (pIC50 = 5.569) of compound 14 is extraordinarily lower than those of any other compounds, and in the first regression eqn 1, the regression and prediction deviations of compound 14 are rather great and reach 2.51 and 2.62, respectively; thus, compound 14 may be regarded as an outlier. We use a new approach (35) to determine the outliers by calculating ‘LOO’ cross-validation coefficient inline image, which denotes the square of LOO cross-validation coefficient of compound i computed by the new cross-validation procedure after leaving that datum point (No.i) out from n compounds. The compound with unduly high inline image value can be considered as an outlier, and the compound with the low value can be indicated as an influential point. The inline image values of this series of compounds are also listed in Table 2.

From Table 2, we can easily find that compound 14 has an extremely large inline image value (0.7789) in the training set, so compound 14 can be confirmed to be an outlier. In addition, via a careful analysis of the correlation between the pIC50 values and the molecular structures for this series of compounds, we find a peculiar fact as follows: Changing the phenyl to 2-pyridyl in the substituent R2 will make the resulting compound to decrease in activity, for example, changing compound 20 (R2=phenyl-4-CH2NMe2) to 9 (R2=2-pyridyl-4-CH2NMe2), the activity of the latter will decrease 2.4 times. However, changing compound 19 (R2=phenyl-3-CH2NMe2) to 14 (R2=2-pyridyl-3-CH2NMe2) will make the activity of the latter to decrease greatly by 272 times. This is not in line with the general rule. Meanwhile, we can find that all compounds (except for 14) having N23 atom in C-7 substituent have higher activities than compounds having phenyl or pyridyl as C-7 substituent (the reasons will be explained in Section 3D-QSAR model interpretation). It further makes sure that compound 14 is the outlier, and this compound maybe has a special interaction with the receptor. After omitting compound 14, a satisfying model eqn 2 was obtained as follows:


From eqns 1 to 2, the correlation coefficient (R2) and especially the important cross-validated coefficient q2 greatly increase from 0.582 to 0.856 and from 0.490 to 0.779, respectively, F value also increases (from 11.60 to 47.65), whereas r, Sreg, and Spre values decrease. Nevertheless, the deviations of regression and prediction for compound 4 in eqn 2 are still rather large (0.78 and 0.99, respectively), so we calculate the new inline image values of this series of compounds after eliminating compound 14, and list them in Table 2. From Table 2, we can find that compound 4 also has a very large inline image value (0.834) in the training set, so compound 4 should also be considered as another outlier. On the other hand, compound 4 has completely neutral substituent R2 (i.e., the biggest hydrophobic coefficient), and thus has weak binding affinity for plasma protein and bad solubility. Because the substituent R2 of these compounds is exposed to the solvent, a good solubility is needed (the reason will be further explained in Section 3D-QSAR model interpretation). This compound is very different from any one of the other compounds, and we also regard compound 4 as an outlier. Therefore, simultaneously deleting compound 4 and 14 as outliers should be more reasonable. Finally, eqn 3 with the better statistic quality was further developed from eqn 2:


inline image

The R2 value of eqn 3 increases to 0.889 from 0.582 of eqn 1 and 0.856 of eqn 2, and the predictive cross-validation q2 increases to 0.834 from 0.490 of eqn 1 and 0.779 of eqn 2; moreover, the F value of eqn 3 is much greater than those of eqns 1 and 2, whereas the r, Sreg and Spre values also relatively decrease. As the q2 (0.834) of eqn 3 raises 0.055 (>0.05) than that of eqn 2, it further shows that compound 4 should be an outlier. In addition, the value of q2 (0.834) is quite close to that of R2 (0.889), so eqn 3 should be more considerable than eqn 2. Therefore, the following discussion is performed based on the optimal model eqn 3. The deviations of regression and prediction of eqn 3 are also listed in Table 2.

Then, the QSAR model eqn 3 was further applied to predict the inhibitory activities of ten compounds in the test set, and the resulting predictive correlation coefficient for the test set reaches 0.793. The results obtained from the test set further demonstrate that this 2D-QSAR model is very robust and predictive, and it can offer some useful theoretical references for understanding the action mechanism and directing the molecular design of this kind of compound with high inhibitory activity. The plot of the predicted pIC50 values based on eqn 3 versus experimental ones is shown in Figure 4. Obviously, the predicted pIC50 values are in a satisfying agreement with experimental ones.

Figure 4.

 Plot of predicted activities vs. experimental ones based on eqn 3, in which 27 compounds in the training set are expressed as dots and 10 compounds in the test set are expressed as triangles.

In general, the interaction between the drug and its receptor is dependent on two fundamental factors: the electrostatic match and the steric match (36). Equation 3 shows that the volume (VR2) of the substituent R2, the net charge (Q20) of the first atom of the substituent R1 and the net charge (QC15) of the C15 atom on ring-C are the main independent factors contributing to the anticancer activity of this series of compounds. This model exhibits that increasing VR2 and QC15 values and decreasing Q20 value can enhance the anticancer activity.

The evaluation of the correlation coefficient of VR2 versus pIC50 (= 0.814) shows that VR2 has the highest contribution to the pIC50, it implies that VR2 plays a crucial role in the interaction of the compound with the biomacromolecule. The positive sign of VR2 term in eqn 3 indicates that the larger the VR2 value, the higher the activity of the compound is.

From Table 2, we can see that compounds (27, 28 and 35) with the smallest VR2 values (VR2 = 71.7 for H atom) have the smallest (pIC50 = 6.041, for compound 27) or smaller antitumor activities (pIC50 = 6.357 and 6.638 for compounds 28 and 35, respectively).

It is possible that there is a steric match between the substituent R2 and the receptor. Hennequin et al. found that the C-7 substituent of quinazoline-based inhibitors is exposed to the solvent area, and thus, the bulky group substituent with moderate length is favored at this site (37). Therefore, we postulate the substituent R2 should be a part of active zone and it may be fall into the large solvent area of the receptor so that the activity of the compound with R2 having large volume can be greatly improved.

In addition, the electrostatic parameter Q20 exhibits negative correlation with pIC50 (= −0.798), so the larger Q20 value, the lower the activity of the compound is. From Table 2, we can see that compound 21 with the smallest Q20 value (−0.5241 a.u.) has the greatest activity (pIC50 = 8.444) among this data set. As compounds (16, 22, 24, 25, 34, 39) with the larger Q20 values have the smaller activity (their pIC50 values are 6.276–7.658), we can easily find that replacing large electronegative O atom with H atom results in greatly reducing the inhibitory activity of the compound. From the correlation analysis, we can find that the total net charge (ΣQA) of the A-ring, and the hydrophobic parameter (logPR1) of the substituent R1 exhibit the larger correlation to Q20 (R = −0.975 and −0.898, respectively). That means that the larger the ΣQA and logPR1, the lower the Q20 is, and thus the greater the pIC50 is, because large ΣQA means that the first atom of the substituent R1 linking to A-ring should be a strong electronegative atom. Moreover, high hydrophobicity may require a large aliphatic hydrocarbon substituent R1. Thus, we can draw a conclusion that the compound with substituent R1 which has a certain long alkyl chain and its first atom being a strong electronegative atom should have a potent inhibitory activity. Therefore, the charges on the first atom of the substituent R1 may relate to the electrostatic interaction between the ligand and the receptor, and that the first atom of the substituent R1 in the ligand may be an electron-withdrawing part.

Another electronic parameter selected in eqn 3 is QC15, the correlation coefficient of QC15 to pIC50 is 0.357, and this coefficient is positive. The results in Table 2 also show that compound 28 with the almost smallest QC15 value (−0.3552 a.u.) has the lowest activity (pIC50 = 6.041). It means that the electronegative potential of the substituents linking to the C15 atom should enhance the activity based on the law of polarity alternation (38). This conclusion is also in agreement with Thaimatt’s work (21), which had strongly suggested that electronegative substituent at the 2′-position (C15) of the phenyl ring-C might enhance the activity.

According to the above established QSAR model and related discussion, we can see that the three selected parameters (VR2, Q20 and QC15) describe the molecular steric and electrostatic properties that highly influence the biological activities of molecules. Selecting certain large-size substituent R2 and increasing the negative charge of the first atom of substituent R1 and the net charge of the C15 atom may enable to enhance the inhibitory activity of the compound.

CoMFA studies

In addition to 2D-QSAR model, CoMFA model was also established to predict and interpret the c-Src inhibitory activities. Figure 3 shows the aligned molecules within the grid box (grid spacing 2.0 Å) used to generate the CoMFA column. And their statistical parameters are listed in Table 3.

Table 3.   Statistical parameters of the CoMFA model
Statistic indexNq2R2SEEFinline imageSDbsContribution%
  1. *N is the optimal number of components, q2 is the square of leave-one-out (LOO) cross-validation coefficient, R2 is the square of non-cross-validation coefficient, SEE is the standard error of estimation, F is the F-test value, inline image is the square of R of bootstrapping analysis (100 runs), SDbs is the standard deviation by bootstrapping analysis.


For a reliable predictive model, the cross-validation coefficient q2 should be >0.5. This model has a cross-validation coefficient q2 of 0.812 with six components, non-cross-validation coefficient r2 of 0.993, SEE of 0.076 and F value of 457.4, suggesting that the established CoMFA model is reliable and predictive for these compounds. Moreover, the inline image value (0.689) represents that the predictive ability of the CoMFA model is satisfying. The inline image of 0.997 and SDbs of 0.004 obtained from bootstrapping analysis (100 runs) further confirm the statistical validity and robustness of the established model. The contributions of the steric and electrostatic fields are 58.1% and 41.9%, respectively, indicating that the steric field and electrostatic field have all important influences on the ligand–receptor interactions and that the steric field is more preponderant than electrostatic field. It is in a good agreement with the aforementioned result in 2D-QSAR. The predicted pIC50 values of the training and test sets as well as their residual values are also listed in Tables 2 and 4. The plot of the predicted pIC50 values versus experimental ones for the CoMFA analysis is also shown in Figure 5, in which most points are evenly distributed along the line X, suggesting that the CoMFA model has a good quality.

Table 4.   Calculated results for 10 compounds in the test set using eqn 3, as well as predicted activities and residuals by CoMFA approach
No.VR2Q20QC15pIC50 (expt.)eqn 3CoMFA
pIC50 (calc.)aDev.bpIC50 (pred.)Residualsc
  1. aPredictive activities calculated by eqn 3.

  2. bdev. = pIC50 (calc.) − pIC50 (expt.).

  3. cResidual = pIC50 (pred.) − pIC50 (expt.).

Figure 5.

 Plot of the predicted activities vs. experimental ones for the CoMFA model.

Docking analysis

It is well known that SKI-606 can strongly inhibit the phosphorylation of Src by binding to the ATP-binding site. These compounds have similar structures to SKI-606 and are well reported as Src inhibitors (18–20), so we can reasonably assume that these compounds exhibit the same activity site with SKI-606 as Src inhibitor based on the experiments and references, although we do not directly identify the active site in docking analysis. Because docking analysis could offer more insight into understanding the protein–ligand interactions and the structural features of active site of protein, all studied inhibitors were docked into the binding site of c-Src kinase with autodock 4.2.f

Before docking, the reliability of the docking project was first validated. The ligand AZD0530 extracted from the X-ray structure of complex was flexibly redocked to the active binding site of c-Src kinase. As a result, the root mean square deviation (RMSD) between the redocked conformation and the crystallographic conformation of compound AZD0530 was 0.98 Å, suggesting an acceptable docking reliability of docking procedure. So this docking project can be extended to search the binding conformations to c-Src kinase for other inhibitors.

All studied inhibitors and quinolinecarbonitrile SKI-606 were docked into the binding site of c-Src similar to the crystallized inhibitor AZD0530. According to the docked results of all the studied compounds (including compounds 7 and 19), we can find that all compounds are adequately docked, and the orientations are similar except for compounds 4 and 14. Figure 6 shows that SKI-606 and compound 21 suitably occupy the adenine pocket of the ATP-binding site and locate on the hinge region through hydrogen bonds with amino acid residues. The aniline group of SKI-606 is buried in the hydrophobic pocket, while 4-methylpiperazine is oriented toward the solvent region. Their H-bond interactions with the backbone NH of Met341 and the side chain OH group of Thr338 can be clearly observed (see the broken lines in Figure 6). To explore the detailed binding characteristics of these compounds, we selected the most active compound 21 to perform the deeper docking study, and its binding model with c-Src kinase was constructed and shown in Figure 7.

Figure 6.

 Binding conformations of SKI-606 (magenta) and compound 21 (yellow).

Figure 7.

 (A) Docking conformation of the most potent inhibitor 21 and corresponding surface of c-Src at the binding site, in which the red and blue regions represent oxygen and nitrogen atoms respectively, whereas white regions represent carbon or hydrogen atoms. (B) The interactions between the binding site and compound 21.

The aniline ring moiety locates in a large hydrophobic pocket, in Van der Waals contact with Gln274, Val281, Ala293, Ile294, Ala390, Ala403, and Asp404. The substituent R1 is blocked by the side chains of Ser345, and Asp348, limiting its prolongation. The side chain of substituent R2 at the entrance of the binding site created by Ser342, Lys343, Gly344, Phe349, and Thr354 is located in the solvent accessible region. The N1 atom of quinoline ring can form a hydrogen bond with the NH backbone of Met341 at the hinge region of the protein. Another hydrogen bond is formed between N12 atom and the OH side chain of Thr338. N23 atom of substituent R2 with a greater electronegativity easily associates H+ ion in aqueous solvent and forms a positively charged NH group, suggesting a favorable H-bonding interaction with the carbonyl backbone of Lys343.

The three theoretical designed compounds have the binding conformations and interactions with c-Src kinase similar to the compound 21. The position of the substituent R2 in ring-D of compound 19 is the same as that of compound 21 but different from compound 14, which lies in the vicinity of the ethynyl and is blocked by the C-terminal domain residue Ser345. It is a interesting question what interactions/clashes result in the profound loss of activity in compound 14, which has the same substituent in ring-D with compound 19. It may be attributed to the quite different conformations between compounds 14 and 19. From Figure 8, we can observe that the substituent at 7-position of the quinoline of compound 14 is rotated by about 90° compared with compound 21, whereas that of compound 19 is moved just only slightly. This implies compound 14 has different interaction mode with c-Src kinase, resulting in the weak interaction different from compound 19. Meanwhile, from Figure 9, we can also see that compound 4 has a high hydrophobic coefficient of the substituent R2, suggesting that it has different binding mode from compound 21. Therefore, all of the above analyses conformably show that compounds 14 and 4 have different interaction modes with c-Src from the other compounds, and thus, they can be regarded as outliers.

Figure 8.

 The interactions between the binding site and compounds 14 (the magenta one) and 21.

Figure 9.

 The interactions between the binding site and compounds 4 (the magenta one) and 21.

3D-QSAR model interpretation

The CoMFA model can be displayed as vivid 3D contour maps, which can provide a more exhaustive interpretation of the biological activity and the related molecular region information. The calculated CoMFA steric and electrostatic contour maps are shown in Figure 10A,B. The steric interactions are represented by green and yellow contours, while the electrostatic interactions are represented by red and blue contours. In the green regions of steric contour plot, bulky substituents enhance biological activity, while bulky substituents in yellow regions are likely to decrease activity. The blue contours indicate the regions where electropositive groups increase activity, and the red contours indicate the regions where electronegative groups increase activity.

Figure 10.

 CoMFA contour maps of complex 21 as reference: (A) steric field; (B) electrostatic field.

There are three large, yellow contours around the substituent R1, suggesting that small groups in these positions favor the biological activity, because these areas can be blocked by near residues Ser345 and Asp348. This result can be used to explain that compounds 2, 25, and 34 with the smallest H atom as substituent R1 exhibit higher activities of almost 10 times than their C-6 isomers 27 (R1=6-ethynyl-3-pyridyl, R2=H), 28 (R1=6-ethynyl-3-pyridyl, R2=H) and 35 (R1=6-ethynyl-(CH2)2N-Me-piperazine, R2=H). It is notable that a small green contour inserting one big yellow contour is close to the terminal C atom of –OEt of substituent R1. It indicates that certain bulky group in this green position would be advantageous in improving the biological activity. This result can be used to explain that compound 21 with ethoxy as substituent R1 has higher activity than compound 22 with the smallest hydrogen as substituent R1. We can also find that the activity is strongly influenced by the length of alkyl of substituent R1, and the length up to 2 or 3 carbon atoms seems to be suitable. It may be the reason why compounds 21 and 37 with ethoxy as substituent R1 have higher activities than other compounds. So we can draw a conclusion that compounds with high activity should be selected to have a moderate-size substituent R1. Such a conclusion is also in a good agreement with above 2D-QSAR result.

Another two small, yellow contours near the ring-C suggest any bulky substituents at these sites (C15–C19) are likely to decrease the activity. In docking, we can find that the aniline ring is deeply stretched into the large binding pocket with hydrophobic interaction, which was surrounded by Gly274, Val281, Ala293, Ile294, Ala390, Ala403, and Asp404, so bigger groups on ring-C are unfavorable. Meanwhile, the residue Ala403 is found rather close to the chlorine atom linking to C15 of aniline ring; suggesting there is the lipophilic interaction between the chlorine atom and Ala403 within the c-Src kinase selectivity pocket, resulting in the increase of the binding affinity. So it is not strange that compounds 2 (7.328), 8 (7.921), 16 (8.276), 22 (7.658), 27 (6.357), 30 (8.409), and 34 (7.553) with more electronegative atom –Cl linking to C15 position of aniline ring have higher activities than corresponding compounds 25 (7.051), 26 (7.469), 23 (7.538), 24 (6.921), 28 (6.041), 38 (7.678), and 39 (6.658) with H linking to C15. That is why all substituents on ring-C are small groups and the substituent linking to C15 is –Cl atom with more electronegativity for this series compounds.

Another big green contour is near the inline image-position of ring-D, but several small yellow contours lie around the inline image-,inline image-, and inline image-positions of ring-D, suggesting that bulky groups on inline image-position and small ones on inline image-,inline image and inline image-positions are favorable to the activity of the compound. From Figure 7, we can clearly see that the substituent of inline image-position of ring-D stretches outside the entrance of the binding groove of c-Src kinase, suggesting that bulky substituents with moderate length could be tolerated by enzyme. On the contrary, substituents of inline image- and inline image-positions of ring-D locate on proximity of residues Ser342, Lys343, and Gly344, suggesting there is a steric hindrance for the receptor. That may be the reason why the activities of compounds 13 and 11 with bulky substituents on inline image-position are higher than that of compound 9 with bulky groups on inline image-positions. Moreover, the longer the substituent on inline image-position of ring-D, the greater the activity of the compound is. Such a trend can also be found via comparing compounds 17 and 22 with compounds 16 and 5. In addition, the side chain of substituent of C-7 position of the studied compound is exposed to the solvent area and thus certain bulky groups on its side chains can be tolerated by the enzyme. This docking result is also in a good agreement with 3D-QSAR model interpretation.

The electrostatic contour map of CoMFA is displayed in Figure 10B. There is a large, blue polyhedron lies around A-ring, so the more the positive charges of ring-A, the greater the activity of the compound, indicating that the substituents linking to ring-A should be electron-withdrawing groups. This may be attributed to electrostatic interactions between the electropositive part of ring-A and the electron-rich O atom of –OH of residue Glu339 and the S atom of residue Met341 in docking. From Figure 6, we can also find that the quinoline ring of these compounds is sandwiched between the N-terminal domain (Leu273, Gly274) and C-terminal domain (Leu393, Ser345) of the kinase (13) and forms a number of hydrophobic contacts. It is also in accordance with the above result of 3D-QSAR. In addition, two small red contours around the first atom of substituent R1 suggest that adding some substituents at proper sites so as to increase the negative charge of the first atom of substituent R1 may enhance the activity. The docking study also conformably shows that the nearest residue is Tyr340, which can give more favorable contacts with the O atom of substituent R1. For example, compounds 8, 11, 26 and 38 with larger electronegative O atom as the first atom of substituent R1 have higher activities than compounds 2, 22, 25 and 39 with H atom as the substituent R1. O atom is an electron-withdrawing atom which can reduce the electronic density of the ring-A (i.e., increasing the electropositive of ring-A) and make it easily suffer the nucleophilic attacks. Another small red contour near the N23 atom also suggests that compounds having high electronegative atoms on this position exhibit a good activity. The docking study (see Figure 7) has shown that the N23 atom with a greater electronegativity locates in the solvent accessible area being away from the binding pocket, and thus, it easily associates H+ ion in aqueous solvent and forms a positively charged NH group, resulting in a favorable H-bonding interaction with the carbonyl backbone of Lys343. So it may be the key for c-Src kinase inhibitory activity. That is the reason why compounds 9–13, 15–17, 19, 20, 21–24 having N23 atoms in C-7 substituents have higher activities than compounds 2, 3, 5, and 7 with pyripyl and phenyl as C-7 substituents except for compound 14. Moreover, the substituent R2 is exposed to the solvent area and thus the R2 with N atom should have some good physicochemical properties, such as solubility, permeability, etc, to increase the binding ability to plasma protein. Such a result is also in line with Hennequin’s work (39), who had reported that basic nitrogen introducing to C-7 position of the quinazoline core of anilinoquinazolines could significantly increase the solubility.

Design of new compounds with higher inhibitory activity

Based on the above discussion, we can see that Q20 and VR2 play vital roles in affecting the inhibitory activity of the studied compounds. Thus, in order to design new compounds with higher inhibitory activity against c-Src, we should emphatically consider increasing the negative charge of the first atom of R1 with a certain long alkyl chain and selecting certain large-size substituent R2. Meanwhile, we take compounds 15 and 21 as templates (see Figure 11) and start from them to carry out the structural modification, because they have the highest activities in this series of compounds. The structural modification (or molecular design) will be focused on the active two aromatic ring-A and ring-C, especially on phenyl A. Moreover, as the substituent on the 2′-position (C15) of the phenyl ring-C may increase the activity of the compound (21), so we also considered the substituent on these positions to improve the activity of compound.

Figure 11.

 Schematic maps of polarity interference for (A) compound 21 and (B) designed compounds D1D3.

Now, we consider modifying R1 structure to increase the negative charge of the first atom of R1 substituent according to the law of polarity alternation (38) and the idea of polarity interference (40) (see Figure 11), with N atom replacing O atom as the first atom of R1 substituent. In Figure 11, the sketches of polarity interferences are first drawn, where a solid arrowhead toward a more electronegative atom represents the direction of primary bond polarity; a dotted arrowhead represents the direction of induced polarity. To make a schematic map simple, we draw the dotted arrowheads only for the induced bonds on α-site atoms. Based on the idea of polarity interference, we can interestingly see that the induced polarity arrows (i.e., the dotted arrow) for the three bonds around the N20 atom of the designed compounds (Figure 11B) increase to six from two around O20 atom of compound 21, and that they are all in the same direction with their primary bond polarity arrows (i.e., the solid arrow). It means that the three atoms around the N20 atom all repulse electrons to N20 atom. So we can expect that the negative charge of the N20 atom of the designed compound is greatly increased, and thus, the negative charge of the N20 atom is more than that of O20 atom of compound 21.

According to the above established QSAR models and some design considerations, three new compounds with higher inhibitory activity against c-Src have been theoretically designed (Figure 12). Their structures and the three parameter values were calculated by the above same methods, as well as the pIC50 values predicted by the 2D-QSAR and CoMFA model are listed in Table 5.

Figure 12.

 Structural schematic diagrams of the designed compounds D1D3.

Table 5.   Structures and computational results for the three designed compounds
CompoundsR1R2RArQ20 (a.u.)VR2QC15 (a.u.)pIC50 (pred.)

From Table 5, we can find that the inhibitory activities of the designed three new compounds (D1D3) are all much higher (pIC50 = 8.485–8.952) than anyone of the 39 compounds in the training set and test set, because their Q20 values are rather lower (from −0.6200 to −0.6262 a.u.) and VR2 values keep a rather large values (728–730). Such results further suggest that our models have a strong predictive ability and they can be used in molecular design or structural modification.


A theoretical study on a novel series of ethynyl-3-quinolinecarbonitriles acting as Src inhibitors has been carried out using the DFT, MM2, statistics, CoMFA methods, and docking analysis. A 2D-QSAR model including two electrostatic properties (Q20 and QC15) and one steric property (VR2) and showing excellent statistical quality (R2 = 0.889) and predictive ability (q2 = 0.834) was established. Meanwhile, a 3D-QSAR model with a satisfying predictive capability (q2 = 0.812) and robustness (inline image = 0.997), indicating that the steric field (58.1%) and the electrostatic field (41.9%) have all importance influences on the ligand–receptor interaction, was also established. In particular, the appropriate binding orientations and conformations of these inhibitors interacting with c-Src kinase are located by docking study.

The results obtained from the established 2D-, 3D-QSAR models, and docking analysis conformably show the following: (1) The electrostatic interaction and the steric interaction all play valuable roles in determining the biological activity of the studied compounds (2). Selecting certain bulky but hydrophilic group for substituent R2 (especially on the inline image-position of ring-D), may increase the activity, because such a substituent R2 can be suitably located in the entrance of solvent and can interact with Lys343 (3). Increasing the negative charge of the first atom of substituent R1 may enhance the activity, because it is advantageous to increasing more contacts with Tyr340 and Asp348 (4). Introducing a stronger electronegative group to C15 position of ring-C may be favorable to the inhibitory activity, because it can increase the net charge of C15 atom and easily result in lipophilic interaction with Ala403.

Based on the established 2D/3D-QSAR models, docking results and some designing considerations, i.e., taking compounds 15 and 21 with the highest activities as templates to carry out the structural modification, focusing on the active aromatic ring-A and the 2′-position (C15) of ring-C, replacing O atom with N atom as the first atom of R1 substituent to increase the negative charge of the first atom of R1 according to the law of polarity alternation, three new compounds with higher inhibitory activity against c-Src have been theoretically designed. These results will help developing novel potential c-Src inhibitors.


  • a

    chem3d (2005) CambridgeSoft Corp.,100 Cambridge Park, MA, USA.

  • b

    gaussian 03, Revision D.01,Gaussian, Inc., Wallingford, CT, USA 2005

  • c

    hyperchem Ver. 7.0 Hypercube Inc..

  • d

    spss vs 9.0; SPSS Inc.:Chicago, IL, USA 1999.

  • e

    sybyl 6.9[CP] (2001) St Louis, MO, USA: Tripos Associates, Inc.

  • f

    autodock 4.2 The Scripps Research Institute, La Jolla, CA, USA.

  • g

    autogrid 4.0 The Scripps Research Institute, La Jolla, CA, USA.


We gratefully acknowledge supports of this research by the National Natural Science Foundation of China (No. 20903026), the Science and Technology Planning Project of Guangdong Province (No. 2007B030702007) and the Natural Science Foundation of Guangdong (No. S2011010002483, S2011010002964), We also heartily thank computation environment support by the Department of Biochemistry, College of Life Sciences, Sun Yat-Sen University for SYBYL 6.9.