SEARCH

SEARCH BY CITATION

Keywords:

  • 2-arylbenzimidazole;
  • checkpoint kinase inhibitors;
  • superaugmented eccentric distance sum connectivity topochemical indices;
  • topological indices

Abstract

  1. Top of page
  2. Abstract
  3. Methodology
  4. Results and Discussion
  5. Conclusion
  6. References

Four highly discriminating fourth-generation topological indices (TIs), termed as superaugmented eccentric distance sum connectivity indices, as well as their topochemical versions (denoted by inline image, inline image, inline image and inline image), have been conceptualized in this study. The values of these indices for all possible structures with three, four, and five vertices containing one heteroatom were computed using an in-house computer program. The proposed superaugmented eccentric distance sum connectivity topochemical indices exhibited exceptionally high discriminating power, low degeneracy, and high sensitivity toward both the presence and the relative position of heteroatom(s) for all possible structures with five vertices containing at least one heteroatom. Intercorrelation analysis revealed the absence of correlation of proposed indices with Zagreb indices and the molecular connectivity index. Subsequently, the proposed TIs were successfully utilized for the development of models for the prediction of checkpoint kinase inhibitory activity of 2-arylbenzimidazoles. A data set comprising 47 differently substituted analogs of 2-arylbenzimidazoles was selected for the study. The values of various TIs for each analog in the data set were computed using an in-house computer program. The resulting data were analyzed, and suitable models were developed through decision tree (DT), random forest (RF), and moving average analysis (MAA). The performance of the models was assessed by calculating the specificity, sensitivity, overall accuracy, and Mathew’s correlation coefficient. A decision tree was constructed for the checkpoint kinase inhibitory activity to determine the importance of topological indices. The decision tree identified the proposed TIs –inline image, inline image– as the most important indices. The decision tree learned the information from the input data with an accuracy of 96% and correctly predicted the cross-validated (10-fold) data with an accuracy of 77%. Random forest correctly predicted the checkpoint kinase inhibitory activity with an accuracy of 83%. The single index-based models were also developed for the prediction of checkpoint kinase inhibitory activity using MAA. The accuracy of prediction of single index-based models derived through MAA was found to vary from a minimum of 90% to a maximum of 95%. Exceptionally high discriminating power, low degeneracy, and high sensitivity toward branching and presence of heteroatom of proposed indices can be of immense use in drug design, isomer discrimination, similarity/dissimilarity studies, quantitative structure activity/property relationships, lead optimization, and combinatorial library design.

The identification and optimization of the lead compounds in a rapid and cost-effective way are the most critical steps in drug discovery. The computer-aided drug discovery approach offers an alternative to the real world of synthesis and screening (1,2). The computational techniques have advanced rapidly over the past few decades and have played a major role in the development of a number of drugs now in the market or going through clinical trials (3,4). QSAR/QSPR is the mathematical relationship linking chemical structure and pharmacological activity/property in a quantitative manner for the series of compounds (5). It also reduces the number of compounds to be synthesized and promptly detects the most favorable compounds. Fundamentally, QSAR aims to identify relationships between some aspects of molecular structure and properties as toxicology, pharmacodynamics, and pharmacokinetics (6).

The 2D approach has a number of advantages compared with the higher dimension QSAR methodologies. First of all, owing to the variety of molecular descriptors available, optimized coordinates are not always required. In fact, connectivity information (in the form of an adjacency matrix) alone can be used to develop QSAR models. As a result, the models using topological descriptors can be built rapidly for very large sets of molecules. Second, this approach avoids the alignment step and thus can be used in the absence of experimental information regarding the binding of a molecule to its target.a

The 2D QSAR makes use of TIs which are the numerical values associated with the chemical constitution for correlation of chemical structures with various physical properties, chemical reactivity, or biological activity (7). These are derived from topological representation of molecules and can be considered structure explicit descriptors (8). The TIs are among the most useful descriptors known nowadays, as these can be rapidly computed for large number of molecules and also offer a simple way of measuring molecular branching, shape, size, cyclicity, symmetry, chirality, complexity, and heterogeneity of atomic environments in the molecule (9–14). The past two decades have witnessed that the use of TIs in QSAR models enhanced the scope of drug design by producing the reliable estimates of therapeutic and toxic potential of chemicals (15).

The genetic integrity of a cell is constantly challenged by radiation, chemical agents, and replication errors (16). These agents mainly cause double strand breaks (DSB) and single strand breaks (SSB) and cause genomic instability that may lead to tumor development, if left unrepaired (17). The DNA damage is also used to cure the cancer. Many of the conventional anticancer treatments (ionizing radiation, hyperthermia, pyrimidine and purine antimetabolites, alkylating agents, DNA topoisomerase inhibitors, and platinum compounds) at least partly damage the DNA of cells. As these treatments are not specifically selective for cancer cells, patients have suffered from serious side effects when taking these drugs (18). Therefore, DNA damage causes the disease, used to treat the disease, and responsible for the toxicity of therapies for disease (19).

In DNA damage response (DDR), eukaryotic cells activate checkpoint pathways to arrest the cell cycle (20–22). The checkpoints comprise a subroutine integrated into the larger DDR pathway that regulates a multifaceted response. Moreover, several checkpoint genes are essential for cell and organism survival (23–27) implying that these pathways are not only surveyors of occasional damage but are firmly integrated components of cellular physiology (22).

The DNA damage checkpoints are known to comprise signal transduction cascades that link the detection of DNA damage to several other processes, i.e. inhibition of progression through the cell cycle from G1 to S, through S and from G2 into M, activation of DNA repair and initiation of apoptosis (28). DNA damage is recognized by damage sensor proteins such as Mre11-Rad50-Nbs1 (MRN complex) and breast and ovarian cancer locus 1 (BRCA1)-associated genome surveillance complex (BASC). These proteins recruit and activate the upstream Ataxia-telangiectasia mutated (ATM) protein and ATM and Rad 3-related (ATR) kinases (17,29). Checkpoint kinases Chk1 and Chk2 are downstream key mediators of DDR through activation of an increasing number of substrates such as p53, NBS1, BRCA1, MDM2, Cdc25A, Cdc25C, and E2F1 (30–32). The relevance of these kinases in the maintenance of genome integrity is clearly indicated by the severe human genetic disorders and the predisposition to cancer associated with defects in these proteins (20,33–35).

Radiation and chemotherapy as the therapy for cancer often have serious side effects that limit their efficacy. Modulations of checkpoint regulating responses to these types of drugs appear as a potential strategy to sensitize the tumor cells to the DNA damaging agents (17). Checkpoint kinase 2 acts as mediator between DNA damage signaling and also act as barrier for tumorogenesis (36). There is evidence in favor of therapeutic value of Chk2 inhibitors (37,38). Checkpoint kinase 2 inhibitors are reported to augment the effect of various cytotoxic drugs, e.g. Doxorubicin (39), Cisplatin (40), and Paclitaxel (41).

The side effects from the radiation therapy have been reported as more serious. As these side effects are in part determined by p53-mediated apoptosis, temporary suppression of p53 has been suggested as a therapeutic strategy to prevent damage of normal tissues during treatment of p53-deficient tumors (42,43). The p53 response to DNA breaks induced by radiation and certain chemical agents is controlled by Chk2 (36). Studies showed that Chk2-deficiency exhibited radioresistance and a critical role in p53 function in response to IR by regulating its transcriptional activity and its stability indicating the utility of Chk2 inhibitors as radioprotectant for normal cells (44,45). Thus, Chk2 inhibitors may be useful drugs for reducing the side effects of cancer therapy and other types of stress associated with p53 activation (46,47).

Agents that target checkpoint kinases have demonstrated impressive evidence preclinically that this approach will provide tumor-specific potentiating agents and may have broad therapeutic utility. Only a few selective Chk2 inhibitors have been reported other than 2-arylbenzimidazole (48), NSC 109555 (49), VRX0466617 (50), isothiazole carboxamides (51), and PV 1019 (52). There are various published inhibitors of Chk1 (Staurosporin, Go6976, SB-218078, ICP-1, CEP-3891, and AZD7762) (53) and both Chk1 and Chk2 (TAT-S216A, UCN-01, and debromohymenialdisine) (54,55), CEP-6367, Sulforaphane (18,56,57).

The past decade has witnessed the development of checkpoint kinase inhibitors for the treatment of cancer. Three checkpoint kinase inhibitors have already entered clinical trials since 2005 (58). The pharmaceutical industry strives to explore novel scaffolds for checkpoint kinase inhibition.

In this study, four topological descriptors termed as superaugmented eccentric distance sum connectivity indices and their topochemical versions have been conceptualized and successfully utilized along with existing TIs for development of models for prediction of checkpoint kinase (Chk2) inhibitory activity of 2-arylbenzimidazoles.

Methodology

  1. Top of page
  2. Abstract
  3. Methodology
  4. Results and Discussion
  5. Conclusion
  6. References

Calculation of topological indices

The values of inline image were calculated for all possible structures with three, four, and five vertices containing one heteroatom (Figures 1 and 2.) using an in-house computer program.

image

Figure 1.  Index values of for all possible structure with three, four, and five vertices containing one heteroatom. *Cpd no., compound number.

Download figure to PowerPoint

image

Figure 2.  Calculation of values of superaugmented eccentric distance sum connectivity topochemical index-1 (inline image), superaugmented eccentric distance sum connectivity topochemical index-2 (inline image), superaugmented eccentric distance sum connectivity topochemical index-3 (inline image), and superaugmented eccentric distance sum connectivity topochemical index-4 (inline image), for three isomers of 11-membered molecule (decylamine).

Download figure to PowerPoint

Superaugmented eccentric distance sum connectivity indices

Superaugmented eccentric distance sum connectivity indices, inline image, proposed in this study can be defined as the inverse of the summation of quotients of the product of adjacent vertex degrees and the product of the squared distance sum and eccentricity of the concerned vertex for all vertices in a hydrogen-suppressed molecular graph. It can be expressed as follows:

  • image(1)

where Mi is the product of degrees of all the vertices (vj), adjacent to vertex i and can be easily obtained by multiplying all the non-zero row elements in augmentative adjacency matrix, Ei is the eccentricity, Si is the distance sum of vertex i, and n is the number of vertices in the graph, and the N is equal to 1, 2, 3, 4 for superaugmented eccentric distance sum connectivity indices-1, -2, -3, -4, respectively.

Similarly, the topochemical version of superaugmented eccentric distance sum connectivity indices can be defined as the inverse of the summation of quotients of the product of adjacent vertex chemical degrees and the product of the squared chemical distance sum and chemical eccentricity of the concerned vertex for all vertices in a hydrogen-suppressed molecular graph.

It can be expressed as follows:

  • image(2)

where Mic is the product of chemical degrees of all the vertices (vj), adjacent to vertex i and can be easily obtained by multiplying all the non-zero row elements in additive chemical adjacency matrix, Eic is the chemical eccentricity, Si is the chemical distance sum of vertex i, and n is the number of vertices in the graph, and the N is equal to 1, 2, 3, 4 for superaugmented eccentric distance sum connectivity topochemical indices-1, -2, -3, -4, respectively (denoted by inline image, inline image, inline image, and inline image).

Superaugmented eccentric distance sum connectivity topochemical indices can be easily calculated from the chemical distance matrix (Dc), chemical adjacency matrix (AC), and augmentative chemical adjacency matrix (inline image). The calculation of proposed inline image, inline image, inline image, and inline image for three isomers of 11-membered molecule (decylamine) has been exemplified in Figure 2.

The index values of the proposed topochemical descriptors toward presence and the relative position of heteroatom(s) for all three-, four-, and five-membered isomers containing one heteroatom have been complied in Figure 1. The discriminating power and degeneracy of the superaugmented eccentric distance sum connectivity topochemical indices were investigated using all possible structures with three, four, and five vertices containing one heteroatom has been given in Table 1. The intercorrelation of the proposed superaugmented eccentric distance sum connectivity indices with Wiener’s index, Zagreb indices, the molecular connectivity index, and eccentric connectivity indices were investigated (Table 2).

Table 1.   Comparison of the discriminating power and degeneracy of inline image , inline image, inline image, inline image using all possible structures with three, four, and five vertices containing one heteroatom
 inline imageinline imageinline imageinline image
  1. aDegeneracy: number of compounds having same values/total number of compounds with same number of vertices.

For three vertices
 Minimum value0.3630.3950.4280.461
 Maximum value2.4843.8095.3967.16
 Ratio1:6.8431:9.6431:12.611:15.54
 Degeneracya0/30/30/30/3
For four vertices
 Minimum value0.0890.0990.1090.119
 Maximum value6.58514.85832.73770.8
 Ratio1:73.9891:150.0801:300.341:594.96
 Degeneracy0/110/110/110/11
For five vertices
 Minimum value0.0390.0470.0570.067
 Maximum value11.81430.23674.188176.075
 Ratio1:302.9231:643.3191:1301.541:2627.99
 Degeneracy0/470/470/470/47
Table 2.   Intercorrelation matrix
 χAξcinline imageinline imageWcinline imageinline imageinline imageinline image
χA10.9390.590.6190.743−0.010.0610.1060.132
ξc 10.5990.6620.67−0.07−0.5670.0620.093
inline image  10.9790.016−0.62−0.567−0.53−0.496
inline image   10.045−0.57−0.502−0.46−0.422
Wc    10.5480.580.5930.594
inline image     10.9930.980.965
inline image      10.9960.988
inline image       10.998
inline image        1

Topological indices

The 26 descriptors including the proposed indices (Table 3) (59–75) of diverse nature were used in this study. Though a total of 26 descriptors were employed for the present study, only 14 indices were shortlisted on the basis of non-correlating nature and classification ability. These shortlisted indices used in the present study are defined below.

Table 3.   Topostructural and topochemical indices
CodeIndexReferences
A1Molecular connectivity topochemical index(59,60)
A2Eccentric adjacency topochemical index(61)
A3Augmented eccentric connectivity topochemical index(62)
A4Superadjacency topochemical index(63)
A5Eccentric connectivity topochemical index(64)
A6Connective eccentricity topochemical index(65)
A7Zagreb topochemical index, inline image(66)
A8Zagreb topochemical index, inline image(66)
A9Wiener’s topochemical index(67)
A10Superaugmented eccentric connectivity topochemical index-1(68)
A11Superaugmented eccentric distance sum connectivity topochemical index-1
A12Superaugmented eccentric distance sum connectivity topochemical index-2
A13Superaugmented eccentric distance sum connectivity topochemical index-3
A14Superaugmented eccentric distance sum connectivity topochemical index-4
A15Molecular connectivity index(69)
A16Eccentric adjacency index(70)
A17Augmented eccentric connectivity index(71)
A18Superadjacency index(63)
A19Eccentric connectivity index(72)
A20Connective eccentricity index(73)
A21Zagreb index, M1(74,75)
A22Zagreb index, M2(74,75)
A23Superaugmented eccentric distance sum connectivity index-1
A24Superaugmented eccentric distance sum connectivity index-2
A25Superaugmented eccentric distance sum connectivity index-3
A26Superaugmented eccentric distance sum connectivity index-4

Wiener’s topochemical index (Wc)

Wiener’s topochemical index (67) is defined as sum of the chemical distances between all pairs of vertices in hydrogen suppressed molecular graph. It is a refined form of the oldest and widely used distance-based topological index, Wiener’s index (76), and this modified index considers the presence and the relative position of heteroatom(s) in a molecular structure. It can be expressed as

  • image(3)

where Picjc is the chemical length the path that contains the least number of edges between vertex i and j in the graph G and n is the number of vertices in the hydrogen depleted graph(67).

Zagreb indices (M1 and M2)

This pair of indices (74,75) denoted by M1 and M2 was introduced in 1972 and is defined as per the Equations 4 and 5.

  • image(4)
  • image(5)

where d(i) is the degree of vertex i, which can be defined as number of edges incident on a vertex i and d(i)d(j) is the weight of edge {i,j}.

Similarly Zagreb topochemical indices (66) inline image and inline image are defined as per the Equations 6 and 7.

  • image(6)

where dc(i) is the chemical degree vertex i and n is the number of vertices.

  • image(7)

where dc(i) dc(j)is the chemical weight of edge {i, j} in the hydrogen suppressed molecular graph and n is the number of edges.

Connective eccentricity index

Connective eccentricity index (73) can be defined as summation of the ratios of the degree of a vertex (Vi) and its eccentricity (Ei) for all vertices in the hydrogen suppressed molecular structure. It can be expressed by the following equation:

  • image(8)

The eccentricity Ei of a vertex i in a graph G is the path length from vertex i to the vertex j that is farthest from iinline image.

Data set

A data set (48) comprising 47 analogs of 2-arylbenzimidazole was selected for the present investigation. The basic structure for these analogs is depicted in Figure 3, and various substituents are enlisted in Figure 4. The values of 26 descriptors (Table 3) used in this study were calculated for all the analogs involved in the data set using an in-house computer program. Compounds having reported IC50 values of ≤25 nm were considered to be active, whereas those possessing IC50 values >25 nm were treated to be inactive for the purpose of the present study.

image

Figure 3.  Basic structures of 2-arylbezimidazole analogs (48).

Download figure to PowerPoint

image

Figure 4.  Relationship of superaugmented eccentric distance sum connectivity topochemical indices, Zagreb topochemical Index, Wiener’s topochemical Index with Checkpoint Kinase (Chk2) inhibitors. (+) active compound, (−) inactive compound and (±) compound in transitional range.

Download figure to PowerPoint

Decision tree

Decision tree provides a useful solution for many problems of classification where large data sets are used and the information contained is complex. A decision tree (generally defined) is a tree whose internal nodes are tests (on input patterns) and whose leaf nodes are categories (off patterns). A decision tree assigns a class number (or output) to an input pattern by filtering the pattern down through the tests in the tree. Each test has given mutually exclusive and exhaustive outcomes.

Decision trees are constructed beginning with the roots of tree and proceeding down to its leaves. In terms of ability, decision trees are a rapid and effective method of classifying data set entries and can provide good decision support capabilities (77,78). In this study, the decision tree was grown to identify the importance of TIs. In a decision tree, the molecules at each parent node are classified, based on the index value, into two child nodes. The prediction for molecule reaching a given terminal node is obtained by majority vote of molecules reaching the same terminal node in training set. In this study, r program (version 2.1.0; University of Auckland, Auckland, New Zealand) along with the RPART library was used to grow the decision tree. The active compounds were labeled as ‘A’ (n = 18) and the inactive compounds were labeled ‘B’ (n = 29). Each analog was assigned a biological activity, which was then compared with the reported Chk2 inhibitory activity.

Random forest

Random forest (RF) was grown for Checkpoint (Chk2) inhibitory activity. Random forest grows numerous classification trees. To classify a new object from an input vector, put the input vector down each of the trees in the forest. Each tree gives a classification means the tree ‘votes’ for that class. The forest chooses the classification having the most votes (over all the trees in the forest) (79). In this study, the RFs were grown with the r program (version 2.1.0) using the RF library.

Moving average analysis

Moving average analysis constitutes the basis for development of single topological index-based model (70,80). For the selection and evaluation of range-specific features, exclusive activity ranges were discovered from the frequency distribution of response level and subsequently identify the active range by analyzing the resulting data by maximization of the moving average with respect to active compounds (<35% = inactive, 35–65% = transitional, >65% = active) The checkpoint kinase (Chk2) inhibitory activity assigned to each compound was compared with the reported biological activity. The average IC50 (nm) values for each range and activity were also calculated.

Data analysis

The sensitivity and specificity values were calculated, which represents the classification accuracies for the active and inactive compounds, respectively. The randomness of model was also predicted by calculating Mathew’s correlation coefficient (MCC). The MCC values ranging between −1 and +1 indicates the potential of model. Mathew’s correlation coefficient took both the sensitivity and specificity into account, and it is generally used as a balanced measure in dealing with data imbalance situation (81).

The results are summarized in Tables 4 and 5 and Figures 5 and 6. The validation of the decision tree (DT)-based model and self consistency test were performed by 10-fold cross validation (CV) method, in which the data set was randomly split into 10-folds. The model was developed using nine randomly selected folds, and the prediction was done on the remaining fold. The goodness of DT-based model was also assessed by calculating the specificity and sensitivity. The 10-fold cross validation results have been presented in Table 4.

Table 4.   Confusion matrix for checkpoint kinase (Chk2) inhibitory activity and recognition rate of models based on decision tree and random forest (RF)
ModelDescriptionRangesNumber of compound predictedSpecificity (%)Sensitivity (%)Mathew’s correlation coefficient
ActiveInactive
Decision treeTraining setActive17196.594.40.9
Inactive128
Cross-validated setActive12682.766.60.03
Inactive524
RF Active16282.788.80.098
Inactive524
Table 5.   Proposed model for the prediction of checkpoint kinase inhibitors
IndexNature of rangeIndex valueTotal compounds in the rangeNumbers compounds predicted correctlyOverall accuracy of prediction (%)Average IC50 (nm)
  1. NA, not applicable.

  2. Values in brackets are based on correctly predicted analogs in the particular range.

inline imageLower inactive<140599.32421901239.44(1414.2)
Active140599.3–2076090.4131266.49 (10.95)
Transitional>2076090.4–<267916.77NA123.38
Upper inactive≥267916.7333620 (3620)
inline imageLower inactive<1298135171794.591672.7 (1672.7)
Transitional1298135–<185965110NA205.46
Active1859651–2357104111110.4 (10.4)
Upper inactive>2357104971301.41 (1671.42)
inline imageInactive<157.64191890.321508 (1590.444)
Lower transitional157.64–<171.649NA132.33
Lower active171.64–184.28565120.417 (8.5)
Upper transitional>184.285–<202.2677NA123.38
Upper active202.267–245.436514.4167 (9.1)
WcLower inactive<2014.091616>991152.25 (1152.25)
Lower transitional2014.09–<2223.329NA146.7
Active2223.32–2431.067710.843 (10.843)
Upper transitional>2431.0615NA165.7
image

Figure 5.  A decision tree for distinguishing active analog (A) from inactive analog (B); A13-superaugmented eccentric distance sum connectivity topochemical index-3 (inline image), A14-superaugmented eccentric distance sum connectivity topochemical index-4 (inline image), A6 - Connective eccentricity topochemical index.

Download figure to PowerPoint

image

Figure 6.  Average IC50 (nm) value of correctly predicted analogs of 2-arylbenimidazole in various ranges of topological models.

Download figure to PowerPoint

Results and Discussion

  1. Top of page
  2. Abstract
  3. Methodology
  4. Results and Discussion
  5. Conclusion
  6. References

The successful application of many topological descriptors is somewhat limited owing to low discriminating power and high degeneracy. There is always a strong need for the development of descriptors and approaches that could provide explicit information on the molecular aspects responsible of drug action (1). Moreover, pharmacogenomics (82), combinatorial chemistry (83,84), and high through put screening (85) permit to obtain and evaluate thousands of compounds in a short time. These technologies have generated new challenges for computational scientists, as they demand novel approaches to the computer-aided lead discovery and optimization in an accelerated way (86).

As the structure of the compound depends on connectivity of its constituent atoms, therefore, TIs based on connectivity can reveal the role of structural and substructural information of molecules in estimating biological activity and evaluate toxicity. Topological indices developed for predicting physicochemical properties and biological activities of chemical substances can be used for drug design (87,88). The application of TIs in drug design can be in lead discovery and lead optimization, virtual screening, structure activity/property studies, structure pharmacokinetics study, and structure toxicity relationships. Recently, these are also being used in similarity/dissimilarity studies, combinatorial chemistry in studying the chirality of the molecule, isomer discrimination, and molecular complexity (1,3).

As shown in Figure 2, the value of inline image changes by a factor of 11 (from 238.801 to 20.804), the value of inline image changes by a factor of 30 (1554.158–52.646), the value of inline image changes by a factor of about 77 (9810.431–127.118), and the value of inline image changes by a factor of 203 (60235.7–296.55) with a minor change in the branching of an 11-membered molecule containing one heteroatom. These descriptors have high discriminating power, which is defined as the ratio of highest to lowest value for all possible structures of same number of vertices. The discriminating power of inline image, inline image, and inline image is 302.9, 643.31, 1301.54, and 2627.99, respectively for all possible structures containing only five vertices (Table 1).

High discriminating power of proposed new descriptors renders them extremely sensitive toward any change in molecular structure. The indices having discriminating power ≥100 for structures containing only five vertices are treated as ‘fourth-generation’ topological descriptors (68,89). inline image, inline image, inline image, and inline image did not exhibit any degeneracy for all possible structures with three, four, and five vertices.

Extremely low degeneracy of the proposed indices ensures the enhanced sensitivity toward the minor changes in branching, connectivity, and changes in the molecular structures. The intercorrelation between the proposed superaugmented eccentric distance sum connectivity topochemical indices and other well-known TIs was also investigated. Pairs of TIs with r ≥ 0.97 are considered highly intercorrelated, those with 0.90 ≤ r ≤ 0.97 appreciably correlated, those with 0.50 ≤ r ≤ 0.89 weakly correlated, and finally the pairs of TIs with r < 0.50 are not intercorrelated (90). As indicated in Table 2, inline image, inline image, inline image, and inline image are not intercorrelated with the well-known χA, ξc, M1, and M2. However, these indices were found to be weakly intercorrelated with Wc and highly intercorrelated with each other, as these are based on similar principles/matrices. The pair of indices χA and ξc, M1 and M2, are highly intercorrelated, whereas χA and Wc, ξc and Wc, ξc and M1, ξc and M2 are found to be weakly intercorrelated, while M1 and M2 are found not be intercorrelated with Wc.

In this study, DT-, random forest (RF)- and moving average analysis (MAA)-based models were developed for the prediction of checkpoint kinase (Chk2) inhibitory activity of 2-arylbenzimidazole. The decision tree was built by utilizing 26 TIs of diverse nature. The index at root node is most important, and the importance of index decreases as the length of tree increases. The classification of 2-arylbenzimidazoles analogs both as active and inactive using a single tree, based on A13, A14, and A6, is illustrated in Figure 5 (the respective descriptor is denoted with an alphanumerical abbreviation that refers to Table 3). The decision tree identified the A13 (inline image) as the most important index. The decision tree classified the 2-arylbenzimidazoles analogs in the training set with an accuracy of 96% and 10-fold cross-validation with an accuracy of 76.6%. The specificity and sensitivity of the DT-based model in training set were of the order of 96.5% and 94.4%, respectively (Table 4). The specificity and sensitivity of the DT-based model in cross-validated set with respect to inactive analogs were of the order of 82.7% and 66.6%. The values of MCC for DT-based model in the training set and cross-validated set are 0.9 and 0.03, respectively, suggesting the randomness and robustness of the model. The values of specificity, sensitivity, and MCC are shown in Table 4.

The RFs were grown with 26 topological descriptors enlisted in Table 3. The importance of node was determined by mean decrease in accuracy. The RF classified 2-arylbenzimidazoles analogs either as active or as inactive with an accuracy of 83%. The specificity and sensitivity were of the order of 82.7% and 88.8%, respectively, and the value of MCC was found to be 0.098 (Table 4).

Using a single index at a time, four independent MAA-based models using inline image, inline image, inline image, and Wc were developed. The proposed models are shown in Table 5. The methodology used in this study aims at the development of suitable models for providing lead molecules through exploitation of the active ranges in the proposed models. These models are unique and differ widely from the conventional QSAR models. Both systems of modeling have their own advantages and limitations. In the instant case, the modeling system adopted has distinct advantage of identification of narrow active range, which may be erroneously skipped during routine regression analysis in conventional QSAR modeling (68). As the ultimate goal of modeling is to provide lead structures, therefore, these active ranges can play a vital role in lead identification.

Retrofit analysis of data (Figure 4 and Table 5) reveals that the MAA-based models derived from inline image, inline image, inline image, and Wc correctly predicted analogs with regard to checkpoint kinase inhibitory (Chk2) activity to the tune of 90%, 94.5%, 90.32% and >99%, respectively. The transitional ranges were observed in all the four models indicating a gradual change in checkpoint kinase inhibitory activity. The active ranges of the models based on inline image and Wc correctly predicted checkpoint kinase inhibitory (Chk2) activity of analogs with an accuracy of >99%. As observed from Table 5 and Figure 6, the average IC50 of correctly predicted analogs of the active ranges of all the four models varied from only 8.5 to ∼11 nm indicating exceptionally high potency. High accuracy of prediction amalgamated with high potency renders active ranges of the proposed models extremely beneficial for providing lead structures for the development of potent checkpoint kinase inhibitors.

Conclusion

  1. Top of page
  2. Abstract
  3. Methodology
  4. Results and Discussion
  5. Conclusion
  6. References

Superaugmented eccentric distance sum connectivity topochemical indices– novel molecular descriptors exhibited exceptionally high discriminating power and sensitivity towards both the presence and the relative position of heteroatom amalgamated with low degeneracy. Moreover, these indices were found to be non-correlating with important topological descriptors. These qualities ensure their utility in drug design, quantitative structure activity/property relationships, combinatorial library design, isomer discrimination, and similarity/dissimilarity studies.

Subsequently, proposed TIs along with other TIs were successfully employed for development of numerous models for Chk2 inhibitory activity of 2-arylbenzimidazoles through decision tree, RF, and MAA. Decision tree revealed that proposed superaugmented eccentric distance sum connectivity topochemical index-3 (inline image) and superaugmented eccentric distance sum connectivity topochemical index-4 (inline image) are the most important indices. The exceptionally high degree of predictability of the resulting models offers a vast potential for providing lead structures for the development of specific Chk2 inhibitors that will help in improving the therapeutic window of radiation therapy and chemotherapy by reducing their side effects on the normal cells.

References

  1. Top of page
  2. Abstract
  3. Methodology
  4. Results and Discussion
  5. Conclusion
  6. References
  • 1
    Estrada E., Uriarte E. (2001) Recent advances on the role of topological indices in drug discovery research. Curr Med Chem;8:15731588.
  • 2
    Hann M., Green R. (1999) Chemoinformatics – a new name for an old problem? Curr Opin Chem Biol;3:379383.
  • 3
    Estrada E., Patlewicz G., Uriarte E. (2003) From molecular graphs to drugs. A review on the use of topological indices in drug design and discovery. Ind J Chem;42A:13151329.
  • 4
    Venkatesh S., Lipper R.A. (2000) Role of the development scientist in compound lead selection and optimization. J Pharm Sci;89:145154.
  • 5
    Hansch C. (1969) A quantitative approach to biochemical structure-activity relationships. Acc Chem Res;2:232239.
  • 6
    Greener M. (2005) QSAR: prediction beyond the fourth dimension. Drug Disc Dev;8:4447.
  • 7
    Waterbeemd V.D., Carter R.E., Grassy G., Kubinyi H., Martin Y.C., Tute M.S., Willett P. (1997) Glossary of terms used in computational drug design. Pure Appl Chem;69:11371152.
  • 8
    Randic M. (1997) On characterization of chemical structure. J Chem Inf Comput Sci;37:672687.
  • 9
    Katritzky A.R., Gordeeva E.V. (1993) Traditional topological indices vs electronic, geometrical, and combined molecular descriptors in QSAR/QSPR research. J Chem Inf Comput Sci;33:835857.
  • 10
    Devillers J., Balaban A.T. (1999) Topological Indices and Related Descriptors in QSAR and QSPR. Singapore: Gordon and Breach Science Publishers.
  • 11
    Ponce Y.M. (2004) Total and local (atom and atom type) molecular quadratic indices: significance interpretation, comparison to other molecular descriptors, and QSPR/QSAR applications. Bioorg Med Chem;12:63516369.
  • 12
    Basak S.C., Grunwald G.D. (1994) Molecular similarity and risk assessment: analog selection and property estimation using graph invariants. SAR QSAR Environ Res;2:289307.
  • 13
    Basak S.C., Bertelsen S., Grunwald G.D. (1994) Application of graph theoretical parameters in quantifying molecular similarity and structure-activity relationships. J Chem Inf Comput Sci;34:270276.
  • 14
    Hu Q.N., Liang Y.Z., Fang K.T. (2003) The matrix expression, topological index and atomic attribute of molecular topological structure. J Data Sci;1:361389.
  • 15
    Bagchi M.C., Maiti B.C., Bose S. (2004) QSAR of anti tuberculosis drugs of INH type using graphical invariants. J Mol Str (Theochem);679:179186.
  • 16
    Massague J. (2004) G1 cell-cycle control and cancer. Nature;432:298306.
  • 17
    Perona R., Amor V.M., Pinilla R.M., Injesta C.B. (2008) Role of CHK2 in cancer development. Clin Transl Oncol;10:538542.
  • 18
    Kawabe T. (2004) G2 checkpoint abrogators as anticancer drugs. Mol Cancer Ther;3:513519.
  • 19
    Kastan M.B., Bartek J. (2004) Cell-cycle checkpoints and cancer. Nature;18:316323.
  • 20
    Hoeijmakers J.H. (2001) Genome maintenance mechanisms for preventing cancer. Nature;411:366374.
  • 21
    Bartek J., Lukas J. (2001) Mammalian G1- and S-phase checkpoints in response to DNA damage. Curr Opin Cell Biol;13:738747.
  • 22
    Zhou B.-B.S., Elledge S.J. (2000) The DNA damage response: putting check points in perspective. Nature;408:433439.
  • 23
    Elledge S.J. (1996) Cell cycle checkpoints: preventing an identity crisis. Science;274:16641672.
  • 24
    Brown E.J., Baltimore D. (2000) ATR disruption leads to chromosomal fragmentation and early embryonic lethality. Genes Dev;14:397402.
  • 25
    De Klein A., Muijtjens M., Os R.V., Vehoeven Y., Smit B., Carr A.M., Lehmann A.R., Hoeijmakers J.H.J. (2000) Targeted disruption of the cell-cycle checkpoint gene ATR leads to early embryonic lethality in mice. Curr Biol;10:479482.
  • 26
    Liu Q., Guntuku S., Cui X.S., Matsuoka S., Cortez D., Tamai K., Luo G., Rivera S.C., Demayo F., Bradley A., Donehower L.A., Elledge S.J. (2000) Chk1 is an essential kinase that is regulated by ATR and required for the G (2)/M DNA damage checkpoint. Genes Dev;14:14481459.
  • 27
    Takai H., Tominaga K., Motoyama N., Minamishima Y.A., Nagahama H., Tsukiyama T., Ikeda K., Nakayama K., Nakanishi M., Nakayama K.I. (2000) Aberrant cell cycle checkpoint function and early embryonic death in Chk1 (−/−) mice. Genes Dev;14:14391447.
  • 28
    Zhou B.B., Anderson H.J., Roberge M. (2003) Models of anti-cancer therapy targeting DNA checkpoint kinases in cancer therapy. Cancer Biol Ther;2:S16S22.
  • 29
    Skladanowski A., Bozko P., Sabisz M. (2009) DNA structure and integrity checkpoints during the cell cycle and their role in drug targeting and sensitivity of tumor cells to anticancer treatment. Chem Rev;109:29512973.
  • 30
    Shiloh Y. (2003) ATM and related protein kinases: safeguarding genome integrity. Nat Rev Cancer;3:155168.
  • 31
    Kastan M.B., Lim D. (2000) The many substrates and functions of ATM. Nat Rev Mol Cell Biol;1:179186.
  • 32
    Bartek J., Lukas J. (2003) Chk1 and Chk2 kinases in checkpoint control and cancer. Cancer Cell;3:421429.
  • 33
    Stracker T.H., Usuia T., John Petrini H.J. (2009) Taking the time to make important decisions: the checkpoint effector kinases Chk1 and Chk2 and the DNA damage response. DNA Repair;8:10471054.
  • 34
    Gent V., Hoeijmakers J.H.J., Kannar R. (2001) Chromosomal stability and the DNA double-stranded break connection. Nat Rev Cancer;2:196206.
  • 35
    Khanna K.K., Jackson S.P. (2001) DNA double-strand breaks: signaling, repair and the cancer connection. Nat Genet;27:247254.
  • 36
    Bartkova J., Horejsi Z., Koed K., Kramer A., Tort F., Zieger K., Guldberg P., Sehested M., Nesland J.M., Lukas C., Orntoft T., Lukas J., Bartek J. (2005) DNA damage response as a candidate anti-cancer barrier in early human tumorigenesis. Nature;434:864870.
  • 37
    Pommier Y., Sordet O., Rao V.A., Zhang H., Kohn K.W. (2005) Targeting Chk2 kinase: molecular interaction maps and therapeutic rationale. Curr Pharm Des;22:28552872.
  • 38
    Antoni L., Sodha N., Collins I., Garrett M.D. (2007) CHK2 kinase: cancer susceptibility and cancer therapy two sides of the same coin. Nat Rev Cancer;7:925936.
  • 39
    Castedo M., Perfettini J.-L., Roumier T., Andreau K., Yakushijin K., Horne D., Medema R., Kroemer G. (2004) The cell cycle checkpoint kinase Chk2 is a negative regulator of mitotic catastrophe. Oncogene;23:43534361.
  • 40
    Vakifahmetoglu H., Olsson M., Tamm C., Heidari N., Orrenius S., Zhivotovsky B. (2008) DNA damage induces two distinct modes of cell death in ovarian carcinomas. Cell Death Differ;15:555566.
  • 41
    Chabalier-Taste C., Racca C., Dozier C., Larminat F. (2008) BRCA1 is regulated by Chk2 in response to spindle damage. Biochim Biophys Acta;1783:22232233.
  • 42
    Komarov P.G., Komarova E.A., Kondratov R.V., Christov-Tselkov K., Coon J.S., Chernov M.V., Gudkov A.V. (1999) A chemical inhibitor of p53 that protects mice from the side effects of cancer therapy. Science;285:17331737.
  • 43
    Evan G., vousden K.H. (2001) Proliferation, cell cycle and apoptosis in cancer. Nature;411:342348.
  • 44
    Hirao A., Kong Y.Y., Mastsuoka S., Wakeham A., Ruland J., Yoshida H., Liu D., Elledge S.J., Mak T.W. (2000) DNA damage-induced activation of p53 by the checkpoint kinase Chk2. Science;287:18241827.
  • 45
    Takai H., Naka K., Okada Y., Watanabe M., Harada N., Saito S., Anderson C.W., Appella E., Nakanishi M., Suzuki H., Nagashima K., Sawa H., Ikeda K., Motoyama N. (2002) Chk2-deficient mice exhibit radioresistance and defective p53-mediated transcription. EMBO J;21:51955205.
  • 46
    Jack M.T., Woo R.A., Motoyama N., Takai H., Lee P.W.K. (2004) DNA-dependent protein kinase and checkpoint kinase 2 synergistically activate a latent population of p53 upon DNA damage. J Biol Chem;279:1526915273.
  • 47
    Zhou B.B., Bartek J. (2004) Targeting the checkpoint kinases: chemosensitization versus chemoprotection. Nat Rev Cancer;4:216225.
  • 48
    Arienti K.L., Brunmark A., Axe F.U., McClur K., Lee A., Belvitt J., Neff D.F., Haung L., Crawford S., Pandit C.R., Karlsson L., Breitenbcher J.G. (2005) Checkpoint kinase inhibitors: SAR and radioprotective properties of a series of 2-arylbenzimidazoles. J Med Chem;48:18731885.
  • 49
    Jobson A.G., Cardellina J.H. II, Scudiero D., Kondapaka S., Zhang H., Kim H., Shoemaker R., Pommier Y. (2007) Identification of a bis-guanylhydrazone [4,4_-diacetyldiphenylurea-bis(guanylhydrazone); NSC 109555] as a novel chemotype for inhibition of Chk2 kinase. Mol Pharmacol;72:876884.
  • 50
    Carlessi L., Buscemi G., Larson G., Hong Z., Wu J.Z., Delia D. (2007) Biochemical and cellular characterization of VRX0466617, a novel and selective inhibitor for the checkpoint kinase Chk2. Mol Cancer Ther;6:935944.
  • 51
    Larson G., Yan S., Chen H., Rong F., Hong Z., Wu J.Z. (2007) Identification of novel, selective and potent Chk2 inhibitors. Bioorg Med Chem Lett;17:172175.
  • 52
    Jobson A.G., Lountos G.T., Lorenzi P.L., Llamas J., Connelly J., Cerna D., Tropea J.E. et al. (2009) Cellular inhibition of checkpoint kinase 2 (Chk2) and potentiation of camptothecins radiation by the novel Chk2 inhibitor PV1019 [7-Nitro-1H-indole-2-carboxylic acid {4-[1-(guanidinohydrazone)-ethyl]-phenyl}-amide]. J Pharmacol Exp Ther;331:816826.
  • 53
    Zabludoff S.D., Deng C., Grondine M.R., Sheehy A.M., Ashwell S., Caleb B.L., Green S. et al. (2008) AZD7762, a novel checkpoint kinase inhibitor, drives checkpoint abrogation and potentiats DNA-targeted therapies. Mol Cancer Ther;7:29552966.
  • 54
    Curman D., Cinel B., Williams D.E., Rundle N., Blockaaron W.D., Goodarzi A., Hutchinsi J.R., Clarkei P.R., Zhou B.-B., Lees-Miller S.P., Andersen R.J., Roberge M. (2001) Inhibition of the G2 DNA damage checkpoint and of protein kinases Chk1 and Chk2 by the marine sponge alkaloid debromohymenialdisine. J Biol Chem;276:1791417919.
  • 55
    Yu Q., Rose J.L., Zhang H., Takemura H., Kohn K.W., Pommier Y. (2002) UCN-01 inhibits p53 up-regulation and abrogates radiation-induced G2-M checkpoint independently of p53 by targeting both of the checkpoint kinases, Chk2 and Chk1. Cancer Res;62:57435748.
  • 56
    Singh S.V., Antosiewicz A.H., Singh A.V., Lew K.L., Srivastava S.K., Kamath R., Brown K.D., Zhang L., Baskaran R. (2004) Sulforaphane-induced G2/M phase cell cycle arrest involves checkpoint kinase 2-mediated phosphorylation of cell division cycle. J Biol Chem;279:2581325822.
  • 57
    Bucher N., Britten C.D. (2008) G2 checkpoint abrogation and checkpoint kinase-1 targeting in the treatment of cancer. Br J Cancer;98:523528.
  • 58
    Janetka J.W., Ashwell S. (2009) Checkpoint kinase inhibitors: a review of the patent literature. Expert Opin Ther Pat;19:165197.
  • 59
    Goel A., Madan A.K. (1995) Structure-activity study on anti-inflammatory pyrazole carboxylic acid hydarzide anlogs using molecular connectivity indices. J Chem Inf Comput Sci;35:510514.
  • 60
    Dureja H., Madan A.K. (2005) Topochemical models for prediction of cyclin-dependent kinase 2 inhibitory activity of indole-2-ones. J Mol Mod;11:525531.
  • 61
    Gupta S., Singh M., Madan A.K. (2003) Novel topochemical descriptors for predicting anti-HIV activity. Indian J Chem;42A:14141425.
  • 62
    Bajaj S. (2005) Study on topochemical descriptors for the prediction of physicochemical and biological properties of molecules. Ph.D. Thesis, New Delhi, India: Guru Gobind Singh Indraprastha University.
  • 63
    Bajaj S., Sambhi S.S., Madan A.K. (2004) Prediction of carbonic anhydrase activation by tri-/tetrasubstituted pyridinium-azole drugs: a computational approach using novel topochemical descriptor. QSAR Comb Sci;23:506514.
  • 64
    Kumar V., Sardana S., Madan A.K. (2004) Predicting anti-HIV activity of 2, 3-diaryl-1, 3-thiazolidin-4-ones: computational approach using reformed eccentric connectivity index. J Mol Mod;10:399407.
  • 65
    Gupta S. (2002) Application and development of graph invariants of drug design. Ph.D. Thesis, Patiala, India: Punjabi University.
  • 66
    Bajaj S., Sambhi S.S., Madan A.K. (2005) Prediction of anti-inflammatory activity of N-arylanthranilic acids: computational approach using refined Zagreb indices. Croat Chem Acta;78:165174.
  • 67
    Bajaj S., Sambhi S.S., Madan A.K. (2004) Predicting anti-HIV activity of phenethylthiazolethiourea (PETT) analogs:computational approach using Wiener’s topochemical index. J Mol Str (Theochem);684:197203.
  • 68
    Dureja H., Gupta S., Madan A.K. (2008) Predicting anti-HIV-1 activity of 6-arylbenzonitriles: computational approach using superaugmented eccentric connectivity topochemical indices. J Mol Graph and Mod;26:10201029.
  • 69
    Randic M. (1975) On characterization of molecular branching. J Am Chem Soc;97:66096615.
  • 70
    Gupta S., Singh M., Madan A.K. (2001) Predicting anti-HIV activity: computational approach using a novel topological descriptor. J Comput Aided Mol Des;15:671678.
  • 71
    Bajaj S., Sambi S.S., Madan A.K. (2006) Model for prediction of anti-HIV activity of 2-pyridinone derivatives using novel topological descriptor. QSAR Comb Sci;25:813823.
  • 72
    Sharma V., Goswami R., Madan A.K. (1997) Eccentric connectivity index: a novel highly discriminating topological descriptor for structure – property and structure – activity studies. J Chem Inf Comput Sci;37:273282.
  • 73
    Gupta S., Singh M., Madan A.K. (2000) Connective eccentric index: a novel topological descriptor for predicting biological activity. J Mol Graph Mod;18:1825.
  • 74
    Gutman I., Ruscic B., Trinajstic N., Wicox C.F. (1975) Graph theory and molecular orbitals XII acyclc polyenes. J Chem Phys;62:33993405.
  • 75
    Gutman I., Randic M. (1977) Algebric characterization of skeletal branching. Chem Phys Lett;47:1519.
  • 76
    Wiener H. (1947) Structural determination of paraffin boiling points. J Am Chem Soc;69:1720.
  • 77
    Kim H., Koehler G.J. (1995) Theory and practice of decision tree induction. Omega Int J Mgmt Sci;23:637652.
  • 78
    Sprogar M., Kokol P., Zorman M., Podgorelec V., Yamamoto R., Masuda G., Sakamoto N. (2001) Supporting medical decisions with vector decision trees. Medinfo;10:552556.
  • 79
    Breiman L. (2001) Random forests. Mach Learn;45:532.
  • 80
    Dureja H., Gupta S., Madan A.K. (2008) Topological models for the prediction of pharmacokinetic parametes of Cephalosporins using random forest, decision tree and moving average analysis. Sci Pharm;76:401408.
  • 81
    Han L., Wang Y., Bryant S.H. (2008) Developing and validating predictive decision tree models from mining chemical structural fingerprints and high throughput screening data in Pubchem. BMC Bioinformatics;9:401.
  • 82
    Bailey D.S., Dean P.M. (1992) Pharmacogenomics and its impact on drug design and optimization. Annu Rev Med Chem;34:339348.
  • 83
    Gallop M.A., Barrett R.W., Dower W.J., Fodor S.P.A., Gordon E.M. (1994) Applications of combinatorial technologies to drug discovery. 1. Background and peptide combinatorial libraries. J Med Chem;37:12331251.
  • 84
    Gordon E.M., Barrett R., Dower W.J., Fodor S.P.A., Gallop M.A. (1994) Applications of combinatorial technologies to drug discovery. 2. Combinatorial organic synthesis, library screening strategies, and future directions. J Med Chem;37:13861401.
  • 85
    Devlin J.P. (2000) High Throughput Screening. New York: Marcel Dekker.
  • 86
    Estrada E., Molina E. (2001) Novel local (fragment-based) topological molecular descriptors for QSPR/QSAR and molecular design. J Mol Graph Mod;20:5464.
  • 87
    Galvez J., Garcia-Domenech R., Julian-Ortiz J.V., Soler R. (1995) Topological approach to drug design. J Chem Inf Comput Sci;35:272284.
  • 88
    Ivanciuc O., Ivanciuc T., Klein D.J., Seitz W.A., Balaban A.T. (2001) Wiener index extension by counting even/odd graph distances. J Chem Inf Comput Sci;41:536549.
  • 89
    Madan A.K., Dureja H. (2010) Eccentricity based descriptors for QSAR/QSPR, mathematical chemistry monographs, no. 9. In: Gutman I., Furtula B., editors. Novel Molecular Structure Descriptors – Theory and Applications II. Serbia: Croatian Chemical Society; p. 91138.
  • 90
    Nikolic S., Kovacevic G., Milicevic A., Trinajstic N. (2003) The Zagreb indices 30 years after. Croat Chem Acta;76:113124.