- Top of page
- Results and Discussion
Four highly discriminating fourth-generation topological indices (TIs), termed as superaugmented eccentric distance sum connectivity indices, as well as their topochemical versions (denoted by , , and ), have been conceptualized in this study. The values of these indices for all possible structures with three, four, and five vertices containing one heteroatom were computed using an in-house computer program. The proposed superaugmented eccentric distance sum connectivity topochemical indices exhibited exceptionally high discriminating power, low degeneracy, and high sensitivity toward both the presence and the relative position of heteroatom(s) for all possible structures with five vertices containing at least one heteroatom. Intercorrelation analysis revealed the absence of correlation of proposed indices with Zagreb indices and the molecular connectivity index. Subsequently, the proposed TIs were successfully utilized for the development of models for the prediction of checkpoint kinase inhibitory activity of 2-arylbenzimidazoles. A data set comprising 47 differently substituted analogs of 2-arylbenzimidazoles was selected for the study. The values of various TIs for each analog in the data set were computed using an in-house computer program. The resulting data were analyzed, and suitable models were developed through decision tree (DT), random forest (RF), and moving average analysis (MAA). The performance of the models was assessed by calculating the specificity, sensitivity, overall accuracy, and Mathew’s correlation coefficient. A decision tree was constructed for the checkpoint kinase inhibitory activity to determine the importance of topological indices. The decision tree identified the proposed TIs –, – as the most important indices. The decision tree learned the information from the input data with an accuracy of 96% and correctly predicted the cross-validated (10-fold) data with an accuracy of 77%. Random forest correctly predicted the checkpoint kinase inhibitory activity with an accuracy of 83%. The single index-based models were also developed for the prediction of checkpoint kinase inhibitory activity using MAA. The accuracy of prediction of single index-based models derived through MAA was found to vary from a minimum of 90% to a maximum of 95%. Exceptionally high discriminating power, low degeneracy, and high sensitivity toward branching and presence of heteroatom of proposed indices can be of immense use in drug design, isomer discrimination, similarity/dissimilarity studies, quantitative structure activity/property relationships, lead optimization, and combinatorial library design.
The identification and optimization of the lead compounds in a rapid and cost-effective way are the most critical steps in drug discovery. The computer-aided drug discovery approach offers an alternative to the real world of synthesis and screening (1,2). The computational techniques have advanced rapidly over the past few decades and have played a major role in the development of a number of drugs now in the market or going through clinical trials (3,4). QSAR/QSPR is the mathematical relationship linking chemical structure and pharmacological activity/property in a quantitative manner for the series of compounds (5). It also reduces the number of compounds to be synthesized and promptly detects the most favorable compounds. Fundamentally, QSAR aims to identify relationships between some aspects of molecular structure and properties as toxicology, pharmacodynamics, and pharmacokinetics (6).
The 2D approach has a number of advantages compared with the higher dimension QSAR methodologies. First of all, owing to the variety of molecular descriptors available, optimized coordinates are not always required. In fact, connectivity information (in the form of an adjacency matrix) alone can be used to develop QSAR models. As a result, the models using topological descriptors can be built rapidly for very large sets of molecules. Second, this approach avoids the alignment step and thus can be used in the absence of experimental information regarding the binding of a molecule to its target.a
The 2D QSAR makes use of TIs which are the numerical values associated with the chemical constitution for correlation of chemical structures with various physical properties, chemical reactivity, or biological activity (7). These are derived from topological representation of molecules and can be considered structure explicit descriptors (8). The TIs are among the most useful descriptors known nowadays, as these can be rapidly computed for large number of molecules and also offer a simple way of measuring molecular branching, shape, size, cyclicity, symmetry, chirality, complexity, and heterogeneity of atomic environments in the molecule (9–14). The past two decades have witnessed that the use of TIs in QSAR models enhanced the scope of drug design by producing the reliable estimates of therapeutic and toxic potential of chemicals (15).
The genetic integrity of a cell is constantly challenged by radiation, chemical agents, and replication errors (16). These agents mainly cause double strand breaks (DSB) and single strand breaks (SSB) and cause genomic instability that may lead to tumor development, if left unrepaired (17). The DNA damage is also used to cure the cancer. Many of the conventional anticancer treatments (ionizing radiation, hyperthermia, pyrimidine and purine antimetabolites, alkylating agents, DNA topoisomerase inhibitors, and platinum compounds) at least partly damage the DNA of cells. As these treatments are not specifically selective for cancer cells, patients have suffered from serious side effects when taking these drugs (18). Therefore, DNA damage causes the disease, used to treat the disease, and responsible for the toxicity of therapies for disease (19).
In DNA damage response (DDR), eukaryotic cells activate checkpoint pathways to arrest the cell cycle (20–22). The checkpoints comprise a subroutine integrated into the larger DDR pathway that regulates a multifaceted response. Moreover, several checkpoint genes are essential for cell and organism survival (23–27) implying that these pathways are not only surveyors of occasional damage but are firmly integrated components of cellular physiology (22).
The DNA damage checkpoints are known to comprise signal transduction cascades that link the detection of DNA damage to several other processes, i.e. inhibition of progression through the cell cycle from G1 to S, through S and from G2 into M, activation of DNA repair and initiation of apoptosis (28). DNA damage is recognized by damage sensor proteins such as Mre11-Rad50-Nbs1 (MRN complex) and breast and ovarian cancer locus 1 (BRCA1)-associated genome surveillance complex (BASC). These proteins recruit and activate the upstream Ataxia-telangiectasia mutated (ATM) protein and ATM and Rad 3-related (ATR) kinases (17,29). Checkpoint kinases Chk1 and Chk2 are downstream key mediators of DDR through activation of an increasing number of substrates such as p53, NBS1, BRCA1, MDM2, Cdc25A, Cdc25C, and E2F1 (30–32). The relevance of these kinases in the maintenance of genome integrity is clearly indicated by the severe human genetic disorders and the predisposition to cancer associated with defects in these proteins (20,33–35).
Radiation and chemotherapy as the therapy for cancer often have serious side effects that limit their efficacy. Modulations of checkpoint regulating responses to these types of drugs appear as a potential strategy to sensitize the tumor cells to the DNA damaging agents (17). Checkpoint kinase 2 acts as mediator between DNA damage signaling and also act as barrier for tumorogenesis (36). There is evidence in favor of therapeutic value of Chk2 inhibitors (37,38). Checkpoint kinase 2 inhibitors are reported to augment the effect of various cytotoxic drugs, e.g. Doxorubicin (39), Cisplatin (40), and Paclitaxel (41).
The side effects from the radiation therapy have been reported as more serious. As these side effects are in part determined by p53-mediated apoptosis, temporary suppression of p53 has been suggested as a therapeutic strategy to prevent damage of normal tissues during treatment of p53-deficient tumors (42,43). The p53 response to DNA breaks induced by radiation and certain chemical agents is controlled by Chk2 (36). Studies showed that Chk2-deficiency exhibited radioresistance and a critical role in p53 function in response to IR by regulating its transcriptional activity and its stability indicating the utility of Chk2 inhibitors as radioprotectant for normal cells (44,45). Thus, Chk2 inhibitors may be useful drugs for reducing the side effects of cancer therapy and other types of stress associated with p53 activation (46,47).
Agents that target checkpoint kinases have demonstrated impressive evidence preclinically that this approach will provide tumor-specific potentiating agents and may have broad therapeutic utility. Only a few selective Chk2 inhibitors have been reported other than 2-arylbenzimidazole (48), NSC 109555 (49), VRX0466617 (50), isothiazole carboxamides (51), and PV 1019 (52). There are various published inhibitors of Chk1 (Staurosporin, Go6976, SB-218078, ICP-1, CEP-3891, and AZD7762) (53) and both Chk1 and Chk2 (TAT-S216A, UCN-01, and debromohymenialdisine) (54,55), CEP-6367, Sulforaphane (18,56,57).
The past decade has witnessed the development of checkpoint kinase inhibitors for the treatment of cancer. Three checkpoint kinase inhibitors have already entered clinical trials since 2005 (58). The pharmaceutical industry strives to explore novel scaffolds for checkpoint kinase inhibition.
In this study, four topological descriptors termed as superaugmented eccentric distance sum connectivity indices and their topochemical versions have been conceptualized and successfully utilized along with existing TIs for development of models for prediction of checkpoint kinase (Chk2) inhibitory activity of 2-arylbenzimidazoles.
Results and Discussion
- Top of page
- Results and Discussion
The successful application of many topological descriptors is somewhat limited owing to low discriminating power and high degeneracy. There is always a strong need for the development of descriptors and approaches that could provide explicit information on the molecular aspects responsible of drug action (1). Moreover, pharmacogenomics (82), combinatorial chemistry (83,84), and high through put screening (85) permit to obtain and evaluate thousands of compounds in a short time. These technologies have generated new challenges for computational scientists, as they demand novel approaches to the computer-aided lead discovery and optimization in an accelerated way (86).
As the structure of the compound depends on connectivity of its constituent atoms, therefore, TIs based on connectivity can reveal the role of structural and substructural information of molecules in estimating biological activity and evaluate toxicity. Topological indices developed for predicting physicochemical properties and biological activities of chemical substances can be used for drug design (87,88). The application of TIs in drug design can be in lead discovery and lead optimization, virtual screening, structure activity/property studies, structure pharmacokinetics study, and structure toxicity relationships. Recently, these are also being used in similarity/dissimilarity studies, combinatorial chemistry in studying the chirality of the molecule, isomer discrimination, and molecular complexity (1,3).
As shown in Figure 2, the value of changes by a factor of 11 (from 238.801 to 20.804), the value of changes by a factor of 30 (1554.158–52.646), the value of changes by a factor of about 77 (9810.431–127.118), and the value of changes by a factor of 203 (60235.7–296.55) with a minor change in the branching of an 11-membered molecule containing one heteroatom. These descriptors have high discriminating power, which is defined as the ratio of highest to lowest value for all possible structures of same number of vertices. The discriminating power of , , and is 302.9, 643.31, 1301.54, and 2627.99, respectively for all possible structures containing only five vertices (Table 1).
Extremely low degeneracy of the proposed indices ensures the enhanced sensitivity toward the minor changes in branching, connectivity, and changes in the molecular structures. The intercorrelation between the proposed superaugmented eccentric distance sum connectivity topochemical indices and other well-known TIs was also investigated. Pairs of TIs with r ≥ 0.97 are considered highly intercorrelated, those with 0.90 ≤ r ≤ 0.97 appreciably correlated, those with 0.50 ≤ r ≤ 0.89 weakly correlated, and finally the pairs of TIs with r < 0.50 are not intercorrelated (90). As indicated in Table 2, , , , and are not intercorrelated with the well-known χA, ξc, M1, and M2. However, these indices were found to be weakly intercorrelated with Wc and highly intercorrelated with each other, as these are based on similar principles/matrices. The pair of indices χA and ξc, M1 and M2, are highly intercorrelated, whereas χA and Wc, ξc and Wc, ξc and M1, ξc and M2 are found to be weakly intercorrelated, while M1 and M2 are found not be intercorrelated with Wc.
In this study, DT-, random forest (RF)- and moving average analysis (MAA)-based models were developed for the prediction of checkpoint kinase (Chk2) inhibitory activity of 2-arylbenzimidazole. The decision tree was built by utilizing 26 TIs of diverse nature. The index at root node is most important, and the importance of index decreases as the length of tree increases. The classification of 2-arylbenzimidazoles analogs both as active and inactive using a single tree, based on A13, A14, and A6, is illustrated in Figure 5 (the respective descriptor is denoted with an alphanumerical abbreviation that refers to Table 3). The decision tree identified the A13 () as the most important index. The decision tree classified the 2-arylbenzimidazoles analogs in the training set with an accuracy of 96% and 10-fold cross-validation with an accuracy of 76.6%. The specificity and sensitivity of the DT-based model in training set were of the order of 96.5% and 94.4%, respectively (Table 4). The specificity and sensitivity of the DT-based model in cross-validated set with respect to inactive analogs were of the order of 82.7% and 66.6%. The values of MCC for DT-based model in the training set and cross-validated set are 0.9 and 0.03, respectively, suggesting the randomness and robustness of the model. The values of specificity, sensitivity, and MCC are shown in Table 4.
The RFs were grown with 26 topological descriptors enlisted in Table 3. The importance of node was determined by mean decrease in accuracy. The RF classified 2-arylbenzimidazoles analogs either as active or as inactive with an accuracy of 83%. The specificity and sensitivity were of the order of 82.7% and 88.8%, respectively, and the value of MCC was found to be 0.098 (Table 4).