Inhibition activity prediction for a dataset of candidates’ drug by combining fuzzy logic with MLR/ANN QSAR models

A hybrid of artificial intelligence simple and low computational cost QSAR was used. Approximately 90 pyridinylimidazole‐based drug candidates with a range of potencies against p38R MAP kinase were investigated. To obtain more flexibility and effective capability of handling and processing information about the real world, in this case, the fuzzy set theory was introduced into the QSAR. An integration of multiple linear regression and artificial neural network with adaptive neuro‐fuzzy inference systems (ANFIS) was developed to predict the inhibition activity. The algorithm of ANFIS was applied to identify the suitable variables and then to find the optimal descriptors. The gradient descent with momentum backpropagation ANN was used to establish the nonlinear multivariate relationships between the chemical structural parameters and biological response. A comparison between the result of the proposed linear and nonlinear regression showed the superiority of QSAR modeling by ANFIS‐ANN method over the MLR. The results demonstrated that the ANFIS could be applied successfully as a feature selection. The appearance of Diam, Homo, and LogP descriptors in the model showed the importance of the steric, electronic, and thermodynamic interactions between a drug and its target site in the distribution of a compound within a biosystem and its interaction with competing for binding sites.


| INTRODUCTION
The mitogen-activated protein (MAP) kinases are the key signaling molecules. The MAP kinases contain threonine and serine protein kinase (Wang, Wu, Ai, & Wang, 2015;Wang et al., 1998).They participate in diverse cellular events and are potential targets for affecting on inflammation, cancer, and other diseases. The MAP kinase p38 is responsive to environmental stresses and is involved in the production of cytokines during inflammation (Huang et al., 2012;Wang et al., 1998). Numerous diseases including inflammatory and autoimmune diseases such as rheumatoid arthritis (RA) or inflammatory bowel disease (IBD), neurodegenerative conditions, cardiovascular events, and even cancer could be associated with atypical regulation of protein kinase-mediated cell signaling. The progress of chronic inflammation is driven by an amplified systemic occurrence of several proinflammatory cytokines such as tumor necrosis factor-R (TNFR), whose biosynthesis and release is elevated by the activity of mentioned protein kinases, for example, the signal molecule p38 mitogen-activated protein kinase (p38, p38 MAPKa) (Chen et al., 2001;Noble, Endicott, & Johnson, 2004;Schieven, 2005;Wang et al., 2015). The discovery of specific inhibitors of the p38 block that affect on inflammatory cytokines has generated significant interest in the MAP kinases as drug targets (Lee et al., 1994). Several small molecules p38 MAPK inhibitors have been shown to effectively block the production of IL-1, TNF, and other cytokines in vitro and in animal tests (Adams, Badger, Kumar, & Lee, 2001;Gallagher et al., 1997;Liverton et al., 1999;Wagner & Laufer, 2006).
The resultant toxic effects of chemical compounds on the body exhibit certain specificity which depends on the compound's chemical structure and reactivity (Nel, Xia, Mädler, & Li, 2006). Molecular descriptors and chemometrics are a powerful combined tool for studying pharmaceutical, toxicological, and environmental problems. The concept of molecular structure is one the most important concepts in the development of the scientific knowledge. Molecular structures are internal factors governing the physicochemical properties, environmental behavior, and toxicology of organic compounds. Compounds with similar molecular structures should have similar physicochemical properties, environmental fate, and toxicological effects; that is, there are inherent relations between molecular structures and their physicochemical properties, and environmental behavioral and toxicological parameters (Tunkel, Mayo, Austin, Hickerson, & Howard, 2005). The relations can be characterized as mathematical models, termed as structureactivity relationships (SARs) or quantitative structure-activity relationships (QSARs), collectively referred to as (Q)SARs. We have reviewed these concepts in details in other papers (Abdolmaleki & Ghasemi, 2017;Abdolmaleki, Ghasemi, & Ghasemi, 2017a;Abdolmaleki, Ghasemi, & Ghasemi, 2017b;Ahmadi, Khazaei, & Abdolmaleki, 2014). As a matter of fact, the reasoning based on the molecular structure has been the main engine for the great development of molecular sciences. Therefore, molecular descriptors are now playing a key role in scientific research and are applied in modeling several different properties in fields such as analytical chemistry, physical chemistry, medicinal and pharmaceutical chemistry, and environmental and toxicological studies (Chen, Li, Yu, Wang, & Qiao, 2008).
The field of artificial intelligence (AI) or computing intelligence techniques (CI) such as artificial neural networks and fuzzy logic (FL) are the method of choice for structure-activity and structure-property correlation. World Scientific Publishing Co., Inc. The AI-based methods are spanning a wide variety of applications of theoretical modeling and simulation. In recent years, it has been applied successfully in many cases such as knowledge discovery and data mining as well as statistical analysis and machine learning. These techniques received considerable attention because of its abilities in dealing with imprecision and uncertainty and in learning to achieve tractability, robustness, and low solution cost. They mainly consist of fuzzy logic (FL), artificial neural networks, probabilistic reasoning, and evolutionary optimization (Ciaramella et al., 2005;Hou, Su, & Chang, 2008;Katritzky et al., 2001Katritzky et al., , 2010Lin & Lee, 1996). Neural networks are often used in conjunction with optimization techniques for feature selection, ranging from simple greedy approaches such as forward selection or backward elimination, and other AI techniques such as adaptive neuro-fuzzy inference systems (ANFIS), ant colony, and particles swarm (Agrafiotis, Cedeno, & Lobanov, 2002;Buyukbingol, Sisman, Akyildiz, Alparslan, & Adejare, 2007;Loukas, 2001;Mwense et al., 2006). FL has been applied in different fields, such as computational biology, physics, automated control, and decision-making support. The underlying idea in FL applications is that people are capable of making decision using imprecise or uncertain knowledge, whereas traditional computer algorithms require precise information. An important advantage of using fuzzy models is that they are capable of incorporating knowledge from human experts naturally and conveniently, while traditional models fail to do so. Other important properties of fuzzy models are their ability to handle nonlinearity and interpretability feature of the models. FL is the most popular constituent of the CI area since fuzzy systems are able to represent human expertise in the form of IF antecedent THEN consequent statements. In this domain, the system behavior is modeled through the use of linguistic descriptions (Babuška & Verbruggen, 2003;Efe & Kaynak, 2001;Nelles, 2001;Paripour, Ferrara, & Salimi, 2017;Zadeh, 1996;Zhou & Gan, 2008).
The integration of these methodologies that exploit the strength of each collection and synergistically is a driving force to synthesize hybrid intelligent systems. Therefore, in the present work, a hybrid method consisted of ANFIS, MLR, and ANN is developed for diverse set of an important group of medicinal compounds. In the following sections, the theory of the proposed methods is briefly described. Then, the results of the proposed system and the comparison results are illustrated. Finally, concluding remarks are made.

| METHODOLOGICAL BACKGROUND
Normally, in a QSAR/QSPR study, an optimization algorithm is used to select the descriptors containing the most information about the given property or activity. CI methods can be complementary to previous approaches and can be used to search efficiently over many disciplines of biomedicine and biochemistry.

| 1141
ABDOLMALEKI AnD GHASEMI consists of several layers of a large of highly interconnected computational units called neurons working in union to solve specific problems. They are considered as information processing systems that have the abilities to learn, recall, and generalize from training data. Basically, the artificial neural network consists of neurons, simple processing elements, which are activated as soon as their inputs exceed certain thresholds. The neurons are arranged in layers which are connected so that the signals at the input are propagated through the network to the output (Gath & Geva, 1989).
ANNs are useful tools in QSAR/QSPR studies and particularly in cases where it is difficult to specify an exact mathematical model for describing a given structure_ property relationship. In fact, ANN is a popular strategy for nonlinear modeling in QSAR studies and some researchers have indicated its capability for drug discovery (Deeb & Clare, 2007;Deeb & Drabh, 2010;Deeb & Goodarzi, 2010;Deeb & Hemmateenejad, 2007;Ghasemi, Mehridehnavi, Pérez-Garrido, & Pérez-Sánchez, 2018). Most of these works used neural networks based on the backpropagation learning algorithm, which has some disadvantages such as local minimum, slow convergence, time-consuming nonlinear iterative optimization, and difficulty in explicit optimum network configuration. In contrast, the parameters of radial basis function neural networks (RBFNNs) can be adjusted by fast linear methods. It has advantages of short training times and is guaranteed to reach the global minimum of error surface during training. The optimization of its topology and learning parameters are easy to be implemented (Walczak & Massart, 2000;Yao et al., 2004). RBFNNs can be described as a three-layer feed-forward structure and consist of three layers: an input layer, hidden layer, and an output layer. The input layer does not process the information; it only distributes the input vectors to the hidden layer. The transfer function in the hidden layer of RBF networks is called the kernel or basis function. For a detailed description, the reader is referred to references. Each hidden layer unit represents a single radial basis function, with associated center position and width. Each neuron on the hidden layer employs a radial basis function as a nonlinear transfer function to operate on the input data (Yao et al., 2004).

| Adaptive neuro-fuzzy inference systems (ANFIS)
Adaptive neuro-fuzzy inference systems are synthesized by an appropriately integrating the neural and fuzzy system interpretations. Therefore, the resulting hybrid combination inherits the numeric power of NN as well as the verbal power of FL (Efe, Kaynak, & Wilamowski, 2000;Jang, Sun, & Mizutani, 1997;Li, Huang, & Chen, 2004). An ANFIS structure having m-inputs and the single output with product inference rule and firstorder Sugeno model can be described as in (Noble et al., 2004) with f i being described as in the rule consequent. The structural view of such a system is illustrated in Figure 1. The rule structure for an ANFIS utilizing first-order Sugeno model has the following representation: When the consequent part of the rule structure is compared with that of rules in SFS architecture, it is seen that the polynomial representation of the decision introduces higher parametric flexibility extending the realization capability. ANFIS structure has been utilized with gradient-based training strategies for identification of nonlinear systems (Efe & Kaynak, 1999). An indepth discussion is given with numerous examples on the use of ANFIS structure (Jang et al., 1997).

| Datasets and descriptor generation
The dataset consisted of an experimental dataset for pyridinylimidazole-based compounds that were taken from the F I G U R E 1 Structure of an ANFIS study Laufer et al. as shown in Table 1 (Laufer, Hauser, Domeyer, Kinkel, & Liedtke, 2008). The biological data (IC 50 ) based on research of Laufer et al. used in this study were expressed in μM. In this case, the log-transformed values did not improve significantly the model's prediction ability. At first, the geometry optimizations of all 3D structures headed to energy minima that were accomplished by using AM1 method of hyperchem Program. Many descriptors calculated to study the similarity or dissimilarity of compounds and used to predict the relationships between physicochemical properties and biological activities by chemoffice (CS ChemOffice 2005 molecular modeling software version 9) and mmp+ (Molecular Modeling Pro plus (MMPP) version 6.0 (ChemSW Inc.)) programs. Total descriptors were utilized to remove the intercorrelating descriptors (redundancy) and reduce the multicollinearity (Buyukbingol et al., 2007).
In the ANFIS feature selection step, the dataset was partitioned into a training set (odd-indexed samples) and a checking set (even-indexed samples) to select the set of inputs that most influence the IC 50 , and overfitting problem was monitored by plotting resulting data. In the modeling step both MLR and ANN, 75% data are selected for the training set and 25% for the check set. Individual elements are selected randomly. The data values (each data) were extracted and written in separate training and testing data files. The matlab (version 7.1, MathWorks, Inc) and Neural Networks Toolbox were used for neural networks practicability.

| ANFIS feature selection
A critical issue with many biological datasets is the overwhelming number of possible features that could be used as input to a model. Standard statistical approaches can be used to reduce the number of features, but in many cases, thousands of features may still remain. And it may also be the case that the researcher is most interested in identifying the key nonlinear relationships between features that are useful in an output decision, and reducing the feature space using linear regression methods may not be well suited. In such cases, the researcher can use the evolutionary process to evolve the selection of features to be used as input to the model concurrent with the optimization of the model itself. Such a method rapidly identifies useful collections of features while simultaneously producing an optimized model. Therefore, an exhaustive search was used within the available inputs to determine the set of the most influential input attribute in predicting the activity of mentioned drug or output. When the training and checking errors are comparable, this implies that there is no overfitting. By this way, an ANFIS model was constructed for each combination and trains it and reports the performance achieved.
We have obtained 31 descriptors out of 49 descriptors from the datasets which were utilized to find the significant descriptors. The selected descriptors (after removing input attributes with RMSE > 0.6) for ANFIS model with one variable as shown in Figure 2. We have used the constrain of RMSE > 0.6 for avoiding overfitting and overtraining. We can simply select models with two, three, or four variables directly since they have the least errors as shown in the plot. However, this will not necessarily be the optimal combination of two or three inputs that result in the minimal training error. To verify this, we repeated the search for the optimal combination of two, three, and four input attributes.

| Development of the models
Building good predictive models is difficult, and the results are not always satisfactory. However, the performance of a model may be acceptable in some domain, whereas insufficient in another. Recent studies show that combined models minimize uncertainty and produce more robust predictors. Here, two approaches of MLR and ANN methods were considered based on selected descriptors by ANFIS method.

| MLR method
The stepwise multiple linear regression procedure was used for model generation. In the stepwise multiple linear regression, a linear equation is produced, but all independent variable is not used. Each variable is added to the equation at a time, and a new regression is performed. The new term is retained only if equation passes a test for significance. Three descriptors selected by ANFIS were used as input for the regression analysis using the stepwise procedure. The stepwise method implemented in the software package of SPSS (version 11.5, SPSS Inc.). As a first step, a correlation matrix was performed for these three descriptors calculated for each molecule. Inspection of this matrix did not show a considerable correlation (R ≥ 0.90) between them. Of course, this dataset remained after eliminating of one outlier (4) from dataset; outliers ≥2SD where SD is standard deviation measured by SPSS. Among the more suitable model obtained, the best MLR model was chosen for further evaluation. The best MLR model consists of two descriptors. The two parameters appearing in this model were the Homo and LogP. The main goals of generating the MLR model were developing a calibration model for the prediction of IC 50 as a linear model.

| ANN method
ANNs are useful tools in QSAR/QSPR studies and particularly in cases where it is difficult to specify an exact mathematical model for describing a given structure-property relationship. Most of these works used neural networks | 1143 ABDOLMALEKI AnD GHASEMI T A B L E 1 Biological activity of different pyridinylimidazole-based compounds include the following: (I) 2-Halogen-substituted pyridin-4-yland quinolin-4-ylimidazole derivatives with a 4-fluorophenyl moiety at imidazole C4 or C5 a . (II) 2-Halogen-substituted pyridin-4-ylimidazole derivatives with a 3-trifluoromethylphenyl moiety at imidazole C4 or C5. (III) (Hetero)arylalkylaminopyridines (2-thioimidazole derivatives with a 4-fluorophenyl ring). (IV) (Hetero)arylalkylaminopyridines, (cyclo)alkylaminopyridines and (cyclo)alkylaminopyridines with an additional polar residue (2-thioimidazole derivatives with a 3-trifluoromethylphenyl ring). (V) (Cyclo)alkylaminopyridines and (cyclo)alkylaminopyridines with an additional polar group (2-thioimidazole derivatives with a 4-fluorophenyl ring). (VI) 2-(Thi)oxypyridinylimidazoles. (VII) Tetrasubstituted aminopyridinylimidazoles based on the backpropagation learning algorithm, which has some disadvantages such as local minimum, slow convergence, time-consuming nonlinear iterative optimization, and difficulty in explicit optimum network configuration (Walczak & Massart, 2000;Yao et al., 2004). In this work, the input parameters Diam, Homo, and LogP were normalized to simplify of descriptors effect on IC 50 within the range (−1, 1). The feed-forward network was formed with six neurons in the hidden layer and one neuron in the output layer. Radbas and tansig transfer functions were chosen in the hidden layer and output layer, respectively. The learning algorithm is Leverberg-Marquardt with mean squared error as performance function. This algorithm is more efficient than the basic backpropagation (BP) algorithm and is highly recommended as a first-choice supervised algorithm, although it does require more memory than other algorithms.

| RESULTS AND DISCUSSION
The main goals of the present work were as follows: (a) to build regression models based on ANFIS system feature selection, to reveal the relationship of novel 2,4,5-and 1,2,4,5-substituted 2-thioimidazoles, (b) to achieve a better understanding of the physicochemical basis of inhibitor activity and its mechanism in these drug, and (c) to compare the ability of the linear (MLR) and nonlinear (ANN), chemometrics techniques in predicting the activity behavior of a diverse set of 2-thioimidazoles derivatives. To fulfill these goals, one needs a very diverse dataset. As can be seen from

| ANFIS results
ANFIS to evolve the selection of features rapidly identifies useful collections of features while simultaneously producing an optimized model. The goal of the ANFIS is to find a model or mapping that will correctly associate the inputs (structural descriptors) with the output (biological activity). By this procedure, the most significant descriptors (inputs) in the fuzzy system were automatically identified. The ANFIS is simulated using the MATLAB Fuzzy Logic Toolbox. ANFIS turned out to be less appropriate for biological applications, since the number of adjustable network parameters grows exponentially with the number of molecular descriptors, owing to the network's complex architecture. Therefore, for practical use, the number of descriptors was restricted to at most 12 (removing input attributes with RMSE > 0.6) inputs. The plot and results from the exhaustive search (Figure 2) clearly indicate that the input attribute LogP is the most influential. We can select more than one input attribute to build the ANFIS model. We examined ANFIS models with two, three, and four variables. The plot demonstrates the result of selecting three inputs, in which Diam, Homo, and LogP were selected as the best combination of three input variables. However, the minimal training (and checking) error does not reduce significantly from that of the best four-input model, which indicates that the newly added attribute does not improve the prediction much. For better generalization, we always prefer a model with a simple structure. ANFIS dynamically constructs the initial (input) and final (output) membership functions (MFs) based on the nature of data. The degree an object belongs to a fuzzy set is denoted by a membership value between 0 and 1. The genfis1 function generates an initial FIS from the training data, which is then fine-tuned by ANFIS to generate the final model. F I G U R E 2 Every input variable's influence on drug activity ((○) training and ‫)٭(‬ checking errors) Figure 3 indicates the minimal checking error occurs at about epoch 30, which is indicated by a circle. ANFIS was trained for 28 epochs with gauss2mf as a proper MF. Therefore, we put three-input ANFIS for further exploration. Notice that the checking error curve goes up after 30 epochs, indicating that further training overfits the data and produces worse generalization.
The ANFIS prediction can be compared against a linear regression model by comparing their respective RMSE values against checking data. Thus, the performance of the ANFIS model by different MFs including of gauss2mf, gaussmf, and trimf was checked with a linear regression model. RMSE against checking data for ANFIS models was 0.535, 0.646, and 0.573, respectively. The higher value of RMSE (6.547) in the linear regression and the near-zero value of the RMSE of the ANFIS method which indicates the appropriate descriptors shows the superiority of the ANFIS method than linear regression. The final selected descriptors in our system belong to three groups: steric, electronic, and thermodynamic descriptors. As noted before, the correlation matrix for the highest significant three descriptors showed no significant intercorrelated descriptors.
The attractive features of an ANFIS include easy to implement, fast and accurate learning, strong generalization abilities, excellent explanation facilities through fuzzy rules, and easy to incorporate both linguistic and numeric knowledge for problem-solving (Roy & Chakraborty, 2013).

| MLR results
Linear regression methods such as MLR and PLS have been widely used for obtaining predictive and descriptive QSAR/QSPR models in our research group (Ahmadi et al., 2014;Ghasemi, Abdolmaleki, Asadpour, & Shiri, 2008;Ghasemi, Asadpour, & Abdolmaleki, 2007;Ghasemi, Saaidpour, & Brown, 2007;Rouhollahi, Shafieyan, & Ghasemi, 2007). In this QSAR study, the stepwise MLR analysis was employed on the training dataset to establish the quantitative regression model. MLR method provided a useful equation linking the structural features to the IC 50 of the compounds as the following form: This QSAR model can be used to predict the activity of compounds based on these parameters. In the proposed model, one of two variables (Homo or highest occupied molecular orbital energy) is related to the electronic properties of the molecules (e.g., the partial charge distribution or the ABDOLMALEKI AnD GHASEMI electronegativities of atoms) and another descriptor, Log P, is related to the thermodynamic structure of molecules.
Validating QSPR with external data (i.e., data not used in the model development) is the best method of validation. The predictive power of the regression model developed on the selected training set is estimated on the predicted values of prediction set chemicals. Multiple regression analysis of these data yielded a pretty good fit (R = 0.8050) for chemicals activity. The residuals values as shown in Figure 4 obtained by the MLR modeling as differences of experimental and predicted IC 50 values. The model obtained is good, bearing in mind the great variety of functionality: amino, alkyl, hydroxyl, carbonyl, and carboxyl groups, and aromatic rings. Furthermore, existence of a trend in the residual plot as has shown in Figure 4, for this data, is maybe a reason for nonlinearity of this dataset. It is well-known that biological systems are inherently nonlinear and dynamic, and also, cumulative probability plot ( Figure 5) shows a nonlinearity in the distribution of dataset; then, this plot verified the fact of nonlinearity of dataset.

| ANN results
Inspection of results in prior section shows this dataset is a typical nonlinear regression problem. This problem can be tackled by using the neural learning algorithm. Since there might be a strongly nonlinear component in the relationship between IC 50 and the three descriptors (resulted by ANFIS), QSAR modeling was also performed using ANNs to get the ANN counterpart to Equation (1). The learning capability of the network was tested with varying the transfer functions in hidden and output layer and the number of neurons in the hidden layer as shown in Table 2. It observed that network generalizes well for six neurons in the hidden layer and radbas and tansig transfer functions as the optimal value. The mean squared error is found to be 0.0197 for training set and 0.0231 for testing set. The training was terminated after 323 epochs.
Satisfactory results obtained with the Levenberg-Marquardt method using RBFNN. In case of radial basis network, the hidden layer consists of radbas transfer function (Duch & Jankowski, 2001). The conjugate method; Levenberg-Marquardt method is extremely fast. The advantage of the Levenberg-Marquardt is that it converges faster around the minimum and gives more accurate results. Its only drawback is that it requires more memory than the backpropagation with momentum method. In contrast, backpropagation NNs, the parameters of RBFNNs, can be adjusted by fast linear methods. It has advantages of short training times and is guaranteed to reach the global minimum of error surface during training. The optimization of its topology and learning parameters are easy to be implemented (Walczak & Massart, 2000;Yao et al., 2004).

| Comparison and measures of performance
This section presents some results obtained through the performing and training of several neural network architectures with the gradient descent backpropagation algorithm, using real data as network input features. If there is enough data available, it is preferable to divide it into a training set and an external test set. The test set is used for validating the model after training and will function as a completely new unseen dataset. The model performance on the external test set acts as a generalization measure. In the cases when there was no external test set supplied in the original data, the data were sorted according to the y-values and every value was assigned to the test set randomly. To be able to determine the accuracy of the results and compare the predictive abilities of the methods on the dataset, some kind of performance measure is needed. Two commonly used measures in multivariate analysis are the mean-square error (MSE) and the determination coefficient or the squared correlation coefficient also referred to as the R 2 -value. A summary of the results of prediction performances of proposed methods for train and test data is given in Table 3.
The comparison between two methods shows that the ANN method is more effective. Therefore, the lower the MSE values, the neural network has better prediction performance. Similarly, the higher the correlation coefficient (R) value, the higher the prediction performances of the model. According to correlation coefficient (R) in both trained and tested data, ANN is better predictor model. The correlation coefficient of ≥0.90 for the prediction set showed a good predictive ability for the generated model by RBFNN. As a matter of fact, the prediction of the test set based on selected descriptors by ANFIS revealed its suitability in feature reduction. The results show that the test result is relatively poor for MLR model, as can be seen in its R value. The ANN results obtained in this QSAR study are better than the corresponding MLR results. Apparently, the results were significantly improved in these ANN runs over the MLR analyses, indicating the strong nonlinear dependence of inhibition data on the input descriptors used. The strength of the alternative multivariate methods compared to the traditional multilinear regression (MLR) and PLS is their ability to handle complex nonlinear data. Comparison of results for MLR and NN show that neural networks can be effectively used in the activity development of pyridinylimidazole-based compounds and gives more satisfactory results for practical problems. In summary, NN was able to predict the test data.

| Model validation by external test criteria
Model validation is a critical phase of each QSAR modeling. Commonly, much of the effort has focused on new validation methods for producing superior answers. It has the potential to significantly enhance answers. In computational manner, it could be interpreted in many ways. Roy and Tropsha have introduced interesting statistical tools for validation of acquired QSAR models. Golbraikh and Tropsha through analysis of several datasets showed the assumption of leave-one-out cross-validated R 2 more than 0.5 or (LOO q 2 ˃ 0.5) is necessary for assessing the predictive ability of model but it is not sufficient. They emphasized that the test set or external validation is the best tactic to form a reliable QSAR model (Golbraikh & Tropsha, 2002). Generally, the following parameters are the most popular criteria for more validation of QSAR/QSPR models: R 2 0 and R ′2 0 are correlation coefficients for regression through the origin for predicted versus observed and observed versus predicted activities, respectively, and corresponding slopes of regression lines through the origin are k and k′. Besides, Roy (Pratim Roy, Paul, Mitra, & Roy, 2009;Tropsha, Gramatica, & Gombar, 2003) proposed another validation statistical criteria, named modified R 2 (R 2 m ) which measure the predictability of a model and is determined as follows: R 2 m value more than 0.5 (R 2 m > 0.5) approves that the model possess good external prediction ability.
In this respect, using mentioned statistical tools for the validation of MLR and ANN models, we listed values in Table 4. The results for all criteria in this table suggest high ability of the ANN model in prediction of the bioactivity of candidate drugs.

| Explanation of descriptors
The best algorithms of QSARs modeling are those that are simple, transparent, easily interpretable, and easily portable. A transparent model can be defined as one that is based on ABDOLMALEKI AnD GHASEMI fundamental physicochemical properties with a clear and unambiguous statement of how the model has been formulated . Such a model is capable of mechanistic interpretation. The transparent characteristics are usually achieved by proper mathematical algorithms . In other words, a QSAR model should be interpretable in terms of the parameters employed.
Interdisciplinary integration with subjects such as biochemistry and toxicology can deepen the understanding of the modes of toxicity actions and then improve mechanistic interpretabilities of QSAR models . The essential interactions of pyridinylimidazole inhibitors with the ATP-binding cleft are briefly summarized as follows: (a) hydrogen donor/acceptor functions of the 2-aminopyridyl residues within the hinge region (mainly gaining activity); (b) space-filling lipophilic aryl residues binding to the hydrophobic back pocket (also hydrophobic region, mainly gaining selectivity); (c) interactions with the hydrophobic front region (also hydrophobic region, gaining both activity and selectivity); (d) further interactions with both the sugar pocket and the phosphate binding region (importance less clear, preferred positions to modify physicochemical properties) (Tong et al., 1997;Wang et al., 1998;Wilson et al., 1997).
As noted above, the process parameters that have significant effects on the inhibition activity of these drugs include Diam, Homo, and LogP descriptors. Diam attribute of a compound depends mainly on the molecular size and symmetry, and it influences on intermolecular interactions. This physical property for a crystal is governed by the hydrogen bonding ability of the molecules, the molecular packing in crystals (effects from molecular shape, size, and symmetry), and other intermolecular interactions such as charge transfer and dipole-dipole interactions in the solid phase (Ghasemi et al., 2008).Therefore, it influences on solubility and solubility controls IC 50 . The polarization of a molecule by an external electric field seems to correlate with shape, size. It is possible; make inductive interactions in the molecule. Thus, the most significant property of the molecular polarizability is the relation to the molecular bulk or molar volume. Polarizability values have been shown to be related to hydrophobicity and thus affect on biological activities. Furthermore, the electronic polarizability of molecules shares common features with the electrophilic superdelocalizability (Karelson, Lobanov, & Katritzky, 1996). This suggests that, with larger substituents, the interactions between receptor residues and compounds initially are enhanced, resulting in an increase in activity. However, as substituents become too large, unfavorable steric repulsions between ligand and receptor begin to dominate, resulting in a decrease in activity (Chiu & So, 2004).
The earlier studies established small (methyl) substituents located at the exocyclic sulfur atom significantly improved inhibition of activity in both the isolated kinase assay and cytokine release from the human whole blood. This greater inhibitory potential was attributed to better entry of the inhibitor molecule into the binding cleft of p38, leading to stronger enzyme-drug interactions (Laufer, Striegel, & Wagner, 2002). For example, in the kinase activity assay, the ethylsulfanyl substituent of compound 6 proved to be nearly bioisosteric to the well-defined methylsulfanyl substituent (e.g., at compound 1, divergence factor of 1.7), additionally confirming the idea of stronger binding of the 4-fluorophenyl/pyridin-4-yl pharmacophore inside the ATP site, when substituents at the exocyclic sulfur are not bulky. For compound 8, the large benzylsulfanyl moiety at the 2-position of the imidazole core diminished the p38 inhibitory potency in contrast to its sterically less demanding ethylsulfanyl analog 6 (3.8-fold) and particularly with respect to the "small" methylsulfanyl derivatives ML3375, 1, and 3. Consequently, the least displaced orientation of those modified "bulkier" drug molecules within the active site was responsible for the consistently reduced activity of the 3-trifluoromethylphenyl-substituted imidazoles (Laufer et al., 2008).
Our modeling shows that the Homo energy is an important factor in QSAR models. Energies of the Homo and Lumo are very popular quantum chemical descriptors. It has been shown that these orbitals play a major role in governing many chemical reactions (i.e., formation of charge transfer complexes) and determining electronic band gaps (as stability index) in solids. A large Homo-Lumo gap implies high stability for the molecule in the sense of its lower reactivity in chemical reactions. The Homo-Lumo gap has also been used as an approximation to the lowest excitation energy of the molecule. The energy of the Homo is directly related to the ionization potential and characterizes the susceptibility of the molecule toward attack by electrophiles. The energy of the Lumo is directly related to the electron affinity and characterizes the susceptibility of the molecule toward attack by nucleophiles. The concept of hard and soft nucleophiles and electrophiles has been also directly related to the relative energy of the Homo/Lumo orbitals. Hard nucleophiles have a low-energy Homo; soft nucleophiles have a high-energy Homo; hard electrophiles have a high-energy Lumol and soft electrophiles have a low-energy Lumo (Karelson et al., 1996). According to the frontier molecular orbital theory (FMO) of chemical reactivity, the formation of a transition state is due to an interaction between the frontier orbitals (Homo and Lumo) of reacting species. Frontier orbital electron densities on atoms provide a useful means for the detailed characterization of donor-acceptor interactions. Thus, the majority of chemical reactions takes place at the position and in the orientation where the overlap of the Homo and Lumo of the respective reactants can reach a maximum. In the case of a donor molecule, the Homo density is critical to the charge transfer (electrophilic electron density), and in the case of an acceptor molecule, the Lumo density is important (nucleophilic electron density). These indices have been employed in this QSAR study to describe drug-receptor interaction sites. According to the polyelectronic perturbation theory of Klopman and Hudson, drug-receptor interactions are under either charge or orbital control. Thus, the net atomic charges characterize electrostatic interactions, while the donor superdelocalizability characterizes the covalent component of the interaction (Karelson et al., 1996).
The dataset (Table 1) shows the introduction and wide variation of substituents at the Pyridine part of these 2-thioimidazoles bridged by different heteroatoms (NH, O, S). The strength of the H-bonding between the pyridine-N of the inhibitor and the amide-NH of Met109 is influenced by the electron donating or withdrawing capacity of the substituents. The p38 inhibitory potency has been correlated with the electron density at the respective heterocyclic ring nitrogen (Laufer, Wagner, Kotschenreuther, & Albrecht, 2003). An electronegative halogen atom located at the 2-position of the pyridine-4-yl moiety significantly decreased the biological activity. Fluoropyridine 1, for example, exhibited a 2.8-fold worse inhibitory potency toward the isolated enzyme (Laufer et al., 2008).
In agreement with our previous study, the hydrophobicity parameter (LogP) plays an important role in QSAR models (Ghasemi, Abdolmaleki, & Mandoumi, 2009). The most popular quantitative scale to measure the lipophilicity of compounds is the logarithm of the partition coefficient (called the log P parameter) between 1-octanol and water, introduced in detail by Hansch and Leo (Mannhold, Poda, Ostermann, & Tetko, 2009). Partition coefficients have been shown to correlate with measures of biological activity in a very wide variety of experimental systems, ranging from simple protein binding to animal and human in vivo effects. It is a very important indicator of transport and permeation through membranes, interaction with biological receptors and enzymes, toxicity, and biological potency. In environmental sciences, the hydrophobicity is often used to predict solubility, the bioconcentration factor, and the organic adsorption coefficient (K oc ) (Roberts, 2002).This is presumably because hydrophobic effects are important not only in the intermolecular interactions that occur between a drug and its target site but also in the distribution of a compound within a biosystem and its interaction with competing for binding sites. The results of Livingston's studies indicated that the dye lipophilicity can be explained by hydrophobic and polarity dye structural parameters (Livingstone, 2000).The positive standardized coefficient for LogP parameter is in accordance with physical considerations-compounds with higher hydrophobicities have stronger interactions with the target site and thus enhance the activity of chemicals.
Hydrogen bonding between the pyridine ring N of the inhibitor and the backbone amide-NH of Met109 in the otherwise lipophilic linker region (Ala51, Thr106, His107, Leu108, and Met109) is essential for the biological activity of the vicinal diaryl-substituted heterocyclic compound class. Appropriate amino substituents at the 2-position of the pyridine can enhance activity toward unsubstituted (e.g., ML3375) and toward halogenated pyridine derivatives, as they are able to form a second H-bond with the linker region next to the existing one (here, between the exocyclic 2-amino group as H-donor and the main chain carbonyl oxygen of Met109 as H-acceptor). On the other hand, an electron donating amino group should increase the electron density at the pyridine, thereby intensifying the affinity of the inhibitor for the enzyme. Moreover, by adequate lipophilic side chains, further interactions with the surface-exposed front region (hydrophobic region) can be achieved. Therefore, in general, sufficiently lipophilic arylalkylamino residues favorably affected biological activity (compounds 13-23, 18). For condensed aromatic ring systems (15,16,19), the spatial geometry seemed to play a crucial role; however, multiple separate aromatic cycles in the aliphatic side chain resulted in increased IC 50 values (24, 25) (Laufer et al., 2008).
A typical three-dimensional (3D) QSAR technique that ultimately allows one to design and predict activities of molecules, for which the foundation is that interaction force fields of series bioactive molecules with the same receptor are similar. Even 3D structures of receptors are unclear, one can deduce the properties of receptors, design new chemicals, and quantitatively estimate activities of chemicals by studying circumambient interaction force fields of bioactive molecules and quantifying bioactivities (Cramer, Patterson, & Bunce, 1988;Marshall & Cramer, 1988).

| CONCLUSIONS
The 3D structure of drugs to their biological properties has been increased knowledge of the 3D structure of biological molecules. Molecular structural descriptors used in 3D QSARs model capable of mechanistic evaluation. The establishment of QSARs based on the proper analysis and understanding of mechanisms, and vice versa, well-established QSAR models, maybe facilitate mechanism interpretations. In this study, it has been shown that the use of ANFIS offers a feasible method for the optimization of the knowledge base of fuzzy logic controllers. ANFIS process to evolve the selection of features rapidly identifies useful collections of features while simultaneously producing an optimized model, and it is very useful for automated feature reduction and selection tool in QSAR study. We showed that using a real case, the ability and feasibility of the application of hybridization of methods ANFIS _ neural networks a tool further efficiency than ANFIS _ MLR, in biological data. In agreement with the fact,