Evaluating the Predictivity of Virtual Screening for Abl Kinase Inhibitors to Hinder Drug Resistance

Virtual screening methods are now widely used in early stages of drug discovery, aiming to rank potential inhibitors. However, any practical ligand set (of active or inactive compounds) chosen for deriving new virtual screening approaches cannot fully represent all relevant chemical space for potential new compounds. In this study, we have taken a retrospective approach to evaluate virtual screening methods for the leukemia target kinase ABL1 and its drug-resistant mutant ABL1-T315I. ‘Dual active’ inhibitors against both targets were grouped together with inactive ligands chosen from different decoy sets and tested with virtual screening approaches with and without explicit use of target structures (docking). We show how various scoring functions and choice of inactive ligand sets influence overall and early enrichment of the libraries. Although ligand-based methods, for example principal component analyses of chemical properties, can distinguish some decoy sets from active compounds, the addition of target structural information via docking improves enrichment, and explicit consideration of multiple target conformations (i.e. types I and II) achieves best enrichment of active versus inactive ligands, even without assuming knowledge of the binding mode. We believe that this study can be extended to other therapeutically important kinases in prospective virtual screening studies.

The availability of crystal structures of many key drug targets and the low cost of computational methods now encourage the use of virtual screening (VS) in early stages of drug discovery. There is an enormous quantity of data regarding target structures and ligand binding, and VS should be expected to work best when all experimental knowledge is integrated appropriately into the methods. If the ligand set contains diverse or focussed scaffolds, then the training or parameterization of the VS method should be designed to account for this. Screening of focussed databases will best predict active ligands when trained against similar compounds, and screening of diverse sets will best identify active ligands if the variability of the target protein is adequately represented in the method. In this study, we examine VS approaches for the leukemia target receptor ABL1, a protein tyrosine kinase now well characterized by knowledge of multiple inhibitors and target conformations.
Inhibition of protein kinases by selective inhibitors has become a major therapeutic approach for many diseases, especially well established for cancer. Targeted inhibition of ABL1 and several related kinases by imatinib (Gleevec, Novartis) has become the successful front-line therapy for chronic myeloid leukemia (CML) and several solid tumors (1). Response to imatinib therapy in CML statistically is highly durable in the chronic phase; especially with early initiation of treatment; more advanced stages of the disease often involve relapse and imatinib resistance (2,3). Mutations of amino acids in the kinase domain of ABL1 are the most common cause of such resistance, affecting some 50-90% patients with acquired resistance (4)(5)(6). Among the various mutations, an isoleucine substitution at the 'gatekeeper' residue threonine (T315I) accounts for about 20% of the total burden of clinical resistance (5). This residue has been designated 'gatekeeper' due to its position that determines the size of a hydrophobic pocket in the active site of the kinase domain. Many small molecule inhibitors exploit this threonine residue for their specificity (7). Substitution of the gatekeeper residue has been observed as a major mechanism of acquired resistance for other tyrosine kinase drug targets, including c-KIT-T670I (8), EGFR-T790I (9), and PDGFRalpha-T74M/I (10).
Recent studies have shown a strong correlation between substitution of the gatekeeper residue and oncogenic transformation (11), and substitution of a threonine gatekeeper residue with a hydrophobic residue such as leucine is a mechanism of activation of several tyrosine kinases (12). Thus, the mechanism of resistance against Abl inhibitor drugs involves not only drug binding properties, but also the oncogenic transformation capacity of gatekeeper mutant itself. Second-generation CML drugs, such as dasatinib and nilotinib, have been introduced to combat or forestall resistant forms. However, many of these newer drugs do not eliminate resistance via the gatekeeper mutation (ABL1-T315I) (4,13), despite greater potency against wild-type protein (ABL1-wt) and most of the imatinib-resistant mutations (13)(14)(15). Therefore, developing ABL1 inhibitors that target resistance mutations, and in particular the ABL1-T315I gatekeeper mutation, currently remains a goal of leukemia drug research.
Known inhibitors of ABL1 that also inhibit the ABL1-T315I form are predominantly 'type II' inhibitors, targeting an inactive conformation of the kinase. These include ponatinib (in clinical trials, also known as AP2453416, along with others in earlier stages of development) (16,17). Type II inhibitors bind in a deep and mostly hydrophobic pocket that exists when the activation loop of a kinase adopts an inactive conformation in which the phenylalanine of the conserved DFG motif is removed from its hydrophobic packing position that becomes the pocket. Other characteristics of type II inhibitors include hydrogen bonding interactions, usually involving amide or urea moieties. In contrast, type I inhibitors bind to the active form of the kinase, in which the DFG phenylalanine is bound in its hydrophobic site, and the neighboring aspartate is positioned appropriately for its role in the phosphotransfer reaction of the kinase. Both type I and type II inhibitors typically bind to the hinge region that also anchors the ATP adenine via hydrogen bonds. Figure 1 shows type I and type II binding conformations of ABL1 kinase domain structures.
We studied a set of high-potency ABL1 inhibitors that can inhibit both ABL1-wt and ABL1-T315I forms ( Figure 2). Applying VS retrospectively to these and related inhibitors, we aimed to identify VS protocols that best identify active inhibitors dispersed in larger libraries. The protocols vary with respect to the chemical properties analyzed, and the amount and type of target structural information integrated into the procedures. Such optimized protocols would be best suited to screen libraries of ligands with unknown activity against ABL1 and mutant forms. The study can in principle be extended to other therapeutically important kinases and also provides information for the extent of structural information needed for success.

ABL1 inhibitor set
To create a library of inhibitors that inhibit both ABL1-wt and ABL1-T315I, representing a set of active compounds with decreased drug resistance potential, compounds with IC 50 values <100 nM in enzyme assays for ABL1-wt or ABL1-T315I were retrieved from the Kinase Knowledgebase (KKB, www.eidogen-sertanty.com). Of the inhibitors identified, 38 were inhibitory (IC 50 < 100 nM) for both the wild-type and mutant forms; 16 of these were ponatinib analogs. In addition, 141 were inhibitory for ABL1-wt alone (IC 50 for ABL1-T315 > 1 lM or no mutant binding data available). In contrast, all the high-potency inhibitors of ABL1-T315I were active against the wild-type target (IC 50 < 1 lM). Here, we study the dual high-potency (IC 50 < 100 nM) inhibitors in detail, as they possess in common one of the selectivity criteria for ABL inhibition therapy that aims to reduce the occurrence of drug resistance. Table 1 summarizes the sizes of the relevant inhibitor sets taken from the KKB database.
The diversity of this inhibitor set was analyzed by the Scaffold Hunter program (18). A scaffold is defined by the all carbon and heterocyclic rings, their aliphatic linker bonds, and atoms attached via a double bond (19). Scaffold Hunter extracts chemically meaningful compound scaffolds and iteratively removes one ring at a time to generate smaller compounds. Thereby, a hierarchical arrangement of parents and children is formed, yielding branches that are combined to form a tree ( Figure 3).

Inactive ligand sets
Three 'decoy' sets were chosen for inclusion into test libraries that combine active and inactive compounds. The largest set was retrieved from the Directory of Useful Decoys (DUD) (20), containing 6319 physically similar but topologically distinct ligands. As no decoy set chosen explicitly for ABL kinase domains is available from DUD, the decoy set for homologous kinase SRC was used for this study. A second set was taken from Glide (21). This set is 'universal', that is, neither 'kinase inhibitor-like' nor specifically 'non-kinase-inhibitory', consists of 1000 ligands and was created from one million druglike ligands. Finally, a set was chosen from the weak binding inhibitors (enzyme inhibition IC 50 = 100-1000 nM), containing 89 inhibitors. As weak binders, these might be considered the most challenging decoys.

ABL1 kinase domain structures
Five crystal structures of T315I mutants of ABL1 kinase domain in complex with inhibitors were taken for analysis, along with structures for four of these inhibitors that have been co-crystallized also with the ABL1-wt kinase domain. These structures, summarized in Table 2, were used for VS of dual active inhibitors and of inactive ligands. Because four pairs of structures, each with one inhibitor binding both the wt and T315I forms, are included, the test set includes a range of inhibitor-associated flexibilities, DFG conformational states, and allows direct comparisons of the effects of gatekeeper mutations.

Virtual screening studies
Protein preparation For docking, the single kinase domain structures, in complex with their native ligands, were analyzed by the protein preparation wizard of Schrodinger program (Schrodinger LLC, 2011, New York, NY, USA). Water molecules were deleted, bond orders assigned, and hydrogen atoms were Chem Biol Drug Des 2013; 82: 506-519 added. A restrained minimization was then performed with the OPLS2005 force field using the default constraint of 0.30 A RMSD. A grid box was then generated for each structure that included co-crystallized ligand and most of the binding cleft between the N-and C-lobes. The main chain nitrogen of Met318 at the hinge segment of kinase domain was included as constraint as a hydrogen bond donor for the docking runs.

Ligand preparation
Ligand preparation and the subsequent calculations were performed by modified KNIME (www.knime.org) workflows made up of Schrodinger modules. The co-crystallized ligands, the dual active inhibitors, and decoy sets mentioned in the ligand-based study were prepared using the OPLS2005 force field in the ligand preparation module of Schrodinger. The ligands were ionized as between pH 5-9, and the tautomers and stereoisomers were generated. Finally one lowest energy conformation from the generated conformer set was chosen for docking with Glide.

Docking and scoring protocol
The compounds of the libraries were classified into 'hits'a ranked listand 'inactives' using three different Glide docking protocols: high throughput virtual screening (HTVS), standard precision (SP), and extra precision (XP). For each ligand, Glide generates a set of low-energy conformations and then exhaustively searches the receptor active site to position the conformers. The docked poses pass through a series of hierarchical filters that evaluate the receptor-ligand interactions and are then energy-minimized on a precomputed grid of van der Waals and electrostatic energies for the receptor. The final scores are calculated according to the energy functions described elsewhere (22). In short, all docking functions use flexible ligand docking and same scoring scheme. But HTVS reduces the number of low-energy conformers through the docking filters. Moreover, HTVS reduces the thoroughness of the final torsional refinement and sampling of the ligand conformers. Compared with XP, SP is a softer method that can identify relatively weak binders by allowing 'less than perfect' poses. Therefore, SP is used in large-scale VS to identify ligands with a reasonable propensity to bind. Extra precision imposes severe penalties for poses that apparently violate physical chemistry rules. For example, charged and strongly polar groups should be adequately exposed to solvent. Extra precision thereby reduces false positives and can be used in lead optimization studies where only a limited number of compounds are considered for synthesis or other experiments.

MM-GBSA re-scoring
To estimate the free energy of binding between the receptor and the ligands, an implicit solvation model was used via the molecular mechanicsgeneralized Born surface (MM-GBSA) approach. Glide SP poses were re-scored using MM-GBSA in two ways: first, as a rigid receptor, and secondly, as a partially flexible receptor where any residue with an atom within 12 A of the ligand remained flexible.
The MM-GBSA is a postprocessing end-state method for calculating free energies of binding of molecules in solution.
Compared with more rigorous methods such as free energy perturbation and thermodynamic integration methods, MM-GBSA and the related method MM-PBSA are computationally more efficient. All these methods allow for rigorous free energy decomposition into contributions from different groups of atoms or types of interaction. In MM-GBSA, the binding free energy (DG bind) between a ligand (L) and a receptor (R) in forming the complex (RL) is calculated as: where DE MM , DG sol and DS denote the change in gas phase MM energy, solvation free energy, and the conformational entropy upon binding. DE MM is composed of A B C Figure 2: Scaffold tree of highaffinity dual inhibitors for ABL1-wt and ABL1-T315I. Imidazole is the parent scaffold that gives rise to all ponatinib analogs. (A) First two parent layers of the scaffold tree.
(B) Full extension of the imidazole containing scaffolds: the ponatinib containing scaffold is marked. (C) All inhibitors derived from ponatinib scaffold. The term 'analog' is used loosely in this article. The inhibitors that are visually similar to ponatinib in 2D sketches are termed analogs. Scaffold is a well-defined term in this article. A scaffold consists of all carbo-and heterocyclic rings, their aliphatic linker bonds, and atoms attached via a double bond. Therefore, the inhibitors that have similar structures but differ in heterocyclic atoms are not considered to have the same parent scaffold.   Docking analyses Two metrics were used to calculate the enrichment success of the virtual screening output 'hit' lists: the enrichment factor (EF) and the receiver operating characteristic (ROC) plot. The EF plots the percentage of actives as a function of the position in the ranked list versus percentage of all hits from the database. Active ligands or decoys were identified as hits once they pass the Glide docking filters mentioned above and can be ranked according to Glide docking scores. In an XY plot for EF calculation,

Y ¼
No. of actives identified as hits All active hits Â 100; and X ¼ Screened hits (Actives + Decoys) All active hits + All Decoy hits Â 100: The EF was calculated for 1%, 5%, and 10% of the total hits that contain active ligands and decoys. This method approximates and tests reasonable procedures of selecting compounds for testing after ranking compounds of unknown activity by VS.
Receiver operating characteristic plots true positive rates in Y-axis and the corresponding true positive rate in Xaxis: The area under the curve (AUC) of ROC plot is equivalent to the probability that a VS run will rank a randomly chosen active ligand over a randomly chosen decoy.
The EF and ROC methods plot identical values on the Y-axis, but at different X-axis positions. Because the EF method plots the successful prediction rate versus total number of compounds, the curve shape depends on the relative proportions of the active and decoy sets. This sensitivity is reduced in ROC plot, which considers explicitly the false positive rate. However, with a sufficiently large decoy set, the EF and ROC plots should be similar.

Ligand-only-based methods
In principle, (ignoring the practical need to restrict chemical space to tractable dimensions), given enough data on a large and diverse enough library, examination of the chemical properties of compounds, along with the target binding properties, should be sufficient to train cheminformatics methods to predict new binders and indeed to map the target binding site(s) and binding mode(s). In practice, such SAR approaches are limited to interpolation within structural classes and single binding modes, partly because of the amount of data available and also partly because of the consequently limited number of chemical descriptors considered. Here, in order to investigate to what extent the active inhibitors and decoys can be distinguished, the compounds were assigned chemical space coordinates according to the molecular descriptorbased principal component (PC) sets of ChemGPS-NPweb (23). These descriptors include some 40 molecular descriptors such as molecular weight, number of rotatable bonds, number of hydrogen bond donors/acceptors and were analyzed for active ligands, DUD decoys, and randomly selected high-potency (IC 50 < 100 nM) kinase inhibitors. The first three PCs from the ChemGPS-NPweb-based calculations can distinguish the inhibitor and decoy compound sets (with some overlap), but the ABL1 inhibitors are found scattered and indistinguishable within the volume populated by randomly chosen kinase inhibitors (IC 50 < 100 nM). The first four dimensions of the ChemGPS-NP PC calculation account for 77% of the data variance. For typical compound sets, PC1 represents size, shape, and polarizability; PC2 corresponds to aromatics and conjugation-related properties; PC3 describes lipophilicity, polarity, and H-bond capacity; and PC4 expresses flexibility and rigidity. A 3D plot was constructed from the threefirst PCs to display the distinctions between the various compound sets.
Correlation of molecular properties and binding affinity: The Canvas module of the Schrodinger suit of programs provides a range of methods for building a model that can be used to predict molecular properties. They include the common regression models, such as multiple linear regression, partial least-squares regression, and neural network model. Several molecular descriptors and binary fingerprints were calculated, also using the Canvas module of the Schrodinger program suite. From this, models were generated to test their ability to predict the experimentally derived binding energies (pIC 50 ) of the inhibitors from the chemical descriptors without knowledge of target structure. The training and test set were assigned randomly for model building.

Neural network regression
Neural networks are biologically inspired computational methods that simulate models of brain information processing. Patterns (e.g. sets of chemical descriptors) are linked to categories of recognition (e.g. binder/non-binder) via 'hidden' layers of functionality that pass on signals to the next layer when certain conditions are met. Training cycles, whereby both categories and data patterns are simultaneously given, parameterize these intervening layers. The network then recognizes the patterns seen during training and retains the ability to generalize and recognize similar, but non-identical patterns.

Diversity of the inhibitor set
The high-affinity dual inhibitors for wt and T315I ABL1 kinase domains can be divided roughly into two major scaffold categories: ponatinib-like and non-ponatinib inhibitors. The scaffold analysis shows that there are some 23 major scaffolds in these high-affinity inhibitors. Although ponatinib analogs comprise 16 of the 38 inhibitors, they are constructed from seven child scaffolds (Figure 2). These seven child scaffolds give rise to eight inhibitors, including ponatinib. However, these closely related inhibitors vary significantly in their binding affinity for the T315I isoform of ABL1, while wt inhibition values are similar (Figure 4). Figure 4 shows clearly that T315I affinity for ponatinib analogs vary according to variations in their hydrophobic binding interactions. For example, replacement of CF 3 by a chlorine atom causes a dramatic decrease in affinity for T315I. A similar effect can be observed for 4-methyl substitution at the piperazine ring. Thus, the ponatinib scaffold provides the greatest binding energy components via predominantly polar interactions, especially H-bonding at the hinge, but variations in the side chains and their mostly hydrophobic interactions cause the variations in binding affinity seen mostly for binding to the T315I isoform.

HTVS and SP docking with DUD decoys
Virtual screening docking runs were performed for the library of dual active compounds dispersed in the DUD decoy set against the nine ABL1 kinase domains as summarized in Table 2. For each kinase domain target structure, the co-crystallized ligand, the dual active inhibitors, and the DUD sets were docked using the HTVS and SP modes. The resulting ranked hit lists were characterized using the EF and ROC AUC methods (Table 3, Figure 5). The AUC values show thatwith a single exception -SP docking shows better results compared with the HTVS protocol ( Table 3). The exception occurs for docking against the PPY-A-bound ABL1-T315I structure. Docking to the type II receptor conformations in general provided much higher enrichment of active inhibitors. Nearly 99% enrichment was obtained by docking against each of the type II conformation structures of ABL1-T315I. For VS against a single target structure, the ROC AUC values from the SP docking highlight the type II ABL1-T315I kinase domain structure as the best choice.

Evaluation of early enrichment factors
The early EFs calculated for the VS runs are shown for the SP method in Table 4, highlighting the relative success of the docking runs to identify actives, filter away decoys, and rank actives over the remaining decoys in the hit list. Both the type II conformation targets provide the best results. As the best example, docking against the ponatinib-bound ABL1-T315I kinase domain structure, 34 (89%) of 38 active inhibitors versus only 1915 (30%) of 6319 decoys were identified as hits. At the EF1% level, 18 (47%) of these active inhibitors were already included. The superior performance of the type II conformation target structures is perhaps not surprising, given the preponderance of type II inhibitors in the dual active set. However, there are significant differences between the docking runs against the two type II target structures. Against the DCC-2036 bound kinase domains, enrichment of the active inhibitors was a bit higher, but at the cost of identifying more than 70% of decoys as hits. However, some of the discouragement of this result is compensated for by the relatively high early enrichment values. Using type I kinase domain conformations, more actives and decoys were identified as hitsup to 80% of the decoysand early enrichments were much poorer than using the type II conformation as docking target.
Binding energy prediction and enrichment with MM-GBSA Binding energies were calculated for the SP docked poses using MM-GBSA, which in theory should provide improved energy values and, by extension, should improve the ranking of the hit list. However, Table 5 shows that both the ROC AUC and enrichment values are decreased for type II conformation targets with MM-GBSA approach. For the type I, the results were mixed. Although the overall enrichments were generally increased compared with the SP and HTVS approaches, the early enrichment values are lowered in most cases. These values show that binding energies calculated by MM-GBSA approach could enrich the active inhibitors from decoys, but the performance was less satisfactory than SP docking energies.

VS with Glide decoys and weak inhibitors of ABL1
As it was most successful, the ponatinib-bound ABL1-T315I conformation was chosen for further VS studies to test the effects of alternate choices for decoys and alternate methods for binding energy calculations. Using either the 'universal' Glide decoys or ABL1 weak binders as decoy sets, ranked hit lists from SP and/or XP docking runs were either used directly or re-ranked using the MM-GBSA approach with a rigid receptor model or using the MM-GBSA approach with receptor flexibility within 12 A of the ligand. Table 6 summarizes the results. For the Glide decoys, SP docking was sufficient to eliminate 86% of decoys, partially at the cost of low early enrichment values, which MM-GBSA energy calculations were not able to improve. The ABL1 weak inhibitor set was used as the strongest challenge to VS runs, because these, as ABL1 binders, require highest accuracy in binding energy ranking for recognition. And indeed, SP docking eliminated only roughly 50%, in contrast to the results for the Glide 'universal' decoys. However, the XP docking was able to improve this to eliminate some 83%, at the cost, however, of eliminating a larger set of active compounds. Both ROC

512
Chem Biol Drug Des 2013; 82: 506-519 and early enrichment values show that XP docking performed better than random for the reduced set of compounds classified as hits, but only barely. The addition of MM-GBSA calculations with the rigid and flexible receptors did not offer significant improvement.

Ligand-based studies
Chemical space of active inhibitors Despite some overlap, active inhibitors and DUD decoys map to distinguishable volumes in chemical space (Fig-ure 6A). This itself provides information to filter sets of potential inhibitors to eliminate compounds that match decoys rather than inhibitors. In contrast, plotting ABL1-wt selective inhibitors versus dual active ABL1 inhibitors does not distinguish the sets ( Figure 6B) in the major PC dimensions.

Correlation of molecular properties and binding affinity
Multiple calculations were made to identify the strongest linear correlations between the molecular properties of the inhibitors and their experimental pIC 50 values. For ABL1wt, the numbers of hydrogen bond donors and rotatable bonds showed the strongest correlations (R 2 of 0.87 and À0.69, respectively). In contrast, for ABL1-T315I, only the number of rotatable bonds showed a strong correlation (R 2 = À0.59), consistent with loss of threonine as a hydrogen bonding acceptor in the ABL1-T315I mutant. In both cases, the number of rotatable bonds was found to negatively correlate with the pIC 50 values with moderate correlation, supporting the generally valid inhibitor design goal that minimizing flexibility will enhance binding (provided the ability to fit the binding site is maintained, of course).
Several methods (multiple linear regression, PLS regression, and neural network regression) were used to create  models for predicting the experimental binding affinity (pIC 50 ) from molecular properties. Even in the absence of clear correlations with individual molecular properties, such models can in principle be trained to recognize complex multifactorial patterns, given enough data. Here, the neural network-based regression provided the best correlation between the experimental and predicted values (Figure 7).

514
Chem Biol Drug Des 2013; 82: 506-519 of the receptor. Key variations are seen in the positions of the activation and the glycine-rich loops, which are of a scale too large for automated receptor flexibility algorithms to have a chance of correct prediction. However, they do cluster into clearly distinct groups (Figure 8), and representatives of the groups may be selected for use in drug discovery tasks. The extent of knowledge of drug target plasticity depends on extensive crystallography research, something not available for relatively new targets. On the other hand, for key target classes, such as protein kinases, it is quickly becoming the norm to have significant information regarding structural plasticity of the target in drug discovery programs.
By itself, knowledge of target plasticity is not sufficient for good predictivity of inhibitor binding properties. For example, the energy costs of reorganization must be taken into account, and these are not generally accessible to theoretical methods. Instead, one increasingly has recourse to databases of ligand binding energies. As these databases grow, the prediction of binding energies from known binding data and explicit consideration of the plasticity of target structures will improve. At some point, the size and diversity of the binding data alone may become sufficient for predictivity when used in 'highdata-volume' 3D-QSAR-type approaches. At present, as can be seen here and elsewhere in the literature, ligandalone data are not adequate for binding predictivity, outside of narrowly proscribed boundaries, and drug design methods benefit greatly from consideration of target structures explicitly.
For tyrosine kinases, notably including ABL, the distinction between 'DFG-in' and 'DGF-out' states arises from the conformation of the activation loop and generates the major classification of inhibitor types (I and II, respectively) Among the type I conformations, substantial variations can be found, especially concerning the glycine-rich loop and the conformation of the DFG motif, such that the classification becomes less clear. For example, the SX7 structure shows the DFG motif to occupy a conformation intermediate between 'DFG-in' and 'DGF-out' (Figure 7). Also, the danusertib-bound structure (PDB: 2v7a) shows the glycine-rich loop in an extended conformation, whereas the other eight structures show the loop in a shared bent conformation in close contact with inhibitors.
The 'DFG-in' conformation corresponds to the active state of the kinase, whereby the loop is extended and open,  the phenylalanine residue of DFG occupies a hydrophobicaromat binding site at the core of the kinase domain, and the aspartic acid is poised to coordinate a magnesium ion which in turn coordinates the beta and gamma phosphate groups of ATP. In the DFG-in conformation, the kinase domain can bind both ATP and protein substrate, and the adenine ring of the ATP can form hydrogen bonds to the hinge region of the kinase domain (24). In contrast, the 'DFG-out' conformation represents an inactive form of the kinase ( Figure 1C) and is generally incompatible with both nucleotide and protein substrate binding. This conformation was first seen in an ABL1 complex with imatinib (25), but has since been found for many inhibitors and many kinases. In this conformation, the DFG segment is rotated, removing the DFG aromat from its binding site and creating a cavity, which can tightly accommodate inhibitors. The phenylalanine side chain can also partially occlude the ATP binding pocket. ABL inhibitor complex structures in the PDB show both DFG-in and DFG-out conformations, for both wild-type and T315I forms, as described above. Type II inhibitors (DFG-out binders) block the conformational change to the DFG-in state and so bind only to the DFG-out conformation. Type I inhibitors may bind both DFG-in and DFG-out conformations. These two conformations do not define two distinct and rigid states for the protein, and a new pharmacophore type I 1/ 2 has been proposed recently, which includes inhibitors with all type I but few type II interactions (26). Although no inhibition data are publically available for SX7, the KKB shows a few congeners of SX7 that weakly inhibit FLT1, FGFR1, and Aurora kinase A. But four other ligands are highly potent inhibitors of both forms of ABL1 kinase. Therefore, the nine publicly available PDB structures form complete representative set for a thorough VS study. A similar study has been published recently that used several crystal structures of p38c to investigate the effects of combining hit lists from different crystal structures of the same target (27).

Ligand selection
The choice of target structures and the choice of ligands were guided by the aim to represent target plasticity on the one hand and to preserve desirable high-potency dual inhibition qualities on the other hand. The decoy sets were chosen to enable testing of the VS methods in general, but also to test the applicability of different types of decoy sets. The DUD set was chosen to best match ABL1 inhibitor-like ligands, and the Glide set is a universal decoy set. The DUD decoy set has been previously used for enrichment studies (28). The DUD data set has been shown to include analog bias in the active ligand set. However, a recent version of DUD data set (DUD-E) has been released to address analog bias by enhancing chemotype diversity by their Bemis-Murcko atomic frameworks (29). Alternate data sets are available that also aim to minimize analog bias, such as Maximum Unbiased Validation (MUV) (30). In this study, we have replaced the original active ligands from the DUD data set with the inhibitors retrieved from KKB, so the analog bias of the DUD active ligands is not present.
One interesting result was the differentiation between the type II receptor conformations, namely 3ik3 (ponatinib bound) and 3qrj (DCC-2036 bound). With SP docking, about 30% of DUD decoys were predicted as hits, whereas this was more than 50% for 3qrj. The early enrichment (EF1%) was also different between these conformations: 47.37% for 3ik3 and 61.11% for 3qrj. The enrichment is similar for EF5%. Thus, the type II conformation represented by the ponatinib-bound ABL1-T315I structure performed better for enriching active inhibitors; the large proportion of ponatinib like inhibitors in the dual active set probably accounts for this.
Directory of Useful Decoys decoy set has been previously used for enrichment studies (28). Using the Glide universal decoys, only 14.4% of decoys were predicted as hits. This is an encouraging indicator, especially during VS with unfocussed ligand library. The early enrichment values between DUD and Glide decoys are not easily comparable, however, because of the different total content of decoys in the hit setsinclusion of only few decoys in the hit list dramatically reduces the EF values. Therefore, low early enrichment values with a small decoy set (such as Glide decoys here) should be a discouraging indicator in VS.
Using weak ABL1 binders as the decoy setthe most challenging varietythe Glide XP method was remarkably able to eliminate some 80% of the decoys, whereas the SP method eliminated about 60%. After elimination, the overall enrichment (indicated by ROC AUC) values were similar.
Chemical space navigation ABL active inhibitors seem underrepresented in the KKB data set. More than 23000 ligands are recorded with IC 50 < 100 nM for all forms of kinases, but only 255 are active against ABL1 (wild-type and mutant forms). This has been shown in a recent study with more than 20 000 compounds against a 402-kinase panel (31). Of the 182 dual activity inhibitors, 38 showed high activity (IC 50 < 100 nM) for both the receptor forms. But 90 high-activity ABL1-wt receptor showed medium (IC 50 = 100-299 nM) or low (IC 50 = 300-1000 nM) activity for ABL1-T315I. A few inhibitorsless than 10showed high activity for ABL1-T315I, but medium to low activity for ABL1-wt.

Conclusion
In this study, VS methods were applied to test their ability to identify inhibitors of leukemia target kinase ABL1 and its drug-resistant mutant form T315I. Nine PDB structures of the ABL1 kinase domain, with and without the mutation, and representing different activation forms, were used for GLIDE docking. ABL1 inhibitors were retrieved from Kinase Knowledge Base (KKB) database and combined with decoy compounds from the DUD database. Enrichment factor and receiver operating characteristic (ROC) values calculated from the VS studies show the importance of selecting appropriate receptor structure(s) during VS, especially to achieve early enrichment.
In addition to the VS studies, chemical descriptors of the inhibitors were used to test the predictivity of activity and to explore the ability to distinguish different sets of compounds by their distributions in chemical space. We show that VS and ligand-based studies are complementary in understanding the features that should be considered during in silico studies.