SEARCH

SEARCH BY CITATION

Keywords:

  • chemotypes;
  • cyclooxygenases;
  • dual activity–difference maps;
  • selectivity;
  • structure–activity relationships;
  • visualization

Abstract

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information

Structure–activity characterization of molecular databases plays a central role in drug discovery. However, the characterization of large databases containing structurally diverse molecules with several end-points represents a major challenge. For this purpose, the use of chemoinformatic methods plays an important role to elucidate structure–activity relationships. Herein, a general methodology, namely Chemotype Activity and Selectivity Enrichment plots, is presented. Chemotype Activity and Selectivity Enrichment plots provide graphical information concerning the activity and selectivity patterns of particular chemotypes contained in structurally diverse databases. As a case study, we analyzed a set of 658 compounds screened against cyclooxygenase-1 and cyclooxygenase-2. Chemotype Activity and Selectivity Enrichment plots analysis highlighted chemotypes enriched with active and selective molecules against cyclooxygenase-2; all this in a simple 2D graphical representation. Additionally, the most active and selective chemotypes detected in Chemotype Activity and Selectivity Enrichment plots were analyzed separately using the previously reported dual activity–difference maps. These findings indicate that Chemotype Activity and Selectivity Enrichment plots and dual activity–difference maps are complementary chemoinformatic tools to explore the structure–activity relationships of structurally diverse databases screened against two biological end-points.

Abbreviations:
CASE

chemotype activity and selectivity enrichment

COX-1

cyclooxygenase-1

COX-2

cyclooxygenase-2

DAD

dual activity–difference

SAR

structure–activity relationships

Structure–activity relationships (SAR) play a central role in drug discovery (1). To this end, a number of predictive and descriptive methods can be employed. Among them, some of the commonly used methodologies include quantitative SAR, rule-based methods, neural networks and pharmacophore modeling (2–5). However, the application of these methodologies is highly dependent of the structural nature of the database and the experimental information.

The chemotype-based classification of structurally diverse databases, associated with one or multiple targets, requires a robust and flexible strategy (6). An approach developed by Xu and Johnson (7,8) shows a considerable promise in this regard. Their method decomposes molecules in terms of characteristic structural patterns of variable resolution and complexity called chemotypes and provides tools for a hierarchical classification based on these chemotypes (6–9).

Chemotype-enrichment plots were previously designed to identify chemotype classes of active molecules with activity against one biological end-point in compound databases. The main goal of that approach is to characterize the relationship of occupancy to activity enrichment for a set of chemotypes at a given level of structural resolution (9). Hence, this kind of study provides an important tool to gain information about which molecular framework (e.g., molecular scaffold) is promising for further development of drug candidates (10).

As an extension of that approach herein we present the Chemotype Activity and Selectivity Enrichment (CASE) plots as a novel graphical representation of the chemical structure, activity and selectivity patterns of a molecular database with activity against two biological end-points. Based on the fact that the chemotype analysis does not consider the side chains of the molecules, which are also crucial for the biological activity, we extended the analysis to include the entire molecules. For this purpose, selected chemotypes identified in the CASE plot were further analyzed using the previously described dual activity–difference (DAD) maps (11–13), which are two-dimensional representations of the pairwise activity differences designed to characterize activity landscapes of data sets with biological activity against two biological end-points (11–13). It is worth to notice that DAD maps uncover regions in the landscape with similar SAR for two receptor subtypes as well as regions with inverse SAR (11–13).

As case of study, we carried out a comprehensive analysis of a molecular database with activity against cyclooxygenase-1 (COX-1) and cyclooxygenase-2 (COX-2). This database is very interesting because activity data for an important number of derivatives are available. Also, selective COX-2 inhibitors constitute an important class of anti-inflammatory drugs designed to reduce the gastrointestinal side effects caused by COX-1 inhibition (14,15). Furthermore, COX-2 has been proposed as a promising target in cancer therapy (15–17) and as well in neuropsychiatric disorders (15). It is worth to mention that several studies have linked COX-2 selective inhibitors with an increased cardiovascular risk and most of them have been withdrawn from the market (14). However, this kind of drugs still represents an important therapeutic option in chronic inflammatory diseases (e.g., celecoxib).

Results of this work show that chemotype characterization is a useful tool to study large molecular databases by simplifying the further multitarget SAR analysis.

Methods

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information

Data set

The chemical structures and biological activities of 658 previously reported cyclooxygenase inhibitors were obtained from the binding database (18–20). A broad diversity of chemical structures and their inhibitory activities (in the form of IC50 values) against human COX-1 and COX-2 are contained in this database. Activities for COX-1 ranges from 1 nm (pIC50 = 9.00) to 15 500 000 nm (pIC50 = 1.81); median pIC50 = 5.15 nm. Similarly, activities for COX-2 ranges from 0.08 nm (pIC50 = 10.10) to 1 010 000 nm (pIC50 = 3.00); median pIC50 = 6.44 nm. SMILES representations of the compounds and their biological activities can be found in the Supporting information, Table S1.

Chemotype-based classification

Molecules were classified into chemotype classes corresponding to different levels of structural resolution following the approach reported by Xu and Johnson (7,8) using molecular equivalence indices (MEQI). Two resolution levels were used in this study, namely cyclic systems and cyclic system skeletons (Figure 1). Cyclic systems were generated by removing side chains and cyclic system skeletons were obtained by setting all atoms to a single atom type (sp3 carbon) and all bonds to a single bond. To compute chemotypes, all counter ions and hydrogen atoms were removed. Exocyclic bonds of ketones, imines, sulfones, and sulfoxides were considered as part of the cyclic systems. However, to obtain a more general representation of the skeletons, exocyclic bonds of ketones, imines, sulfones, and sulfoxides were not considered part of cyclic system skeletons (9,21–23). Figure 1 exemplifies the different resolution levels used in this work for two molecules in the current database. Of note, both examples share the same chemotype at low-resolution level (cyclic system skeletons).

image

Figure 1.  Chemotype classification using two levels of structural resolution.

Download figure to PowerPoint

Activity-enrichment factor

Activity-enrichment factor (AEF) is defined as the proportion of active molecules in a particular chemotype relative to the proportion of active molecules in the entire database. The AEF for each chemotype in the database was calculated using the following equation (9):

  • image(1)

where inline image is the AEF for the λth chemotype, which is defined as the fraction of active compounds that present the λth chemotype inline image, with respect to the fraction of active compounds in the entire database Act(C).inline image was calculated as follows:

  • image(2)

where inline image and inline image are the number of compounds and active compounds, respectively, in the corresponding chemotype class at a given resolution level.

The background activity, Act(C), was calculated as follows:

  • image(3)

where |C| is the total number of compounds in the database and |C+| is the total number of active molecules in the data set C.

In this study, molecules with IC50 < 100 nm were defined as active. However, other activity values can be used.

Selectivity-enrichment factor

Selectivity-enrichment factor (SEF) is defined as the proportion of selective molecules in a particular chemotype relative to the proportion of selective molecules in the entire database. The SEF for each chemotype in the database was calculated using the expression:

  • image(4)

where inline image is the SEF for the λth chemotype, that is, the fraction of selective compounds toward a target/end-point in the λth chemotype inline image, with respect to the fraction of selective compounds in the entire database Sel(C).

inline image was calculated as follows:

  • image(5)

where inline image and inline image are the total number of compounds and selective compounds, respectively, in the corresponding chemotype class at a given resolution level.

The background selectivity, Sel(C), was calculated as follows:

  • image(6)

where |C| is the total number of compounds in the database and |C*| is the total number of selective compounds in the data set C.

In this work, we employed the activity ratio (ART = [IC50 COX-1]/[IC50 COX-2]) to define selectivity. Two different thresholds in selectivity (T) were used to define high (AR > 100) and low (AR > 10) selective compounds, denoted as AR100 and AR10, respectively. The number of high (C*100)- and low-selective compounds (C*10) can be used to calculate the SEF100 and SEF10, respectively, for each chemotype. Other thresholds in selectivity can be used.

Chemotype activity and selectivity-enrichment plots

For a data set of N compounds tested against targets I and II and clustered in chemotype classes, patterns of activity and selectivity can be characterized using CASE plots that are generated by plotting AEF against SEF for each chemotype. A general form of a CASE plot is presented in Figure 2. Chemotype Activity and Selectivity Enrichment plots are divided into four regions I–IV delimitated by the background values of AEF and SEF (AEF and SEF = 1). Data points in region I correspond to chemotypes with a low proportion of active and selective molecules. Data points in region II represent chemotypes with a low proportion of active molecules, but with a high proportion of selective molecules. Points plotted in region III represent chemotypes with a high proportion of active molecules, but a low proportion of selective molecules, for example high activity against both targets. Finally, data points plotted in region IV denote chemotypes that are rich in active and selective molecules. Although a CASE plot does not directly provide information of the frequency of a chemotype, this information can be easily mapped in the plot, for example by coloring the data points by frequency of the chemotype.

image

Figure 2.  General form of a chemotype activity and selectivity enrichment (CASE) plot showing four regions. Region I contains chemotypes with low activity and selectivity; region II contains chemotypes with low activity but high selective; region III contains chemotypes with high activity but low selectivity; region IV indicates chemotypes with a high activity and selectivity.

Download figure to PowerPoint

In this work, CASE plots were generated for COX-2 using two chemotype resolution levels (cyclic systems and cyclic system skeletons).

Dual activity–difference maps

The SAR of data sets tested against two biological end-points, for example molecular targets I and II, can be characterized using DAD maps which are based on pairwise comparisons (11–13). For a data set of N compounds tested against targets I and II, the DAD map depicts N(N–1)/2 pairwise potency differences corresponding to each possible pair of compounds in the data set against both targets. The potency differences for target T for each molecule pair are calculated with the following equation:

  • image(7)

where pIC50(T)i and pIC50(T)j are the activities of the ith and jth molecules (i). In this work, T = COX-1, COX-2. Note that ΔpIC50(T)i,j can have positive or negative values. As was previously described (11–13), using the sign of the potency difference values in the activity–difference maps (as opposed to using the absolute potency difference), provides additional information concerning the direction of SAR.

A prototype DAD map is shown in Figure 3. Vertical and horizontal lines determine boundaries for low-/high-potency differences for targets I and II, respectively. In this work, data points were considered to have a low-potency difference if –1 ≤ ΔpIC50 ≤ 1 for each target. The boundaries give rise to five general zones labeled as Z1–Z5 (Figure 3). Structural changes of molecule pairs that fall into zone Z1 (either a small or a large structural change) have a similar impact in the activity against the two targets (either an increase or decrease in activity). Therefore, zone Z1 is associated with similar SAR of the pair of compounds for both targets. Data points that fall into zone Z2 indicate that the change in activity of the compounds in the pair is opposite for I and II. Thus, the structural changes in the pair of compounds in Z2 are associated with an inverse SAR; that is, they increase the activity for one target but decrease the activity for the other target. Data points in Z3 and Z4 correspond to pairs of molecules with the same or similar activity for one target (I or II, respectively), but different activity for the other target (II or I, respectively). Data points in zone Z5 denote pairs of compounds with similar activity (or identical if ΔActivity = 0 for both targets) against I and II. In other words, structural changes in the pair of compounds located in Z5 have little or no impact in the activity against the two targets. Of note, the classification of data points in an activity–difference map is independent of the structure similarity. Further details of DAD maps are described elsewhere (11–13).

image

Figure 3.  General form of activity–difference (DAD) maps for targets I and II. The dashed lines intersect the axes at potency difference values of 0 ± t e.g., t = 1 (one log unit). The regions are defined as follows: Z1, structural modifications result in a significant decrease or increase of activity towards both targets; Z2, changes in structure increase activity for one target, while decreasing activity for the other target significantly; Z3 and Z4, structural changes result in significant changes in activity towards one target, but not an appreciable change towards the other.

Download figure to PowerPoint

Results and Discussion

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information

Database characterization based on chemotype frequency analysis

The hierarchical classification among chemotypes at different resolution levels is known to be well suited for structurally large data sets (7,8). Xu and Johnson (7,8) developed a methodology to group chemotypes in five basic categories: complete 2D structures, cyclic systems, side chains, ring systems and functional groups. In this work, the molecular database was analyzed considering chemotypes at two different resolution levels (cyclic systems and cyclic system skeletons), Figure 1, which can be associated to molecular scaffolds and molecular skeletons, respectively (7–9). Chemotypes were classified and ranked based on their frequency in the database. The data set contains 191 cyclic systems and 86 cyclic system skeletons. The most frequent chemotypes at two resolution levels are shown in Figure 4. The most frequent cyclic system is PP97T, corresponding to similar molecules with a common scaffold based on 1,5-diphenyl-1H-pyrazole (12.5%), Figure 4A. It is worth noting that similar scaffolds based on three ring systems have important frequency percentages, for example 6CEKT (5.5%), L2UP5 (4.0%), USKZ6T (4.0%), E5CGF (2.3%) and USKFM (2.1%). On the other hand, the most frequent cyclic system skeleton is C2NYM (34.5%), which comprises most of the scaffolds containing 1,2-diarylheterocyclic or 1,2-diarylcarbocyclic system, Figure 4B.

image

Figure 4.  Most frequent cyclic systems (A) and cyclic system skeletons (B) found in the current cyclooxygenase inhibitors database (frequency ≥ 10). Chemotype identifier, frequency, and percentage are displayed.

Download figure to PowerPoint

Having identified and quantified the most frequent cyclic systems and cyclic system skeletons, the characterization of the activity and selectivity profile of each chemotype is addressed in the next section using the CASE plots.

Chemotype activity and selectivity-enrichment plots

In this work, CASE plots are proposed as a graphical representation to analyze activity and selectivity enrichment for each chemotype in a set of molecules with activity against two targets or biological end-points. These plots are an expansion of the previously described chemotype-enrichment plots, which are graphic representations that show the relationship between occupancy and activity enrichment for a set of chemotypes at a given level of structural resolution. It is worth to mention that the previously described chemotype-enrichment plots were proposed for the analysis of a set of molecules with activity against a single target or biological end-point (9). Hence, chemotype-enrichment plots do not give information about selectivity for a particular chemotype. As described in the Methods, scaffolds rich in active and selective compounds were identified in the CASE plots (Figure 2).

Different levels in activity and selectivity can be used in CASE plots. Table 1 shows SEF values for the most frequent cyclic systems and cyclic system skeletons considering two thresholds for selectivity (SEF100 and SEF10). At the cyclic systems resolution level, chemotypes with high frequency in the database and high SEF (>1) at the two selectivity levels are PP97T, 6CEKT, L2U5P, QZ3TX, E5CGF, USKFM and RF2VF. It is important to consider different selectivity levels to visualize high- and low-selective chemotypes. An example of a chemotype for cyclic systems with low selectivity is USZ6T, which presents a SEF100 = 0, but a SEF10 = 1.03. Hence, the chemotype USZ6T is rich in molecules with low selectivity in the current data set. A similar analysis using cyclic system skeletons reveals that C2NYM, RDJ81, JK9S5 and 8E4BT are the most frequent chemotypes enriched in selective molecules.

Table 1.   Most frequent cyclic systems and cyclic system skeletons analyzed at two thresholds of selectivity
CodeFrequencyActivesaInactivesAEFC*100bNSC100cC*100/NSC100SEF100C*10dNSC10eC*10/NSC10SEF10
  1. AEF, activity-enrichment factor; SEF, selectivity-enrichment factor.

  2. aCompounds were considered as actives against COX-2 when IC50 < 100 nm.

  3. bC*100 (selective compounds) = number of compounds with AR > 100.

  4. cNSC100 (non-selective compounds) = frequency − C*100.

  5. dC*10 (selective compounds) = number of compounds with AR > 10.

  6. eNSC10 (non-selective compounds) = frequency − C*100.

Entire database
DB6582114472114470.473453131.1
Cyclic systems
PP97T8231511.1843391.101.6466164.131.54
6CEKT3620161.7325112.272.1733311.001.75
ZZ2VF293260.329200.450.9712170.710.79
L2U5P262332.762154.202.522337.671.69
USZ6T260260.000260.000.0014121.171.03
C9L9P180180.000180.000.002160.130.21
4ZLWP170170.000170.000.003140.210.34
QZ3TX161512.921334.332.5315115.001.79
E5CGF151503.121234.002.4914114.001.78
USKFM141222.671226.002.671401.91
Z3903140140.000140.000.000140.000.00
RF2VF132110.481033.332.401301.91
R79ZD12481.040120.000.00390.330.48
24N4H111100.280110.000.000110.000.00
QP8FX100100.000100.000.000100.000.00
VDXB110190.310100.000.000100.000.00
Cyclic system skeletons
C2NYM227951321.311161111.051.59176513.451.48
2MYHR393360.240390.000.000390.000.00
5UPK5313280.309220.410.9114170.820.86
RDJ81302462.492282.752.292646.501.65
9UKMT230230.000230.000.004190.210.33
5BH05222200.281210.050.145170.290.43
JK9S5181712.951535.002.6017117.001.80
EK7D5152130.422130.150.422130.150.25
45FHF150150.000150.000.000150.000.00
8E4BT142120.451042.502.231401.91
SM6KR13490.960130.000.003100.300.44
ANWMV130130.000130.000.000130.000.00
ZW6KT100100.000100.000.000100.000.00

Chemotype Activity and Selectivity Enrichment plots for cyclic systems and cyclic system skeletons are shown in Figure 5. As discussed in the Methods, CASE plots can be divided into four different regions I–IV (Figure 2). Points located in different regions represent different activity and selectivity patterns. Frequency values were mapped (Figure 5A) using a continuous color scale from green (less frequent) to red (more frequent).

image

Figure 5.  Chemotype activity and selectivity enrichment (CASE) plots for COX-2 inhibitors at two different chemotype resolutions. Chemotype Activity and Selectivity Enrichment plots are divided in regions I–IV representing different activity and selectivity enrichment patterns. (A) More frequent chemotypes (frequency ≥ 10); points are color-coded by frequency using a continuous scale from green (less frequent) to red (more frequent). (B) Less frequent chemotypes (frequency < 10).

Download figure to PowerPoint

Region IV is considered the most important zone on this study because it contains active and selective chemotypes. Region IV, in Figure 5A, identifies the most frequent (frequency ≥ 10), active (AEF > 1) and selective (SEF100 > 1) chemotypes. For cyclic systems, there are six examples namely PP97T, 6CEKT, USKFM, L2USP, QZ3TX and E5CGF. Chemotypes E5CGF, QZ3TX, L2USP and USKFM have the highest values of AEF (3.12, 2.92, 2.76 and 2.67, respectively) and SEF100 (2.49, 2.53, 2.52 and 2.67, respectively). Nevertheless, PP97T is the most frequent cyclic system in the region IV (AEF = 1.18 and SEF100 = 1.64) and even in the entire database (frequency = 82). For cyclic system skeletons, there are three examples of cyclic system skeletons located in region IV, namely C2NYM, RDJ81 and JK9S5. The chemotype JK9S5 has the highest values of AEF and SEF100 (2.60 and 2.95, respectively); however, this is not the most frequent chemotype. On the other hand, chemotype C2NYM has the lower values of AEF (1.31) and SEF100 (1.59), as compared to JK9S5 and RDJ81, but it is the most frequent cyclic system skeleton in the entire database (227 compounds).

Based on CASE plots and considering both chemotype resolution levels, it is clear that chemotypes in the region IV comprise the highest proportion of selective and active molecules; and all of them can be classified into 1,2-diarylheterocyclic or 1,2-diarylcarbocyclic systems. Interestingly, when both chemotype resolutions are compared, cyclic systems E5CGF, 6CEKT, USKFM and PP97T can be clustered in a general parent cyclic system skeleton C2NYM.

The information extracted from CASE plots is in agreement with previous reports on which 1,2-diarylheterocycles and 1,2-diarylcarbocyclic systems were described as potent and selective COX-2 inhibitors (14,15). Interestingly, some drugs in clinical use as anti-inflammatories (FDA approved) that can be classified among the most important chemotypes are celecoxib (cyclic system PP97T/cyclicsystem skeleton C2NYM) for human use and deracoxib (PP97T/C2NYM) for veterinary use. Also, other important selective COX-2 inhibitors that have been recently withdrawn from the market are rofecoxib, valdecoxib and parecoxib that can be classified as chemotype C2NYM, whereas etoricoxib is classified as RDJ81.

In addition, the analysis of chemotypes in region III of CASE plots could become useful when chemotypes that comprise molecules with dual strong activity are desirable, for example R79ZD (2,3–dihydro-1-benzofuran system).

Some chemotypes with low frequency are also of interest as they may show high values of AEF and SEF. However, these chemotypes can mislead interpretations in the first analysis. For example, Figure 5B shows an important quantity of chemotypes located in region IV for cyclic systems and cyclic system skeletons. However, most of them are characterized by only one or two molecules in the current database, which is poor information to support the potential in activity and selectivity of these particular chemotypes.

Although region IV of the CASE plots provides interesting and valuable SAR information concerning the scaffolds; the chemotype analysis, by itself, does not include details about the influence of the side chains. This will be addressed in the next section.

Chemotype-based dual activity–difference maps

Having characterized the most important cyclic systems and cyclic system skeletons, a systematic SAR study was carried out with the entire 2D structures using DAD maps. Dual activity–difference maps were previously reported as a useful tool to characterize multitarget SAR (11,12). This method has been applied previously for databases where compounds share the same scaffold (11) as well for structurally diverse compounds (12,13). Dual activity–difference maps were divided in five regions Z1–Z5 (see prototype map in Figure 3) using a threshold in activity potency differences of ±1 in pIC50 as detailed in the Methods (other threshold values can be used). Additionally, each chemotype can be represented on an independent DAD map showing N*(N*−1)/2 data points, where N* is the number of compounds with a given chemotype. Dual activity–difference maps focused on individual chemotypes provide SAR information based on side chain modifications around the corresponding chemotype.

Chemotype distribution in DAD maps

Chemotypes previously identified in region IV of CASE plots are especially valuable for SAR characterization. Figure 6 shows DAD maps representing the six most frequent, active and selective cyclic systems (PP97T, 6CEKT, L2U5P, QZ3TX, E5CGF and USKFM) and three cyclic system skeletons (C2NYM, RDJ81 and JK9S5). Of note, in this analysis, only pairwise comparisons were performed for compounds sharing the same chemotype.

image

Figure 6.  Dual activity difference maps for chemotypes located in region IV of CASE plots. Each point represents a pairwise comparison where both molecules share the same chemotype. Data points are color-coded to distinguish chemotypes namely PP97T (gray), 6CEKT (brown), E5CGF (black), USKFM (pink), L2U5P (purple), QZ3TX (blue), C2NYM (red), RDJ81 (green), JK9S5 (orange). In order to have a better visualization, chemotypes are depicted in an individual panel labeled each with its code and pairwise frequency.

Download figure to PowerPoint

The results show that, even when all selected chemotypes are structurally related, its distribution in DAD maps can be different. This is valuable information because different chemotypes can be associated with different activity and selectivity patterns. The quantification of data pairs located in regions Z1–Z5 for selected chemotypes is shown in Table 2. For cyclic systems resolution, chemotype PP97T is widely distributed among Z1, Z3–Z5 having frequency > 20%. This result suggest that PP97T has a diverse SAR, where changes in structure lead to changes in selectivity against COX-1 (Z3 = 21.3%), COX-2 (Z4 = 21.6%) and parallel changes against both targets (Z1 = 24.4%). Interestingly, PP97T has the lower distribution among Z5 (26.5%) as compared with 6CEKT, E5CGF, USKFM, L2U5P and QZ3TX (47.1–67.7%); therefore, changes in side chains of this chemotype lead to high changes in activity or selectivity in most of the cases. Chemotypes 6CEKT, USKFM and QZ3TX are frequent in regions Z3–Z4 (Z3 = 16.5–29.2%; Z4 = 19.8–22.2%); thus, changes in the structure of these chemotypes are highly related to changes in selectivity. These same chemotypes are also frequent in Z5 = 47.1–58.2% and show an important number of pairs with no changes in activity. Additionally, for these same chemotypes, low-frequency values were found in Z1; therefore, a small amount of molecules is associated with parallel changes in activity (Z1 = 2.5–9.7%). Chemotype E5CGF has high frequency in Z3 (39%); thus, most of the changes in structure lead to changes in activity against COX-1. This chemotype is also frequent in Z5 (59%). Finally, chemotype L2U5P has high frequency in Z4 (15.4%); therefore, structural changes in molecules with this chemotype are highly related to changes against COX-2. Data points containing this chemotype are also frequent in Z5 (67.7%). At the cyclic system skeleton level, C2NYM is widely distributed among Z1, Z3–Z5; the same behavior than their children cyclic system PP97T. The same tendency was observed for JK9S5 (distributed among Z3–Z5) and for RDJ81 (distributed among Z1, Z4–Z5) as compared with their children cyclic system QZ3TX and L2U5P, respectively.

Table 2.   Pairwise activity differences distribution in regions Z1–Z5 of DAD maps for each selected chemotype
ChemotypesTotal (Z1–Z5)aZ1bZ1 (%)cZ2Z2 (%)Z3Z3 (%)Z4Z4 (%)Z5Z5 (%)
  1. DAD, dual activity–difference.

  2. aTotal pairs for a particular chemotype distributed among regions Z1–Z5.

  3. bTotal pairs that fall in a particular region of DAD map for each chemotype.

  4. cPercent of pairs comprised in a particular region of DAD map relative to total pairs in all regions (Z1–Z5) for each chemotype.

Cyclic systems
PP97T332181024.42086.370721.371621.688026.5
6CEKT630619.7101.612219.414022.229747.1
E5CGF10511.0004139.0116259.0
USKFM9144.411.11516.51819.85358.2
L2U5P325278.300288.65015.422067.7
QZ3TX12032.5003529.22117.56150.8
Cyclic system skeletons
C2NYM25651477818.612634.9482918.8716127.9762029.7
RDJ814355913.610.2358.011927.422150.8
JK9S515363.9003724.23019.68052.3

Structure–activity relationships with DAD maps

Some important structural modifications that lead to activity and selectivity changes for selected chemotypes can be analyzed using DAD maps. Figure 7 shows a DAD map with examples of pairs of molecules for chemotype PP97T in different regions. Also, panels Z1–Z5 depict the chemical structures and activities of selected pairs for both cyclooxygenases in each region of the DAD map.

image

Figure 7.  Dual activity–difference (DAD) map comprising 3321 pairwise comparisons for 82 compounds with the cyclic system PP97T. Some examples are shown labeled with the compounds code. Chemical structures for selected pairs are depicted in the panels Z1–Z4 (respectively location Zone) and the IC50 (nm) for COX-1 and COX-2 are shown at the bottom of each structure.

Download figure to PowerPoint

Compound pairs 104_144 and 104_128 are examples of pairs in zone Z1, where structural changes are associated with similar changes in activity against both cyclooxygenases. For example, these pairs suggest that trifluoromethyl at position 3 in combination with chlorine or fluorine at position 4 increases the activity against both targets. Z2 is a very attractive zone because pairs located in this region are related to inverse SAR. Some examples in Z2 are pairs 146_147, 114_147, 113_150 and 113_128. Based on these examples, where sulfonamide is replaced for chlorine or methoxyl group, selectivity against COX-1 is observed, and hence, it is evident that sulfonamide substituent is very important to gain selectivity against COX-2. Additionally, the examples presented in regions Z1 and Z2 suggest that hydrogen at position 4 (see compounds 146 and 114) is favorable for selectivity against COX-2 in 3-trifluoro and 3-difluoropyrazole derivatives as compared with halogen substituted molecules (e.g., 144 and 128). Pairs in region Z3 are related to changes in activity against COX-1 and low or no changes against COX-2. Some examples are pairs 94_113 and 101_113, where the absence of sulfonamide has large impact to increase activity against COX-1 but it has low effect against COX-2. Also, the pair 141_108 suggests that a change in the position of phenyl sulfonamide has an important impact to lose selectivity. Pairs in the region Z4 are related to changes in activity against COX-2 and low or no changes against COX-1. Data points 110_480 and 133_480 are examples of pairs in Z4. These pairs suggest that the presence of the acetylaminomethyl substituent leads to loss of activity and selectivity against COX-2. This observation is also supported by comparing the well-known COX-2 selective inhibitor celecoxib with the non-selective compound 480, which only differ in the acetylaminomethyl substituent. These same pairs in Z4 suggest that some changes in the phenyl ring at position 5 are tolerated maintaining selectivity. This last observation holds also with pairs 110_124 and 114_124 located in Z5. Additional examples in Z4 are 119_134 and 119_120 where the substitution of the amino substituent at position 4 of the pyrazole ring for halogens, like chlorine or bromine, is favorable for COX-2 selectivity. Additional substituents at position 4 of the pyrazole ring, like fluorine or methylsulfonyl, reduce activity and selectivity (pair 97_104 located in Z5). Interestingly, pyrazole derivatives substituted at position 3 with trifluoro- or difluoromethyl, for example 146 and 114, lead to highly selective compounds as well as pyrazole derivatives substituted with bromine or chlorine at position 4, for example 134 and 120; however, derivatives having both substitution patterns lead to low-selective compounds (e.g. 144). It is worth mentioning that the SAR discussed in this work is highly dependent of the current database, and hence, other additional observations could arise with different databases screened against the same targets.

Conclusions

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information

Herein, we report a new graphical methodology called Chemotype Activity and Selectivity (CASE) plot, which adds selectivity information to the previously reported activity-enrichment plots. This methodology can be applied to study molecular databases screened against two targeted molecular end-points.

Briefly, CASE plots were constructed by plotting AEF and SEF for a set of molecules clustered in chemotypes. Chemotype Activity and Selectivity Enrichment plots were divided in four regions I–IV to obtain information about activity and selectivity for each chemotype. As a case of study, we reported a comprehensive analysis based on chemotype characterization for a set of 658 cyclooxygenase inhibitors. The most common, active, and selective cyclic systems (PP97T, 6CEKT, E5CGF, USKFM, L2U5P, and QZ3TX) and cyclic system skeletons (C2NYM, RDJ81, and JK9S5) in the current database were easily detected. It is worth to notice that all chemotypes rich in active and selective molecules contain a 1,2-diarylheterocyclic or 1,2-diarylcarbocyclic system. This observation is in agreement with previous reports where these systems are present in a high number of active and selective COX-2 inhibitors and even in compounds approved for human and veterinary use (celecoxib and deracoxib, respectively). Compounds with selected chemotypes (most active and selective) were further characterized using DAD maps. Dual activity–difference maps are based on pairwise comparisons of compounds tested against two targets. Dual activity–difference maps were divided into five zones Z1–Z5 showing different selectivity distributions for each selected chemotype. Results show that SAR are highly dependent for each chemotype, at least for this database. Chemotype Activity and Selectivity Enrichment plots and DAD maps based on chemotypes are general and complementary methodologies. Whereas CASE plots are useful in chemotype analysis, DAD maps provide SAR information related to side chains. These methodologies can be used to study other similar or more populated databases providing a visual means for assessing the information related to activity and selectivity.

Acknowledgments

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information

We thank Jacob Waddell and Walter A. Gloria-Greimel for proofreading the manuscript. The authors would like to express their sincere thanks to BindingDB for providing the studied structure and activity data; to Dr. Mark Johnson for providing the program MEQI; to Schrödinger, LLC, for providing Maestro 9.1; and to TIBCO Software Inc., for providing TIBCO Silver Spotfire 3.2. J.L. M-F. acknowledges the State of Florida for funding.

References

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information
  • 1
    Iyer P., Wawer M., Bajorath J. (2011) Comparison of two- and three-dimensional activity landscape representations for different compound data sets. Med Chem Commun;2:113118.
  • 2
    Ooms F. (2000) Molecular modeling and computer aided drug design. Examples of their applications in medicinal chemistry. Curr Med Chem;7:141158.
  • 3
    López-Vallejo F., Medina-Franco J.L., Castillo R. (2006) Diseño de fármacos asistido por computadora. Educ Quím;17:452457.
  • 4
    Kubinyi H. (1997) QSAR and 3D QSAR in drug design. 1. Methodology. Drug Discovery Today;2:457467.
  • 5
    Kubinyi H. (1997) QSAR and 3D QSAR in drug design. 2. Applications and problems. Drug Discovery Today;2:538546.
  • 6
    Schuffenhauer A., Varin T. (2011) Rule-based classification of chemical structures by scaffold. Mol Inf;30:646664.
  • 7
    Xu Y.J., Johnson M. (2001) Algorithm for naming molecular equivalence classes represented by labeled pseudographs. J Chem Inf Comput Sci;41:181185.
  • 8
    Xu Y.J., Johnson M. (2002) Using molecular equivalence numbers to visually explore structural features that distinguish chemical libraries. J Chem Inf Comput Sci;42:912926.
  • 9
    Medina-Franco J.L., Petit J., Maggiora G.M. (2006) Hierarchical strategy for identifying active chemotype classes in compound databases. Chem Biol Drug Des;67:395408.
  • 10
    Bajorath J., Peltason L., Wawer M., Guha R., Lajiness M.S., Van Drie J.H. (2009) Navigating structure–activity landscapes. Drug Discovery Today;14:698705.
  • 11
    Pérez-Villanueva J., Santos R., Hernández-Campos A., Giulianotti M.A., Castillo R., Medina-Franco J.L. (2011) Structure-activity relationships of benzimidazole derivatives as antiparasitic agents: dual activity-difference (DAD) maps. Med Chem Commun;2:4449.
  • 12
    Medina-Franco J.L., Yongye A.B., Perez-Villanueva J., Houghten R.A., Martinez-Mayorga K. (2011) Multitarget structure-activity relationships characterized by activity-difference maps and consensus similarity measure. J Chem Inf Model;51:24272439.
  • 13
    Mendez-Lucio O., Pérez-Villanueva J., Castillo R., Medina-Franco J.L. (2012) Activity landscape modeling of PPAR ligands with dual-activity difference maps. Bioorg Med Chem;20:35233532.
  • 14
    Bingham S., Beswick P.J., Blum D.E., Gray N.M., Chessell I.P. (2006) The role of the cylooxygenase pathway in nociception and pain. Semin Cell Dev Biol;17:544554.
  • 15
    Dannhardt G., Kiefer W. (2001) Cyclooxygenase inhibitors – current status and future prospects. Eur J Med Chem;36:109126.
  • 16
    Sobolewski C., Cerella C., Dicato M., Ghibelli L., Diederich M. (2010) The role of cyclooxygenase–2 in cell proliferation and cell death in human malignancies. Int J Cell Biol;2010:121.
  • 17
    Schneider C., Pozzi A. (2012) Cyclooxygenases and lipoxygenases in cancer. Cancer Metastasis Rev;30:277294.
  • 18
    Chen X., Lin Y., Gilson M.K. (2001) The Binding Database: overview and user’s guide. Biopolymers;61:127141.
  • 19
    Chen X., Lin Y., Liu M., Gilson M.K. (2002) The Binding Database: data management and interface design. Bioinformatics;18:130139.
  • 20
    Liu T., Lin Y., Wen X., Jorissen R.N., Gilson M.K. (2007) BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Res;35:D198D201.
  • 21
    Singh N., Guha R., Giulianotti M.A., Pinilla C., Houghten R.A., Medina-Franco J.L. (2009) Chemoinformatic analysis of combinatorial libraries, drugs, natural products, and molecular libraries small molecule repository. J Chem Inf Model;49:10101024.
  • 22
    Medina-Franco J.L., López-Vallejo F., Kuck D., Lyko F. (2011) Natural products as DNA methyltransferase inhibitors: a computer-aided discovery approach. Mol Divers;15:293304.
  • 23
    Yoo J., Medina-Franco J.L. (2011) Chemoinformatic approaches for inhibitors of DNA methyltransferases: comprehensive characterization of screening libraries. Comput Mol Biosci;1:716.

Supporting Information

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information

Table S1. SMILES representation of the structures and the pIC50 (−logIC50) values of the data set.

FilenameFormatSizeDescription
CBDD_12019_sm_TableS1.doc972KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.