Notice: Wiley Online Library will be unavailable on Saturday 27th February from 09:00-14:00 GMT / 04:00-09:00 EST / 17:00-22:00 SGT for essential maintenance. Apologies for the inconvenience.
Despite the sequencing of the human genome, the rate of innovative and successful drug discovery in the pharmaceutical industry has continued to decrease. Leaving aside regulatory matters, the fundamental and interlinked intellectual issues proposed to be largely responsible for this are: (a) the move from ‘function-first’ to ‘target-first’ methods of screening and drug discovery; (b) the belief that successful drugs should and do interact solely with single, individual targets, despite natural evolution's selection for biochemical networks that are robust to individual parameter changes; (c) an over-reliance on the rule-of-5 to constrain biophysical and chemical properties of drug libraries; (d) the general abandoning of natural products that do not obey the rule-of-5; (e) an incorrect belief that drugs diffuse passively into (and presumably out of) cells across the bilayers portions of membranes, according to their lipophilicity; (f) a widespread failure to recognize the overwhelmingly important role of proteinaceous transporters, as well as their expression profiles, in determining drug distribution in and between different tissues and individual patients; and (g) the general failure to use engineering principles to model biology in parallel with performing ‘wet’ experiments, such that ‘what if?’ experiments can be performed in silico to assess the likely success of any strategy. These facts/ideas are illustrated with a reasonably extensive literature review. Success in turning round drug discovery consequently requires: (a) decent systems biology models of human biochemical networks; (b) the use of these (iteratively with experiments) to model how drugs need to interact with multiple targets to have substantive effects on the phenotype; (c) the adoption of polypharmacology and/or cocktails of drugs as a desirable goal in itself; (d) the incorporation of drug transporters into systems biology models, en route to full and multiscale systems biology models that incorporate drug absorption, distribution, metabolism and excretion; (e) a return to ‘function-first’ or phenotypic screening; and (f) novel methods for inferring modes of action by measuring the properties on system variables at all levels of the ‘omes. Such a strategy offers the opportunity of achieving a state where we can hope to predict biological processes and the effect of pharmaceutical agents upon them. Consequently, this should both lower attrition rates and raise the rates of discovery of effective drugs substantially.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
As illustrated in Fig. 1, classical drug discovery (or pharmacology or chemical genetics) started with an organism displaying a phenotype where there was a need for change (e.g. a disease) and involved the assay of various drugs in vivo to identify one or more that was efficacious (and nontoxic). There was no need to discover (let alone start with) a postulated mechanism of drug action; for a successful drug, this could come later (often much later) [1-3]. This approach is thus ‘function first’, and is equivalent in terms of (chemical) genetic or genotype–phenotype mapping  to ‘forward’ genetics, and has lead to the discovery of many drugs that are still in use (and mainly still without detailed knowledge of their mechanisms of action). By contrast, particularly as a result of the systematic (human) genome sequencing programmes, drug discovery largely changed to an approach that was based on the ability of chemicals to bind to or inhibit chosen molecular targets at low concentrations in vitro . This would then necessarily be followed by tests of efficacy in whole organisms. This approach is thus ‘target-first’, and is equivalent to ‘reverse’ genetics, and (despite some spectacular new molecules that work on selected patients, as well as the important rise of biologicals) has been rather ineffectual because the vast majority of small molecule drugs (90–95%) fail to go forward, even from the ‘first into humans’ phase, to become successful and marketable drugs; a set of phenomena known as ‘attrition’ [6-11]. This is not unexpected to systems biologists, who would see the distinction as being similar to the distinction between hypothesis-dependent and data-driven science [12, 13]. The present review aims to illustrate why this is the case, as well as what we might seek to do to improve matters. Figure 2 provides an overview of the present review, which begins by recognizing the role of robustness in biochemical networks.
The robustness of biochemical networks
Somewhat in contrast to designed and artificial network structures such as roads, railways and process plants , natural evolution has selected much less for cheapness and efficiency than for robustness to parameter changes (whether caused by mutation or otherwise) [15-26]. This is straightforwardly understandable in that an organism with a mutation messing up a whole pathway will soon be selected out, and so the selection pressure for robustness is very high. Typically, it is the network topologies and feedback structures themselves, rather than the exact parameter values involved, that are responsible for the robustness to parameter changes . However, another way to think about this is that, by diminishing the sensitivity of individual steps to particular changes in their parameters (or to inhibitors), no individual enzyme or target or inhibitor is likely to have much effect unless it affects many other steps by itself. This is easily achieved by having enzymes obeying Henri–Michaelis–Menten kinetics operating at (or below) their Km values (Fig. 3), where a certain amount of inhibition of them (other than uncompetitive inhibition)  simply raises the concentration of their substrate and restores flux. (If the substrate of an enzyme has a concentration that is maintained essentially constant by regulatory mechanisms, then competitive inhibition of an enzyme that uses it in a minor pathway can be expected to be as effective in vivo as it is in the spectrophotometer.)
The corollary is clear: to have a major effect on a typical biochemical network, it is necessary to modulate multiple steps simultaneously (see below), such that any drug that acts solely on a single (molecular) target is unlikely to be successful. The same is true of schemes designed to increase the fluxes in pathways of biotechnological interest [29-35]. This distributed nature of flux control, which contributes to robustness, has long been established, and indeed is proven mathematically for certain kinds of networks via the theorems of metabolic control analysis [36-42]. These show that, by normalizing appropriately, the contributions (‘control coefficients’, also known as their local sensitivities)  to a particular flux of all the steps in a biochemical pathway add up to 1, and thus most individual steps are likely to have only a small contribution.
Polypharmacology as a desirable goal
If we are to design drugs that overcome this robustness, we need either to find individual molecules that hit a useful set of multiple targets [44-46] (for an example from neuropharmacology, see [47-51]) or use cocktails of drugs [24, 52-54], each of which hits mainly an individual target. The former is known as polypharmacology [44, 45, 55-67] or multi-target drug discovery [68, 69] and the recognition that we need to attack multiple targets in pharmacology is reflected in names such as ‘systems pharmacology’ [6, 70-88] or ‘systems medicine’ [89-93]. The use of cocktails is of course commonplace in diseases such as cancer and HIV-AIDS .
One issue is that finding a good subset of even a small number of targets from a large number of possible targets is a combinatorial optimization problem . All combinations of n drugs specific for n targets gives 2n possibilities , whereas finding the best combination of even just three or four drugs or targets out of 1000 gives 166 million or 41 billion combinations, respectively, resulting in numbers that are too large for typical experimental analyses (but easily accessible computationally; see below).
Polypharmacology in pharmacogenomics and personalized medicine
An important recognition, if not that recent in origin , is that every patient is different and thus their response to drugs will also be different [97-102]. As neatly phrased by Henney , quoting an 18th Century physician (Caleb Parry), ‘It is much more important to know what kind of patient has a disease than to know what kind of disease a patient has’. The essential combinatorial argument is straightforward : if we define for any character, such as the fasting low-density lipoprotein-cholesterol level, the ‘normal range’ to be the middle 95 percentiles, then any individual has a probability of 0.95 of being ‘normal’ (for that character). (This is conventional but thereby ignores systematic errors or biases .) The probability of being normal for two (independent) characters is thus 0.952 and, for n independent characters, is 0.95n. This drops below 1% when n = 90, and there are of course thousands of characters. What is probably more unexpected, therefore, is not that individuals are different but that they display any similarity of response at all (in part, this presumably reflects the evolution and selection for robustness described above, and the fact that many characters are not of course entirely independent.)
From the point of view of polypharmacology, a drug that interacts usefully with n targets can more easily afford to ‘lose’ one of them (e.g. as a result of an inactivating single nucleotide polymorphism or other mutation) if n is large, whereas a drug that has only one target may provide a very strong variation in response between individuals. Assuming that adverse drug reactions are taken into account, a drug with multiple useful targets is thus likely to show significantly less variation in the response across populations. Drugs do of course require transporters to reach to their sites of action (see below) and this concept should also be included as part of the relevant polypharmacological analysis of multiple ‘targets’ (i.e. preferred macromolecules with which the drug is intended to interact).
How target-specific are the presently available marketed drugs?
The argument that one should seek to hit multiple targets begs the question of which proteins do successful (and thus marketed) drugs actually bind to, given that many of them were in fact isolated on the basis of their ability to bind to a specific and isolated molecular target? What takes place in real cells, tissues and organisms, however, is very different: individual drugs [44, 46, 55, 57, 61, 63, 64, 66, 67, 106-145], and even intermediary metabolites [146-149], are now seen to bind to a great many more entities than just the single ‘target’ via which they were typically discovered. Drugs on average bind to six targets , whereas ligands in some classes typically bind to many more [44, 114]. This ‘drug promiscuity’  can be accounted for in terms of the comparatively limited number of protein motifs used in evolution , which are often related to each other [145, 153], as well as the fact that only a small number of biophysical forces determine binding; together, these make complete specificity generally implausible in small molecules and, as a consequence, bioactivity in one species is often enriched in other species [154, 155]. A typical example of promiscuity is outlined below.
An example: statins
Although low-density lipoprotein-cholesterol is widely regarded as a major determinant of cardiovascular diseases as a result of its appearance in atherosclerotic plaques, its correlation with disease when in its normal range is poor [115, 156]. Nonetheless, subsequent to the discovery of a ligand (later marketed as lovastatin) from Aspergillus terreus that would inhibit HMG-CoA reductase, and thus lower cholesterol, a great swathe of ‘statins’ have been marketed, and the epidemiological evidence that they can prolong life is good. It is again widely assumed that this is because they lower cholesterol, whereas this is neither logical, nor (as stated) true. Although there is a highly unfortunate tendency to lump all such molecules as ‘statins’ (presumably because they were discovered via their ability to inhibit HMG-CoA reductase), expression profiling studies straightforwardly show that they have no such unitary mode of action . The resolution of the paradox  is uncomplicated . All statins consist of a substructure that mimics hydroxymethylglutarate (and not, incidentally, its CoA derivative), termed the ‘front end’, bound to a wide variety of other structures (the ‘back ends’). In most cases, it is likely the ‘back end’ that accounts for most of the biological activity, and mainly because such molecules are anti-inflammatory. In a previous review , more than 50 literature citations up to June 2008 were provided. More recent examples are now available [159-165]. A similar tale can be told for ‘glitazones’ .
Drug biophysics and the rule-of-5
Lipophilicity is widely seen as an important concept in drug discovery, albeit that there is no doubt that drug promiscuity tends to increase with lipophilicity [107, 119, 122, 124, 126, 127, 144, 150, 166-172]. In an extremely influential review  and later reprint , Chris Lipinski and colleagues, when seeking to minimize the number of drugs that failed for reasons of pharmacodynamics and pharmacokinetics, proposed four rules (known as the ‘rule-of-5’ or Ro5 because each rule contains elements that are multiples of 5). They predicted that poor absorption or permeation into cells for a molecule is more likely when the number of hydrogen-bond donors > 5, the number of hydrogen-bond acceptors > 10, the relative molecular mass > 500 and the calculated log P (cLog P) > 5. This last in particular is a measure of lipophilicity, and those who design chemical libraries will always seek molecules that obey the Ro5, including through experimental measurements of the partition coefficient log P [175-177] and/or the distribution coefficient log D . Note, however, that natural product-based drugs (still a major source of leads and indeed marketed drugs; see below) very rarely, if ever, obey the Ro5 and, indeed, even some synthetic drugs have very large molecular weights ; for example, navitoclax dihydrochloride , a Bcl-2 inhibitor, has seven ring systems and a relative molecular mass of 1047.5. Indeed, there is an increasing recognition [154, 155, 166, 180-186] that over-reliance on Ro5 compliance would lose many desirable drugs, including known ‘blockbusters’.
Designing chemical libraries: the role of natural products in drug discovery
Originally, of course, all drugs were natural products, and even now natural products (or chemical moieties derived therefrom) continue to contribute to many useful and profit-making drugs. Notwithstanding, many drug companies have abandoned them. This makes little sense  because they represent an exceptionally rich resource that occupies a distinct chemical space [188-204], and they continue to provide approximately half of all useful drugs [205-212]. The ability to detect novel and previously cryptic natural products, whether via pheromone activity  and co-culture [214, 215], pharmacognosy , proteomics  and metabolomics , or via (meta)genomics  and genome mining , will increase greatly the utility of natural products in drug discovery. Their common role as iron chelators [221-224] makes them of special interest [26, 62, 225].
One reason given for the otherwise very odd loss of interest in natural products is that their high fraction of stereocentres often makes them difficult to manipulate chemically. Probably a more pertinent reason is that their failure to obey Lipinski's rules has led to the perception that they do not easily permeate cells. The facts of permeation speak otherwise, not least because, if they are active against intracellular targets as most are (and, in humans, are active orally, and thus must cross at least the gut epithelium), they must cross membranes easily enough. There remains a question as to how (Fig. 4).
The role of drug transporters
… what is certain today is that most molecules of physiological or pharmacological significance are transported into and out of cells by proteins rather than by a ‘passive’ solubility into the lipid bilayer and diffusion through it … 
Notwithstanding the above quotation (dating from 1999), it is widely assumed that drugs cross membranes according to their lipophilicity, via what little  phospholipid bilayer sections of biological membranes may be uninfluenced by proteins (Fig. 4A). Actually, the evidence for this mode of transport is essentially non-existent (and, in truth, it is hard to acquire directly). There is an alternative view that we have reviewed extensively [151, 228-231], for which there is abundant evidence, as well as a number of recent reviews (e.g. from 2012 alone: [84, 232-267]); this is that transbilayer transport in vivo is negligible, and drugs cross biomembranes by hitchhiking on genetically encoded solute transporters that are normally involved in the intermediary metabolism of the host. In humans, there are more than 1000 of these , and a number of online databases exist [151, 250].
The evidence cited above comes in various flavours, although the most pertinent for our purposes are the many clear experimental examples that show precisely which genetically-encoded transporters are used to transport specific drugs. This is especially easily achieved, and can be made quantitative, when the drugs themselves are toxic (or can be added at toxic concentrations), as in yeast  and trypanosomes [269-272]. It is important that the assays are at least semi-quantitative because binary (qualitative yes/no) assays that look for resistance when carriers are deleted may miss them. To emphasize once more, this is because multiple carriers can often transport each drug, and so the loss of just one is not normally going to confer ‘complete’ resistance. It probably underpins the widespread belief in ‘passive’ diffusion across membranes because ‘passive’ is often used erroneously as a synonym for ‘transporters that we do not happen to know about and that are in fact important’ [151, 231].
A flipside of this is illustrated by examples where there is clear evidence that the expression (profiles) of a subset of transporters substantially determines the efficacy of the drug in question. Gemcitabine, the best drug against pancreatic cancer, provides an excellent example because the drug is only efficacious when a suitable nucleoside transporter is well expressed in the target tissue [233, 245, 249, 252, 273-288].
Drug transporters: ‘barriers’, tissue and interspecies differences
As well as the historical change in an understanding of the mode of action of narcotics (‘general anaesthetics’), which went from entirely lipid-only views to one where the protein targets were identified and recognized [151, 231], there are at least three contrafactuals that those who believe in lipid-transport-only theories need to explain: (a) the fact that most drugs do not diffuse across the blood–brain barrier (and others) where the lipids are not significantly different [151, 231]; (b) the substantially varying tissue distributions [289-296]; and (c) the very large species differences in cellular drug uptake [297-299]. By contrast, the transporter-only view recognizes the possibility of varying degrees of tissue/individual/species enzyme distribution [289, 291, 293, 296, 300-305] and specificity , and their requirement for effecting transport provides a simple explanation for all these phenomena. In other words, the primacy of the need for transporters to effect drug transport into any cell at meaningful rates means that we need to seek to understand which drugs use which transporters. As noted above, if a drug can hitchhike on half a dozen transporters, a knockout of only one will tend to show little phenotypic effect, and thus careful quantitative methods may be necessary to discriminate which transporters are involved; in such cases, therefore, although the knowledge of the multiple transporters is interesting, it may not be that important to the function of getting drugs to intracellular targets.
Overall, this recognition of the importance of drug transporters shows that the problem of understanding how drugs get into cells is not so much a problem of biophysics, but rather a problem of quantitative systems biology. What is meant by this is outlined below.
The need for quantitative biochemical network models
It is a commonplace in engineering that, if one aims to understand the system being designed, especially if it is complex, then it is necessary to have a parallel mathematical or computational in silico model of the artefact of interest. This has long been recognized in a few areas of biology (e.g. neurophysiology) [306, 307], although only more recently are we beginning to see human biochemical and physiological (and especially metabolic) network models of the type that we require [92, 308-319], both for the entire organism or for elements such as the liver , a liver cell  or a macrophage [322, 323]. The development of these is best performed using crowd-sourcing or community-based methods [319, 324-326]. The great utility of such reconstructed networks [327-330] lies in areas such as: testing whether the model is accurate, in the sense that it reflects (or can be made to reflect) known experimental facts; analyzing the model to understand which parts of the system contribute most to some desired properties of interest; hypothesis generation and testing, allowing rapid analysis of the effects of manipulating experimental conditions in the model without having to perform complex and costly experiments (or to restrict the number that are performed); and testing what changes in the model would improve the consistency of its behaviour; along with experimental observations.
They also provide the necessary ground substance for inferencing modes of action of compounds with unknown or off-target effects (see below).
The metabolite-likeness of successful pharmaceutical drugs
Because we know the structures of successful, marketed drugs, it is possible to develop concepts such as drug-likeness [331-333] that capture the properties possessed by successfully marketed drugs. However, armed with the widely available metabolomics data indicating the metabolites that cells, tissues or body fluids typically possess [334-337], it is possible to investigate whether (because we consider that they must hitchhike on carriers used in intermediary metabolism) successful (i.e. marketed) drugs are more similar to human metabolites than to say the Ro5-compliant molecules typically found in drug discovery libraries. When such studies are performed, the answer is that most synthetic compounds in chemical databases are not metabolite-like , whereas successful drugs are indeed commonly metabolite-like [339-341]. This adds weight to the view that those seeking to discover new drugs should consider the metabolite-likeness of their molecules early in the discovery process, along with the question of which transporters they are likely to use. It also leads to the obvious recognition  that it is important to incorporate into human metabolic network models the reaction steps that cover the metabolism of candidate and marketed pharmaceuticals (including their absorption, distribution and excretion).
Frequency encoding as part of biochemical signalling
Assays are an important part of the drug discovery process, although a simple binding or inhibition assay of a specific target (whether isolated or even when within a cell) does not clarify whether the inhibition serves any useful function. A particularly clear and interesting example comes from signalling pathways in which the signal is not based on amplitude (i.e. that might reasonably reflect an inhibition) but on frequency (that almost certainly will not, at least not directly). The transcription factor nuclear factor-kappa B (NF-κB) provides a good example.
Because a collection of nominally similar cells or unicellular organisms is not even close to being identical (thermodynamically, an ‘ensemble’), for fundamental statistical reasons , there is the question of how to correlate macroscopic measurements of metabolic or signalling molecules with phentotypic effects. In cases such as when the phenotype is the ability to replicate or divide, which is necessarily a single-cell property, one simply cannot make such as correlation, even in principle [342-344], and sometimes the variability of the expression profiles between single, axenic microbial cells of even single proteins is huge [345, 346].
Another specific case in which we cannot expect to relate the properties of collections of cells to a phenotype of interest is when they are not in a steady-state, and especially when they oscillate. This is exactly what happens in the NF-κB system. What we found, on comparing the behaviour of a mathematical model of the system [347, 348] (Fig. 5) with the behaviour of individual cells determined microscopically [349, 350], was that there is indeed a substantial oscillation in the distribution of NF-κB between the nucleus and the cytoplasm, and that this dynamic behaviour (rather than say a ‘static’ concentration of the NF-κB) can be related to changes in gene expression controlled by the transcription factor. More simply, macroscopic snapshots of the NF-κB concentration provide no information on the dynamics (and their heterogeneity) , and it is the dynamics that is important: the protein signal is frequency-encoded [352, 353]. This phenomenon appears to be widespread, and also applies, for example, to p53-Mdm2 [354-359], ERK , Stat/Smad  and elsewhere [362, 363]. Such studies indicate the need to study their interaction (and effects on biology) at as high a level of organization as possible, and certainly not solely by focussing on individual molecules. Analysis of cells (often called high-content screening) [364-381] is a start, although we need to return to ‘phenotypic’ screening at the level of the differentiated organism.
Thus, we come full circle to the distinction made in Fig 1. If we wish to discover new drugs that work effectively at the level of the organism, we need to move towards initial analyses that are conducted in differentiated organisms [382-394]. For financial and ethical reasons, this mainly means model organisms, with candidates including Saccharomyces cerevisiae [394, 395], Caenorhabditis elegans [154, 396-402], Drosophila melanogaster [403-405] and Danio rero (zebrafish) [405-410]. (Because of the numbers of organisms involved, fragment-based discovery methods [172, 411-422] are preferable.) This will find us the effects, under circumstances where transporters are not a major issue, and will assess toxicity at once. What this will not necessarily clarify is the modes of action of the drugs; for this, appropriate analyses are needed, many of which can now be performed on a genome-wide scale [268, 269, 423, 424]. An important additional strategy is based on the use of inferencing methods.
Inferencing (parameters from measurement of variables)
In a typical biochemical network, the parameters are the topology of the network, the starting (or fixed) concentrations of enzymes, their kinetic properties (e.g. Km and Vmax) and the starting or ‘fixed’ concentrations of metabolites and effectors. pH and time are also usually treated as honorary parameters. The variables of the system are then the changes in metabolite concentrations or fluxes that occur when one of the parameters is changed (e.g. by adding a substrate or effector to the system). The issue (Fig. 6) is how to identify which parameters have changed by measurement of changes in the variable alone (i.e. what effectors do is modify some of the parameters). The welcome answer is that they can [139, 425-435], although many of these problems are quite under-determined, and the numerical methods do not yet scale well. However, what this tells us is that the availability of candidate networks, together with series of ‘omics’ measurements of variables, does indeed allow the possibility of inferring the modes or molecular sites of action of polypharmacological agents when added to whole cells or organisms.
The present review has sought to identify a number of areas where we might beneficially look again at how useful medicines are discovered:
recognizing that the solution to failed target-first approaches that lead to attrition involves adopting function-first approaches
recognizing that this follows in part from the fact that very few diseases (and no complex ones) have a unitary cause, and thus poly-pharmacology approaches are required
recognizing the need for quantitative biochemical models that we can interrogate in silico and then validate
recognizing the major role of drug transporters in getting drugs to their sites of action (and stopping their accumulation at toxic levels)
recognizing that this involves a radical re-evaluation of the utility of the Ro5 as commonly used
recognizing that most transporters evolved and were selected to transport natural, endogenous metabolites, and that successful drugs are structurally ‘like’ metabolites
recognizing that this invites a major consideration of the benefits of natural products in drug discovery
recognizing that phenotypic screening is important, although establishing mechanisms and modes of action requires genome-wide analyses coupled with sophisticated inferencing methods.
Taking all these together will once again set us more securely on a path to successful drug discovery.