The use of mechanistic evidence in drug approval

Abstract The role of mechanistic evidence tends to be under‐appreciated in current evidence‐based medicine (EBM), which focusses on clinical studies, tending to restrict attention to randomized controlled studies (RCTs) when they are available. The EBM+ programme seeks to redress this imbalance, by suggesting methods for evaluating mechanistic studies alongside clinical studies. Drug approval is a problematic case for the view that mechanistic evidence should be taken into account, because RCTs are almost always available. Nevertheless, we argue that mechanistic evidence is central to all the key tasks in the drug approval process: in drug discovery and development; assessing pharmaceutical quality; devising dosage regimens; assessing efficacy, harms, external validity, and cost‐effectiveness; evaluating adherence; and extending product licences. We recommend that, when preparing for meetings in which any aspect of drug approval is to be discussed, mechanistic evidence should be systematically analysed and presented to the committee members alongside analyses of clinical studies.

clinical studies form only part of the evidence base. In particular, good quality evidence of mechanisms can be obtained from a wide variety of sources, not just clinical studies ( Table 1). The EBM+ programme seeks to clarify the role of such evidence and make its evaluation more explicit, in the hope that considering these important forms of evidence in conjunction with the results of clinical studies can lead to further improvements in health outcomes. [2][3][4][5] Drug approval is a "hard case" for the thesis that one should explicitly scrutinize evidence for mechanisms. This is because it is common EBM practice to hold that when randomized studies are available, as they almost always are in the case of drug approval, they should be considered in preference to-or even to the exclusion ofother kinds of evidence. As we shall argue, this is not always appropriate: randomized studies may not measure (or may be underpowered to measure) outcomes of interest, either beneficial or, especially, harmful, and in any case other evidence, eg, from mechanisms, can support or undermine the results of such studies. Either way, mechanistic evidence should always be taken into consideration. This paper is structured as follows. First, in Section 2, we outline the ways in which evidence for mechanisms explicitly informs the drug approval process, through the phased approach to approval, including animal studies and human clinical studies. The roles of such evidence in these processes are well recognized. In other tasks related to drug approval, the roles of evidence for mechanisms are also crucial, but less well recognized and often implicit. In the rest of the paper, we show that evidence for mechanisms is relevant to all the tasks that are important in drug approval: evaluating the efficacy of a drug; evaluating harms; evaluating the external validity of a claim about a study population; determining drug usage; extending the licence of a drug; evaluating the quality of a formulation; evaluating adherence; and evaluating cost effectiveness. Finally, in Section 12, we draw some conclusions.
Before proceeding, it will be useful to define the key concepts to which we appeal, to avoid ambiguity.
A complex-systems mechanism is a complex arrangement of entities and activities, organized in such a way as to be regularly or predictably responsible for the phenomenon to be explained. 6 An example of a complex-systems mechanism is the heart's mechanism for pumping blood. A mechanistic process consists of a spatio-temporal pathway along which certain features are propagated from the starting point to the end point. 7 An example of a mechanistic process is the process by which a signal is propagated from an artificial pacemaker to the heart.
We use the term mechanism to refer to either a complex-systems mechanism, or a mechanistic process, or some combination of the two. For example, the mechanism for pumping blood might be constituted by the complex-systems mechanism of an artificial pacemaker for producing a timing signal, the complex-systems mechanism of the heart itself, and the mechanistic process linking the two.
A clinical study for the claim that A is a cause of B repeatedly measures the values of a set of measured variables that includes A and B. In an experimental study, the measurements are made after an experimental intervention. If no intervention is performed, the study is an observational study.
A mechanistic study for the claim that A is a cause of B is a study that provides evidence of the details of the mechanism by which A is hypothesised to cause B. Note that a clinical study for the claim that A is a cause of C, where C is an intermediate variable on the mechanism from A to B, is also a mechanistic study for the claim that A is a cause of B, because it provides evidence of the details of the mechanism from A to B. A clinical study for the claim that A is a cause of B is not normally a mechanistic study for that claim, because, although it can provide indirect evidence that there exists some mechanism linking A and B, it does not normally provide evidence of the structure or features of that mechanism.
We emphasize here, as footnoted earlier, that evidence for mechanisms includes evidence of either the existence of a mechanism or evidence of the details of a mechanism. While mechanistic studies provide evidence of the details of a mechanism, clinical studies can provide evidence of the existence of a mechanism. Thus, high quality evidence for mechanisms can be obtained by a wide variety of means, as shown in Table 1. 3 A claim of effectiveness is a claim that a particular causal relationship holds in some target population of interest. A claim of efficacy is a claim that a particular causal relationship holds in some specific study population under particular controlled conditions. A claim of external validity (or applicability) is a claim that a particular causal relationship holds more widely than in a specific study population, controlled clinical setting, or experiment. Effectiveness is often established by establishing efficacy in a study population and then establishing external validity to a target population of patients.

| CLINICAL DRUG DISCOVERY AND DEVELOPMENT
In the drug discovery process, mechanistic evidence is widely acknowledged to be crucial. Contemporary drug discovery and development are exceedingly "target driven" (see section 7, Harms), starting with the characterization of a biological component that can serve as an intervention point to a disease mechanism, and proceeding to the design and synthesis or biological production of a compound able to interact with the target component. Once manufactured, a new compound needs to be evaluated for beneficial and adverse effects. This process is typically divided into phases (Table 2).
In the pre-clinical phase (phase zero), a compound is tested in animals to determine an appropriate dose for human trials and to characterize any major organ toxicity. In phase I, the compound is tested in healthy human volunteers, unless the drug is likely to have adverse effects that obviate this (eg, drugs used to treat cancers). In phase II, the compound is tested in a small number of patients affected with the targeted disease. Phase I to midway through phase II is focused on learning what doses of the drug are tolerated and how the drug affects major organ systems, and confirming that the drug is likely to be effective for the proposed indication, called "proof of concept" (typically: a randomized trial showing that drug A improves some interim measure C, a so-called biomarker, which is an indicator that the drug is likely to benefit the clinically relevant measure B). In phase III, the compound is tested in larger human trials in patients who have the disease. The latter part of phase II until the completion of phase III is focused on learning how best to use the drug in patients (determining appropriate dosing and learning how different patient characteristics influence dosing) and confirming that the drug is efficacious and sufficiently free of common harms in a sample of target patients.
Phase III trials, so-called pivotal trials, are conducted for regulatory approval and seek to confirm that the drug benefits patients on a clinically relevant outcome measure, which may be a biomarker or a direct measure of improvement of the disease. The design of these trials is informed by what has been learned throughout the drug's development. For instance, phase III trials will test the doses of the drug that have been identified in late phase II trials, and the selection of participants will be informed by what has been learned about the benefit to harm balance so far assessed (eg, patients with renal or hepatic impairment or those taking other medicines that may interact with the experimental treatment may be excluded from the trial).
A successful phase III trial is a basis for applying for a marketing authorization (the official term for the licence). Sometimes, approval is conditional on conducting further studies after approval, typically to monitor unexpected adverse effects or reactions.
Sheiner characterized clinical drug development as a series of "learn-confirm" cycles. 8 Learning components of clinical drug development seek to answer key questions about the drug and its actions on the body. Examples of these questions include: What are the mechanisms by which the drug enters the body, distributes throughout the body, and is cleared from the body? What doses of the drug are pharmacologically active? With what biological systems does the drug interact and how does it affect these systems? The answers to these questions are determined by establishing the complex-systems mechanisms and mechanistic processes at play. This knowledge then informs the design and interpretation of clinical studies, which seek to confirm that the drug is efficacious and sufficiently safe in the study population.
What is understood about the actions of the drug gets progressively more sophisticated, and the evidence regarding its clinical benefits more compelling, as it successfully progresses through clinical development. This is due to the interplay between the emergence of evidence for mechanisms and confirming that the drug benefits patients in rigorously designed randomized trials. While Sheiner focused on clinical drug development, it is important to note that this interplay does not stop at drug approval. The questions that consumers, clinicians, and regulators have about medicines go beyond the evidence provided by even the most compelling phase III randomized trials. Will the drug benefit a specific consumer given his or her characteristics? Is the drug effective in the kinds of patients who present to the clinic? Is the drug safe in patients with impaired renal or hepatic function, and what dose is appropriate? Are particular age groups at greater risk of adverse reactions? While partial answers to these questions will be provided by the studies conducted during the drug's development, it is important that both mechanistic evidence and evidence of clinical outcomes continue to evolve, especially given that clinical use of the drug will extend beyond the groups of patients represented in the original studies.
There is an increasing focus on appropriate evaluation throughout a drug's life-cycle. To do this well, to appropriately answer the questions of consumers, clinicians, and regulators, it is necessary to have reliable and relevant evidence for mechanisms and clinical outcomes.

| PHARMACEUTICAL QUALITY
Pharmaceutical tests are conducted to demonstrate that the drug, and its specific formulations, are sufficiently stable for clinical use. Key aspects that need to be determined are the rate at which the drug product loses potency and the identification and properties of any degradation products. The shelf life/expiry date of a drug product is determined by considering the rate at which potency is lost and the presence and toxicity of any degradation products. In the absence of toxic degradation products, the expiry date of a drug product is typically set such that the drug will retain greater than 90% of its labelled potency for the duration of its shelf life under recommended storage conditions. Knowledge and evidence of mechanisms play a central role in ensuring and assessing the stability of drug formulations. Knowledge of the chemical characteristics of the drug inform the way that the drug will be formulated and stored. Key stability tests are undertaken on the medicinal product in the selected storage container when stored as recommended. For example, glyceryl trinitrate (nitroglycerin) is highly volatile, and tablets tend to lose potency over time. Glyceryl trinitrate tablets need to be stored in glass containers with a foil lined cap, because loss of potency will be exacerbated if the tablets come in contact with plastic or other permeable packaging material; cotton wool, often included in drug containers, must not be packaged with glyceryl trinitrate tablets.
The general approach to stability testing has developed in response to developments in the understanding of mechanisms of This information guides decisions regarding appropriate storage and appropriate stability testing. This is especially important for biologic medicines, because of the concern that degradation products may cause an immune response. Another example of mechanisms of degradation informing appropriate stability testing are water-based drug products packaged in semipermeable containers. In addition to the routine stability tests outlined previously, these products also require tests to demonstrate that water loss under conditions of low relative humidity do not occur.

| PHARMACOKINETICS, PHARMACO-DYNAMICS, AND PHARMACOGENETICS
Currently, the mechanistic evidence that is always systematically evaluated in the drug approval process consists of studies of pharmacokinetics (PK) and pharmacodynamics (PD). Below, we briefly describe the roles of these, together with the closely related field of pharmacogenetics. Pharmacokinetics is the study of how a drug enters, distributes within, and clears the body. Pharmacodynamics is the study of how varying concentrations of the drug in the body produce therapeutic and adverse effects. Pharmacogenetics is the study of the genetic influences on drug pharmacokinetics and pharmacodynamics.
Together, these sciences provide insights into the complex-systems mechanism(s) that influence the concentration of the drug in the body and the relationship between the concentration of the drug and the drug's effects. Knowledge of a drug's pharmacokinetics, pharmacodynamics, and pharmacogenetics is rarely complete, but rather accumulates throughout drug development and subsequent clinical use. All three sciences have developed rapidly over the past two decades.
The clinical applicability of pharmacogenetics in particular is recent and is likely to play an increasingly significant role in clinical drug development, regulation, and clinical use. Concrete examples of the roles these sciences play in providing evidence for and from mechanisms for drug evaluation are provided below.

| DEVISING DOSAGE REGIMENS
The development of appropriate dosage recommendations provides an excellent example of the "learn-confirm" cycles that occur throughout clinical drug development and clinical use of the drug. Much work early in clinical drug development is focused on determining the drug's dose-response relationship (see Figure 1). In early-phase trials, this will be informed by the drug's pharmacology (pharmacodynamics), pharmacokinetic studies in healthy volunteers, and dose-ranging studies.
Dose-ranging studies seek to identify the smallest dose that produces a measurable effect on an outcome of interest (the "minimum effective dose") and the "maximum tolerated dose" (doses above which adverse effects occurred that required withdrawal of the drug in the majority of patients weight. This is because elimination of enoxaparin is influenced by renal function and metabolism, which tend to vary in predictable ways with body weight. The challenge, however, is that lean body weight is a better predictor of the clearance of enoxaparin than total body weight. The distinction is unimportant in the leaner patients that are often enrolled in clinical trials, but critical in the broader range of patients treated in routine care. 11,12 Dosing an obese patient using total body weight rather than lean body weight puts the patient at risk of toxicity. Understanding the mechanisms of enoxaparin's elimination informs appropriate dosing; a drug that is distributed throughout the body differently or eliminated differently will require a different approach.

| EFFICACY
For approval, a drug must be shown to be efficacious in patients with the targeted disease or condition. Phase III (pivotal) trials are meant to demonstrate this sufficiently well to merit licensing. Demonstrating efficacy requires showing that the treatment is correlated with improvement in the condition, and that any observed difference between the treatment and control groups is attributable to the treatment. The latter requires sufficient evidence for ruling out explanations of the correlation in terms of chance, bias, or confounding, so that the only remaining explanation is that there is a mechanism linking the intervention and the outcome that shows how the former is at least partly responsible for the latter. An ideally conducted trial would provide this evidence directly: if a sufficiently large correlation were observed in a perfectly randomized, perfectly representative, sufficiently large trial, that would provide very strong evidence that the correlation is causal, ie, that there is some mechanism of action that gives rise to the correlation. In practice, however, studies tend to be imperfect in various respects and so less conclusive. In such cases, it can be useful to consider the evidence in favour of the hypothesised mechanism of action. A well-established mechanism of action can support the efficacy claim, while a hypothesised mechanism that has little evidence or contrary evidence (ie, lack of biological plausibility) can undermine the efficacy claim.
At present, mechanistic considerations tend to be treated rather unsystematically at drug approval meetings. Often "the evidence" is taken to consist of reports of phase III trials, which are selected and analysed in detail in advance of the approval meeting, and subjected to further scrutiny at the meeting. On the other hand, discussion of mechanisms occurs principally at the meeting itself, mediated through the opinions of the experts and without its role or relevance being clear to all participants. The fact that evidence for mechanisms is part of the evidence base and can be crucial to evaluating efficacy is not widely recognized. However, such evidence can be analysed as systematically as evidence from phase III trials. 13 One obvious example of the crucial role for evidence for mechanisms in judgements of efficacy occurs when determining biosimilarity.
A biosimilar is a biological medicine that is very similar to another biological medicine that has already been approved for use. Often the burden of proof in phase III trials is much lower for biosimilar drugs than for other drugs. Instead, those assessing biosimilarity rely more on evidence of similarity of mechanism of action, particularly evidence of similarity of structure and function. 14 For example, Terrosa, a treatment for osteoporosis with active ingredient teriparatide, was approved by the European Medicines Agency (EMA) without a major new study, on the grounds of biosimilarity with Forsteo, a different formulation of teriparatide. 15,16 While biosimilarity refers to similarity of complex biological mole- Evidence for mechanisms also informs judgements of efficacy when evaluating the design of clinical studies and the appropriateness of the inferences drawn from their results. Determining whether a study design is of high quality and is based on sound science requires evidence for mechanisms, notably when assessing the diagnostic categories used in a study, whether the length of the trial was appropriate to demonstrate efficacy, and whether all plausible confounders were controlled for. 3 When clinical studies are found to be defective, evidence for mechanisms may be used as grounds to motivate requests for new studies.
As an example of the use of evidence for mechanisms to assess Quetiapine is an antipsychotic drug that was originally developed for the management of schizophrenia but has also been licensed for use in major depressive episodes in bipolar disorder (as "add-on" or  , and enzymes (eg, pancreatic enzymes). In some cases, the therapeutic target is not known but must exist; for example, the therapeutic target for lithium is not known, although the enzyme inositol-1-phosphatase, which it inhibits, is a strong candidate.
Adverse effects of drugs are also produced by actions on targets.
In some cases, the target is the same as that by which the beneficial effect is produced; such effects are called "on-target effects". However, most adverse effects are produced by actions on targets other than those that produce benefit; these are called "off-target effects".
The principles are illustrated in relation to the dose-related classification of adverse drug reactions ( Figure 1).
In Figure 1, each curve is a theoretical dose-response (concentration-effect) curve. Adverse drug reactions follow three patterns in relation to the dose-responsiveness of the beneficial effect (in green) 27: • hypersusceptibility reactions (blue), in which the reactions occur at doses or concentrations lower than those associated with benefit; • collateral reactions (orange), in which the reactions occur at doses or concentrations in the same range as those associated with benefit; • toxic reactions (red), in which the reactions occur at doses or concentrations higher than those associated with benefit, either through the same mechanism (solid line) or some other mechanism (dotted line).
The solid lines show on-target effects, the dotted lines off-target effects.
Apart from adverse reactions that occur through exaggeration of the target effect (ie, some toxic reactions; red solid line in Figure 1 by the CHMP as of July 2017. 31 The CHM raised concerns that abaloparatide, which is intended for use in older women, had been tested only in healthy women, and raised concerns about the most frail patients. Grounds for concern included the fact that half of all patients in a trial developed anti-abaloparatide antibodies and that abaloparatide injection led to a marked increase in heart rate.
Sofosbuvir/velpatasvir is a combination therapy for hepatitis C.
The CHM considered this treatment on 24 March 2016 but raised a safety concern on the grounds of extrapolation: velpatasvir has been found to cause serious teratogenicity across three species (mouse, rat, and rabbit), and this robustness across species was thought to provide significant evidence of a possible teratogenic effect in humans.
Robustness of effect is important evidence of similarity of mechanism.

| COST EFFECTIVENESS
If the benefit to harm balance of a medication is acceptable, and the manufacturer receives a licence to market it, there remains the question of whether a health care system can afford to use it to treat members of its population, given that health care budgets are limited.
One way of deciding this is to calculate the cost-effectiveness of the medication, ie, whether the effect it offers gives good value for money. The usual method for doing this is to calculate the overall cost of using the medication and dividing it by a measure of the quality of life that is gained by using it. The quality of life is assessed by a measurement called the quality adjusted life year or QALY. A QALY of 1 implies perfect health and a QALY of 0 implies no health at all (ie, death). QALYs are typically measured using instruments that elicit patients' answers to questions about their health. For example, one such instrument, the EQ5D, asks how problematic the individual finds mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. The difference between the QALYs before and after treatment, the QALY gain, is divided into the cost, giving an incremental cost-effectiveness ratio (ICER). In the UK, if an intervention has an ICER of £20 000 to £30 000 per QALY gained, it is considered to be cost-effective and can be recommended for funding by the health care system.
Mechanisms are often not discussed by committees charged with determining the cost-effectiveness of therapeutic interventions, but understanding mechanisms can influence decisions in various ways.
In constructing pharmacoeconomic models that relate clinical outcomes to costs, it may be helpful to include mechanistic considerations. For example, in a multiple comparison of different types of antihypertensive drugs a decision will have to be made about whether to compare drugs with different mechanisms of action (eg, betablockers, diuretics, calcium channel blockers) and whether, within a pharmacological class, to include compounds with variable actions (eg, in the class of beta-blockers whether to compare full antagonists with partial agonists). 32 The cost-effectiveness of rituximab has been studied using a mechanism-based pharmacoeconomic model that included population pharmacokinetics and pharmacodynamics, linking serum rituximab concentrations to progression-free survival, simulating the effectiveness of rituximab in various clinical contexts. 33 These mechanisms served as inputs to economic models of follicular lymphoma, based on NICE appraisals.
If an intervention is claimed to be efficacious but the proposed mechanism of action is not biologically plausible, or is biologically implausible, or if there is no well attested mechanism, the claim of efficacy may be vitiated and the size of the QALY gain put in doubt.
In some cases, conflicting analyses can be informed by an appeal to mechanisms. For example, in an indirect comparison of two medications that both increased platelet counts in children with idiopathic thrombocytopenia, an analysis by the manufacturer of one of the medications suggested that there was no significant difference between the two compounds, while an independent analysis suggested otherwise. 34 The fact that the two treatments had different actions on the thrombopoietin receptor mediating platelet synthesis suggested that there was likely to be a difference, supporting the results of independent analysis. Although the data were too poor for a firm conclusion to be made about the size of the difference, this mechanistic argument, when taken with other considerations, helped the appraisal committee to reach a decision.

| ADHERENCE
The reasons people seek treatments, the reasons they adhere to the treatments offered, and the interaction between help seeking and subsequent adherence to treatment have been extensively investigated over many decades. 35  Because the functional defect-loss of regulation of ion and water transport-is known, and the mechanisms responsible for it are fairly well characterized, in vitro assays demonstrating that cells regain function in the presence of a drug are expected to provide a good biomarker of clinical success. Laboratory evidence of this effect in different CFTR mutant cells, together with trial evidence for previously approved indications, allowed the FDA to conclude that the drug will work in several cystic fibrosis genotypes not tested in clinical trials. 48 Such use of mechanistic evidence requires more than considering the biological plausibility of a treatment. Rather, one must explicitly evaluate the evidence that speaks to the operation of the mechanism, and the evidence must be of good quality.

| DISCUSSION AND RECOMMENDATIONS
Evidence-based medicine seeks to make evidence explicit and to develop explicit methods for evaluating it. In practice, present-day EBM focuses almost exclusively on clinical studies-it treats mechanistic evidence that arises from other sources as irrelevant or peripheral.
But mechanistic evidence is neither of those things: we have argued that evidence for mechanisms, ie, evidence that mechanisms exist and how they operate, is central to drug approval, because it informs the drug approval process in a wide variety of ways. We believe that the drug approval process would benefit from explicitly including mechanistic evidence as part of the assessment of manufacturers' applications for licences and in postmarketing surveillance, so that it can be appropriately scrutinized and, if need be, challenged. The