Address correspondence to Jacqueline A. French M.D., NYU Comprehensive Epilepsy Center, 223 East 34th Street, New York, NY 10016, U.S.A. E-mail: Jacqueline.firstname.lastname@example.org
Purpose: Monotherapy approvals have been difficult to obtain from the U.S. Food and Drug Administration (FDA), and have almost all been achieved using a trial design entitled “withdrawal to monotherapy” in treatment-resistant patients, which employs a so-called “pseudo-placebo” as a comparator arm. The authors submitted a white paper to the FDA advocating use of a virtual placebo historical control as an alternative to pseudo-placebo. Such an approach reduces patient risk that would result from exposure to pseudo-placebo. In this article, we present the data submitted to the FDA to justify a historical control.
Methods: We analyzed individual patient data from eight previously completed withdrawal to monotherapy studies, which we determined had similar design. All studies employed percent meeting predetermined exit criteria (denoting worsening of seizure control) as the outcome measure. Kaplan-Meier estimates of the percent exiting were calculated at 112 days.
Results: The percent meeting exit criteria were uniformly high, ranging from 74.9–95.9%. The eight studies appear to meet the criteria set forth for use of historical control. The estimate of the combined percent exit based on the noniterative mixed-effects model is 85.1%, with a lower bound of the 95% prediction interval of 65.3%, and 72.2% for an 80% prediction interval.
Conclusion: There is justification for proposing that these data can serve as a historical control for future monotherapy studies, obviating the need for a placebo/pseudo-placebo arm in trials intended to demonstrate the efficacy of approved drugs as monotherapy in treatment-resistant patients.
Of the nine new antiepileptic drugs (AEDs) approved for use in the United States over the last two decades, only four are approved for use in monotherapy (oxcarbazepine, felbamate, lamotrigine, and topiramate), and only two for initial monotherapy (oxcarbazepine and topiramate).
The difficulty in obtaining U.S. monotherapy approval for AEDs arises from the fact that the U.S. Food and Drug Administration (FDA), in accordance with the International Conference on Harmonization (ICH), a conference intended to provide a single regulatory path across the world for drug approval, prefers clinical trials that contain an internal, interpretable control group, that is, one in which the test drug shows superiority to the control. The American Epilepsy Society (AES), the American Academy of Neurology (AAN), and the National Institute for Neurological Diseases and Stroke (NINDS) held a joint conference in March 2001 to discuss the approach to monotherapy approvals and to investigate new methodologies that might overcome regulatory barriers. They identified a number of issues:
(1) New AEDs, while providing important improvements in safety and efficacy, rarely demonstrate superiority in efficacy when compared to “standard” AEDs such as carbamazepine and phenytoin. To date, after many clinical trials, superiority to a typical therapeutic dose of an approved AED has never been demonstrated.
(2) Due to inability to demonstrate superiority to an active control, the only acceptable internal control consists of either placebo or a treatment that is intentionally less effective than the “standard” (also sometimes called a “pseudo-placebo”).
(3) This raises ethical concerns about patient safety, as there are documented cases of harm in patients randomized to subtherapeutic levels of active drug.
Most FDA approvals for monotherapy have been gained through a clinical trial design known as the “pseudo-placebo withdrawal to monotherapy study,” which assigns treatment-resistant patients to receive study drug or a suboptimal maintenance dose of a safe and effective active drug (e.g., valproic acid 1,000 mg/day, or low-dose of the study drug) (the so-called “pseudo-placebo”) (Pledger & Kramer, 1991). Once patients are randomized and titrated to the intended dose, they undergo a “withdrawal phase” when background AEDs are removed over a specified time frame, followed by a monotherapy phase, typically 2 months in duration (Fig. 1). The trial continues until either all phases are completed or patients reach prespecified endpoints.
The endpoints are described as “therapeutic failure” or “escape criteria.” They include clinically significant events such as a doubling of average monthly seizure rate; doubling of the highest 2-day consecutive seizure rate; emergence of new, more severe seizure type; or a clinically significant prolongation of generalized tonic–clonic seizures. Patients may escape either during the withdrawal phase or the monotherapy phase of the trial. Outcome is determined either by time to escape or percent of patients escaping in either arm. Many of these trials have been successfully completed, and some have led to FDA monotherapy indications (Sachdeo et al., 1992; Faught et al., 1993; Beydoun et al., 1997; Sachdeo et al., 1997; Gilliam et al., 1998; Beydoun et al. 2000; Sachdeo et al., 2001).
These trials are valid and interpretable, but there is concern as to whether they expose patients to excessive risk. Some of the pseudo-placebo doses that have been used, such as 1,000 mg/day of valproic acid or 100 mg/day of topiramate, would be effective doses in some patients, and the initial intent was to provide some protection from dangerous seizures. However, some doses used as pseudo-placebos (e.g., 600 mg of gabapentin, 300 mg of oxcarbazepine) have not been demonstrated to be effective doses in randomized controlled trials, and are generally lower than doses used even as initial monotherapy. Epilepsy is different from other conditions where placebo may be utilized (such as hypertension), since there is no way to exit a patient at a “warning” stage, before harm can occur. In a hypertensive patient, one can withdraw a patient on placebo if blood pressure increases, before it gets dangerously high. In the withdrawal to monotherapy design, the escape criteria consist of an increase or worsening of seizures, which by itself can be considered an adverse outcome. Because this design categorizes certain adverse events as escape criteria, finding them in a study report or publication may be difficult. Escape criteria are not considered “adverse events” unless the patient requires hospitalization, in which case they become serious adverse events (SAEs). For example, with topiramate 100 mg as the control, secondarily generalized tonic–clonic seizures occurred in 10% of patients who had not had them during the 2 years prior to study entry (Sachdeo et al., 1997).
The types of SAEs observed in withdrawal to monotherapy studies are indicative of the potential safety issues related to this study design. In the lamotrigine study, there were five SAEs in each group (500 mg lamotrigine and 1,000 mg valproate); however, the types of SAEs were very different. Patients in the lamotrigine group experienced chest pain, pneumonia, rash, Stevens-Johnson syndrome, and suicidal ideation, adverse events that one might expect from active treatment. Patients in the pseudo-placebo group experienced exacerbation of convulsions, increased seizures, paranoid ideation, status epilepticus, and sudden unexplained death (Gilliam et al., 1998). These adverse outcomes are potentially a direct result of trial design, not study treatment, and this must be taken into account when evaluating the ethics of these studies.
Alternative trial designs have been suggested, including noninferiority trials that compare one active treatment to another. These trials do not meet the standard for an interpretable control as defined by the ICH guidelines and the FDA, as there is not “historical evidence of sensitivity to drug effect,” that is, that similarly designed trials have “regularly distinguished effective treatments from less effective or ineffective treatments.” The ICH guidelines state unequivocally that “without this determination, demonstration of efficacy from showing of noninferiority is not possible and should not be attempted.” Because this criterion has not yet been demonstrated for any AED, the FDA would preclude use of a standard AED as an active control in a noninferiority design, although this method is currently used by European regulators for monotherapy approval.
In the absence of a clear option for an internal interpretable control, the FDA will consider the use of historical control as a viable alternative to a randomized comparison (Katz, 2006). Historical control can be defined as the expected behavior of a group of patients, based on their behavior as it has been reproducibly demonstrated in previous trials. Therefore, after several placebo-controlled trials, it may be possible to “predict” how patients treated with placebo might behave, without actually exposing them to the risk of placebo. The case for using a historical control, considered a weaker control than an internal control, is stronger when alternative trial designs are not feasible or ethically acceptable. We submitted a white paper to the FDA, which described the available pseudo-placebo group data from outpatient withdrawal to monotherapy studies and determined that there was a study population with predictable results that could be used as a historical control for future U.S. monotherapy studies. The FDA has, in principle, accepted this approach. At present, this essentially serves as the only FDA accepted path to monotherapy approval that is also considered acceptable by the epilepsy community. A number of studies are underway based on a historical control approach and are listed at http://www.clintrials.gov as historical control withdrawal to monotherapy studies. We now present the analysis that serves as the basis for the studies.
To create a valid historical control, it was essential to obtain all similarly designed trials that have been performed. This included a review of the literature, searching for controlled clinical trials, epilepsy, and monotherapy in which antiepileptic drugs were withdrawn leading to monotherapy; questioning of colleagues with extensive involvement in the performance of clinical trials to determine other trials performed with this methodology; and direct inquiry to companies with drugs in development to identify trials performed but not published.
nUnpublished data courtesy of Abbott Laboratories.
oUnpublished data courtesy of Eisai.
600 mg pregabalin
1,000 mg valproate
1 (PHT or CBZ)
100 mg topiramate
10 μg/ml felbamate
300 mg oxcarbazepine
1 (OXC 2,400)
300 mg oxcarbazepine
15 mg/kg valproate
15 mg/kg valproate
6 mg tiagabine
300 mg rufinamide
Once all trials were identified, each company was asked to provide primary individual patient data from the pseudo-placebo arm only of the trial in question. These data included patient number, age, race, sex, use of carbamazepine (CBZ) upon entry into the trial, time to escape duration, and censoring time, that is, the time until observation ended for reasons other than escape. All of the companies ultimately complied with this request, although different amounts of information were made available from different companies. Further information was obtained from study protocols where available, information available in the FDA Summary Basis of Approval (SBA), and study publication, when available.
Of primary importance in combining these trials as a possible historical control is the consistency of the design, particularly in relation to the population that enters the trial, and the escape criteria. We thus decided to exclude two trials that had substantial differences from the other trials (Appendix S2).
Trial 10 differed from the other trials in escape criteria. Criteria for this trial were (1) a doubling of baseline 2-day seizure frequency, (2) a generalized tonic–clonic seizure if it never occurred before, and (3) two generalized tonic–clonic seizures if none occurred during baseline. Almost the entire refractory partial seizure patient population has had at least one generalized tonic–clonic convulsion at some point in their history; therefore, escape criterion 2 would have applied only to a small subset of patients enrolled in the trial, leaving essentially only two escape criteria for the majority of patients. Because individual seizure counts were not available for patients in this trial, it is impossible to determine whether results would have been the same if the standard escape criteria had been used.
Trial 9 differed with regard to selected entry criteria. Patients could not have failed more than one AED and could have as few as one seizure per month at screen.
These criteria were different from those of the other trials, which enrolled patients with at least 2–4 seizures per month. Moreover, patients were eligible for the other trials only if they had failed at least one AED, whereas patients were eligible for trial 9 if they failed at most one AED. Thus trial 9 included less refractory patients.
Enrollment criteria (eight included trials)
Enrollment and patient characteristic data were available for the majority of the included trials. For the lone unpublished included trial (trial 4), only the primary data on exit, the escape criteria, and the number of patients enrolled were; no information on background demographics was available. All trials enrolled patients with localization-related epilepsy.
Specific enrollment criteria for each trial and common escape criteria are provided in Appendix S3. Of note, all trials with available information listed similar exclusion criteria, including history of status epilepticus or progressive neurologic disease.
Demographics and baseline characteristics (included trials)
The randomized patient population in the “pseudo-placebo” groups appears similar across the clinical trials. As seen in Table 2, the mean patient age ranged between 35 and 38 years, and the median baseline seizure frequency ranged between 5.5 and 10.0 seizures per 4 weeks.
CBZ, carbamazepine; CPS, complex partial seizure; NA, not available.
aThe number of patients in pseudo-placebo arm.
bBaseline seizure frequency per 4 weeks.
cData not available.
dIn both arms combined.
64 (average of three groups)
To establish a historical control, it was imperative to establish that the studies had similar characteristics across the trials. It is, therefore, essential to ascertain that those trials had similar designs and escape criteria, and that they randomized patients with similar demographic characteristics. Study design was available for all included trial. The study design that was predominantly used is depicted in Fig. 1.
Trials are outlined in Table 1, including nature of pseudo-placebo used. Overall, the study designs were similar, and used a “pseudo-placebo” comparator consisting of a low dose of the study drug or 15 mg/kg of valproic acid. All the trials were randomized, double-blind, parallel-group design with a baseline phase followed by a double-blind phase divided into a withdrawal phase and a monotherapy phase. In all but one of the trials, patients were randomized prior to withdrawal of the baseline AED (Fig. 1). One trial (trial 5, Appendix S1) randomized patients after AED withdrawal. The withdrawal phase ranged from 4–10 weeks, and the monotherapy phase ranged from 10–16 weeks across the trials. All trials allowed either one or up to two baseline AEDs. In trial 5, the baseline AED had to be CBZ. For the six trials that allowed two AEDs at baseline, five required that one of the baseline AEDs be taken at <50% of the minimum recommended dose or that the serum concentration be <50% of the lower end of the reference range for the serum level. The minimum required seizure frequency during the baseline phase ranged between 2 and 4 seizures per 4 weeks in all trials, with the exception of trial 9, which required 1–4 seizures per 4 weeks. Appendix S3 contains the inclusion and exclusion criteria by trial. Table 1 shows the trial comparison summary. The primary endpoint was the same for all trials: percent of patients exiting the trial. Patients exited the trial if they experienced seizure worsening as identified by predetermined escape criteria (shown in Appendix S4). These criteria were uniform across the trials and can be grouped into four main types:
(1) A twofold increase in partial seizure frequency in any 28-day period compared to baseline. (All trials, although trial 1 used the highest four consecutive weeks in baseline and the others used the average frequency in baseline.)
(2) A twofold increase in the highest consecutive 2-day seizure frequency that occurred during the baseline phase. (All trials, although trial 2 required three seizures if the highest daily count in baseline was one seizure, and trial 8 did not use this criterion if the highest daily count was one seizure.)
(3) Occurrence of a single generalized tonic–clonic seizure if none had occurred in the previous 6 months (trial 6, trial 9), within 2 years of study entry (trial 1), during baseline (trials 3, 5, 7, 8), and “emergence of a more severe seizure type” (which would include generalized tonic–clonic seizure) (trial 2).
(4) A prolongation or worsening of seizure duration or frequency considered by the investigator to require intervention for all trials (although trials 2, 5, 7, and 9 require the worsening seizures to be generalized tonic–clonic) or episode of serial seizures/status epilepticus for trials 3, 7, 8, and an episode of status epilepticus for trial 1.
For seven of the eight trials included in the analysis, individual data on time to exit were made available. Kaplan-Meier estimates of the percent exiting were calculated as a function of time as well as the asymptotic standard error of the estimates (Kaplan & Meier, 1958) (Fig. 2). To use a consistent time frame, exit rates were calculated from the start of withdrawal of the background AED (or start of drug taper for study 5) to 112 days, since the study lengths varied from 112–188 days. For trial 7, only the number of participants who entered the pseudo-placebo arm and the number meeting the escape criteria were available. For this trial, the percent exiting was approximated by the number exiting divided by the number entering, which is conservative in the sense of giving the lowest estimate of the percent exiting of any ordering of escapes and stopping observation for other reasons. In addition, all patients exited in trial 5. To provide an estimate of the percent exit and associated standard error, two successes (i.e., completed without meeting escape criteria) and two failures were added to the total (for a full explanation, see Agresti & Coull, 1998).
The possible effect of potentially important variables (e.g., age, gender, race, withdrawal from CBZ) was examined by fitting a Cox model (Cox, 1972) stratified by trial with the possible explanatory variables as covariates. An unstratified Cox model with trial as covariate was used to examine effect of trial and randomization before or after baseline medication withdrawal. Only trials with individual data on the variable of interest were included.
After answering the question of whether these control groups form a reasonable set of historical controls for future trials, one is left with the question of how, in practice, one decides if a new drug improves upon the historical standard generated by these trials. Although there is growing statistical literature on preservation of effect (i.e., requirement that the treatment under investigation demonstrates efficacy greater than a specified fraction, such as 80% or 90% of the efficacy of the control) in active-controlled trials (commonly referred to as “delta”), no guidance exists from the statistical literature on how to gauge the success of a new treatment against an aggregated database of pseudo-placebo studies. Three possibilities were examined:
(1) A very conservative approach is to take the lowest of the lower bounds from the 95% confidence intervals from the control series of the different trials (calculated as described in preceding text) and declare a new drug superior only if the upper bound of the confidence interval on the percent exiting with the new drug is smaller than the lowest lower bound. Note that this approach has the undesirable property that one is penalized for every additional trial in the set of control series—the minimum of the lower bounds can only decrease with the addition of more trials.
(2) A liberal approach is to combine the data from all the control groups into one series and set the bar as the lower bound of the 95% confidence interval based on the combined series. An overall estimate of the percent exiting by 112 days and its standard error was obtained by a Kaplan-Meier estimate based on the data from all studies that provided individual data. This ignores any variability between trials.
(3) A more reasonable approach in the spirit of ICH E10 defines success on the percent exiting from the different control groups and how consistent that percent is across the different studies. To account for interstudy variability, a noniterative random-effects approach (DerSimonian & Laird, 1986) was computed to determine the combined percent exit rate and standard error. As a sensitivity analysis, a mixed-effects model was used with study as a random effect to compute restricted maximum likelihood (REML) estimates for the combined percent exit rate and standard error (Normand, 1999).
To remain conservative, a two-sided prediction interval (PI) was calculated on the percent exiting, based upon a projected sample size of 50 subjects and a pseudo-placebo exit rate of 80%. This provides a “bar” that is more conservative than the standard confidence interval, as it attempts to bound how a single future trial would behave, rather than the standard confidence interval, which attempts to bound the mean. The lower bound of the PI would set the bar for the level against which a new drug would have to show significant improvement. Explicitly, the mean M and standard error S for a single future trial are calculated based on the eight historical trials (including interstudy variability and the variability of a future trial). The bar is set at M-Z*S, where Z is the (1-alpha/2)*100 percentage point of the standard normal distribution, that is, 1.96 for a 95% PI and 1.28 for an 80% PI. A new trial would provide evidence of effect as monotherapy if the upper bound of a 95% confidence interval on the exit rate from the new trial was lower than the lower bound of the specified PI based on the historical controls.
A total of 398 participants were enrolled in the eight included pseudo-placebo cohorts, although seven participants left the trial prior to the start of the baseline drug taper and were excluded from the analyses presented herein.
Table 3 summarizes the estimates of the percent exiting by 112 days in each trial. This was uniformly high, ranging from 74.9–95.9%, with half the trials having an estimated percent exiting between 77.2% and 87.5%. The estimate of the combined percent exit based on the noniterative mixed effects model is 85.1%, with a lower bound of the 95% PI of 65.3% and a lower bound of the 80% PI of 72.2%.
Table 3. Outcome for patients in the “pseudo-placebo” arms of included trials
No.patients in pseudo-placebo arm
No. exit by day 112
Est. of % exit
Standard error (SE) (%)
95% Lower confidence bound (%)
aOne participant left the trial prior to the start of baseline drug taper.
bSix participants left the study prior to the start of baseline drug taper.
cNote this differs from the original paper (Gilliam et al., 1998) where withdrawals for any reason were analyzed.
dTwo success and two failures (exits) were added to estimate percent exit and the standard error.
600 mg gabapentin
1,000 mg valproate
100 mg topiramate
10 μg/ ml felbamate
300 mg oxcarbazepine
300 mg oxcarbazepine
15 mg/kg valproate
15 mg/kg valproate
Withdrawal from CBZ did not increase the likelihood of exit (p = 0.56). The overall estimate of CBZ effect was to raise the hazard rate of exiting by 8.0% (95% confidence interval: 19.4% decrease to 35.4% increase). There was an effect of medication withdrawal prior to randomization. Trial 5 removed background medication prior to randomization, so an additional sensitivity analysis was performed by removing trial 5 from the analysis. The lower bound of the 95% PI (excluding trial 5) was 67.0%, so given the similar response, trial 5 remained in the analysis.
There was no effect of age or gender. Nonwhites, who made up 18.6 % of the cases, were less likely to exit (p = 0.027). The overall estimate was for nonwhites to have a 38.6% decrease in hazard (95% confidence interval: 72.7% decrease to 4.5% decrease).
Several potential confounders were considered, including the rate of AED withdrawal. Trial 1, which had the slowest rate of AED withdrawal, was one of the trials with the longest time to exit. Trial 2, which also had relatively slow withdrawal, did not have obviously delayed exit. In addition, changes in background AED over time were considered. Information regarding time of enrollment was not available; however, publication dates exist for seven of the eight trials as well as a date of enrollment for the unpublished trial. Publication dates ranged from 1992 (trial 7) to 2001 (trial 5). The concern would be decreasing exit rates over time. In any case, the more recent trials (5 and 6) had high exit rates, so timing of study does not appear to be a factor.
Eight separate trials with similar designs and escape criteria have similar outcomes for the pseudo-placebo arm with respect to the primary outcome—percent exiting. These trials would appear to meet the criteria set forth by ICH for use of historical control. Therefore, this group of trials might reasonably be used as historical controls for future trials of withdrawal to monotherapy using a similar design in a similar population.
Although there are some differences in the shape of the curves, the curves are surprisingly similar. The differences are less likely due to the trial populations being heterogeneous than slight design differences between the trials, which has been addressed by treating study as a random effect. One difference was the rate of withdrawal of baseline antiepileptic medications. The absolute percent exit also might differ slightly because of different pseudo-placebo arms. These trials approximate the behavior of patients had they been withdrawn to placebo, which obviously would have been unethical. One could confidently say that the rate of exit with true placebo would have, if anything, been greater than with pseudo-placebo. However, the similarity of the results across trials is impressive and suggests that small differences in trials would not substantially impact the outcome.
There is some controversy regarding how to appropriately combine multiple trials to use as a historical control in a pivotal trial for registration of a new drug. In our white paper, we suggested the “bar” be set using the lower bound of the 95% PI based on the combined percent exit rates (65.3%) for a single trial or the lower bound of the 80% PI (72.2%) for two trials. This controls for interstudy variability; moreover, the PI approach accounts for the variability of a future trial as opposed to simply estimating the mean using a confidence interval approach. The PI approach used an assumed sample size of 50 patients and an expected failure rate of 80% to predict behavior of a future pseudo-placebo arm. These assumptions are reasonable given the average sample size of the combined pseudo-placebo trials (n = 49.4) and the estimated rate of withdrawal (86.1%). In addition, these estimates are method invariant given that the point estimates and standard errors from the mixed effects model are nearly identical to the results from the noniterative approach. Therefore, this use of historical controls might well provide an acceptable way for new drugs to obtain an indication for monotherapy without exposing patients to undue risk.
Some may argue that a fraction of the efficacy seen in the successful pseudo-placebo trials should be preserved. This is often referred to as “delta” in the noninferiority paradigm. Because the proposed design supports a superiority trial, we have not taken the approach of estimating and preserving delta.
One final issue regarding the performance of an open trial using historical control would be whether the knowledge that all patients are receiving drug would affect patient behavior. Although this answer cannot be known for certain, the impact is minimized by the fact that all patients in the pseudo-placebo controlled trials understood that they had been randomized to either of two active arms: one lower dose and one higher dose. Consent forms did not include the information that one arm included a dose that was expected to be less effective, although of course this information could have been transmitted verbally by the investigator.
In conclusion, we demonstrated that these data can serve as a historical control for future monotherapy studies of drugs that have already demonstrated efficacy in an add-on setting, obviating the need for a placebo/pseudo-placebo arm in trials intended to demonstrate the efficacy of drugs as monotherapy. Based upon these observations, the FDA has accepted the concept of historical controls in this setting. Several trials utilizing this design planned for regulatory submission have begun. (Sepracor, 2009; UCB, Inc, 2009a–d)
Dr French has served as a consultant for the following companies, with the proceeds going to the Epilepsy Study Consortium: Cypress Bioscience Inc, Eisai Medical Research, GlaxoSmithKline, Icagen Inc., Ikano, Johnson & Johnson PRD, Marinus, NeuroVista Corporation, Novartis, Ono Pharma U.S.A. Inc, Ovation, Pfizer, Sepracor Inc., SK Life Science Inc, Special Products, LTD, Supernus Pharmaceuticals, Taro Pharmaceuticals, UCB, Inc, Upshire Smith, Valeant, and Vertex Pharmaceuticals. She has received grant support from Pfizer, UCB, Inc, Icagen, Ikano, SK Life Sciences, and Vertex. Dr Temkin has been a paid consultant for Novartis, Celgene, Eisai, GlaxoSmithKline, UCB, Pfizer, Johnson & Johnson, Schwarz, and PAR. Dr Warnock was a salaried employee of GlaxoSmithKline from August 2002 through May 2007. Dr Wang is a paid employee of Johnson & Johnson. This work was partially funded by GlaxoSmithKline.
We confirm that we have read the Journal’s position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.