SEARCH

SEARCH BY CITATION

Keywords:

  • Bias-adjusted estimation;
  • Drop-the-losers design;
  • Treatment selection;
  • UMVCUE

Abstract

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 The three-stage drop-losers design
  5. 3 Unbiased and bias-adjusted estimation of μ1
  6. 4 Simulation studies
  7. 5 Discussion
  8. Acknowledgments
  9. Conflict of interest
  10. Appendix
  11. References
  12. Supporting Information

The two-stage drop-the-loser design provides a framework for selecting the most promising of K experimental treatments in stage one, in order to test it against a control in a confirmatory analysis at stage two. The multistage drop-the-losers design is both a natural extension of the original two-stage design, and a special case of the more general framework of Stallard & Friede (2008) (Stat. Med. 27, 6209–6227). It may be a useful strategy if deselecting all but the best performing treatment after one interim analysis is thought to pose an unacceptable risk of dropping the truly best treatment. However, estimation has yet to be considered for this design. Building on the work of Cohen & Sackrowitz (1989) (Stat. Prob. Lett. 8, 273–278), we derive unbiased and near-unbiased estimates in the multistage setting. Complications caused by the multistage selection process are shown to hinder a simple identification of the multistage uniform minimum variance conditionally unbiased estimate (UMVCUE); two separate but related estimators are therefore proposed, each containing some of the UMVCUEs theoretical characteristics. For a specific example of a three-stage drop-the-losers trial, we compare their performance against several alternative estimators in terms of bias, mean squared error, confidence interval width and coverage.

1 Introduction

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 The three-stage drop-losers design
  5. 3 Unbiased and bias-adjusted estimation of μ1
  6. 4 Simulation studies
  7. 5 Discussion
  8. Acknowledgments
  9. Conflict of interest
  10. Appendix
  11. References
  12. Supporting Information

The maximum likelihood estimate (MLE) of the treatment effect is often reported as standard at the end of a multistage trial. It is of course a precise and readily available estimator, but since it ignores the trial's sequential nature it is generally biased (Whitehead, 1986) and considerable research has been conducted into estimation methods that address this fact. Although many bias adjusted estimation procedures have been proposed, and unbiasedness is certainly not the only characteristic by which an estimator can be judged, the only way to achieve an efficient and “purely” unbiased estimate is to execute the following procedure: (i) identify an unbiased estimate based on part of the data—Y say, (ii) identify complete, sufficient statistics for the parameter in question—Z say, and (iii) employ the Rao-Blackwell improvement formula to obtain inline image—the uniform minimum variance unbiased estimate (UMVUE).

There are two distinct applications of the Rao-Blackwell approach to estimation in multi-stage trials. The first approach, as pioneered by Emerson & Fleming (1990) and further clarified by Liu & Hall (1999), applies to a group sequential trial with one active treatment and one control arm that stops when conclusive evidence (for or against the efficacy of the treatment) is first observed. The stage at which the trial stops—M say—is a random variable. Given M, a sufficient statistic of the data, Z, and a truncation adaptive stopping rule (Liu & Hall, 1999) one calculates the expectation of the first stage data, Y1 say, given the pair inline image to obtain the truncation adaptable UMVUE. We refer to this approach as unconditional because, in a three-stage trial, for example, it produces an estimate of the treatment effect regardless of whether the trial stops at stage one, two, or three and it is therefore unbiased by definition when one averages across all possible realizations of the sequential trial.

However, in some circumstances one may feel it more appropriate to develop estimators that are UMVUE conditional on the occurrence of a particular subset of trial realizations. For example, in two-stage “drop-the-losers” trials, the best performing of K experimental treatments is selected after the first stage before being tested in isolation against a control group in a confirmatory analysis in stage two. The stage two estimate of the selected treatment, Y2 say, is unbiased, and Cohen & Sackrowitz (1989) obtain the UMVUE of the selected treatment, conditional on the order of the stage one treatment arm estimates, by calculating inline image, Q denoting the stage 1 order statistic condition. We call this an example of an UMVCUE—C for conditional. UMVCUEs have also been proposed for use more generally in two-stage trials evaluating a single treatment that allow early stopping for futility only. This is because a strong argument can be made that estimation of the treatment's effect is only important when the trial does in fact continue to the final stage; see Pepe et al. (2009), Koopmeiners et al. (2012), and Kimani et al. (2013) for recent examples.

The two-stage drop-the-losers' design has been the focus of much attention in the research literature. Sampson & Sill (2005) and Wu et al. (2010) consider hypothesis testing methodology for this design whereas Sill & Sampson (2007), Bowden & Glimm (2008), and Bebu et al. (2010) target point estimation. More generally, the vast majority of adaptive trial designs have also followed a two-stage strategy in the following sense: Design adaptations (such as subgroup selection or sample size adjustment) are made at the first interim analysis. The trial then proceeds under a fixed design (possibly with additional interim analyses) for its remaining duration. However, it is questionable whether a two-stage approach is always the best strategy. As an example, an adaptive trial was recently conducted into a treatment for maintaining lung function in patients with Coronary Obstructive Pulmonary disease (COPD) (Barnes et al., 2010). The trial aimed to use its first stage data to select the most promising doses of a new drug, before testing them against a placebo in a confirmatory analysis at stage two. In the event, two out of the four doses were selected for continuation to the second stage.

By selecting two doses (instead of one), the study decreased its chances of accidentally discarding a dose that would ultimately be successful. Nonetheless, evaluating two experimental treatments in the confirmatory analysis is more challenging than for one, since multiple testing corrections must be applied. Furthermore, if the best performing of the two experimental treatments ends up being recommended, then its estimate may subsequently be queried as “biased”. There may, therefore, have been an advantage in allowing selection of the single most promising treatment–dose to occur after several interim analyses. With this in mind, a clear multistage analog for the two-stage drop-the-losers design exists: rather than selecting the best of K treatments at a single interim analysis, selection could be achieved by dropping a predetermined number of treatments at each stage due to (relative) poor performance until only one remained. The downside of inserting additional interim analyses into any clinical trial is clearly an increased administrative burden. Yet, in the context of a drop-the-loser trial, doing so can markedly increase the probability of selecting the truly best treatment for a given number of patients, as is shown in Section 'The three-stage drop-losers design'.

The multistage drop-the-loser design approach is actually special case of a more general design framework for testing multiple treatments proposed by Stallard & Friede (2008). In their paper, the decision to drop a treatment need not be dictated by a predetermined rule based on efficacy data but, if it is, the family wise error rate of the trial can be controlled in the strong sense. Stallard & Friede (2008) do not touch on the issue of estimation for their general design, although it is highlighted by them as an area for future methods development. In this paper we focus on estimation, and attempt to derive the UMVCUE for the specific drop-the-losers case. A different derivation to that of Cohen & Sackrowitz (1989) is used. It is perhaps less intuitive than the original, but generalizes to an arbitrary number of stages far more easily. Furthermore, allowing additional stages of selection requires an increasingly strong and unexpected form of conditioning to be employed. So in order to best elucidate the approach we start in Section 'The three-stage drop-losers design' by considering the extension to the three-stage case and the general J-stage formula is left as an appendix. In Section 'Simulation studies', we apply our estimation proposals to some specific three-stage trial examples, and compare its performance against several other estimation strategies. Interval estimation for the selected treatment is also considered. We discuss the issues raised and point to future research in Section 'Discussion'.

2 The three-stage drop-losers design

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 The three-stage drop-losers design
  5. 3 Unbiased and bias-adjusted estimation of μ1
  6. 4 Simulation studies
  7. 5 Discussion
  8. Acknowledgments
  9. Conflict of interest
  10. Appendix
  11. References
  12. Supporting Information

Imagine a three-stage trial initially involving K experimental treatments and a control group. The purpose of the first two stages is to identify which treatment has the most beneficial true effect, as reflected via an appropriate outcome measure. Throughout this paper we will assume that higher values of the outcome are more desirable. At the end of stage one, inline image experimental treatments are dropped and then the best of the L remaining experimental treatments is selected at the end of stage two for confirmatory testing against the control group in stage three. We will refer to such a three stage design as a “K:L:1” trial. Let inline image index the initial full set of treatments with inline image referring to the control group. Let inline image denote the response of the ith patient on treatment k, which is normally distributed with mean inline image and variance inline image. At stage j (inline image), n subjects are randomized to each treatment arm still active in the trial and the experimental treatments are evaluated according to a test statistic of the form:

  • display math(1)

Without loss of generality we will refer to the true mean of the selected treatment as μ1, the reasons for this subscript notation are given in due course. At the conclusion of the trial it would be natural to primarily focus on testing the null hypothesis inline image. In this paper, we focus on the task of estimating the contrast inline image. In the next subsection, we show empirically that whilst K:L:1 trials can provide more power to select the truly best treatment for confirmatory testing (compared to an analogous two-stage drop-the-losers trial) the MLE for μ1 can be substantially biased. Section 'Initial motivation' serves to merely illustrate the problem. Further details regarding the notation, as well as the design and analysis of drop-the-losers trials, are covered from Section 'Notation for the estimation problem' onwards.

2.1 Initial motivation

Trial data are simulated under a 3:2:1 design. Each treatment arm at each stage is allotted n = 60 patients and the variance of each patient's individual treatment effect, inline image, is 50. At the end of stage 2, 300 patients have been allocated to the experimental treatments. This is contrasted with a traditional two-stage “3:1” trial, where the same number of patients (100 per arm) are used to select the best performing treatment after stage one. We assume that the vector of true mean effects for the three treatments is (1, Δ, 2). Figure 1 shows the proportion of simulations for which the truly best treatment (with mean equal to 2) is selected, as Δ is varied between 0 and 2. Each point is the average of 50,000 simulations. The 3:2:1 design gives a marginally higher power of selecting the best treatment than the 3:1 design. The results of two further simulation scenarios are also shown in Fig. 1 (left), namely:

  • A 5:3:1 trial (75 per arm per stage) compared with a 5:1 trial (120 per arm). True vector of treatment means (1, 1, 1, Δ, 2). 600 patients used to select best treatment in total.
  • A 6:2:1 trial (75 per arm per stage) compared with a 6:1 trial (100 per arm). True vector of treatment means (1, 1, 1, 1, Δ, 2). 600 patients used to select best treatment in total.
image

Figure 1. Left: power to select the truly best treatment for various two and three-stage drop-the-losers designs as a function of Δ . Right: bias and coverage of the MLE in a K:2:1 design, as a function of K.

Download figure to PowerPoint

For these scenarios, the difference in power between the two and three-stage designs is now much more pronounced, and the case for switching to a three-stage design is stronger. Increasing the sample size of the trial (i.e, reducing the variance of the estimates) will always increase the probability of selecting the best treatment (assuming that one is truly better than the others). In our example the treatment with mean effect 2 is always best. For increasing sample sizes, the power curves approach 1.

Note that, if equal numbers of patients were randomized to the control group as are to the experimental arms at each stage, the two and three-stage designs featured above would have different numbers of controls at different stages. However, since we have assumed that the trials always continue to the final stage, the control group data only impacts on the final stage analysis; it therefore does not affect the probability of selecting the best treatment.

Trial data is now simulated under a K:2:1 design for inline image and with n = inline image = 50 for all k. All treatments are assumed to have no (zero) effect. Figure 1 (right) plots the bias and coverage of the MLE for μ1 as a function of K. One can see that as the number of experimental treatments increases, the bias in the MLE increases and the coverage of its 95% confidence interval starts to fall well below its nominal level. In the context of a two-stage drop-the-losers trial, the bias in the MLE of the selected treatment is maximized when all treatments have the same effect (see Carreras & Brannath, 2013, for a proof). We conjecture that this is true under three-stage drop-the-losers selection as well. Despite the fact that Fig. 1 (right) most probably represents a worst case scenario in terms of the MLEs performance, there is certainly room for alternative estimation strategies to be developed.

2.2 Notation for the estimation problem

For simplicity we will now assume that the within treatment variance term inline image is constant across treatments, and is hence referred to as v2. This means that ranking the treatments via the test statistic in Eq. (1) over the first two stages is equivalent to ranking by the values of the experimental treatment terms only. That is, the cumulative control group data at stage j, inline image, and the common square root term inline image can be ignored, leaving the experimental treatment MLEs to be directly compared head-to-head.

Let inline image (inline image) represent the estimate for the mean effect of treatment k using only those subjects recruited at stage j. In order to add more flexibility, let the originally fixed number of subjects per-arm per-stage, n, now equal inline image, so this number can vary across stages if required (as in Bowden & Glimm (2008)). Letting inline image, and inline image, then:

  • display math

We base all subsequent mathematical derivation on the inline image random variables (or transformations of them), leaving the individual patient data inline image notation behind. Let inline image represent the vector (ψ1, … ,inline image) where inline image is the ranking of treatment k using our design framework. Denote the event: inline image, inline image, … , inline image by the letter Q. It is useful to label the inline images so that event Q is satisfied. That is, so that inline image refers to the treatment that is ultimately ranked as kth best at the end of the trial. However, it is important not to confuse or equate this convenient labeling with explicitly conditioning on event Q—this is done in Sections 'Unbiased and bias-adjusted estimation of μ1' and 'Deriving the RB1 estimator'.

At the end of stage 1, the top L treatments are kept and the remainder are dropped. So, inline image (which are identical to the stage one MLEs) satisfy:

  • display math(2)

This enables inline image and inline image to be associated with treatments inline image after stage one. At stage two, the remaining L treatments are ranked according to their cumulative MLEs. So, assume that the remaining stage one statistics inline image and inline image satisfy:

  • display math(3)

This enables inline image and inline image to be associated with treatments inline image at stage 2. A schematic diagram of this selection process is shown in Fig. 2. Ultimately, the selected experimental treatment is therefore associated with stage-wise statistics inline image and additionally a Y13 ∼ N(μ1,inline image), the sole experimental treatment tested at stage 3. Note that although inline image fulfills the inequalities in Eqs. (2) and (3), this does not directly imply that inline image or inline image. Note further that although the truly best treatment has the highest chance of being selected, μ1 is not equivalent to maxinline image.

image

Figure 2. Schematic diagram showing the selection process of the three stage trial.

Download figure to PowerPoint

At the third and final stage of the trial, we seek an efficient unbiased estimate of μ1 − μ0, where μ0 represents the mean parameter of the control group. As previously mentioned, the control group always progresses to the final stage of the trial, so μ0 can be trivially and unbiasedly estimated using all of the relevant data via its MLE. By contrast, the parameter μ1 is much more elusive; at the trial outset it is a discrete random variable with K possible values and, conditional on Q, it is not unbiasedly estimated by its final stage three MLE:

  • display math(4)

since Y11 and Y12 are conditionally biased with respect to μ1. We therefore focus on bias-adjusted estimation of μ1 for the remainder of the paper.

3 Unbiased and bias-adjusted estimation of μ1

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 The three-stage drop-losers design
  5. 3 Unbiased and bias-adjusted estimation of μ1
  6. 4 Simulation studies
  7. 5 Discussion
  8. Acknowledgments
  9. Conflict of interest
  10. Appendix
  11. References
  12. Supporting Information

Let inline image and inline image represent the vector of mutually independent unselected treatment estimates at stages one and two. Further, let Y1 and Y2 represent the complete set of normally distributed statistics at stage one and two so that, for example, inline image. The joint distribution, f(.), of the complete data inline image:

  • display math(5)

where inline image, inline image and Π equals all terms involving inline image. The trio inline image, inline image and inline image are therefore complete sufficient statistics for all mean parameters unconditionally. We note that the conditional density inline image is essentially the same as Eq. (5), except that its support is restricted by Q. It must therefore be scaled up by a factor representing the probability of Q, in order to integrate to one over the restricted space.

In the spirit of Cohen & Sackrowitz (1989) who investigated two-stage UMVCUEs, a three-stage UMVCUE for μ1 would be a Rao-Blackwellization of the unbiased final stage estimate Y13 conditional on the complete sufficient statistic inline image and under Q. We now clarify precisely how Y13 is restricted in this setting. At stage one, we can state from Eq. (2) that (a): inline image and from Eq. (3) (by removing a common term inline image) that:

  • display math

The set of Y13 values that satisfy conditions (a) and (b) is the sampling distribution of inline image. From the definition of Z1, condition (b) implies that:

  • display math

and condition (a) implies that:

  • display math

Therefore, inline image must be less than

  • display math

Since T depends explicitly on the value of Y12 (as well as implicitly through Z1) it is natural to condition on (inline image) rather than on (inline image) in order to calculate inline image. We will refer to this as the “RB1” estimator, and denote it by the symbol inline image.

Lehmann & Scheffe (1950) have shown that a sufficient and complete statistic is also minimally sufficient (the converse is not true). Therefore, since (inline image) is sufficient but not minimal, it cannot be complete. Thus, the RB1 estimator will be unbiased and have a smaller variance than Y13 by the Rao–Blackwell theorem, but it is not UMVCU. We now calculate inline image, returning to a discussion of what properties can and can not be claimed by it in Section 'Summarizing the RB1 and RB2 estimators'.

3.1 Deriving the RB1 estimator

We will derive the RB1 estimator in the following manner. We start by transforming:

  • display math

then show that Y13 is independent of (inline image) given (inline image), and finish by going from:

  • display math

Let the vector inline image and diagonal matrix Σ be the mean and variance of the inline image variables inline image written as inline image. They follow the ordering described in Section 'Notation for the estimation problem' and Fig. 2 so that, if inline image is the lth variable, the l-th entry of inline image is inline image and the (inline image)-th element of Σ is inline image. Using a standard result (e.g., Srivastava, 2002, Theorem 2.5.1), inline image follows the (inline image)-dimensional multivariate normal (MVN) distribution: inline image, where

  • display math

For convenience we have labeled the mean and variance parameters of inline image using the generic symbols inline image, not because they are all equal, but because they are equally irrelevant to subsequent development. Next, we apply another well known theorem on conditional multivariate normal distributions (e.g., Srivastava, 2002, Theorem 2.5.5). Let

  • display math

where inline image= inline image and inline image is equal to the inline image vector inline image = inline image. The inverse of the remaining inline image × inline image submatrix, inline image, equals

  • display math

where inline image. Defining inline image, and inline image = inline image, we can write the conditional distribution inline image as inline image where:

  • display math

and inline image = inline image = inline image. The conditional density of Y13 given Z1 and Y12 does not depend on inline image, inline image or μ. We therefore drop irrelevant terms by writing it as inline image. Only now do we condition on event Q, which acts to restrict inline image to be inline image—the value of T being fixed by the observed values of inline image and inline image (inline image). This yields the density

  • display math

where inline image is the indicator function for event Q and

  • display math

Taking the expectation of inline image yields

  • display math(6)

3.2 Deriving the RB2 estimator

The inline image estimator in (6) does not look like a direct analog of the two-stage UMVCUE of Bowden and Glimm (2008)—a correction to the two-stage MLE—because of the extra conditioning on Y12. One way of avoiding this extra conditioning would be to calculate inline image, for a inline image that did not depend on Y12. This can be achieved by defining the condition inline image simply as the portion of Q coming from the stage two selection rule (b). We can then state that inline image where:

  • display math

The quantity inline image can then be obtained following the procedure as in Section 'Deriving the RB1 estimator', to yield an alternative estimator for μ1, which we will denote as inline image and refer to as the “RB2” estimator. It has the same form as (6) but with t replaced by the observed value inline image of inline image, with inline image equal to the MLE inline image and with inline image equal to inline image.

3.3 Summarizing the RB1 and RB2 estimators

The RB2 estimator is the expected value of Y13 given the complete sufficient statistic and under condition inline image. So, if inline image truly represented the condition restricting inline image under drop-the-losers selection, then inline image would be the UMVUE given inline image. However, no such claim can be made because inline image is not the correct condition, it is Q, so E [inline image] is conditionally biased with respect to μ1. Likewise, the RB1 estimator can not claim to be the UMVUE, conditional on Q, because it does not condition on a minimal sufficient (and hence complete) statistic.

One could argue, however, that if we simply condition on Y12 and Q first, then (inline image) is a minimal sufficient statistic and therefore the RB1 estimator is a UMVCUE of sorts. This might be perceived as simply a semantic trick, and it is clear that conditioning on Y12 is not as natural as conditioning on Q alone. For this reason we will refer to inline image and inline image as simply Rao-Blackwellized estimators.

3.3.1 A general formula for the RB1 estimator

In the Appendix, we derive the inline image estimator for a J-stage drop-the-losers trial. This estimator will yield a more efficient, unbiased estimate of μ1 than using the stage J data alone. For the J-stage case, one is forced to condition on Z1 and inline image additional variables corresponding to the individual treatment effect estimates of the ultimately selected treatment at stages 2 to inline image. Although the precise values of inline image, inline image and t change, the J-stage estimator is identical in form to Eq. (6). Since it only ever requires the evaluation of a standard normal density and distribution function, it remains trivial to evaluate whatever the value of J.

3.4 Strength of unbiasedness of the RB1 estimator

Since μ1 is a random variable, it is convenient to consider the extended definition of bias due to Posch et al. (2005). That is, for a generic estimator inline image of μ1, the bias is given by:

  • display math(7)

Here inline image refers to the true mean of treatment k, the k-th element of μ. inline image is the condition that treatment k is selected under the design and inline image represents the bias of estimator inline image conditional on treatment k being selected. Although inline image = 0, it actually fulfills a stronger form of unbiasedness, namely that inline image = 0 inline image.

3.5 An alternative near-unbiased estimator

In Section 'Simulation studies', we show the price paid by the RB1 estimator for unbiasedness is a substantial increase in mean squared error (MSE). Part of the motivation for developing the RB2 estimator was to see if it could trade off small amounts of bias for a MSE reduction. An additional bias-adjusted (but not unbiased) estimator is now considered. Bebu et al. (2010) proposed a likelihood based procedure for obtaining a bias corrected MLE (which we will refer to as the BC-MLE) and confidence intervals for the selected treatment in a two-stage drop-the-losers trial. We now adapt their general approach to the specific example of a 3:2:1 drop-the-losers design setting in order to complement the simulations in Section 'Simulation studies'. Extensions to the general three stage case are obvious, but when more treatment arms are added the computational effort in obtaining the BC-MLE increases markedly.

Assuming that the variance of all stage-wise statistics are known, the log-likelihood of the parameter vector inline image conditional on Q:ψ=(1,2,3) is proportional to

  • display math(8)

Here, inline image and Z1 are as defined in Section 'Unbiased and bias-adjusted estimation of μ1' after Eq. (5), inline image being equal to the selected treatment's MLE with variance inline image. inline image is defined here as the MLE of the second best treatment at stage 2, inline image, with mean μ2 and variance inline image = inline image = inline image. Y31 is the sole statistic on the treatment that ranked last at stage one, with mean μ3 and variance inline image. The penalizing term inline image represents the probability of event Q given μ. This is equivalent to the probability that all three elements of the multivariate density:

  • display math

are positive, and it can be approximated to a high degree of accuracy in R using the pmvnorm() function (Genz & Bretz , 2009). The conditional log-likelihood (8) can then be maximized to yield joint estimates for μ1, μ2, and μ3.

4 Simulation studies

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 The three-stage drop-losers design
  5. 3 Unbiased and bias-adjusted estimation of μ1
  6. 4 Simulation studies
  7. 5 Discussion
  8. Acknowledgments
  9. Conflict of interest
  10. Appendix
  11. References
  12. Supporting Information

We now conduct a simulation study to compare the performance of the MLE, RB1, RB2, BC-MLE, and stage three estimator Y13 in estimating μ1 under drop-the-losers selection. Various parameter constellations for the true treatment means and the stage-wise variances are considered. We note that in a real trial setting the quantity of interest would be the treatment versus control group comparison inline image, but for reasons already discussed we can ignore estimation of μ0. By averaging over the results of all simulations, we obtain a Monte–Carlo estimate for the estimators' bias as given in Eq. (7). By summing the estimators' squared errors across all simulations, we can approximate their MSE, when defined analogously to Eq. (7) as well.

4.1 Point estimation for a 3:2:1 trial

In a simple initial simulation, all three treatments are assumed to have a true mean effect of 0, so that μ1 ≡ 0. The number of patients recruited to each remaining treatment arm at each stage (inline image) is 100, 50, and 25, respectively. v2 = 50 so that inline image = inline image. Figure 3 shows the distribution of 100,000 realizations of the MLE, RB1 estimator, RB2 estimator, BC-MLE, and Y13. From this simulation their empirical biases and MSEs were (0.41, 0.00, 0.04, −0.26, 0.00) and (0.35, 0.85, 0.69, 0.80, 2.00), respectively. Y13 is unbiased but inefficient and at the other extreme the MLE is efficient but biased. The RB1 estimator is unbiased and has a substantially lower MSE than Y13 because it utilizes data from the first two stages. Incorrect conditioning induces a small amount of bias into the RB2 estimator but is accompanied by a substantial decrease in MSE. The BC-MLE reduces the magnitude of the bias in the MLE but is shown to overcorrect. Its modal value is, however, close to the true value of 0.

image

Figure 3. Distribution of the 5 estimators' estimates for inline image from a three-stage 3:2:1 drop-the-losers design, given stage-wise variances inline image = inline image. Each distribution is based on 100,000 simulations.

Download figure to PowerPoint

In Table 1, we show the bias and MSE of the estimators for this choice of parameter values and three additional true parameter constellations. Trial data were simulated with 50 patients per-arm per-stage so that inline image = 1. The same general pattern is observed across all scenarios, except the magnitudes of the biases and MSEs change.

Table 1. Bias and MSE of the various estimands over the four scenarios (50,000 simulations per scenario). In each case inline image = (1, 1 ,1)
Parameter valuesMLERB1RB2BC-MLEStage three Y13
 Bias 
(0, 0, 0)0.377−0.0050.038−0.142−0.004
(0, inline image, inline image)0.3570.0070.043−0.1320.007
(0, inline image, 1)0.3100.0040.041−0.1220.012
(0, inline image, 2)0.111−0.0050.010−0.0920.012
 MSE 
(0, 0, 0)0.3880.6900.5630.5790.982
(0, inline image, inline image)0.3840.6860.5500.5701.002
(0, inline image, 1)0.3720.6700.5240.5531.017
(0, inline image, 2)0.3470.5870.4130.4711.019

4.2 Interval estimation for a 3:2:1 trial

It is possible to derive an expression for the variance of the RB1 estimator using the delta method, as, for example, Koopmeiners et al. (2012) do in the context of a single arm two-stage trial with a binary endpoint. However, from Fig. 3 we see that, in the context of a three-stage drop-the-losers trial, the distribution of inline image is highly skewed. Therefore, even if we were to derive analogous expressions for the variance of these quantities, it would not appear sensible to use them to furnish symmetric confidence intervals around the point estimate. For this reason, we adapt the nonparametric bootstrap procedure—originally proposed by Pepe et al. (2009) for a single arm two-stage trial—to the three-stage drop the loser setting. Specifically, we perform the following resampling schema to trial data inline image :

  1. Produce bootstrap sample of first stage data, with mean inline image.
  2. If inline image is ⩾ original observed value inline image:
    1. Produce bootstrap samples of second stage data, with mean inline image.
    2. If stage two MLE inline image is ⩾ original observed value inline image:
      1. Produce bootstrap samples of third stage data, with mean inline image.
      2. Calculate the RB1 estimator inline image from Eq. (6) given inline image, and original observed value t.

This should be repeated until a large enough collection of inline images have been obtained to accurately assess its sampling distribution. Empirical quantiles of this distribution can then be read-off to give confidence intervals for inline image. Upon implementing this procedure, it is no extra effort to additionally calculate bootstrapped confidence intervals for the RB2 estimator at the same time.

Confidence intervals for the BC-MLE of μ1 are obtained by the profile likelihood approach described in Bebu et al. (2010). That is, we calculate the statistic inline image as twice the difference between the log-likelihood evaluated at the joint BC-MLEs for inline image and at the BC-MLEs for inline image given the constraint inline image. A (1 − α) level confidence interval for μ1 is then the set of values for which inline imageinline image.

Table 2 shows, for inline image, the average confidence interval width and the resulting coverage of the BC-MLE, RB1, and RB2 estimators for the four scenarios already introduced and with 50 patients per treatment arm per stage as before. Within each simulation, confidence intervals and coverage were assessed with respect to the true (fixed) value of μ1, and then averaged across simulations. Each bootstrapped confidence interval calculated was based on 1000 simulated values of inline image. The overall figures are based on only 10,000 simulations—obtaining a confidence interval for the BC-MLE using the profile likelihood method required substantially more computational effort than the bootstrap procedure, and so this was the limiting factor.

Table 2. Coverage and mean confidence interval width for the BC-MLE, RB1, and RB2 estimators. (10,000 simulations per scenario). In each case inline image = (1, 1, 1)
 RB1BC-MLERB2
Parameter valuesCovinline imageCI widthCovinline imageCI widthCovinline imageCI width
(0, 0, 0)0.9633.310.9492.920.9512.91
(0, inline image, inline image)0.9583.270.9502.900.9492.87
(0, inline image, 1)0.9543.200.9482.830.9462.80
(0, inline image, 2)0.9562.960.9452.570.9562.49

Both the bootstrap and profile likelihood approaches appear to provide confidence intervals with nominal coverage. The RB1 estimate's confidence interval width is by far the widest. The smallest width is obtained from the RB2 estimator, but it is only marginally smaller than that of the BC-MLE.

4.3 Further results for K:2:1 trials

Trial data is now simulated under a K:2:1 design with all treatment means equal to 0 (so that inline image), as in Section 'Initial motivation'. Figure 4 (top-left to bottom-right) plots the bias, MSE, coverage and confidence interval width of the MLE, RB1, and RB2 estimators as a function of K. The properties of the stage three estimator Y13 are shown where informative (it is unbiased, with a known variance, so a simple symmetric confidence interval around it will achieve its nominal coverage). Each point is based on 50,000 simulations. The BC-MLE is not evaluated as it becomes too computationally demanding. However, there is no reason to suspect that its performance significantly worsens as more treatment arms are added. Standard symmetric confidence intervals are used for the MLE, assuming a normal distribution, and ignoring selection. While this could be termed a “standard” analysis, there is of course no reason to believe that this confidence interval will achieve its nominal coverage probability. As K increases, the bias of RB2 estimator increases but stays at modest levels compared to the MLE. The RB1 estimator and Y13 are unbiased. The MSE of the MLE is substantially lower than the other two estimators when K=3, but rises more quickly than the other two as K increases. The MSE of Y13 is 1, by definition.

image

Figure 4. Bias, MSE , coverage (w.r.t. 95% confidence intervals), and average 95% confidence interval width of the MLE, RB1, and RB2 estimators as a function of k, for inline image 0, σ123=1. The stage three estimate Y13 is also shown where informative.

Download figure to PowerPoint

The coverage of the RB1 estimator's bootstrapped 95% confidence interval stays relatively constant over the range of K, but is always slightly conservative. The RB2 estimator's coverage starts to worsen as K increases, but not to the same extent as the MLE. Table 3 shows, for these three estimators, the proportion of times that their 95% confidence intervals are above or below the true value of μ1. In each case, being above the true value is far more likely. This can be understood by the fact that all three are more likely to over-estimate than under-estimate the true effect. For example, in the simulations shown in Fig. 3 the MLE, RB1, and RB2 estimators overestimate μ1 83%, 57%, and 61% of the time, respectively. Of course, in the case of the RB1 estimator, this tendency to overestimate is perfectly cancelled out by less frequent, but larger, underestimation so that inline image + inline image. While the above/below ratio of the RB1 estimator stays fairly constant (around 10:1), the RB2 and MLE above/below ratio increases rapidly with increasing K. This reflects their increasing positive bias.

Table 3. Proportion of times the 95% confidence interval for the MLE, RB1, and RB2 estimators are above or below μ1
 RB1RB2MLE
K% Above μ1% Below μ1% Above μ1% Below μ1% Above μ1% Below μ1
33.90.485.30.1306.40.084
44.00.346.20.0527.90.044
54.20.326.90.0449.60.012
64.10.317.10.04410.00.016
74.30.307.70.02412.00.016
84.30.328.00.03213.00.012

The MLEs average confidence interval width (a constant value of 3.92inline image) is far lower than the other two estimators—the RB1 estimator's confidence interval is on average 60% wider than the MLEs when K = 8, but the MLE suffers from suboptimal coverage as a result. The confidence interval width of Y13 is a constant value of 3.92, which gives an idea as to the additional gain in using the RB1 estimator, if unbiased estimation is required.

5 Discussion

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 The three-stage drop-losers design
  5. 3 Unbiased and bias-adjusted estimation of μ1
  6. 4 Simulation studies
  7. 5 Discussion
  8. Acknowledgments
  9. Conflict of interest
  10. Appendix
  11. References
  12. Supporting Information

In this paper, we have explored the issue of estimation for a multistage analog of the two-stage drop-the-loser design. Our main focus was to generalize the work of Cohen & Sackrowitz (1989) to enable efficient unbiased estimation of the selected treatment. In this regard, we can only claim to have been partially successful. The RB1 estimator is unbiased and has a lower variance than the final-stage estimator, but it is derived using an additional condition that is needed to overcome technical difficulties. Further work may reveal that these conditions can be relaxed to yield more efficient unbiased estimators. Perhaps unsurprisingly, the RB1 estimator was shown to have a large MSE, due to its unbiasedness. The RB2 estimator was derived as an alternative; its less stringent conditioning (on the minimal sufficient statistic) resulted in an estimator with a small amount of bias but a greatly reduced MSE in the context of a 3:2:1 trial. However, its performance (in particular the coverage of its bootstrapped confidence intervals) worsened for K:2:1 trials as K increased.

Our derivation of the RB1 and RB2 estimators assumes that the within treatment arm variances (the inline images) and the number patients randomized to each treatment arm at stage j (inline image) are equal across treatments. This meant that ranking via the test statistic in Eq. (1) is equivalent to ranking by the mutually independent experimental treatment MLEs at each stage. Indeed, the independence property gained by ignoring the common control group data in the selection process is key to the proof. If these two conditions are not met then ranking via test statistic and MLE will not be equivalent, so the RB1 and RB2 estimators as stated here will be invalid. Our development also assumes that the inline image terms are known and a very different approach would be required if they were assumed unknown (Cohen & Sackrowitz, 1989). The BC-MLE approach of Bebu et al. (2010) is, in contrast, much better suited to the unknown variance case, since they can simply be included as additional parameters in their conditional likelihood.

The multistage drop-the-losers design assumes that the trial always proceeds to the final stage. Whilst this could be criticized as nonsensical and inefficient when strong evidence exists to stop the trial early, it gives the trial a fixed sample size that may make it attractive to practitioners and funding bodies alike (Kairalla et al., 2012). One may, however, wish to augment this design with an efficacy/futility stopping rule, as do Stallard & Friede (2008) and Wu et al. (2010). Our approach can easily be adapted to yield a RB1 or RB2 estimator in this context, by recalculating (in the three stage case) exactly how the sampling space of Y13 was restricted conditional on inline image and given the trial made it to the final stage. However, from an estimation perspective, conditioning on reaching the final stage only really makes sense when the trial can stop early for futility and not efficacy (Pepe et al., 2009). Moreover, in this case whilst estimation of μ1 is still trivial, unbiased estimation of the treatment control comparison inline image does not immediately follow, because the control group data must itself be corrected for some selection bias induced by the stopping rules. This extra complication has, however, been successfully addressed in recent work by Kimani et al. (2013).

Further research is needed to explore and understand how best to design and analyze J-stage drop-the-losers trials from an operational planning and hypothesis testing perspective. For example, how to calculate the critical value for testing the selected treatment against the control at the final stage, whether it is possible to control the type I error rate of this test in the strong sense; finding multistage designs (e.g., the K and L in the three stage context) that are optimal—in terms of maximal power and minimal size. Some preliminary work on this subject can be found in a technical report (Wason & Bowden, 2012) available at https://sites.google.com/site/jmswason/supplementary-material. Software (in the form of R code) to reproduce our results can be found in the supplementary material accompanying this paper.

Acknowledgments

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 The three-stage drop-losers design
  5. 3 Unbiased and bias-adjusted estimation of μ1
  6. 4 Simulation studies
  7. 5 Discussion
  8. Acknowledgments
  9. Conflict of interest
  10. Appendix
  11. References
  12. Supporting Information

This work was funded by the Medical Research Council (grant numbers G0800860 and MR/J004979/1). The authors would like to thank the reviewers for their helpful comments which greatly improved the quality of this paper.

Appendix

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 The three-stage drop-losers design
  5. 3 Unbiased and bias-adjusted estimation of μ1
  6. 4 Simulation studies
  7. 5 Discussion
  8. Acknowledgments
  9. Conflict of interest
  10. Appendix
  11. References
  12. Supporting Information

The RB1 estimator for a J-stage drop-the-losers trial

In this section, we introduce a more general notation. Let inline image represent the number of experimental treatment arms active in the trial at stage j, for inline image. Clearly inline image and inline image. Define inline image to be the set of all treatment arms in the trial at stage j, so that:

  • display math

Let inline image represent the effect estimate of treatment u at stage j for inline image. Let inline image be the standard deviation of the estimates at stage j. Further define:

  • display math

Let inline image represent the J effect estimates associated with the (ultimately) selected treatment with true mean effect μ1. Let inline image represent the vector of the unselected treatment effect estimates at stage j. The vector of all treatment estimates across all stages can be written as inline image and has length N = inline image. The MLE of μ1 at stage j, inline image, is equal to

  • display math(9)

Define inline image and rewrite inline image in terms of Z1 and inline image as

  • display math

As for the three-stage example in Section 'The three-stage drop-losers design', in the first inline image stages of a J-stage trial, we sequentially rank the treatment arms by the order of their cumulative MLEs, defined for each treatment remaining in the trial at stage j as in Eq. (9) . The j-th stage imposes the restriction:

  • display math

Given Z1, when inline image satisfies all of inline image selection conditions required by event Q, it is restricted to be less than or equal to

  • display math(10)

where inline image represents the treatment effect estimate from stage u associated with the (inline image)-th largest cumulative MLE at stage j and the second summation is only defined and evaluated if inline image. For example, in Section 'The three-stage drop-losers design' we set J=3 and conditions (a) and (b) on Y13 correspond to setting j equal to 1 and 2 in formula (10) respectively. This suggests that when conditioning inline image on Q, we additionally need to condition on inline image when calculating the RB1 estimator . Let Y11, Y1, inline image and inline image represent the complete data from the J-stage trial. Following the previous development in Section 'Unbiased and bias-adjusted estimation of μ1', we shall transform the multivariate normal densities as follows:

  • display math

Upon demonstrating that inline image is independent of inline image given inline image we finish by going from

  • display math

The density inline image is inline image where:

  • display math

Letting inline image we define:

  • display math

where inline image= inline image, inline image is equal to the inline image vector inline image = inline image and inline image is the remaining inline image × inline image matrix. The conditional distribution inline image is inline image, where:

  • display math

for inline image, inline image equal to mean parameter vector of a and inline image. Using block-wise inversion techniques (e.g. Srivastava, 2002, Corollary A.5.2) inline image can be expressed as

  • display math

where M1 is the scaler inline image, inline image and the values of M3 and M4 are unimportant. We now see that

  • display math

and inline image = inline image. Noting that this distribution does not depend on inline image or inline image we write inline image as

  • display math

and inline image equals

  • display math(11)

The J-stage RB2 estimator could be derived as in Section 'Deriving the RB2 estimator', by ignoring the first inline image selection steps that depend on inline image. However, the bias of the RB2 estimator is likely to be increasing as a function of J.

References

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 The three-stage drop-losers design
  5. 3 Unbiased and bias-adjusted estimation of μ1
  6. 4 Simulation studies
  7. 5 Discussion
  8. Acknowledgments
  9. Conflict of interest
  10. Appendix
  11. References
  12. Supporting Information

Supporting Information

  1. Top of page
  2. Abstract
  3. 1 Introduction
  4. 2 The three-stage drop-losers design
  5. 3 Unbiased and bias-adjusted estimation of μ1
  6. 4 Simulation studies
  7. 5 Discussion
  8. Acknowledgments
  9. Conflict of interest
  10. Appendix
  11. References
  12. Supporting Information

As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials are peer reviewed and may be re-organized for online delivery, but are not copy-edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.

FilenameFormatSizeDescription
bimj1460-sup-0001-Suppmat.zip10KSupplementary Material

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.