Interim monitoring of sequential multiple assignment randomized trials using partial information

The sequential multiple assignment randomized trial (SMART) is the gold standard trial design to generate data for the evaluation of multistage treatment regimes. As with conventional (single‐stage) randomized clinical trials, interim monitoring allows early stopping; however, there are few methods for principled interim analysis in SMARTs. Because SMARTs involve multiple stages of treatment, a key challenge is that not all enrolled participants will have progressed through all treatment stages at the time of an interim analysis. Wu et al. (2021) propose basing interim analyses on an estimator for the mean outcome under a given regime that uses data only from participants who have completed all treatment stages. We propose an estimator for the mean outcome under a given regime that gains efficiency by using partial information from enrolled participants regardless of their progression through treatment stages. Using the asymptotic distribution of this estimator, we derive associated Pocock and O'Brien‐Fleming testing procedures for early stopping. In simulation experiments, the estimator controls type I error and achieves nominal power while reducing expected sample size relative to the method of Wu et al. (2021). We present an illustrative application of the proposed estimator based on a recent SMART evaluating behavioral pain interventions for breast cancer patients.


Introduction
Treatment of chronic diseases and disorders involves a series of treatment decisions made at critical points in the progression of a patient's health status.To optimize long-term health outcomes, these decisions must adapt to evolving patient information, including response to previous treatments.Strategies for adapting treatment decisions over time are formalized as treatment regimes, which comprise a sequence of decision rules, one per stage of intervention, that map accrued patient information to a recommended treatment (Chakraborty and Moodie, 2013;Tsiatis et al., 2020).The value of a regime is the expected utility if the regime is used to select treatments in the population of interest.A regime is optimal if it has maximal value.Much of the statistical literature on treatment regimes has focused on estimation and inference for optimal regimes (Kosorok and Laber, 2019).However, scientific interest often focuses on comparison of a small number of pre-specified treatment regimes, either with each other or against a control, on the basis of mean outcome.
The gold standard for data collection for the evaluation of treatment regimes is the sequential multiple assignment randomized trial (SMART; Lavori and Dawson, 2004;Murphy, 2005).A SMART contains multiple stages of randomization, with each stage corresponding to a key decision point.In a SMART, if, when, and to whom a treatment might be randomly assigned is allowed to depend on a patient's treatment and outcome history, leading to a rich and flexible class of designs.In the past decade, the use of SMARTs has increased dramatically; SMARTs have been conducted in a range of disease and disorder areas, including cancer (Wang et al., 2012;Thall, 2015;Kelleher et al., 2017), behavioral sciences (Almirall et al., 2014;Kidwell and Hyde, 2016), and mental health (Manschreck and Boshes, 2007;Sinyor et al., 2010).For a comprehensive list of SMARTs, see Bigirumurame et al. (2022).
Every SMART can be equivalently represented as randomizing subjects at baseline among a set of fixed regimes known as the trial's "embedded regimes."Primary analyses in a SMART often focus on comparisons of the embedded regimes against each other or a control (Lavori and Dawson, 2004;Murphy, 2005).These comparisons are often used for sizing a SMART (Seewald et al., 2020;Artman et al., 2020).For example, Figure 1 shows a two-stage SMART schema for behavioral interventions for pain management in cancer patients with eight embedded regimes (Kelleher et al., 2017;ClinicalTrials.gov, 2021).Each embedded regime takes the form "give intervention a; if response, give b; if non-response, give c;" e.g., give pain coping skills training (PCST) Full initially; if response, give maintenance; otherwise, give PCST-Plus.We discuss this study further in Section 7.
Interim monitoring allows early stopping for efficacy or futility, which can reduce cost and accelerate evaluation of candidate treatments.Group sequential methods allowing early stopping are well established for conventional clinical trials (Jennison and Turnbull, 2000).However, analogous methodology for SMARTs is limited.Wu et al. (2021) propose an interim test for a difference in mean outcome among embedded regimes in two-stage SMARTs.However, their approach is based on the inverse probability weighted estimator (IPWE), which does not incorporate baseline and accruing patient information that could be used to enhance efficiency (Zhang et al., 2013).Chao et al. (2020) consider interim analysis for a small-n, two-stage SMART restricted to the specific situation in which the same treatments are available at each stage and the goal is to remove futile treatments.
We develop a class of interim analysis methods for SMARTs based on an augmented inverse probability weighted estimator (AIPWE) for the value of a regime that increases statistical efficiency by using partial information from individuals with incomplete regime trajectories.Our method applies to SMARTs with an arbitrary number of stages and treatments, as well as those in which the set of allowable treatments depends on a patient's history.We present the statistical framework in Section 2. In Section 3, we review the AIPWE for the value of a regime when all participants have progressed through all stages.
We introduce the proposed Interim AIPWE in Section 4. In Section 5, we discuss testing procedures, stopping boundaries, and sample size formulae for interim analysis.In Section 6, we evaluate the empirical performance of the proposed procedure in a series of simulation experiments, and we present a case study based on the cancer pain management SMART in Section 7.

Statistical framework
Consider a SMART with K stages and a planned total sample size of N .Each subject completing the trial generates a trajectory of the form (X 1 , A 1 , X 2 , A 2 , . . ., X K , A K , Y ), where A k ∈ A k , k = 1, . . ., K, is the treatment assigned at stage k; A k is a finite set of treatment options at decision k; X 1 ∈ R p 1 comprises baseline subject variables; X k ∈ R p k , k = 2, . . ., K, comprises subject variables collected between stages k − 1 and k; and Y ∈ R is an outcome measured at the end of follow up, coded so that higher values are better.Let X k = (X 1 , . . ., X k ) and A k = (A 1 , . . ., A k ), and define H 1 = X 1 and H k = (X k , A k−1 ), k ≥ 2, so that H k is the information available at the time A k is assigned.Let H k = dom H k and let 2 A k denote the power set of A k .We assume there exists a set-valued function Ψ k : H k → 2 A k so that the set of allowable treatments for a subject with Laan and Petersen, 2007;Tsiatis et al., 2020).
In this setting, a treatment regime is a sequence of decision rules, d = (d 1 , . . ., d K ), where denote the potential outcome under treatment sequence a K = (a 1 , . . ., a K ), and let X * k (a k−1 ) denote the potential intermediate variables under sequence ), a k−1 }, and H * 1 (a 0 ) = H 1 .The potential covariates and outcome for an individual receiving treatment according to regime d are The mean outcome, or value, for a regime d is V (d) = E{Y * (d)}.
In a SMART, primary analyses often focus on inference on V (d) for regimes d that are embedded in the trial.Let π k (a k , h k ) = P (A k = a k |H k = h k ) be the probability (propensity) of being randomized to treatment a k ∈ Ψ k (h k ) at stage k for a subject with history h k .It is well known that V (d) is identifiable under the following conditions: , where ⊥ ⊥ denotes independence), which holds by design in a SMART; consistency, Y = Y * (A K ) and H k = H * k (A k−1 ); and no interference among subjects (Tsiatis et al., 2020).Hereafter, we assume that these conditions hold.
Take d , = 1, . . ., L, to be the regimes embedded in the SMART and d 0 a possible control, e.g., a treatment or regime representing the standard of care.For definiteness, we consider two null hypotheses that address the efficacy of the embedded regimes: These hypotheses are analogous to those used in multi-arm, multi-stage and platform trials (Jennison and Turnbull, 2000, Chapter 16;Wason, 2019).The control value V (d 0 ) may be fixed or estimated from an additional control arm.The methods presented here apply to futility testing with minor modification.Hypotheses that stop the trial for either a single regime or all regimes falling below an efficacy boundary are possible in this construction.

AIPWE for complete data
We briefly review the AIPWE of the value in the setting where one observes N complete independent and identically distributed trajectories } to be an indicator that treatment is consistent with d through the first k decisions, and let Although the propensities are known in a SMART, estimating them based on correctly specified models can increase efficiency (Tsiatis, 2006b).Let θ k,N be an estimator of θ k .The form of the AIPWE for V (d) is (Zhang et al., 2013;Tsiatis et al., 2020, Section 6.4.4) where L k (x k ) is an arbitrary function of x k and we define 0 0 yields the IPWE that forms the basis for the approach of Wu et al. (2021); the IPWE uses only the observed outcomes and no covariate information.It can be an inefficient estimator for V (d) when there are covariates that are correlated with the outcome.The efficient choice for is the event that all treatments received are consistent with d through decision k.
In practice, the functions L d k (x k ), k = 1, . . ., K are unknown, but they can be estimated using Q-learning as follows (Tsiatis et al., 2020, Section 6.4.2).Posit a model Obtain an estimator β K,N for β K by an appropriate regression method, e.g., least squares, and take , the predicted outcome using the fitted model when individuals receive consistent treatments at stage k + 1.Then, obtain β d k,N by a suitable regression method using the pseudo-outcomes as the response, e.g., for least squares, For individuals with only one treatment available at stages k to k , we use pseudo-outcome Q (Tsiatis et al., 2020, Section 6.4.2).

Interim AIPW estimator
The interim AIPW estimator (IAIPWE) uses partial information from individuals who have yet to complete follow up at the times interim analyses are conducted; the IAIPWE includes the IPWE and AIPWE for complete data as special cases.Assume that the enrollment process is independent of all subject information and that the time between stages is fixed, as is the case for many SMARTs.Let S be the number of planned analyses.Let Γ(t) ∈ {0, 1} be an indicator that a participant has enrolled in the SMART at study time t, where t = 0 denotes the start of the study (in calendar time).In addition, let κ(t) ∈ {0, 1, . . ., K} be the furthest stage reached by a participant at time t with Γ(t) = 0 ⇒ κ(t) = 0; and let ∆(t) ∈ {0, 1} be an indicator that a participant has completed follow up, i.e., they have completed all K stages and have had their outcome ascertained.Thus, the number of participants enrolled at time t is n(t) = N i=1 Γ i (t).We evaluate either the fixed set of L embedded regimes d L =1 for null hypothesis (1), or the embedded regimes along with a control regime, d 0 , for null hypothesis (2).The control regime may be estimated from a separate trial arm or may have a predetermined fixed value.We use superscript to indicate that a quantity is being computed for regime d , e.g., β k,N is shorthand for We define the "full data" under regime d as The observed data for an individual at time t are therefore For a given time t and regime d , let R (t) ∈ {1, . . ., 2K, ∞} be a discrete coarsening variable, which is defined as follows: Thus, R (t) = ∞ corresponds to a participant having completed follow up and being consistent with d for all treatment decisions at time t.For R (t) < ∞, R (t)/2 is the number of stages at which a participant is consistent with d at time t, and R (t) mod 2 encodes whether the number of consistent stages is due to time-related censoring, i.e., not having yet completed the current stage, or having been assigned a treatment that is inconsistent with d .See Appendix A for an example of how R (t) is determined.
The observed data W (t) are a coarsened version of the full data W * d .The coarsening is monotone in that the full data coarsened to level R (t) = r are a many-to-one function of the full data coarsened to level R (t) = r + 1 at time t.Moreover, the data are coarsened at random, as P {R (t) = r|W * d } = P {R (t) = r|W (t)} (Tsiatis, 2006b, Chapter 7;Zhang et al., 2013), which follows from the consistency and sequential randomization assumptions in Section 2. Define the coarsening hazard function λ r (t) = P R (t) = r R (t) ≥ r, W (t) to be the conditional probability that an individual is coarsened to level r given that they are at risk of being coarsened.Because the data are coarsened at random, λ r (t) is a function of the observed data.Let the probability that an individual is coarsened after r be K r (t) = P R (t) > r W (t) , which is also a function of the observed data.Let K r,n(t) (t) be an estimator of K r (t).Let ν k (t) = P {κ(t) ≥ k|Γ(t) = 1}, k = 1, . . ., K; ν K+1 (t) = P {∆(t) = 1|Γ(t) = 1}; and k(r) map coarsening level r to corresponding decision k.We can express both λ r (t) and K r (t) in terms of propensities It is straightforward to posit models for π k (A k , H k ) or ν k (t) using logistic regression or simple averages and estimate λ r (t) and K r (t).The form of the IAIPWE for regime d at time t is where L k(r) (x k(r) ) is an arbitrary function of x k(r) .The estimator is doubly robust and thus guaranteed to be consistent in a SMART with a specified enrollment process.We include a proof in Appendix B. Similar to the AIPWE, we estimate the efficient choice of unknown functions however, because the IAIPWE uses individuals with incomplete treatment trajectories, the struct an estimator β K for β K by an appropriate regression method, e.g., least squares, using only individuals who have completed all treatment stages, i.e., ∆(t) = 1, and subsequently take a k(r) for k(r) = K−1, . . ., 1. Estimating β k(r) requires pseudo-outcomes, which may be missing when using individuals who have been observed through stage k(r) + 1, i.e., κ(t) > k(r), but have no observed outcome Y or estimable pseudo-outcome from stages k(r) + 2 or later.
In such cases, we define the pseudo-outcomes for estimating β k(r) as This approach uses individuals with incomplete information to fit the Q-functions for greater efficiency.When all observed individuals have completed their regimes, this strategy is equivalent to the pseudo-outcome method outlined in Section 3. We obtain β k(r) by a suitable regression method, using Q k(r)+1 (x k(r)+1 , a k(r)+1 ; β k(r)+1 ) with Q k(r)+1 (x k(r)+1 , a k(r)+1 ; β k(r)+1 , . . ., β K ) when necessary, and To make clear the connection between the IAIPWE and the (A)IPWE, we express V IA (t) in (4) in an alternate form.For definiteness, consider K = 2 decisions at fixed times, and let ν 2 (t) and ν 3 (t) be estimated by ν 2,n(t If Γ i (t) = ∆ i (t) = 1 for all i, so that n(t) = N , as at the time of the final analysis, (5) reduces to the AIPWE (3) with K = 2.The augmentation terms in (4) use partial information from participants who are enrolled at the time of an interim analysis but who do not yet have complete follow up.In contrast, the IPWE (obtained by setting L k (X ki ) ≡ 0, k = 1, 2, . . ., K) uses data only from those subjects who are consistent with the regime under consideration at all stages of the study and who have completed the trial.The AIPWE (3) also uses information only from subjects who have completed the trial, but it additionally uses a series of regression models, one for each stage, to impute information for subjects who are not consistent with the regime under consideration starting from the stage at which their treatment first deviates from the regime.The IAIPWE (4) furthermore uses data from all subjects in fitting the regression models in the AIPWE and thereby uses more information and further improves efficiency.
As our goal is to use the IAIPWE for interim monitoring and analyses, we need to characterize its sampling distribution.The following result shows that the IAIPWE for the embedded regimes is asymptotically normal; we use this result to construct tests and decision boundaries in subsequent sections.A proof is given in the Appendix C.
. ., V L IA (t)} be the stacked value estimators at time t across all regimes, and let n(t)/N p → c, a constant.Under standard regularity conditions stated in the Appendix C, A consistent estimator Σ of Σ can be obtained using the sandwich estimator or the bootstrap.Comparisons among the L + 1 regimes can be constructed using a contrast vector and are asymptotically normal via a simple Taylor series argument (see Appendix C).When there is no control regime, V(t) is indexed only by = 1, . . ., L.

Hypothesis testing
For simplicity, consider S = 2 planned analyses at study times t 1 (interim analysis) and t 2 (final analysis).We present the extension to an arbitrary S in Appendix D. We discuss the interim analysis procedure in the context of superiority; the procedure for homogeneity follows under minor modifications.
Define the test statistics at analysis time t s , where V 0 IA (t s ) can be estimated as the sample average of response Y i for individuals receiving d 0 and the denominator is obtained from the approximate normal sampling distribution for V(t s ) in Theorem 1.If regime means are compared to a fixed control value V 0 , replace V 0 IA (t s ) by V 0 .At each analysis s, we propose to stop the trial if any test statistic exceeds a stopping boundary c s (α), which will be discussed in the next section.Heuristically, the testing procedure at significance level α across all t s is as follows.
A trial with more than two planned analysis repeats step (1) for all interim analyses, terminating when a test statistic is greater than the corresponding stopping boundary.This formulation can be adapted to any set of hypotheses involving functions of the values of regimes of interest.For example, testing the homogeneity hypothesis (1) would involve calculation of chi-square test statistics based on the distributions of V(t s ), s = 1, 2, analogous to Wu et al. (2021), which would be compared to corresponding stopping boundaries.

Stopping boundaries
We discuss boundary selection and sample size calculations for superiority null hypothesis (2), which involves multiple comparisons of embedded regimes against a control regime.We seek to determine stopping boundaries c α (s), s = 1, 2, that control the family-wise error rate across all planned analyses at level α; i.e., Common approaches to calculating boundaries that satisfy (6) include the Pocock boundary, which takes c α (s) = c α for some c α for s = 1, 2 (Pocock, 1977); the O'Brien-Fleming (OBF) 'Brien and Fleming, 1979), where ι is the reciprocal of the square root of the statistical information (e.g., inverse of the variance of the numerator of the associated Z-score) available at analysis s divided by the statistical information available at final analysis S; or the broader α-spending approach (DeMets and Lan, 1994).If the information proportion between the interim and final analysis varies by regimes, practitioners may elect to use a regime-dependent ι in the spirit of OBF.For a detailed discussion about if and when each boundary type might be preferable, see Jennison and Turnbull (2000).
Define the stacked vector of sequential test statistics Boundaries that satisfy (6) can be obtained via the joint cumulative distribution function of Z under null hypothesis (2).
Theorem 2 Under the null hypothesis (2) and for where Σ H 0 is a block diagonal matrix with diagonal entries Corr{Z 1 (t s ), . . ., Z L (t s )} and off-diagonal entries ι −1 Corr{Z 1 (t s ), . . ., Z L (t s )}, ι is the reciprocal of the information proportion between interim analysis s and final analysis S.
A proof of Theorem 2 and discussion on calculating ι and the correlation between the Zstatistics are provided in the Appendix E. In practice, computation of c α can be done numerically.Either the correlation of the test statistics or the variance of all components of the estimator must be specified to compute the stopping boundaries.We approximate c α through integration of the corresponding multivariate normal distribution of Z.Under the information monitoring approach (Tsiatis, 2006a), the correlation between sequential test statistics for the same regime simplifies to the square root of the ratio of the information available between the two time points.Because of incomplete information for participants enrolled but who have not yet completed the trial, this quantity does not simplify to the square root of the ratio of the interim sample size to the final planned sample size.The off-diagonal elements of the covariance matrix, Σ H 0 , may be non-zero for overlapping embedded regimes.For these reasons, it may be difficult to specify Σ H 0 .An alternative is to specify generative models for the observed data, i.e., a mean model and distributions for associated covariates, propensities, and enrollment at time of interim analyses, and estimate the correlation structure empirically via simulation.
The choice of the models and estimators for λ r (t), K r (t), and L k(r) (x k(r) ) impact the correlation structure of Σ H 0 and can result in correlated value estimators across non-overlapping embedded regimes; i.e., regimes that involve different stage 1 treatment options.If cohorts enroll sequentially and interim analyses are planned such that all enrollment occurs within each cohort (i.e., ∆ i (t s ) = Γ i (t s ) for all i for s = 1, 2), then the test statistics at each analysis use the standard AIPWE (3) computed using data from all participants who have entered the trial.Therefore, stopping boundaries for trials with such enrollment procedures are subsumed by this method.

Power and sample size
With stopping boundaries c α = {c α (1)1 L , c α (2)1 L } ∈ R 2L for 1 L an L-vector of ones, and specified alternative H A , the power of the testing procedure is The power of the test under H A , where Z has expected value As the mean under the alternative, µ A , is a function of the sample size, so too is (8).Thus, to achieve nominal power 100(1 − β)%, one can set (8) equal to 1 − β and solve for the sample size.Although our results hold for a general alternative hypothesis H A , we proceed under the simplifying assumption that Σ H 0 = Σ H A , i.e., that the covariance is the same under H 0D and H A .In our implementation, we use a grid search for a fixed enrollment process and ratio between interim sample sizes to find the total planned sample size N that attains the desired power.When the augmentation terms are zero, the analyst must specify the correlation among estimators of the regimes, the information proportion for analyses, the alternative mean outcomes, and the variance of the mean outcomes.When augmentation terms are non-zero, all generative models must be specified to determine the sample size and corresponding power.Specification of all generative models required for the IAIPWE at the design stage may be challenging.Accordingly, a practical strategy would be to power the trial and thus determine N conservatively based on the IPWE but base interim analyses on the more efficient IAIPWE, which can lead to increased power and smaller expected sample size.
As previously stated, the covariance structure, Σ H 0 , depends on the enrollment process through the information proportion at the time of analysis.Thus, one can compute the maximum power for a fixed sample size under differing enrollment processes using (8) adjusted for the differences in the information proportion at the time of the analysis.One can also consider other objectives such as minimizing the time to decision or the cost of the trial using these same procedures.

Test for homogeneity
Exploiting the previous developments, we formulate a sequential testing procedure using Z(t s ) = {Z 1 (t s ), . . ., Z L (t s )} for the global null hypothesis (1), i.e., that all regimes are equal.We derive a χ 2 statistics using Theorem 2. Let C = [I L−1 | − 1 L−1 ] where I q is the (q×q) identity matrix and 1 q a q-vector of ones.Let Σ H 0 (t s ) be the (L×L) submatrix of Σ H 0 corresponding to the covariance of Z(t s ), and let µ A (t s ) be the (L × 1) vector corresponding to the alternative mean at time t s .The sequential Wald-type test statistic at time t s is which follows a χ 2 distribution with degrees of freedom υ = rank{CΣ H 0 (t s )C } and noncentrality parameter Following the methods in previous sections, the stopping boundaries now come from a χ 2 distribution.Using simulation, we estimate the stopping boundaries using the correlation structure of Z such that {c α (1), c α (2)} satisfy the type I error rate.The Pocock boundaries still satisfy c α (s) = c α ; however, the OBF type boundaries satisfy {c α (1), c α (2)} = {ι 2 c α , c α } with ι as defined in Section 5.2.After calculating the stopping boundaries, we use the distribution of Z for relevant power and sample size calculations.We estimate the total planned sample size required to attain power 1 − β numerically; see Appendix F for details on implementation.

Simulation experiments
We report on extensive simulations to evaluate the performance of the IAIPWE.In our simulation settings, IPWE corresponds to the proposed method of Wu et al. (2021).We present results here based on 1000 Monte Carlo replications for the schema shown in Figure 1.We evaluate the type I error rates, power, and expected sample sizes for fixed interim analysis times for the null hypothesis H 0D in (2) and alternative hypothesis H AD : V (d ) − V (d 0 ) > δ for at least one .We also investigate the benefit of leveraging partial information through the IAIPWE over an IPWE in trials with sample size determined by the IPWE.Finally, we consider how the proportion of enrolled individuals having reached different stages of the trial at an interim analysis affects performance.We consider both Pocock and OBF boundaries.We use correctly specified Q-functions for augmented estimators.Appendix G includes results for additional schema and settings; the results are qualitatively similar.
Table 1 summarizes the total planned sample size to achieve power 80% under a specified alternative (VP2 for VP1 and VP2, and VP3 for VP3), the proportion of early rejections of null (2), the proportion of total rejections of null (2), the expected sample size, and the expected stopping time.Results are given for both the total sample size to achieve the desired power for each individual estimator (a) and for the total sample size for the IPWE to achieve the desired power (b).The slight differences among the total planned sample size in (a) and (b) are due to Monte Carlo error.All estimators achieve nominal power and type I error rate.The IAIPWE requires a smaller total planned sample size to achieve nominal power.The IAIPWE also exhibits the highest early rejection rate under true alternatives demonstrating the efficiency gain from the augmentation terms and therefore lower expected sample sizes and earlier expected stopping times.The AIPWE slightly underperforms relative to the IPWE due to the overestimation of variance using the sandwich matrix for small n(t 1 ).It is well known that the performance of the sandwich matrix can deteriorate for small samples.As such, alternative estimation of the covariance matrix, such as using the empirical bootstrap, can be used.The IAIPWE is less affected by overestimation of the variance than the AIPWE.When the total sample size is selected based on the IPWE and an augmented estimator is used, the type I error rate is controlled and the study achieves a higher power.
Table 2 summarizes estimation performance at the interim and final analyses, where a mean square error (MSE) greater than one implies that the indicated estimator is more efficient than the IPWE.The estimators are all consistent as expected.Both the AIPWE and IAIPWE are more efficient than the IPWE at both analyses, and the IAIPWE is more efficient than the AIPWE.At the interim analysis, the standard errors for the IPWE underestimate the sampling variation in most cases, whereas the standard errors for the AIPWE overestimate the sampling variation.The IAIPWE consistently estimates the sampling variation with the exception of regime 6 at the interim analysis.
In the second scenario, we investigate how different enrollment processes affect the proportion of early rejections for hypothesis (2) with S = 2 analyses.To vary the rate of enrollment, we select in which of four time periods ([0, 500], [501,600], [601,700], and [701, 1000]) an individual enrolls using a multinomial distribution.Within each, individuals enroll uniformly.Results for the Pocock stopping boundaries under (VP2) are given in Table 3.The sample sizes are determined to achieve 80% power under (VP2), and the interim analysis is conducted on day 700.Both the total planned and expected sample sizes are lower for the IAIPWE than the IPWE or AIPWE.The proportion of early rejections is higher when more individuals have progressed further through the study due to the increased information available at the time of analysis.All methods attain the desired power, and the IAIPWE achieves earlier expected stopping times and lower expected sample sizes than the IPWE and AIPWE.
In Appendix G, we present results for two additional, common designs: the schema in Figure 1 with a control arm and a schema in which responders are not re-randomized.The additional simulations demonstrate that the IAIPWE performs well even under misspecification of the Q-functions.In small samples, the IAIPWE variance may be overestimated, resulting in the estimated proportion of information at interim analyses being inflated.The Table 1: For the schema in Figure 1, interim analysis performance results for testing hypothesis (2) against H AD with a fixed control value using Pocock and OBF boundaries.VP indicates the true value pattern.Method indicates the estimator used.The total planned sample size N is determined by either each method (a) or by the IPWE (b).Total planned sample sizes are determined to maintain a nominal type I error rate of α = 0.05 and achieve a power of 80% under the respective value patterns, using alternative (VP2) to determine the sample size for the null (VP1).Early Reject and Total Reject are the rejection rates at the first analysis and for the overall procedure, respectively.E(SS) is the expected sample size, i.e, the average number of individuals enrolled in the trial when the trial is completed.E(Stop) is the expected stopping time, i.e., the average number of days that the trial ran.Monte Carlo standard deviations are given in parentheses.OBF boundaries may be conservative in these cases.The IAIPWE performs well with multiple interim analyses and for the χ 2 testing procedure for H 0H .
7 Case study: cancer pain management SMART We present a case study based on a recently completed trial evaluating behavioral interventions for pain management in breast cancer patients (Kelleher et al., 2017;ClinicalTrials.gov, 2021).A schematic for the trial is shown in Figure 1.Initially, patients are randomized with  49.5, 49.5, 49.5, 49.5, 47.5, 47.5, 47.5, 47.5).
( equal probability to one of two pain coping skills training interventions: five sessions with a licensed therapist (PCST-Full) or one 60-minute session (PCST-Brief) with a licensed therapist.After eight weeks (end of stage one), participants who achieve a 30% reduction in pain from baseline are deemed responders and randomized with equal probability to maintenance therapy or no further intervention.Non-responders who received PCST-Full are randomized with equal probability to either two full sessions (PCST-Plus) or maintenance.Non-responders who received PCST-Brief are randomized with equal probability to PCST-Full or maintenance.The eight embedded regimes are given in Figure 1.Follow up occurs eight weeks after administration of stage two intervention and again six months later.Here, we take the outcome of interest to be percent reduction in pain from baseline at the final six month assessment and the primary analysis to be the evalution of the eight embedded regimes via the null hypothesis H 0D in (2) as described below.Because the data from the trial are not yet published, we simulate the trial based on the protocol.We consider five baseline covariates: height X 1,1 , weight X 1,2 , presence/absence of comorbidities X 1,3 , use of pain medication X 1,4 , and whether or not the participant is receiving chemotherapy X 1,5 .We observe the response status R 2 , percent reduction in pain X 2,0 , and degree of adherence X 2,1 at the first follow up at the end of stage one.Participants enroll uniformly over 1000 days, the end of stage one occurs eight weeks after enrollment, and the outcome Y is ascertained eighteen weeks after the end of stage one and thus six months after enrollment.The distributions of covariates and outcomes are given in Appendix H.We take N = 284 to match the sample size of Kelleher et al. (2017).
An interim analysis is planned for day 500 and a final analysis at the trial conclusion, a maximum of 1182 days.We test the null hypothesis (2) against the alternative that any Figure 2: For the schema in Figure 1, interim analysis performance results for testing null hypothesis (2) against H AD with a fixed control.Results include the Pocock boundaries (dashed), OBF boundaries (dotted), value estimates (circles) and 95% confidence bounds, and test statistics (rhombus) at the interim (left) and final (right) analysis time for the behavioral pain management case study data set using the IAIPWE.regime achieves greater than a 22.5% reduction in pain (fixed control value); see Appendix H.We consider both Pocock and OBF boundaries, for which, to achieve a type I error of α = 0.05 using our IAIPWE procedure, c α=0.05 = (2.66,2.66) and (4.20, 2.43) respectively.For the AIPWE and IPWE, the Pocock and OBF boundaries are = (2.66,2.66) and (4.30, 2.44) respectively.In this setting, the correlation structure for Z is similar for all estimators.Therefore the Pocock boundaries are the same even with the difference of available information at the interim analysis.As a result, the Pocock boundaries illustrate in part why we expect more early rejections under a true alternative for the IAIPWE than the other estimators.By construction, the different OBF boundaries demonstrate the impact of the increased information available using the IAIPWE at the interim analysis.
The interim analysis occurs at 500 days after the trial enrollment begins, at which point 51.4% of the total planned sample size N has been enrolled, 46.8% of the N planned participants have progressed to the second decision, and 34.9% have completed the trial.Figure 2 summarizes the estimated values for each regime at the time of analysis, corresponding Z-statistic.Exact numbers are recorded in a tabular format in Appendix I. Regime 1, which starts with PCST-Full, triggers early stopping based on the test statistic exceeding the OBF boundary.Regimes 1 and 3 trigger early stopping based on test statistics exceeding the Pocock boundary.The standard errors are smaller than those obtained using the IPWE or AIPWE, which are included in the Appendix I.The IPWE and AIPWE trigger early stopping with regimes exceeding the Pocock boundary, but fail to trigger early stopping under the OBF boundary.The decision to stop the trial early reduces the sample size from the total possible 284 subjects to 146 and the length of the study by 96 weeks.Early stopping means implementation of behavioral interventions for pain management in breast cancer patients, potentially helping more individuals and avoiding less efficacious regimes for those who otherwise would have enrolled in the trial.

Discussion
We proposed interim analysis methods for SMARTs that gain efficiency by using partial information from participants who have not yet completed all stages of the study.The approach yields a smaller expected sample size than competing methods while preserving type I error and power.Simulations demonstrate a potential for substantial resource savings.
We have demonstrated the methodology in the case of two-stage SMARTs with an interim analysis focused on evaluation of efficacy.However, the methods extend readily to studies with K ≥ 2 decision points, multiple interim looks, and general hypotheses including futility.We have consider Pocock and OBF boundaries, though the approach can be adapted to any monitoring method, such as information-based monitoring (Tsiatis, 2006a) and the use of α spending functions (DeMets and Lan, 1994).
We have made the simplifying assumptions throughout that: (i) the time between stages is fixed, which is the case for many SMARTs; and (ii) the final outcome is observed on all individuals by the end of the trial (so excluding the possibility of drop out).The extension to random times between stages is non-trivial.Simulations included in Appendix G suggest that the IAIPWE (incorrectly assuming fixed transition times) performs well when time per stage varies with subject outcomes.Due to variability in enrollment, an analysis at a predetermined time may have a realized power slightly different from the nominal power based on the number of individuals enrolled and their realized trajectories at the time of analysis.In such cases, planning the interim analysis based on available sample size rather than a predetermined time may be preferred.Extensions for additional levels of coarsening, such as those due to drop out, attrition, or time-to-event outcomes requires additional augmentation terms or changes to the functions λ r (t), K r (t) and L k(r) (x k(r) ).For a comprehensive review of the considerations involved, see Chapter 8 of Tsiatis et al. (2020).A modified multiple imputation strategy may also be used for missing data following that of Shortreed et al. (2014).
As demonstrated in our simulation experiments, the sandwich covariance estimator can overestimate the variance of the values and lead to conservative stopping boundaries when the number of parameters is close to the sample size.Interim analyses typically have larger sample sizes, so this issue is unlikely to occur in practice.The information proportion can be checked at each interim analysis to verify the planned proportions against the realized values.The IAIPWE stopping boundary and sample size calculations also require the challenge of positing models.Although we have studied the performance of the IAIPWE under these conditions to evaluate fully its properties, we anticipate the trialists will prefer to power a SMART based on the IPWE to avoid making the additional model assumptions.We advocate this approach in practice as it can assuage concerns about misspecified models while still benefiting from the efficiency gains of the IAIPWE.If a trial does reach the final analysis, using the AIPWE offers efficiency gains by effectively performing covariate adjustment.Here, the covariates to be used in the Q-functions should be specified before the trial begins.
The framework presented here forms the basis for additional methodology for interim monitoring for SMARTs with random times between stages and specialized endpoints.The IAIPWE has potential use in adaptive trials in which randomization probabilities, or even the set of treatments, varies with accumulating information ( (Jennison and Turnbull, 2000, Chapter 17); Wang and Yee, 2019).We will report on these developments in future work.
Consider cases r odd and even separately.Then , r odd , r even.
Simplify the probability statements using our propensities and enrollment notation, and the fact that the term is 0 for all individuals coarsened before r.We can express the denominator as individuals who have been consistent with the regime through k(r , r odd Further algebra and simplifications from K = 2 yield the estimator in the paper.If the time to next treatment varies by treatment assignment, then the probability of coarsening due to time conditioned on the treatment assignment is no longer expressed with ν as defined and proper adjustments can be made.arbitrary l, the estimator converges to d ).By Lemma 10.4 of Tsiatis (2006b), Therefore, the IAIPW estimator is consistent if the second term is 0. First, consider the case the propensity and proportion models are correctly specified.Then for r = 1, . . ., 2K, the hazard functions are correctly specified, i.e. λ r,i (t; θ * p ) = λ * r (t) for all r = 1, . . ., 2K.Define W r,i as the random vector {I(R i (t) = 1), . . ., I(R i (t) = r − 1), W i }.Then by iterated expectations and the definition of the hazard functions, for all r = ∞,

Now consider when the arbitrary functions
Under the assumption of coarsening at random and using iterated Therefore the estimator is consistent if either the propensity or regression models are correctly specified.
conditions from Section B for times for covariance Σ Ts given in (7.10) in Boos and Stefanski (2013) and the discussion below.Let 1 T be the SL × S(L + 1) matrix of block diagonal matrices [−1 1×L |I L×L ].Then by Slutsky's Theorem where

Discussion
We can determine the value of ι given in Theorem 2 by finding the value of Σ H for a general hypothesis, H, since ι −1 are entries (1, L + 1), (2, L + 2), . . ., (L, L + L) of Σ H .In the case that the information proportion varies between regimes, the ι −1, will be regime-specific.
The matrix Σ H can be computed analytically.First, one must find the estimating equations used for all estimated parameter and stack these over all planned analyses.Then, the covariance of the estimating equations can be written as following from Boos and Stefanski (2013).However, this gives the covariance of all estimated parameters.We are interested in finding the covariance of the estimated values.Construct block diagonal matrix bdiag(1 1 , . . ., 1 S ) where 1 s is the matrix [0|I L+1,L+1 ] such that 1 s θ = V, as given in Appendix C. Then Σ Ts = bdiag(1 1 , . . ., 1 S )Σ ψ bdiag(1 1 , . . ., 1 S ) .Let 1 T be the matrix of S block diagonal matrices [−1 1×L |I L×L ].Then, following the application of Slutsky's Theorem,

Responders Receive a Single Treatment Option
We consider the trial design in Figure 3 and testing null hypothesis H 0D for superiority.Enrollment times are drawn uniformly between 0 and 1000 days and follow-up times occur every 100 days.The first interim analysis is conducted when 30% of individuals have completed the trial which, under the enrollment mechanism, corresponds to having approximately 50% enrollment.We generate two baseline covariates X 1,1 ∼ Uniform(25, 75) and X 1,2 ∼ Bernoulli(0.5)as well as an interim outcome X 2,1 ∼ Uniform(0, 1) and response status R 2 ∼ Bernoulli(0.4);for notational consistency, R 2 is considered part of X 2 .The initial treatment is generated as A 1 ∼ Bernoulli(0.5)and, the second treatment is generated as A 2 |R 2 = 0 ∼ Bernouilli(0.5)and A 2 |R 2 = 1 is 0. Outcomes are normally distributed with variance σ 2 = 100 and conditional mean where β = (β 0 , β 1 , . . ., β 12 ) .Values of β were chosen to encode three value patterns (VPs): (VP1) all regimes are equivalent (β = (10, 0.5, 12.5, 0, 0, 0, 12.5, 12.5, 0, 0, 0, 0, 0) ); (VP2) there is a single best embedded regime (β = (10, 0.5, 12.5, 0, 0, 0, 12.5, 12.5, 0, 5, 0, 0, 0) ); and (VP3) embedded regimes starting with A 1 = 0 are optimal (β = (12.5,0.5, 12.5, −2.5, 0, 0, 12.5, 12.5, 0, 0, 0, 0, 0) ).In (VP2), embedded regime 4 attains a higher value 50.5, and in (VP3), embedded regimes 1 and 2 attain the higher value 50.0.All other regimes for each VP have value 47.5.The clinically meaningful difference of δ = 3 and 2.5 from the fixed control mean value V (d 0 ) = 47.5 and nominal power 80% are used for sample size calculations for (VP2) and (VP3), respectively.Table 5 summarizes the true value pattern, the estimator used, the total planned sample size to achieve desired power under a specified alternative, the proportion of early rejections Figure 3: Schema in which responders only have one treatment available at stage K = 2.The design embeds four regimes of the form "Give intervention a; if non-response, give b; otherwise, if response give c." Regimes 1, . . ., 4 take (a, b, c) to be (0, 2, 4), (0, 3, 4), (1, 5, 7), and (1, 6, 7), respectively. of H 0D , the proportion of total rejections, the expected sample size, and expected stopping time for analyses using the IPWE, AIPWE and IAIPWE.Results are presented for both when the sample size is determined for each estimator and when the sample for the IPWE is used for all estimators.The results may differ for each estimator as the IPWE and AIPWE use only individuals with complete trajectories.We calculate the total planned sample size to achieve power 80% under (VP2) as the sample size for investigating the type I error rate under true (VP1), else to achieve power 80% under the respective VPs.The expected sample size is average number of individuals enrolled in the trial regardless of their contribution to the estimator used when the trial is stopped.We test the null H 0D against the alternative H AD for δ = 0 using the testing procedure outlined in the Section 5 with S = 2 planned analyses at day 500, and, if applicable, trial completion.We see that type I error rates are controlled and the nominal power is attained across three value pattern and stopping boundaries.The sandwich estimators of the variance of the values overestimates the asymptotic variance of the values for small N , which results in deflated early rejections for the AIPWE in these scenarios.The use of partial information for individuals by the IAIPWE results in smaller expected sample sizes and earlier expected stopping times for the alternative compared with the IPWE or AIPWE.As anticipated by the performance in conventional single-stage clinical trials, OBF boundaries may be too conservative if the analysis is performed when the proportion of information is low.
Table 5: For the schema in Figure 3, interim analysis performance results for testing H 0D against H AD with a fixed control value using Pocock and OBF boundaries.Value pattern indicates the true value pattern.Method indicates the estimator used.The total planned sample size N is determined by either each method (a) or by the IPWE (b).Total planned sample sizes are determined to maintain a nominal type I error rate of α = 0.05 and achieve a power of 80% under the respective value patterns, using alternative (VP2) to determine the sample size for the null (VP1).Early Reject and Total Reject are the rejection rates at the first analysis and for the overall procedure, respectively.E(SS) is the expected sample size, i.e, the average number of individuals enrolled in the trial when the trial is completed.E(Stop) is the expected stopping time, i.e., the average number of days that the trial ran.Monte Carlo standard deviations are given in parentheses.Table 6 contains the same information as Table 5 when the Q-functions are misspecified by not including terms for X 1,1 .The sample size to attain the desired power increases for the IAIPWE and AIPWE, but still remains smaller than the IPWE.The type I error rates are still controlled and the nominal power attained if the stopping boundaries are chosen under the misspecification.Table 7 contains the same information as in Table 5 when testing the null hypothesis for homogeneity H 0H using the χ 2 testing procedure.We again calculate the sample size determined to achieve power 80% under (VP2) as the sample size for investigating the type I error rate under true (VP1).We consider S = 2 planned analyses at day 500, and if applicable, trial completion.Table 7 shows that there is a slight increase in the total planned sample size required to achieve power equivalent to that for testing H 0D .The true OBF boundaries for a χ 2 test make early stopping statistically improbable for interim analyses, which is reflected in the low early rejection rates and the difference between the expected sample size using OBF boundaries and the expected sample size performing a single analysis.The procedure achieves nominal power with a slightly inflated type I error rate.Finally, we consider S = 3 with interim analyses at days 500 and 700.We test H 0D against H AD with δ = 0. Table 8 contains the same entries as in Table 5 and the proportion of rejections that occur at the first analysis s = 1, at the second analysis s = 2 if the trial continued, and the total rejections if the H 0D was rejected at any analysis.The IAIPWE again has the lowest expected sample size and earliest expected stopping times.For OBF boundaries, the total planned sample size is marginally higher for the IAIPWE than the AIPWE due to Monte Carlo error.Both nominal type I error rates and power are achieved.
Table 8: For the schema in Figure 3, interim analysis performance results for testing H 0D against H AD with a fixed control value using Pocock and OBF boundaries.Summary of results as described in Table 5 with S = 3 planned analyses on days 500, 700, and trial end.Rejections for s = 1 and s = 2 are given as the proportion of rejections that occur at that analysis without a prior rejection.

Variable Time Between Analyses
For illustrative purposes, we consider the case when time between stages is not fixed.The outcomes are distributed as given in the previous setting where responders receive a single treatment.Enrollment times are drawn uniformly between 0 and 1000.Treatment two is assigned at a follow-up which occurs uniformly between 90 and 110 days after enrollment.The final outcome is observed uniformly between 90 and 110 days after treatment two is assigned.
Table 9: For the schema in Figure 3, interim analysis performance results for testing H 0D against H AD with a fixed control value using Pocock and OBF boundaries.Summary of results as described in Table 5 when time between follow-ups is not fixed.Table 9 contains the same entries as in Table 5 when the time between follow-ups is no longer fixed.We see that there is little difference in the performance of the estimators compared to when the time between stages is fixed.This suggests the IAIPWE is robust to variability in the time between follow-ups if this remains independent of the treatments.We consider an additional schema with inclusion of a control arm shown in Figure 4. Individuals are randomly assigned to treatments labeled PCST-Full (0), PCST-Brief (1), or Control (2) with equal probability and enrolled in the trial as stated in trial design 1.We encode the control arm as treatment A 1 = 2 to align our notation with the mean model from the simulations in the Section 6. Interim analyses are conducted at day 700 and trial end to mitigate over-estimation of the variance at day 500 for small n(t 1 ).The outcome has variance 100 and mean

Motivating Design with an Estimated Control Arm
Let (VP2) indicate embedded regimes = 1, . . ., 4 attain a higher mean outcome than the standard of care by δ = 3.Let (VP3) indicate embedded regimes = 1, . . ., 4 attain a higher mean outcome than the standard of care by δ = 5.
Table 10 summarizes the results with entries as in Table 5.We use the sample size determined to achieve power 80% under (VP2) as the sample size for investigating the type I error rate under (VP1), else the sample size is determined under the respective VP.We test the H 0D against the alternative H AD for δ = 0 with S = 2 planned analyses.As expected, estimation of a control arm increases the required sample size to achieve the same power when the control is fixed.Therefore, a more extreme treatment difference is required to have a similar overall sample size when estimating the mean under a control arm in a SMART.All methods attain the desired power, and the IAIPWE consistently has lower expected sample sizes than the IPWE and AIPWE.In some cases the AIPWE and IPWE have similar expected stopping times dues to the minimal difference in estimated information proportion or the over-estimated variance at the first analysis by the AIPWE.However, when the sample size of for all estimators is determined using the IPWE, the superiority of the AIPWE is seen in uniformly earlier stopping times and lower expected sample sizes.Differences in the total planned sample sizes for the IPWE under (a) and (b) are due to Monte Carlo error.It is sufficient to show independent increments by showing that the influence functions of the estimator V l IA (t s ), cov{IF l (t s ), IF l (t s )} = var{IF l (t s )} for t s < t s .By construction, E{IF l (t s )} = 0. Furthermore, 2K r=0 C r = 1 for individuals enrolled in the study.Thus, We will begin by showing that these terms have a martingale structure.In particular, we will show that the expectation of the cross terms for r = r ± 1 is 0. Let Therefore, for both k = 1, 2, we have shown So independent increments holds.

Figure 1 :
Figure 1: Schema for the SMART evaluating regimes involving behavioral interventions for pain management in breast cancer patients embedding eight regimes of the form "Give intervention a; if non-response, give b; otherwise, if response give c."The embedded regime determined by a = PCST-Full, b= PCST-Plus, c = Maintenance is shown with dashed lines around the treatments.Regimes l = 1, . . ., 8 take (a, b, c) to be (Full, Plus, Maintenance), (Full, Plus, No Intervention), (Full, Maintenance, Maintenance), (Full, Maintenance, No Intervention), (Brief, Full, Maintenance), (Brief, Full, No Intervention), (Brief, Maintenance, Maintenance), and (Brief, Maintenance, No Intervention), respectively.This figure appears in color in the electronic version of this article.

Figure 4 :
Figure 4: Schema for the SMART evaluating regimes involving behavioral interventions for pain management in breast cancer patients with an additional arm for standard of care.

Table 2 :
For the schema in Figure1, interim analysis performance results for testing hypothesis (2) against H AD with a fixed control value under Pocock Boundaries under (VP2) and sample size N based on the method.MC Mean is the Monte Carlo mean of the estimates, MC SD is the Monte Carlo standard deviation of estimates, ASE is the Monte Carlo mean of the standard errors, and MSE Ratio is the ratio of the Monte Carlo mean square error for the IPWE divided by that of the indicated estimator for the tree estimates at the interim analysis (a) and final analysis (b) for B = 1000 simulations.The true values under (VP2) for regimes (1, . . ., 8) are (

Table 3 :
For the schema in Figure1, interim analysis performance results for testing hypothesis (2) against H AD with a fixed control value using Pocock boundaries under varying enrollments.The interim analysis is conducted on day 700.The percentages p 1 , p 2 , and p 3 are the expected percentage of individuals to have completed the trial, made it to only stage two, and to have made it to only stage one, respectively.Method indicates the estimator used.The total planned sample size N is determined by either each method.Total planned sample sizes are determined to maintain a nominal type I error rate of α = 0.05 and achieve a power of 80% under (VP2).Early Reject and Total Reject are the rejection rates at the first analysis and for the overall procedure, respectively.E(SS) is the expected sample size, i.e, the average number of individuals enrolled in the trial when the trial is completed.E(Stop) is the expected stopping time, i.e., the average number of days that the trial ran.Monte Carlo standard deviations are given in parentheses.

Table 4 :
For the schema in Figure1, interim analysis performance results for testing hypothesis H 0D against H AD with a fixed control value under Pocock Boundaries under (VP1) and (VP3) with method-based sample size N .MC Mean is the Monte Carlo mean of the estimates, MC SD is the Monte Carlo standard deviation of estimates, ASE is the Monte Carlo mean of the standard errors, and MSE Ratio is the ratio of the MC mean square error for the IPWE divided by that of the indicated estimator for the three estimates at the interim analysis (a) and final analysis (b) for B = 1000 simulations.

Table 6 :
For the schema in Figure3, interim analysis performance results for testing H 0D against H AD with a fixed control value using Pocock and OBF boundaries.Summary of results as described in Table5when the Q-functions are misspecified.

Table 7 :
For the schema in Figure3, interim analysis performance results for testing H 0H against H AH with a fixed control value using Pocock and OBF boundaries under the χ 2 testing procedure.Summary of results as described in Table5.

Table 10 :
For the schema in Figure4, interim analysis performance results for testing H 0D against H AD with a fixed control value using Pocock and OBF boundaries.Summary of results as described in Table5.

Table 12 :
Pocock boundaries, OBF boundaries, estimated values ×10 −1 (standard errors), and test statistics at interim and final analysis time for the behavioral pain management case study data set using the IPWE.Results are presented for the interim analysis (a) then the final analysis (b).

Table 13 :
Pocock boundaries, OBF boundaries, estimated values ×10 −1 (standard errors), and test statistics at interim and final analysis time for the behavioral pain management case study data set using the AIPWE.Results are presented for the interim analysis (a) then the final analysis (b).

Table 14 :
Interim analysis performance results for testing null hypothesis H 0D against H AD with a fixed control.Results include the Pocock boundaries, OBF boundaries, value estimates ×10 −1 (standard errors), and test statistics at interim and final analysis time for the behavioral pain management case study data set using the IAIPWE.Results are presented for the interim analysis (a) then the final analysis (b).