An order restricted multi-arm multi-stage clinical trial design

One family of designs that can noticeably improve efficiency in later stages of drug development are multi-arm multi-stage (MAMS) designs. They allow several arms to be studied concurrently and gain efficiency by dropping poorly performing treatment arms during the trial aswell as by allowing to stop early for benefit. Conventional MAMS designs were developed for the setting, in which treatment arms are independent and hence can be inefficient when an order in the effects of the arms can be assumed (eg,when considering different treatment durations or different doses). In this work, we extend the MAMS framework to incorporate the order of treatment effects when no parametric dose-response or duration-responsemodel is assumed. The design can identify all promising treatments with high probability. We show that the design provides strong control of the family-wise error rate and illustrate the design in a study of symptomatic asthma. Via simulations we show that the inclusion of the ordering information leads to better decision-making compared to a fixed sample and aMAMS design. Specifically, in the considered settings, reductions in sample size of around 15% were achieved in comparison to a conventional MAMS design.


DECISION RULES FOR THE 3-ARM 2-STAGE DESIGN
Other decision rules at the interim analyses could be considered by the proposed design. As in Section 3 of the main body of the text, we consider the 3-arm 2-stage example. We consider two different alternatives described in Table 1 and Table 2.
The first alternative of decision rule (Table 1) differs from the one described in Section 3 in the main paper in the cells coloured in red. For the described combination of the decision rules, the equation for the FWER is: The power equation to reject both hypotheses is: The second alternative (Table 2) differs from the original described in the main paper in the red cells. For the described combination of the decision rules, the equation for the FWER is: The power equation to reject both hypotheses is: We compare the three decision rules using triangular 1 boundaries and considering the same bounds for both treatments The design is powered at 80% to reject both hypotheses. The Pocock 2 and O'Brien & Fleming 3 boundaries for both treatments were considered as well but the difference in the bounds and the sample size were negligible.
The difference in power among the three different combinations of decision rules are reported in Figure 1 with the following notation: • decision rule 1 (decrule1): stop the trial when (1) 1 ≥ 1 and stop the trial when (1) 1 ≤ 1 , 1 < (2) 1 < 1 . No major differences are observed between the three different decision rules in Figure 1, because the probability of rejecting both hypotheses is quite similar for all the three decision rules. Furthermore, the differences in the ESS are negligible (the three designs differ in only one patient per arm per stage). It follows that, also when the triangular bounds are used at the interim analyses, the different decision rules present minimal differences on the power and the ESS. Thus, one could decide which rules to use depending on the clinical context.

POWER COMPARISON ORD AND FSD
Lets consider a 3-arm 2-stage ORD. Let (1) 1 = (2) 1 = (1) 2 = (2) 2 = 1 , (1) 1 = (2) 1 = 1 = − 1 be the critical values such that Equation (1) holds under the global null hypothesis. Assume that the interim analysis is done after half of the total sample size has been observed and consider an equal allocation ratio to all arms. Let be the critical bound for the fixed balanced sample design (FSD) and the sample size per arm per stage and assume = 1.
Lemma 1 states that, under these assumptions, it follows that 1 < so that the 3-arm 2-stage ORD is always more powerful that the FSD with the same total sample size. Proof. Lemma 1 is proven by contradiction. Assume that 1 ≥ .
For the ORD, the critical values are found to satisfy the following equality under 0 . For the FSD the critical bound is found in order to satisfy and if 1 ≥ then ( Therefore from Equation (7), it follows that which is a contradiction and therefore the Lemma is proven.
Let 1 be the number of patients per arm at the first stage and 2 at the second stage. From Equation (2), the (rejecting 01 and 02 | ) for a 3-arm 2-stage ORD design can be written as: ( For the FSD with patients per arm it holds (rejecting 01 and 02 | ) = ) ∼

)]
Let = 1 and fix the maximum sample size for both designs. Therefore, if is the sample size per arm for the FSD, then 1 = 2 and if 1 = 2 2 then 2 = . Given that it follows (rejecting 01 and 02 | ) − (rejecting 01 and 02 | ) = ( Using Lemma 1, it holds Through some further analytical passages it can be shown that (rejecting 01 and 02 | ) − (rejecting 01 and 02 | ) > ( Given the analytical complexity of Equation (9), we evaluate it numerically using R, choosing = 0.05 and 1 = 1.876, = 0.1 and 1 = 1.527. Moreover, we consider = 10, 50 and (1) ∈ (0, 2), (2) ∈ (0, 2). Figure 2 shows that the difference in power to reject both hypotheses between the ORD and the FSD is always greater than zero for the chosen values of 1 , and for the considered sample sizes.

CASE STUDY: NUMERICAL RESULTS
The results of the simulations that revisit the NCT01257230 trial using the ORD when 1 interim analysis is planned after observing half of the total population are provided in Table 3.

COMPARISON OF THE CRITICAL VALUES BETWEEN A 3-ARM 2-STAGE ORD AND A STANDARD MAMS
In Figure 4 it can be seen how the values of different shape of critical bounds differ between the 3-arm 2-stage ORD and the standard MAMS design which select all promising arms to proceed to the next stage. The bounds for the ORD are found in order to control Equation (1) under 0 at level = 0.05, when the same bound ( , ), ∈ {1, 2}, are used for each treatment arm, whereas bounds for the MAMS design are found using the R 5 package proposed by Jaki et al. 6 It is worth noting that overall, the critical bounds for the ORD are smaller in each stage compared to the standard MAMS. How to cite this article:  (2) . Cells coloured in red correspond to different decision rules compared to the ones described in Section 3 in the main paper.

ACKNOWLEDGMENTS
(1) proceed with 2 proceed with 1 , 2 drop both arms   (2) . Cells coloured in red correspond to different decision rules compared to the ones described in Section 3 in the main paper. (1) proceed with 2 proceed with 1 , 2 proceed with 1 , 2