Modelling semi‐attributable toxicity in dual‐agent phase I trials with non‐concurrent drug administration

In oncology, combinations of drugs are often used to improve treatment efficacy and/or reduce harmful side effects. Dual‐agent phase I clinical trials assess drug safety and aim to discover a maximum tolerated dose combination via dose‐escalation; cohorts of patients are given set doses of both drugs and monitored to see if toxic reactions occur. Dose‐escalation decisions for subsequent cohorts are based on the number and severity of observed toxic reactions, and an escalation rule. In a combination trial, drugs may be administered concurrently or non‐concurrently over a treatment cycle. For two drugs given non‐concurrently with overlapping toxicities, toxicities occurring after administration of the first drug yet before administration of the second may be attributed directly to the first drug, whereas toxicities occurring after both drugs have been given some present ambiguity; toxicities may be attributable to the first drug only, the second drug only or the synergistic combination of both. We call this mixture of attributable and non‐attributable toxicity semi‐attributable toxicity. Most published methods assume drugs are given concurrently, which may not be reflective of trials with non‐concurrent drug administration. We incorporate semi‐attributable toxicity into Bayesian modelling for dual‐agent phase I trials with non‐concurrent drug administration and compare the operating characteristics to an approach where this detail is not considered. Simulations based on a trial for non‐concurrent administration of intravesical Cabazitaxel and Cisplatin in early‐stage bladder cancer patients are presented for several scenarios and show that including semi‐attributable toxicity data reduces the number of patients given overly toxic combinations. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.


Introduction
In oncology, phase I clinical trials are conducted to evaluate the toxicity profile of a novel agent. The aim is to identify the maximum tolerated dose (MTD), defined to be the largest dose that is expected to cause unacceptable toxicity in a specified proportion of patients [1]. The desired proportion is known in practice as the target toxicity level (TTL) and is denoted here as Γ. What is considered as unacceptable toxicity will depend on the drug, disease and patient population. In practice, unacceptable toxicity is known as dose-limiting toxicity (DLT) and is usually restricted to the observation of one or more grade 3 or higher toxic reactions, as defined by the National Cancer Institute's Common Terminology Criteria for Adverse Events [2]. For trials of cytotoxic drugs, such a dose is assumed to be the most promising for reducing tumour size, with a constrained potential for inducing DLTs in patients.
Combinations of drugs are often required to effectively treat cancer patients. There may be synergistic benefits in combining two or more cytostatic/cytotoxic agents, such as increasing the potential for reducing the size of tumours [3]. In addition, with more advanced molecularly targeted therapies, different drugs that form a treatment regimen may be used to deal with cellular heterogeneity within tumours, thus effectively combating tumours before they become resistant to drugs [4]. Combinations of chemotherapeutic drugs may be administered concurrently or non-concurrently over a treatment cycle, and such a choice is usually dependent on the disease being treated, the treatments being used, and the biological mechanism via which the treatments act [5][6][7][8][9]. Administering treatments concurrently is often undertaken in order to quickly kill tumour cells and/or prevent remaining tumour cells from developing immunity to particular drugs and thus improve treatment efficacy with respect to disease-free survival and overall survival [10]. However, whilst concurrent administration may be clinically efficacious, it may lead to severe toxicities in patients because of the large doses and high dose-intensities that patients receive. Administering drugs non-concurrently over a cycle may reduce the likelihood of patients experiencing severe toxic reactions whilst still providing the clinical benefit of combining treatments to treat tumours [3].
In a trial, clinicians are responsible for determining whether any observed toxicity is attributable to one or more of the treatments, or a consequence of disease progression [11]. In a combination trial of two known drugs, the issue of misattributing toxicity to an incorrect source (that is, attributing a treatmentrelated toxicity to disease, or a disease-related toxicity to treatment) is less likely, because both drugs have been studied previously and their toxicity profiles are reasonably well known. However, in dual-agent phase I trials of two drugs with similar toxicity profiles, there is the issue of drug-related non-attributable (NA) toxicity (for the remainder of this work, toxicity is that caused by the experimental agents, and disease-related toxicity is ignored). In single-agent trials where only one drug is escalated and no other therapies are administered, a DLT observed in a patient is due to the dose of that particular drug, that is, the DLT is attributable to the drug. In dual-agent trials however, there possibly exists further ambiguity. Because two drugs are being administered, depending on the trial context, it is possible that one cannot attribute a DLT to a particular agent under investigation. One may say that such a DLT is NA. The situation may be even more complex than this; there is the possibility of a synergistic interaction between both drugs that leads to toxicity even though each drug given alone is deemed safe [12]. Yin and Yuan [13] considered modelling the four possible toxicity outcomes (DLT/no DLT due to drug A coupled with DLT/no DLT due to drug B) via a contingency table approach, which can be used when there are nonoverlapping toxicities for drugs A and B. When there are overlapping toxicities and toxicities that cannot be attributed to specific drugs, the outcomes can be collapsed into a simpler DLT/no DLT outcome for each combination.
Consider an example trial of two drugs A and B, again with overlapping toxicities, where the aim is to identify the MTD combination with respect to the occurrence of first-cycle DLTs that are drug related (i.e. we do not consider disease-related toxicity). In such a trial, drug A is administered at the start of the treatment cycle, and drug B is administered at a much later time point within the cycle (e.g. several days later) if and only if the patient does not experience a DLT after receiving drug A. Any DLT observed before drug B is administered is attributable to drug A only. However, after a patient receives drug B, any observed toxicity may be due to drug A (in the form of delayed toxicity), drug B or a synergistic combination of both drug A and drug B, as mentioned previously. Such toxicity may be regarded as NA. This example trial involves attributable and NA toxicity occurring; we define this mixture of attributable and NA toxicity as semi-attributable (SA) toxicity. By incorporating details of when doses are administered and whether a DLT was observed before or after drug B was given, if at all, we may be able to better determine whether drug A and/or drug B should be escalated and ideally avoid early onset toxicities from drug A alone, meaning more patients are likely to receive the full dose combination that is believed to be more efficacious than each treatment given as monotherapies.
The inclusion of non-concurrent drug administration has yet to be considered in statisticalmethodological research for phase I trials, and no novel designs for combination trials have incorporated this detail. Therefore, we propose methodology for designing a dual-agent phase I trial of treatments with overlapping toxicities, where it is not clear whether drug A, drug B or both drugs are responsible for causing toxicities, and these treatments are given non-concurrently. Section 2 describes a real-life dual-agent phase I trial that motivates this work, and Section 3 presents a method for a trial where the doses of both drugs can be adapted between patients in order to estimate one or more MTD combinations. Section 4 details a simulation study that compares our work with a design that assumes drugs are being given concurrently, and Section 5 describes the results with respect to both accuracy in MTD combination recommendations and chance of dosing patients at unsafe dose combinations. We conclude this paper with a summary of our work, including limitations and areas of further research.

Motivational trial
The work in this paper is motivated by a submitted protocol for a dual-agent phase I dose-escalation trial featuring the non-concurrent administration of Cabazitaxel (A) and Cisplatin (B) intravesically (via a catheter into the bladder) in patients diagnosed with recurrent high-risk non-muscle invasive bladder cancer (at stages tumour in situ, Ta or T1) who have previously received standard treatment of intravesical Bacillus Calmette-Guérin (Clinical Trials.gov Identifier NCT02202772). Both drugs have similar toxicities associated with intravenous administration (urinary tract infections, renal problems and nausea), and it is believed that this will also be the case for intravesical administration. Treatment cycles are weekly (7 days), with Cabazitaxel being administered to patients on the morning of day 1 and Cisplatin being administered on the morning of day 5 only if no DLT attributable to Cabazitaxel is observed in the patient before the administration of Cisplatin. Therefore, any DLT in the first cycle occurring before Cisplatin is administered is due to Cabazitaxel, whereas a DLT occurring after the administration of Cisplatin may be due to Cabazitaxel alone, Cisplatin alone or a combination of the two. Patients will receive a maximum of 6 weeks of treatment. A 2000 mg/100 ml dose of gemcitabine is also administered to patients during the treatment cycle (on the morning of day 3), which has previously been shown to be well tolerated when given intravesically at this concentration to patients with non-muscle invasive bladder cancer [14] (we do not consider modelling the fixed dose of gemcitabine in our work, but if we were to, our modelling approach would be amended accordingly; see discussion). Initially, the toxicity profile of a 2 × 4 dose combination grid formed by two dose levels of Cabazitaxel ( { a 1 , a 2 } = {2.5, 5}mg/100ml) and four dose levels of Cisplatin ( , 66, 80, 100}mg/100ml) was to be investigated. However, this was later amended to be five dose combinations from this set of eight because of sample size limitations. The definition of DLT is deemed to be the observation of excessive toxicity (at least one grade 3 or grade 4 toxicity as defined per the National Cancer Institute's Common Terminology Criteria for Adverse Events) in the first cycle (week) of treatment. In summary, the investigators wish to identify the dose combination that, when given in the schedule stated, has an estimated probability of DLT over a cycle close to 0.25. Using this as a motivational study, we consider how a dual-agent dose-escalation study with non-concurrent drug administration can be designed so that exploration of a full dose-toxicity surface, whilst reducing dosing at overly toxic combinations, can be achieved and compare its operating characteristics to an existing approach that does not account for SA toxicity.

Methods
We present the methodology proposed to incorporate SA toxicity for dual-agent phase I dose-escalation trials of non-concurrently administered drugs, as well as the approach where such detail is omitted (an entirely NA approach).

Semi-attributable toxicity
Consider a dual-agent trial where a j denotes the j th dose level of drug A {j = 1, … , J} and b k denotes the k th dose level of drug B {k = 1, … , K}. For each patient, drug A is administered at time 0, and drug B is administered at pre-planned time t B provided no DLT has been observed in the patient before time t B . An entire cycle is observed for the time window [0, T], with t B < T (Figure 1). To incorporate the concept of SA toxicity as defined in Section 1, let Y i be a trinary outcome variable for patient i such that (1) , the probability of observing a DLT before time t B when drug A is given alone conditional on dose level a j and vector of model parameters . The following conditions must hold for t B : ) .
Condition (ii) states the assumption of monotonicity used in dose-escalation studies, that is, holding t B and parameters fixed, the probability of DLT is non-decreasing as the dose of drug A increases. Let Here, denotes the vector of model parameters for T , with T satisfying the following conditions: Condition (iv) states that the addition of any dose of another agent (i.e. drug B), whilst the dose of drug A is held constant will give a combination with probability of DLT over time window [0, T] greater or equal to that of drug A given alone when observed over time window . Conditions (v) and (vi) are the assumption of monotonicity (condition (ii)) for drug A and drug B, respectively.
, the probability of not observing a DLT in the time period . Within this context, one may assume that t B and T are related in some way. One simplifying assumption is that t B is linearly related to T , that is, t B = T . Under this assumption, = { , }. Therefore, where 0 ⩽ < 1 is a fraction that can be estimated in the model, or fixed at t B T , say. Although it is unrealistic that will ever equal 1, this scenario may be understood as observing no DLTs after time t B due to drug A, with drug B never being administered in the first cycle. If we were permitted to vary t B in our trial, then such a situation would be easier to interpret. However, we assume for this trial that t B is fixed.
The previous assumption relating t B to T means that only a choice of probability function for T is required, and thus, T shares the same parameter vector with t B , but with t B also dependent on . This is valid because T (a j , 0; ) is the probability of DLT over the interval [0, T] solely due to drug A and thus must be greater than or equal to t B (a j ; ), which satisfies condition (ii) previously. Under this specification, the probabilities relating to each outcome (Y i = 0, 1, or 2) being observed are modelled as follows: It can easily be shown that T ( . Given the probability function choices stated previously, the outcome Y i for patient i has a categorical distribution with the following probabilities: where a(i) denotes the dose of drug A given to patient i and b(i) denotes the dose of drug B given to patient i (if at all). Therefore, the likelihood contribution of patient i to the overall likelihood is as follows: where [Y i = y] is the Iverson bracket, which takes value 1 if Y i = y, where y ∈ {0, 1, 2}, and 0 otherwise. Therefore, after observing n patients, the overall likelihood is L( where D n denotes the set of all accrued trial data, that is, dose combinations (a(i), b(i)) and binary DLT responses Y i , given to patients i = {1, … , n}. Using Bayes' theorem, with prior distribution f ( ) for parameter vector , the posterior distribution of , denoted g ( , is as follows: Once g ( | D n ) has been calculated, the posterior distributions for parameters contained in can be used to compute the posterior distribution of T ( , the probability of DLT at dose combination (a j , b k ) over the interval [0, T], for all dose combinations. Because the investigators wish to identify the combination (a( * ), b( * )) such that the aforementioned metric can be used to determine the next dose-escalation step from patient n to patient n+1. Specifically, let N(a(n), b(n)) be the neighbourhood of dose combination (a(n), b(n)), that is, all dose combinations in the dose combination grid immediately adjacent (vertically, horizontally and diagonally) to the combination given to patient n. Then the dose combination for patient n + 1 may be expressed mathematically as follows: wherêT (a j , b k ; ) could be chosen to be, say, the posterior median of T (a j , b k ; ). In the case where two dose combinations are equally close to the TTL on the probability scale, one may consider choosing the combination with the smallest dose a j of drug A, because this minimizes T (a j ; ) and we ideally want patients to receive both drugs in such a trial, although other approaches proposed by investigators may be considered. The constraint of dosing in the neighbourhood of the previously administered dose combination can also be dropped if investigators are happy to make larger changes to doses of each agent than one-level increase/decreases.

Comparison to non-attributable toxicity
The previous model construction describes the idea of SA toxicity when drugs are administered nonconcurrently. This may be compared with the simpler setting where differentiation between whether a DLT observed in patient i occurred before t B or afterwards is not considered. Under such a scenario, only the fact of whether a DLT occurred or not within the interval [0, T], that is, whether Y i ≠ 0 or whether Y i = 0, would be utilized. Furthermore, even if a DLT occurs in patient i before t B and thus drug B is not administered, b(i) is recorded as the dose that would have been given. This is simply referred to as the NA approach. Therefore, the NA approach only considers whether a DLT occurred or not at a combination and does not incorporate any information regarding which agent(s) caused toxicity, or whether drug B was given or not. The likelihood under the NA approach for patient i is as follows: where a(i) and b(i) are the doses of drugs A and B that patient i is due to receive and the prior distribution h( ) on is used to obtain posterior distribution g ( | D n ) for parameter vector . The likelihood Table I. Likelihood contribution of patient i dependent on modelling of toxicity and doselimiting toxicity outcome.
contributions under the SA and NA approaches for patient i given DLT outcome Y i = y for y = {0, 1, 2} are given in Table I.
In the case where multiple dose combinations are equally close to the TTL on the probability scale (call this set C), one may use weighted randomization to choose a dose combination [15], where each dose combination selection probability is weighted by n −1 c , the inverse of the number of patients treated at each candidate combination c ∈ C, that is,

Dose-escalation algorithm
Here, we present a summary of the dose-escalation algorithm to be used in the simulation study. Assume there are a maximum of N patients available, who will be enrolled in cohorts of size c such that N is divisible by c. Based on the previous discussion and methodology outlined, dose escalation/de-escalation proceeds as follows: (1) For n = 0, dose the first cohort of c patients at a 1 .
and thus posterior distribution of where is some upper threshold, stop the trial. Otherwise, set n = c.
(2) For n ⩽ N, (a) Let N(a(n), b(n)) be the set of all neighbouring dose combinations to combination (a(n), b(n)), the combination prescribed to the previous cohort (including the n th person). For wherêT (a j , b k ; ) is the posterior median of the distribution T (a j , b k ; ). If there exists more than one such combination that minimizes the aforementioned function, choose the combination that also minimizes T (a j ; ) (for SA) or use weighted randomization (for NA, see (3) If n = N, the recommended dose combinations at the end of the trial, M, are those that have an estimated posterior median probability of DLT (using the posterior distributions obtained in step (2d)) over the interval [0, T] within [Γ − , Γ + ], for some small , and have previously been experimented on during the trial, that is, Trials that are terminated early for safety concerns or have dose combinations that are suitably close to the TTL but have not been experimented at will not recommend an MTD.

Simulation study
A simulation study to evaluate the performance of modelling SA toxicity versus NA toxicity in a trial with non-concurrent administration of Cabazitaxel (A) and Cisplatin (B), as discussed in Section 2, was conducted, with a TTL of 0.25. All methods were compared on the basis of the percentage of patients that received dose combinations with true DLT probabilities within the interval [Γ − , Γ+ ], the percentage of patients that received dose combinations with true DLT probabilities much higher than the TTL (commonly known as overdoses) and the distribution of MTD recommendations at the end of the trial. We also consider the mean bias and root mean-squared error (RMSE) for each model parameter ∈ and define each of these measures as follows: wherêl is the the posterior median estimate of parameter at the end of the l th trial, A is the set of all trials that did not stop early and |A| is the size of A. We limit these calculations to the set A because trials that do stop early do not yield any MTD combination estimates and parameter estimates are biased towards larger values; the worth of mean bias and RMSE metrics lies in how close parameter estimates (and thus the estimated MTD contour) are to the truth for each scenario.

Dose-toxicity model
To model the probability of DLT T (a j , b k ; ), we used the Farlie-Gumbel-Morgenstern copula model [16], which has been previously investigated by Yin and Yuan [13], where where p j and q k are skeleton probabilities of DLT for actual dose levels a j and b k , respectively, when administered as monotherapies, and are non-negative marginal parameters, ∈ R is an interaction parameter and 0 ⩽ < 1. Therefore, = { , , , } and = { , , }. This model was chosen for its ability to model antagonistic interaction (when < 0), synergistic interaction (when > 0) and independent action/no interaction (when = 0), and also for its parsimony, although other models may be considered [12,17].
For modelling of SA toxicity, may be treated as an additional parameter in the model. A sensible choice of prior distribution for is as follows: This choice is recommended because the median of will be close to

Priors
We consider dose-toxicity scenarios over a 4 × 4 dose combination grid formed by {a 1 , a 2 , a 3 , a 4 } and  {b 1 , b 2 , b 3  (similar to those of Yin and Yuan [13]) and a normal prior with mean 0 and variance 10 is proposed for . This is so that a priori, the means (and medians) of the marginal parameters are equal to 1, and the mean (and median) of the interaction parameter equal to 0 indicates an assumption of non-interaction between the two drugs. Furthermore, given the marginal prior probabilities p j and q k stated previously, vague prior probability distributions for marginal and joint probabilities of DLT for each drug when given alone and in combination are obtained. In practice, marginal prior distributions may be elicited from clinicians based on monotherapy trials.
A cycle of treatment is considered to be 1 week; therefore, for the SA approach, T = 7 and t B = 4, because we have 4 days elapsing between the administration of Cabazitaxel and Cisplatin. Simulations were performed with ∼ so that E( ) = t B T = 0.571 and the prior median of equals 0.595. For NA and SA approaches, the threshold for determining whether the trial is terminated early is 0.80; as well as being a sensible choice, this threshold also corresponds to terminating the trial early should two DLTs be observed in the first cohort of two patients, regardless of when they occur in the observable interval of [0, T]. MTD selection at the end of the trial was determined using the rules outlined in Subsection 3.3 with = 0.025. This was chosen in order to limit MTD selection to a 5% window of probability around the TTL and is also based on other works that implement a similar constraint [15], although the choice of may be related to the number of dose combinations and also the belief of how flat/steep the dose-toxicity surface is, based on previous data and expert opinion.

Scenarios
Using the model in Subsection 4.1 and prior distributions specified in Subsection 4.2, we generated six true dose-toxicity scenarios for our simulation study. The true probabilities of DLT over [0, T] per combination under each scenario are specified in Table II, and the underlying dose-toxicity surfaces with true parameter values and MTD contour are shown in Figure 2. Scenario 1 is generated by using the prior means/medians of each parameter. Under scenario 2, the MTD is the largest dose combination, with all other combinations deemed safe. Under scenario 3, there are two MTD combinations and one combination above the TTL; furthermore, the dose-toxicity surface is slightly asymmetric. Scenarios 4, 5 and 6 show very asymmetric dose-toxicity surfaces: scenario 4 has several combinations on or near the MTD contour, with higher doses of drug A more toxic than higher doses of drug B; scenario 5 is similar to scenario 4, but with no interaction and one MTD combination at ( ; under scenario 6, the increase in toxicity is much faster as drug B is escalated relative to when drug A is escalated, and half of the 16 dose combinations have a probability of DLT of 0.40 or larger. We also require true probabilities of DLT for t B (a j ; ) = T (a j , 0; ) in order to conduct our simulation study. All that is required is specification of the true underlying value of , denoted TR . In this simulation study, with t B = 4 and T = 7, the previous scenarios are investigated with TR equal to either 2 14 , 8 14 or 13 14 . Under the SA approach where the prior mean of is 8 14 , setting TR = 2 14 represents a scenario with a lower-than-expected probability of DLT in the time interval , and setting TR = 13 14 represents a scenario with a higher-than-expected probability of DLT in the time interval . Table III displays the different true scenarios for t B (a j ; t B , ) under each of the scenarios for T given in Table II, and the varying values of TR .

Computational specifications
For each scenario considered (Subsection 4.3), 1000 simulations were run for a maximum of 60 patients, who were dosed in cohorts of two patients. Simulations were conducted in the software package R [18] and OpenBUGS v3.2.2 [19] via the BRugs package [20]. We use a Gibbs sampling MCMC approach to estimate the posterior distributions of all relevant model parameters, which are used to determine the   posterior distributions of the probability that each response is observed at every dose combination. For all simulations, two chains were run, each with a burn-in period of 500 iterations and posterior sample of 4000 iterations, with thinning occurring every two iterations. Gelman-Rubin plots and autocorrelation plots from OpenBUGS were checked to ensure both chains converged and that autocorrelation was not present.

Results
We first consider early trial behaviour under both NA and SA approaches and observe how the contour plots of T (a j , b k ; ) change when we observe the response of the first cohort of two patients.  (a 1 , b 1 ), under both the NA and SA approaches. We see that the recommendations do not differ between approaches for the first two cohorts but the resultant posterior median parameter estimates that describe the shape of the dose toxicity surfaces are very different (Figure 3). Under the SA approach, observing DLTs before t B leads to a dose-toxicity surface that shows drug A to be exceedingly toxic; additionally, observing DLTs after time t B , the dose-toxicity surface reflects a belief that drug B is likely to be more toxic than under the NA approach, because no DLT was observed in time interval [0, t B ), when the patient had received drug A only. This is also shown when comparing the NA approach when (Y 1 , Y 2 ) = (0, 1) and the SA approach when (Y 1 , Y 2 ) = (0, 2), when a DLT is observed after time t B in one patient. Table V shows the distribution of patients dosed at combinations with true DLT probabilities falling in certain intervals, as well as the mean and standard deviation of the DLT rates, across all simulations for each dose-toxicity scenario and value of TR , using the NA and SA approaches, respectively. Under the SA approach, we observe similar or increased experimentation at combinations with DLT probability within 5% of the TTL Γ = 0.25; scenario 6 shows that under the SA approach, over 16.5% of patients are dosed within this interval, relative to the 14.4% under the NA approach. When considering a wider probability interval of (0.2, 0.3], one observes similar results, with fewer patients receiving doses with probability of DLT greater than 30% and 40%; again for scenario 6, 50.8% of patients receive combinations with DLT probabilities between 0.20 and 0.30, which is several percentage point below the SA approaches (53-54.3%). For scenarios 4 and 5, under the NA approach, we have 41.5% and 56.9%, respectively, which is less than nearly all SA approaches in the same scenarios (43.5-45.1% and 54.6-58.6%, respectively). Overall, the mean DLT rate for the SA approach (for all underlying values of TR studied) is less than or equal to that under the NA approach, although Figure 4 illustrates that the mean DLT rate of each approach as the number of patients increases is fairly similar. Table V also shows the mean percentage (and standard deviation) of DLTs in each trial that occur before time t B . Although changes are small, it is shown that a slightly reduced DLT rate is observed under the SA approach when TR = 13 14 relative to the NA approach when the percentage of DLT rates before time t B is higher, which can be seen in all scenarios. Table VI shows the percentage of trials recommending each dose combination after all patients have been evaluated for NA and SA method, respectively, as well as the mean bias and RMSE for model parameters , and . With respect to MTD recommendations and their true DLT probabilities, the SA approach when TR equals 8 14 or 13 14 has in general higher recommendation percentages within  With respect to the bias and RMSE, it is observed that the mean bias and RMSE for the interaction parameter are fairly similar across all methods per scenario and the bias seems to indicate that the final parameter estimates are close to 0; given that the prior on is reasonably vague, it is likely that changes in the dose-toxicity surface are determined by the marginal parameters and . The results for RMSE on scenarios 4, 5 and 6 suggest that when toxicity increases slowly for one drug at the marginal level (seen in drug B for scenarios 4 and 5, and drug A for scenario 6), more precise parameter estimates are obtained, because more experimentation is permitted at increasing dose levels of that drug. The converse can be seen for the RMSE around the parameter relating to the other agent, which in truth has DLT probability rate increasing much faster.

Sensitivity analysis
We also conducted a sensitivity analysis to assess how the SA approach performs relative to the NA approach when TR is increasing with a j , that is, higher dose levels of drug A have an increased probability of DLT in time window [0, t B ) relative to the case where TR is constant for all a j . We compared both approaches on scenario 5, so the true probabilities of DLT over the interval [0, T] were identical to those given for scenario 5 in Table II, but with TR ( a j ) = 1∕14 for j = 1, 3∕14 for j = 2, 5∕14 for j = 3 and 8∕14 for j = 4. Therefore, the probability of DLT due to drug A was not directly proportional to the   1 , a 2 , a 3 , a 4 ) as (0.01, 0.06, 0.12 and 0.22). We simulated 1000 trials per approach, under the same conditions detailed in Subsection 4.4 (results are provided in Supporting Information to this paper). We found a trade-off between experimentation and MTD recommendation, with the SA approach identifying the MTD combination (in this scenario, (a 1 , b 2 )) 1.9% more than the NA approach; furthermore, the NA approach recommended 1.5% more overdoses (DLT probability over [0, T] greater than 0.30) than the SA approach. However, with regard to experimentation, 1.7% more patients received the true MTD combination under the NA approach than the SA approach, and 1% more patients received combinations with true DLT probability greater than 0.40 under the SA approach.

Conclusions
In this paper, modifications to standard dual-agent dose-escalation methodology that may be used for clinical trials with non-concurrent administration of agents over a cycle have been investigated. By changing the structure of the likelihood and modelling the response as a trinary categorical variable, rather than a simple binary variable, improvements to the performance of model-based trial design are observed in several scenarios, including slight increases in the percentage of patients receiving dose combinations with true DLT probabilities close to the TTL Γ, reductions in the percentage of patients receiving dose combinations with true DLT probabilities much higher than the TTL Γ and the distribution of MTD recommendations at the end of the trial. For the simulation study in Sections 4 and 5, the SA approach dosed more patients at target combinations and fewer patients at overdoses relative to the NA method. However, the SA approach did not universally outperform the NA method. Also, the SA approach recommends target combinations (or those close to target combinations on the probability scale) more often than the NA method.
However, we acknowledge that there are limitations with the work presented here. As stated previously, the submitted protocol includes a 2000 mg/ 100ml fixed dose of gemcitabine to be administered on day 3, between the administration of Cabazitaxel (day 1) and Cisplatin (day 5). The methodology introduced here is intended to incorporate dual-agent dose-escalation methodology into statistical model-based trial design, and therefore focus on the two agents with adjustable dose levels as per the submitted protocol. The addition of a third agent, albeit a fixed dose that has shown to be well-tolerated in patients at the proposed concentration, introduces further complexity into the modelling framework; Yin and Yuan [21] acknowledge such an extension via the use of copula regression. Such an expansion to a three-agent dose-escalation problem, along with exploration of new methodology for SA toxicity, is particularly challenging and requires an extremely detailed analysis of operating characteristics. Furthermore, we only consider drug-related toxicities in our methodology and not disease-related toxicity, or the problem of toxicity misattribution [11]. However, as mentioned in Section 1, both drugs will have been studied separately in single-agent phase I trials, so misattributing disease-related toxicity to drugs and vice versa is less likely than in monotherapy trials.
A further consideration is the choice of model for modelling the dose-toxicity surface. As stated previously, the choice of model used in this research was made because of its parsimony and also its ability to satisfy all of the aforementioned assumptions in Section 3, as well as its ability to model different forms of interactive behaviour. It does not serve as a formal recommendation for this particular model, and other binary regression models are available for use in dual-agent phase I dose-escalation trials [12]. Before deciding how to model the dose-toxicity surface, investigators and statisticians should discuss the various aspects of a proposed trial in order to consider all possible options for the trial conduct. It may be the case that other dose-toxicity models proposed in the literature, novel extensions of these or indeed entirely new methodology developed specifically for a particular trial will serve as the best method [17]. Further, to the research shown here, we also considered setting P ( and expressing the probability of DLT as P ( , then assessing how end-of-trial MTD recommendations differed; there was very little difference between those shown under the proposed model in equations 3, 4 and 5.
One key point of discussion is the assumption that t B ( a j ; ) = T ( a j , 0; ) . This assumes that the probability of DLT in the time interval is linearly proportional to the probability of DLT in the time interval [0, T] at dose combination (a j , 0). This is a rather neat and simple assumption regarding the nature of the dose-toxicity relationship between the two drugs. The sensitivity analysis in Subsection 5.3 investigated model performance when this assumption was not true for one scenario and found that the SA approach was better than the NA approach at correctly identifying MTD combinations and selecting doses near the MTD, but the NA approach was slightly better with regard to experimentation. Perhaps a more advanced idea would be that the time to a DLT occurring is based on some exponential distribution, dependent on the type and number of drugs given. One may instead assume t B ∝ e( T ) that is t B is proportional to some other function e of T that may not be linear. Alternatively, t B may not be related to T at all, requiring two different probability functions to be chosen for t B and T . However, because dose-escalation decisions are made based on function T alone, this would require some modification to the function that determines which dose combination to give to the next cohort, and indeed to recommend at the end of the trial. As it stands, the current simplifying assumption of t B ( a j ; ) = T ( a j , 0; ) seems sensible and reduces the complexity of this dose-escalation approach. With respect to accuracy of dose-toxicity modelling, it may be the case that the choice of probability function T has a far bigger role to play in dose-toxicity modelling than the linking postulation of the relationship between probability functions t B and T . Based on the work conducted in this paper, the incorporation of methodology relating to SA toxicity may be applied to dual-agent trials that incorporate non-concurrent drug administration but tailored to the specific trial. If information relating to non-overlapping toxicities is known, or if the clinician can distinguish drug-related toxicity from disease-related toxicity, then the dose-toxicity model may be modified so that such information can be used to guide dose-escalation/de-escalation. Furthermore, if pharamcokinetic/pharmacodynamic data can be used to help predict the probability of DLT occurring over a particular interval, and inform the potential for carry-over effects both at the point of administering drug B and even between cycles, then this could be incorporated to make dose-escalation methods more advanced and realistic. The work presented here marks a novel and firm starting point for considering individual trial aspects to tailor advanced Bayesian methodology to a clinical research question of interest.
Considering the results obtained and the limitations identified, further areas of research can be explored. Aside from modifications already addressed such as model choice, assumptions linking t B and T and the inclusion of more than two drugs in the model, it would be particularly interesting to consider how multiple toxicities and their gradings influence dose-escalation/de-escalation decisions for drugs administered non-concurrently. An additional point of interest would be to consider the occurrence of DLTs outside of the first cycle of treatment, where DLT responses are traditionally recorded, particularly in a dual-agent trial with non-concurrent drug administration. A tougher practical consideration would be to see if the time of administration of drug B could be adapted during the trial so that more patients are given the full combination, though still keep the sequential administration structure.