Contrasting principal stratum and hypothetical strategy estimands in multi‐period crossover trials with incomplete data

Complete case analyses of complete crossover designs provide an opportunity to make comparisons based on patients who can tolerate all treatments. It is argued that this provides a means of estimating a principal stratum strategy estimand, something which is difficult to do in parallel group trials. While some trial users will consider this a relevant aim, others may be interested in hypothetical strategy estimands, that is, the effect that would be found if all patients completed the trial. Whether these estimands differ importantly is a question of interest to the different users of the trial results. This paper derives the difference between principal stratum strategy and hypothetical strategy estimands, where the former is estimated by a complete‐case analysis of the crossover design, and a model for the dropout process is assumed. Complete crossover designs, that is, those where all treatments appear in all sequences, and which compare t treatments over p periods with respect to a continuous outcome are considered. Numerical results are presented for Williams designs with four and six periods. Results from a trial of obstructive sleep apnoea‐hypopnoea (TOMADO) are also used for illustration. The results demonstrate that the percentage difference between the estimands is modest, exceeding 5% only when the trial has been severely affected by dropouts or if the within‐subject correlation is low.

of this thinking appears in an addendum ICH E9(R1), (International Council for Harmonisation, 2019), to the ICH E9 guidelines on statistical principles in clinical trials.
Identifying the most pertinent estimand requires careful consideration of many features that may arise in the conduct of the study, such as inter-current events or patient withdrawal. Once an estimand has been identified, the design, data collection and analysis need to be chosen accordingly. The ICH E9(R1) addendum discusses several estimands, such as the Treatment Policy Strategy, which aligns closely with the intention to treat (ITT) principle or the hypothetical strategy, in which it is assumed that all patients adhere to the trial as planned.
An estimand that might have particular clinical relevance is the one which compares the effect of treatments among those patients who can tolerate them-a principal stratum strategy (PSS). In a trial comparing a new treatment, A, with the standard, S, this approach would compare those who receive, and tolerate, A with those receiving S who would have tolerated A had they received it (and vice versa). In a parallel group trial those allocated to S do not also receive A, and vice versa, so the relevant subgroups cannot be identified. However, crossover designs, allocating as they do several trial treatments to each patient, provide greater opportunities for the implementation of a PSS, because comparisons are within patients. For example, inference might be based on the population of patients who can tolerate all treatments being studied. This naturally leads to the use of complete crossover designs, where all treatments are present in all treatment sequences, and the use of a complete case (CC) analysis, which excludes all patients who do not provide a response in every treatment period.
This contrasts with a common analysis which uses restricted maximum likelihood (REML) to fit a linear mixed-effects model to all available data. With the assumption that data are missing at random (MAR) (Rubin, 1976), this will provide an unbiased estimate of a hypothetical strategy (HS) estimand.
While specification of a target estimand can clarify the aim of the study and guide the design and analysis, trials will often provide data of interest to a range of stakeholders, not all of whom may wish to focus on the same estimand. The designation of primary and secondary estimands has been discussed (Leuchs et al., 2015;Mallinckrodt et al., 2017): for example, clinicians and regulators may focus on a PSS estimand, while drug developers may favor an HS estimand. This paper addresses the difference between PSS and HS estimands for treatment effects in complete crossover trials, where the former is estimated using a CC analysis. This is done by computing the expectation of the treatment effect in a CC analysis in terms of the treatment parameter for the HS estimand. The latter will be the expectation from the standard linear mixed effects analysis, if the missing data are MAR. The two would coincide if no patients dropped out, so the difference will inevitably depend on a model for the dropout probabilities. This is essentially the analysis presented for the AB/BA design in Matthews et al. (2014).
Other contributions to the issues that arise in crossover designs with missing data include the effect of data missing not at random (Ho et al., 2012), joint modeling of dropout and response (Basu & Santra, 2010;Wang & Chinchilli, 2021) and missingness being related to threshold exceedance (Liu, 2011). A different strand of research focuses on the fact that while the planned design, , is assumed to be connected, that is, all treatment contrasts are estimable, the design that results after missing data are taken into account,˜, may be disconnected (Bate et al., 2008;Godolphin, 2006;Godolphin & Godolphin, 2019;Low et al., 1999;Majumdar et al., 2008;Prescott & Mansson, 2001;Varghese et al., 2002). These papers take a different approach to the one in the present paper, as they assume that data are missing completely at random (MCAR) and analyze all available data. Nevertheless, our calculations allow us to comment on the issue of connectedness as it applies to the approach presented here.
In Section 2, the proposed analysis and associated notation is introduced, along with the form of the missingness process. In Section 3, the expectation of the CC estimate of the treatment effect, the PSS estimand, is evaluated in terms of the HS estimand. Section 4 presents numerical results for selected four-and six-period complete crossover designs based on Williams' squares. Results on disconnection in this context are given in Section 5 and discussion and possible extensions are in Section 6.

Estimands and estimation
It is assumed that the design compares treatments over periods and that the outcome variable is continuous. In particular, if the response of patient in period is , , then the model usually adopted for a crossover study is some variant of , = + ( , ) + + = 1, … , ; = 1, … , .
Here, and are normally distributed random patient effects and residuals, with mean 0 and variances 2 and 2 , respectively. The effect of period is , the effect of treatment , = 1, … is (with 1 = 0 for identifiability) and ( , ) denotes the treatment allocated to patient in period . We use , to denote the vectors of the period and treatment effects, although the latter will usually require further qualification. We also assume that there is no treatment by period interaction. In a crossover design, such a term is closely related to treatment carryover effects (in the AB/BA design the two are wholly confounded), so some practitioners may include a first-order carryover term in (1): indeed, much of the work on connectedness cited in Section 1 does do this. However, the adequacy of the usual form of carryover to account for any realistic mechanism by which the effect of a treatment might persist has been questioned, perhaps most strongly in Senn (2002, Chapter 10). It is now common to use (1) and justify the absence of carryover on non-statistical grounds, such as requiring observations in successive periods to be sufficiently separated in time: this is the stance taken in the present paper.
It is also assumed that each patient provides a single outcome in each period. Although patients may be observed on a number of occasions during a period, this assumption is probably less restrictive than might be imagined. The principal focus may be on the final observation in a period, at least partly to minimize the chance of a carryover effect, or the analysis may use a summary of the observations made on the patient during the period. If some of the components of this summary are missing, then whether the summary is deemed to be missing, or it is calculated using the available observations, is likely to depend on the context and the judgment of the analyst.
The residual terms in (1) give rise to an equicorrelation, or compound symmetry, structure for the dispersion of the vector of observations on patient , , where ( , ) = 2 = 2 + 2 and, for ≠ ′ , ( , , , ′ ) = 2 = 2 , with the intra-class correlation coefficient. The × matrix is defined to have ones on the diagonal, with all other elements being , so the dispersion matrix of a completely observed patient would be 2 .
If model (1) was applied to the population where it is supposed that all patients provide results in every period, then would be an HS estimand, and to distinguish from other estimands is denoted by HS . In practice, some patients may provide fewer than observations and a widely used method of estimation (see, e.g., Jones and Kenward 2014, Chap. 5) is to apply REML to all available data. If the data were MAR, this method would give an unbiased estimate of HS (Kackar & Harville, 1984).
If REML and Equation (1) are applied only to the population of patients who provide observation in all periods, that is, a CC analysis, then constitutes a PSS estimand, which we denote by PSS . The expectation of the treatment estimator from this analysis,ˆP SS , will be PSS , but this can also be expressed in terms of HS if we postulate a model for the missingness process which gives rise to the difference between the populations. The first step is to derive an expression forˆP SS found from a CC analysis.
The vector of observations, , on any patient included in the CC analysis will have dimension . For this population, the × ( + − 1) design matrix for the model in Equation (1) is = ( 1 , 2 , … , ) and is the × ( + − 1) design matrix for patient . It will be assumed that the design is composed of distinct treatment sequences: the design matrix will be the same for all patients allocated to a given sequence, so we write , = 1, … , for the design matrices corresponding to the sequences. If patient is allocated to sequence , is identified with . While the design allocates patients to sequence , only of these are CCs. If the responses in are ordered by time, then each will be of the form ( | ), where , the × identity matrix, is associated with the period effects and is a × ( − 1) matrix describing the treatment allocation in sequence .
The responses on different patients are assumed to be independent, so the treatment estimator for the CC analysis,ˆP SS , can be expressed aŝ Here, is the mean of all the complete allocated to sequence , andˆis evaluated at =ˆ, the REML estimator of . An important observation, demonstrated in Web Appendix A, is that for the CC analysis of complete crossover designs and having compound symmetry, the matrices −1 ( − )ˆ− 1 , = 1, … , are invariant with respect toˆ, and hence so isˆP SS . Consequently, Equation (2) can be greatly simplified by settingˆequal to 0, so thatˆis replaced by .
In order to make progress with evaluating (ˆP SS ) in terms of HS , it is necessary to specify more precisely how missing data arise.

Missingness process
Attention will be restricted to monotonic missing patterns, so if a patient is missing in period , they are also missing in period for all > . Patients with no responses at all cannot be accommodated, so patients are assumed to drop out in any period after period 1 up to period − 1, or not to drop out at all. If patient drops out immediately after period ≥ 1, we define the dropout indicator to take the value , so = denotes a CC: for a study with = 4 the possibilities are illustrated below, where o denotes an observed value and x is a missing value.

Period 1 Period 2 Period 3 Period 4 Dropout indicator
A model for the missingness process is required and this is often specified in terms of suitable logistic regressions on the outcomes, but subsequent calculations will be expedited if a probit link is assumed. We will use a model of the form where Φ(⋅) is the distribution function of a standard normal variable: note that Pr( = | ≥ , ) = 1. The models allow the probability of dropout to depend on all values previously observed on the patient and, when 1 = 0 for all corresponds to MAR, unless = 0 for all ≥ 1, which corresponds to MCAR. In practice, submodels with many fewer parameters may be sufficient but the level of generality in Equation (4) will be maintained for now.
Calculations used in the investigation of expectation in Section 3 require an expression for the probability that patient provides a CC, i.e. Pr( = | ). This can be built up sequentially from Pr( = 1 | ≥ 1, ) = Pr( = 1 | ) and, for > 1, It follows from this and Equation (4) that the probability of a CC is In the next section, it will be convenient to rewrite this product using -dimensional vectors , = 1, … , − 1, defined using an appropriate pattern of zeros, so that Equation (6) becomes

Conditional expectation
Expression (2) for the CC estimator,ˆP SS , depends on the random variables as well as , so the first step is to evaluate (ˆP SS | 1 , … , ), which in turn requires the evaluation of ( | 1 , … , ) for each sequence . This is shown to be ( ( ) | ( ) = ) in Web Appendix B in the Supporting information, where we write ( ) for to emphasize that here we consider only those patients allocated to sequence : ( ) is defined analogously. The unconditional mean of the observations on an individual allocated to sequence is written as , and as the unconditional expectation is taken over the hypothetical population with no missing values, = + HS . Consequently, ( ( ) | ( ) = ) can be found as where (⋅; , ) is the density of a -dimensional multivariate normal distribution. If Equation (7) is substituted in the above then the integrals can be evaluated using the closed skew-normal distribution (González-Farías et al., 2004). To see this, we first recall that a random variable ∈ ℝ has a closed skew-normal distribution, CSN , ( , , , , ), if its density is where ( , ) and ( , ) are, respectively, -anddimensional means and dispersion matrices and is a × matrix. The distribution function of a -dimensional multivariate normal distribution with mean and dispersion is denoted by Φ (⋅; , ). In order to evaluate (8) we need to identify the numerator in Equation (9) with Pr( ( ) = | ) ( ; , 2 ): it will then follow that the denominator in Equation (8) must coincide with the denominator in Equation (9). This is achieved by setting = , = 2 and choosing , and so that the expression in Equation (7) can be identified with Φ ( ( − ); , ). If we choose = − 1 and = −1 then Φ (⋅; , ) will become a product of − 1 scalar normal distribution functions as in Equation (7). To reproduce the arguments in the factors in Equation (7) we need to define the th row of to be − . For example, in a four-period design where missingness depends only on the current and immediately preceding value (i.e., = 0, > 2), then The identification is completed by noting that the th element of , = 1, … , − 1, needs to be the unconditional expectation of 0 + ( ): this is 0 + = Θ , say. It is convenient to write for the − 1 dimensional vector of these quantities.
With these definitions, it follows that for sequence which is the expectation of a CSN , −1 ( , 2 , , , −1 ) distribution. The moment generating function of the closed skew-normal distribution is available in González-Farías et al. (2004), and using this an expression for the mean is derived in Web Appendix C.
Provided the values permit the inversion of , then where it should be remembered that is now The expression for the th element of the − 1 dimensional vector is derived in Web Appendix C, and can be written as follows, where for clarity we omit the subscript from and note that = −1 + 2 does not depend on , Here, Θ is the th element of and − is omitting Θ .
The th diagonal element of is Λ , − is omitting the th row and th column and is the th column of , omitting Λ .
Equation (12) expresses a key result: it shows that the difference between the expectation ofˆP SS (conditional on the numbers of completely observed sequences) and HS is a suitable combination of the elements of , which measures the effect of including only CCs in the estimation. Taking the expectation over the allows the difference between PSS and HS to be evaluated.

Unconditional expectation
The estimand PSS is the unconditional expectation of the CC treatment estimator, and this is found by taking the expectation of Equation (12) with respect to , = 1, … , . These random variables are independent with ∼ binomial( , ), where denotes the probability that an individual allocated to sequence is a CC. This probability can be evaluated as The expectation of Equation (12) can be approximated using a Taylor expansion about the mean of 1 , … , , that is, about 1 1 , … , . A first-order expansion gives PSS = (ˆP SS ) ≈ (ˆP SS | 1 1 , … , ), whereas a second-order expansion is where is defined at the end of Web Appendix D.

ILLUSTRATIONS FOR FOUR-AND SIX-PERIOD DESIGNS
The difference between the estimands, PSS − HS , derived in Section 3.2, will be illustrated using two crossover designs, which are shown in Table 1. The designs are based on Williams squares (Williams, 1949), which are forms of Latin squares, so have equal numbers of treatments and periods, and are balanced so that each treatment follows every other treatment equally often. One design has four periods and comprises two Williams squares: this will be referred to as the Tomado design, as it was the design used by the investigators in the TOMADO trial of treatments for sleep apnoea-hypopnoea (Quinnell et al., 2014). It is reasonable to be concerned that problems with patients dropping out may be more severe in longer trials, so this is investigated using a design comprising two six-period Williams squares, which can be found as an example of a perpetually connected design in Godolphin and Godolphin (2019). Williams squares have optimal or near optimal properties for a model including a carryover effect of TA B L E 1 The four-and six-period designs used in Section 4: columns are sequences and rows are periods, cell entries are treatments treatment. Although many trials now eschew such a term in favor of washout periods, these designs have been chosen because many practitioners still use them as their balance seems inherently attractive.

Estimand difference and level of missingness
The difference in estimands is illustrated numerically with an example contrast for each design. These are shown for the Tomado and six-period designs in Figures 1 and  2, respectively, with the values assumed for the parameters in Equation (1) given in the captions. Whether an observation is missing is assumed to depend only on the observation itself and the immediately preceding observation, that is, = 0, > 2 and, for simplicity, that 0 = 0 , 1 = 1 , 2 = 2 . This model allows the effects of MNAR ( 1 ≠ 0), MAR ( 1 = 0, 2 ≠ 0) and MCAR ( 1 = 2 = 0) processes to be assessed. The percentage difference between the estimands for the contrast is 100 ( PSS − HS )∕ HS = ED, say, with the numerator found from Equation (15)  = 2 for all . These values for the s have been chosen to ensure that the missingness probabilities (4), given the values assumed for the parameters in (1), are realistic and lead to plausible percentages of CCs, cf. Figures 1  and 2.
Summaries of the ED and CCs are shown in Table 2 for = 0.3, 0.8 over the region −1 < 1 , 2 < 1. In Figure 1 the ED for the contrast 4 − 2 from Tomado with a withinunit correlation = 0.3 is largely less than 12%, and is close to zero in the vicinity of 1 = 2 = 0, corresponding to MCAR. The ED values are smaller when the within-unit correlation is higher, being generally less than 5% when = 0.8, cf. Table 2. The upper plots in Figures 1 and 2 show a broadly elliptical pattern, with bias being least along the line 2 = 1 and largest along 2 = − 1 , which is consistent with the result for the AB/BA design (Matthews et al., 2014). The largest differences between the estimands seen in Table 2 arise in small parts of the ( 1 , 2 ) region, close to the extremes where 1 = − 2 : this is where the missingness probabilities depend on the change in response between successive periods, rather than their general level.
The corresponding results for the six-period Williams design are very similar for ED with, as might be anticipated, slightly lower values for percentage of CCs for the longer design (cf. Table 2).
From Table 2 and from the contour plots, it is seen that the ED is always negative, indicating that the contrast is closer to zero than . This is also consistent with the result in Matthews et al. (2014).
The missingness process corresponds to MAR when 1 = 0. In this case, the standard REML analysis of all available data would provide an unbiased estimate of HS . The ED and percentage CCs for the CC analysis when data are MAR are shown in the top row of Figure 3 for both designs and for = 0.3, 0.8. The ED is less than 5% for all 2 when = 0.8, and is less than 15% when = 0.3. In the latter case, the ED is less than 5%, provided the percentage of CCs is above about 85%. The lower row in Figure 3 shows a very similar picture when 1 varies, while 2 = 0, an instance of data MNAR. In all cases, the ED is zero when data are MCAR, that is, 1 = 2 = 0 but there is some loss of cases because of the role of 0 = −2.5. From Equation (6), the probability of a CC in this instance is Φ(− 0 ) ( −1) , which is 98% for the four-period designs and 97% for the six-period design, as confirmed in Table 2. These values change to 81% and 71% if 0 = −1.5, as can be seen in Web Appendix E.
The results are presented in terms of the percentage difference in estimands, calculated as 100 ( PSS − HS )∕ HS , but based on calculations using an assumed value for the contrast HS . Results for other values of the contrast (not shown) illustrate that the percentage difference is largely unaffected by the size of the contrast, so percentage difference is a suitable summary. This accords with the approximation derived for the AB/BA design in Matthews et al. (2014), where the estimand difference is shown to be approximately proportional to the HS estimand. If the contrast HS is zero, then the CC estimator of the contrasts appears to be zero across the ( 1 , 2 ) plane-see Web Appendix E-suggesting that in this case PSS also vanishes.

Illustrations from the TOMADO trial
The TOMADO trial (Quinnell et al., 2014) compared three mandibular advancement devices with no intervention for the treatment of mild to moderate obstructive sleep apneahypopnea, using the four-period design in Table 1: the outcome considered here is the Epworth Sleepiness Scale (ESS). Treatments 1-4 are, respectively, no intervention, self-fitted (SP1), semi-bespoke (SP2), and bespoke (bMAD).
Ninety patients were randomized in the trial, but seven did not complete even the first period and provided no data: these patients play no part in our analysis. Seventy-four patients provided complete data and some data were available on 83 patients, providing 314 observations. Of the 83 patients, 10 were allocated to sequences 2, 5, and 6, with 9 allocated to sequence 8 and 11 to each of the other sequences. Analysis of all available data using a linear mixed effects model and REML gave the estimates of 1 , … , 4 , 2 , … , 4 as 10. 65, 9.88, 9.69, 10.10, −1.51, −2.15, −2.37 with estimates for , 2 of 0.61 and 16.63, respectively. The corresponding estimates from an analysis of the 74 CCs were very similar: the period and treatment parameters were 10. 58, 9.93, 9.67, 10.09, −1.47, −2.30, −2.41, with and 2 being 0.62 and 16.43, respectively. In both analyses, the standard errors of these Percentage CCs F I G U R E 3 ED and expected percentage of complete cases for the Tomado (solid line) and six-period Williams (dashed line) designs for = 0.3 and for Tomado (dotted line) and six-period Williams (dot-dashed line) designs for = 0.8: for all cases = 2 for all . Selected contrast is 4 − 2 for the four-period design and 5 − 2 for the six-period design. Top row of plots shows ED and percentage expected number of CCs as 2 varies with 1 = 0, that is, corresponding to missing at random (MAR). Bottom row of plots shows corresponding plots as 1 varies with 2 = 0, so cases here are MNAR. In all cases, 0 = −2.5. All plots show the MCAR case when the abscissa is zero.
estimates of period parameters are approximately 0.52, and approximately 0.40 for the treatment parameters. Figure 4A,B corresponds to the contour plots in Figure 1 but assume the parameter values from the analysis of all available data (the plots are indistinguishable from those based on the CC analysis). The scale of the outcome in TOMADO, ESS, differs from that used in Figure 1, with larger means and variances. As such the region −1 < 1 , 2 < 1 would include very extreme missingness probabilities, so a range −0.2 < 1 , 2 < 0.2 is more appropriate.
Values for 0 , 1 , 2 cannot be estimated from the data. However, an illustration of the effect on the difference in the PSS and HS estimands of different values of these parameters can be provided by locating triples ( 0 , 1 , 2 ), where the expected proportion of CCs matches the observed proportion, 74/83. Starting from the MCAR case, where 1 = 2 = 0, 0 = −Φ −1 ( 3 √ 74∕83) = −1.78, the solid line in Figure 4C is the locus in the ( 1 , 2 )plane, where the expected proportion of CCs is within 0.01 of 74/83. The other lines give the corresponding locus when 0 takes the values given in the caption to Figure 4. The curves in Figure 4D show the ED for the SP1-bMAD contrast plotted against 1 , as ( 1 , 2 ) track along the corresponding locus in Figure 4C. The ED is zero for the MCAR case, and departs from zero by up to −4.5% at the extremes of the loci shown in Figure 4C. This variation in ED occurs because of the changes in the values of 1 and 2 , and not because of any change in the level of missingness, which is fixed at 74/83.

SOME COMMENTS ON DISCONNECTED DESIGNS
It was pointed out in Section 1 that missingness could mean that˜is disconnected. However, disconnection with regard to CC analyses is rather different to that in the research cited in Section 1, which generally assumes that data are MCAR and all available data are analyzed using a model with a carryover treatment effect. In this paper, where˜comprises only the CCs, and using model (1),c annot be disconnected unless there are some sequences with no CCs. To see this write Equation (3) as = ∑ , with all > 0 and note that disconnection would imply the existence of a non-zero vector such that = 0. It would follow that = 0 for all , as the are non-negative definite, so ( ∑ ) = 0, that is, is disconnected, contrary to assumption.
As a consequence, an upper bound on the probability that˜is disconnected is 1 − ∏ =1 [1 − (1 − ) ], with given in Equation (14). When the are reasonably large, as in the TOMADO trial, this bound is likely to demonstrate that there is little chance that˜is disconnected. Using parameter values from the TOMADO trial, and assuming the missingness process has = 0 for > 2, and 0 = −1.78, then the medians of the over the region −0.2 < 1 , 2 < 0.2 are around 0.85 and occur at similar points in the plane. For example, at ( 1 , 2 ) = (0.144, 0.168), the are 0.89, 0.85, 0.88, 0.87, 0.88, 0.85, 0.88, 0.88: with the specified in Section 4.2 the above upper bound on the probability of disconnection is less than 10 −7 . However, in smaller trials the upper bound given above will be much less useful: some comments on a more careful analysis are in Web Appendix F.

DISCUSSION
In this paper, we have argued that a CC analysis of a complete crossover design provides a way to estimate a PSS estimand: although couched in different terms, this is essentially the point made in Matthews et al. (2014) for the AB/BA design. This estimand will give some indication of the treatment efficacies among those able to tolerate all the treatments, so is likely to be relevant to those involved with administering the treatments in practice. A complete crossover design is able to do this in a way that a parallel group design cannot because recruits are, at some stage of the trial, offered all the treatments under investigation. It is, perhaps, surprising, that this potential advantage of crossover trials was not mentioned in the ICH E9(R1) guideline (International Council for Harmonisation, 2019). The ICH guideline discusses other estimands and these may be relevant to other users. The treatment policy strategy estimand aligns most closely with an ITT approach and it is notable that writing 35 years ago Lewis (1987) cast doubt on the relevance of the ITT principle for crossover designs. Knowledge of the HS estimand may be useful for those involved with development of the treatments, as it is the quantity that would be estimated from Equation (1) if all planned observations were made. The populations on which HS and PSS estimands are based differ by the patients who drop out of the study before they complete their allocated treatments. With our definition, an estimate of PSS can always be obtained from a CC analysis. If all available data are analyzed using REML this is true for HS , provided that any missing data are MAR.
While the method developed in Sections 2 and 3 can be applied to any complete crossover design, the numerical illustrations are largely based on a Markovian missingness process and the widely used Williams' crossover designs. Perhaps the principal observation from Figures 1 and 2 is that the relative difference between the PSS and HS estimands, ED, decreases as the correlation between observations in successive periods increases. Data are MCAR at the origin in the ( 1 , 2 ) region, where the ED must vanish, and it is important to note that the ED remains small in an extended region around the origin for both designs used for the illustration. The ED is less than 3% across most of the plotted region when = 0.8 and only exceeds 5% in relatively small regions close to ( 1 , 2 ) = (−1, 1) and (1, −1). The ED is rather larger when = 0.3 but this is probably very much at the lower end of the range of likely to be encountered in crossover trials. The justification for using a crossover trial would not be strong for such low . The value for the real TOMADO data is larger than 0.6, and the 10 values presented in Elbourne et al. (2002) for the correlations between outcomes are between 0.49 and 0.91 with a median of 0.75.
CCs are analyzed because these patients have been able to tolerate all the treatments. In practice, some patients who would have tolerated all treatments may fail to complete the trial for reasons not related to the treatment. If the investigator can be assured that the reasons for a patient dropping out of the trial are unrelated to their outcomes, that is, they are MCAR, then it may be that an analysis estimating PSS could incorporate partial information from these patients. The results in Figures 1 and 2 show that PSS is closer to zero than HS and that the two coincide if all missing data are MCAR, so it is reasonable to conjecture that this adjustment to the population on which PSS is based would reduce the discrepancy between PSS and HS . The missingness model that we have used, Equation (4), is of a form widely used in the missing data literature. The use of the probit, rather than the more usual logistic, link is a minor adjustment made for mathematical convenience. The model assumes that the chance of dropout is related to the values of the outcome variables. This would be realistic in many settings, such as pain relief trials, where patients may not tolerate ineffective treatments. However, there may be other settings where the probability of dropout might need to be modeled otherwise, perhaps by including terms for allocated treatment. This possibility is something which might be addressed in further work.
If data are MAR, that is, along the 1 = 0 axis in Figures 1, 2, and 4, then HS can be estimated using REML on all available data. Away from this axis, such estimates will be biased. The estimate of PSS obtained from a CC analysis, together with plots such as Figure 4C,D, evaluated at the estimated model parameters, may provide the analyst with some information on the likely range of values for HS . Our results suggest that this difference is unlikely to be large unless so many observations are missing that this would, in itself, undermine the credibility of the study.

A C K N O W L E D G M E N T S
This work was supported by the NIHR Research Methods Opportunity Funding Scheme (Grant number: RMOFS 2012/05]. The authors are grateful to Dr Tim Quinnell for permission to use the data from the TOMADO trial (NIHR HTA Programme, Project number 08/110/03), to Dr Michael Grayling for his helpful comments and especially to the reviewers and Associate Editor, whose comments led to a radical change in the presentation.

D ATA AVA I L A B I L I T Y S TAT E M E N T
Data sharing is not applicable to this article as no new data were created or analyzed in this paper.