Longitudinal analysis of pre‐ and post‐treatment measurements with equal baseline assumptions in randomized trials

Abstract For continuous variables of randomized controlled trials, recently, longitudinal analysis of pre‐ and posttreatment measurements as bivariate responses is one of analytical methods to compare two treatment groups. Under random allocation, means and variances of pretreatment measurements are expected to be equal between groups, but covariances and posttreatment variances are not. Under random allocation with unequal covariances and posttreatment variances, we compared asymptotic variances of the treatment effect estimators in three longitudinal models. The data‐generating model has equal baseline means and variances, and unequal covariances and posttreatment variances. The model with equal baseline means and unequal variance–covariance matrices has a redundant parameter. In large sample sizes, these two models keep a nominal type I error rate and have high efficiency. The model with equal baseline means and equal variance–covariance matrices wrongly assumes equal covariances and posttreatment variances. Only under equal sample sizes, this model keeps a nominal type I error rate. This model has the same high efficiency with the data‐generating model under equal sample sizes. In conclusion, longitudinal analysis with equal baseline means performed well in large sample sizes. We also compared asymptotic properties of longitudinal models with those of the analysis of covariance (ANCOVA) and t‐test.

between groups are often observed in clinical trials, particularly in placebo-controlled trials. This variance-covariance structure has been discussed (Yang & Tsiatis, 2001). Chen (2006) considered the variance-covariance structure with unequal variancecovariance matrices including pretreatment variances in simulation studies. Winkens, van Breukelen, Schouten, and Berger (2007) also examined the structure with equal pretreatment variances and unequal covariances and posttreatment variances analytically, but they used the completely heterogeneous covariance matrices in their data analysis. Both Chen (2006) and Winkens et al. (2007) commented on the efficiency loss caused by not assuming equal pretreatment variances. The variancecovariance structure with equal variance-covariance matrices is also used (Chen, 2006;Winkens et al., 2007). The differences in asymptotic properties between the models with and without the constraints of equal baseline assumptions remain unclear. We can consider several longitudinal models based on the assumptions on pretreatment means and elements of variance-covariance matrices. Liang and Zeger (2000) used the model with equal pretreatment means under random allocation, and we also use this mean structure. Chen (2006), Winkens et al. (2007), and Yang and Tsiatis (2001) studied longitudinal analysis of pre-and posttreatment measurements in randomized trials and compared them with ANCOVAs or t-tests. However, detailed comparisons among various models have not been conducted.
We compare the properties of models. The ANCOVA has higher power to detect a treatment effect than the t-test under standard assumptions. In previous studies, we examined the asymptotic properties of four ANCOVA models under the cases that covariances and posttreatment variances differ between groups Funatogawa, Funatogawa, & Shyr, 2011). The actual type I error rate of the usual ANCOVA with equal slopes (coefficients of pretreatment measurements) and equal residual variances is asymptotically at a nominal level only under equal sample sizes, but that of the ANCOVA with equal slopes and unequal residual variances is asymptotically at a nominal level even under unequal sample sizes . In unequal sample sizes, an assumption of unequal residual variances is important. However, the efficiency of the latter model is relatively low. The ANCOVA with unequal slopes has higher efficiency but cannot keep a nominal type I error rate irrespective of variance assumptions . Yang and Tsiatis (2001) showed the longitudinal model with equal baseline means and variances and unequal covariances and posttreatment variances has the same efficiency with the ANCOVA with unequal slopes. It would be beneficial to determine whether there are methods which have a high efficiency and keep a nominal type I error rate.
In this paper, we consider three longitudinal models and compare these with four ANCOVAs and the t-test on change and the t-test on posttreatment measurements. We investigate whether these models asymptotically keep a nominal type I error rate and the order of efficiency analytically based on the asymptotic variances of the treatment effect estimators and the model-based asymptotic variances. We also conduct simulation studies using the Kenward-Roger approximation (Kenward & Roger, 1997) for longitudinal models. The structure of this paper is as follows. In Section 2, we show a motivating example. In Section 3, we show the asymptotic properties of longitudinal models as well as the ANCOVAs and t-tests. In Section 4, we compare longitudinal models with the ANCOVAs and t-tests through simulation studies and actual data analysis. In Section 5, we offer a conclusion and discussion.

A MOTIVATING EXAMPLE
As an actual example of unequal covariances and posttreatment variances between groups, we show the results of a placebocontrolled, randomized trial of succimer in children (Rogan et al., 2001). The blood lead levels of a subsample of 100 children are analyzed in Fitzmaurice, Laird, and Ware (2004). The sample sizes are equal between two groups.  and  used these data at baseline and one week after administration as an example of unequal covariances and posttreatment variances in order to show the properties of ANCOVAs under random allocation. In this paper, we added small random values to provide the data. Because unequal sample sizes affect the results, we produced the data with unequal sample sizes randomly reducing half of data from one group, as shown in Figure 1. The left panel shows all succimer data and half of the placebo data, and the right panel shows half of the succimer data and all placebo data. We analyze these data in Section 4.
The means and variances of the pretreatment measurements were almost the same between groups because of random allocation, whereas the covariances and the means and variances of the posttreatment measurements obviously differed. The data show that changes after the intervention were small in the placebo group, and the correlation between the pre-and posttreatment measurements was relatively strong. The changes after the intervention were large in the succimer group, and the amounts of change differed among individuals. Then, the correlation of pre-and posttreatment measurements was weak. The variance of posttreatment measurements was larger than that of the placebo group, and also than that of pretreatment measurements.

Data-generating model
We describe a data-generating model assuming only random allocation. We consider a randomized trial of two treatment groups. Let denote, for the -th ( = 1, 2) treatment group, the measurement of the -th ( = 1, … , ) subject at the -th ( = 1, 2) time. 1 and 2 are pre-and posttreatment measurements, respectively. We assume that the pair ( 1 and 2 ) has finite second moments and is distributed with the following mean and variance-covariance matrix: .
Random allocation guarantees equal values for pre and 2 pre between groups, but not for post , cov , or 2 post . Because pre is equal between groups, 2post − 1post can be considered as the treatment effect. Here, we do not assume normality.

Longitudinal models and asymptotic properties
We consider three longitudinal models for pre-and posttreatment measurements under random allocation in this section and four ANCOVAs additionally in the next section for comparison. Table 1 shows the summary of these models. The first longitudinal model is the data-generating model, and we abbreviate it EMVUV from Equal baseline Means and Variances and Unequal covariances and posttreatment Variances. The model is is a random error. Here, int = pre , int + t = 1post , and int + t + trt = 2post . t is the time effect in the first group and t + trt is the time effect in the second group. We are interested in trt = 2post − 1post . It is the difference in the posttreatment means between groups.
In EMUV, Equal baseline Means and Unequal Variance matrices, the variance-covariance matrices differ between groups. This model has a redundant parameter, because 2 1pre and 2 2pre are used instead of common 2 pre . In EMEV, Equal baseline Means and Equal Variance matrices, the variance-covariance matrices are common between groups. This model is incorrect under the situation that the covariances and posttreatment variances are different between groups, because cov and 2 post are used instead of 1cov , 2cov , 2 1post , and 2 2post . For each model, the maximum likelihood estimators of are obtained bŷ= The model-based asymptotic variance of̂t rt is given by the corresponding element of Cov (̂) = ( ′ −1 ) −1 .
We provide the asymptotic variances of̂t rt for the three longitudinal models. Table 2 summarizes the asymptotic variances. Here and throughout, we neglect o(1∕ ) terms with = 1 + 2 . For EMVUV, the asymptotic variance of̂t rt is given as formula (I) in Table 2. EMUV has a redundant parameter. The asymptotic variance of̂t rt is and it is the same as formula (I) under random allocation with 2 1pre = 2 2pre . For EMEV, the assumed model is incorrect. The asymptotic variance of̂t rt under random allocation is the 3,3-th element of , and it is given as formula (II) in Table 2. However, the model-based asymptotic variance of̂t rt under random allocation is the 3,3-th element of E[( ′ −1 ) −1 ]. Table 3 shows model-based asymptotic variances which are biased from asymptotic variances given in Table 2, and the bias, model-based asymptotic variance minus asymptotic variance. For EMEV, there are both conservative and liberal cases under unequal sample sizes, and there is no bias under equal sample sizes. Under equal sample sizes, the asymptotic variances of EMVUV, EMUV, and EMEV are the same and these are given as formula (VI) in Table 2.
T A B L E 2 Asymptotic variances of the treatment effect estimators for the longitudinal models, ANCOVAs, and t-tests under random allocation in the cases of (A) arbitrary allocation ratios and (B) equal sample sizes

ANCOVA_ESUV
( where * 1 = 1 ∕ 2 1res , * 2 = 2 ∕ 2 2res , * = 2 2 − * 1 ( 1 + 2 )∕( * 1 + * 2 ), and In the ANCOVA with unequal slopes, the difference of expected values of posttreatment measurements between groups depends on pretreatment measurements, and the treatment effect is often estimated at the observed mean of pretreatment measurements. In the ANCOVA of unequal slopes with equal residual variances, ANCOVA_USEV, the OLS estimator of the treatment effect at the observed mean is a consistent estimator for the treatment effect at the true mean (Yang & Tsiatis, 2001). This OLS estimator is identical with the GLS estimator for the ANCOVA of unequal slopes with unequal residual variances, ANCOVA_USUV. The asymptotic variance of the treatment effect estimator is the same as formula (I) in Table 2 Yang & Tsiatis, 2001). However, it differs from the model-based asymptotic variances given in Table 3. In ANCOVA_USUV irrespective of equal or unequal sample sizes and ANCOVA_USEV under equal sample sizes, the bias is ( 1cov − 2cov ) 2 {( 1 + 2 ) 2 pre } −1 , and the model-based variances are always underestimated and the tests are liberal. This is caused because it is not corrected for estimating the unknown pretreatment expectation (Chen, 2006;Winkens et al., 2007). In ANCOVA_USEV under unequal sample sizes, there are both conservative and liberal cases.
The point estimate of the t-test on change with unequal variances, t-test_ChangeUV, is the same as that with equal variances, t-test_ChangeEV. Under random allocation with 2 1pre = 2 2pre , it is given as formula (IV) or (VIII) for 1 = 2 in Table 2. The point estimate of the t-test on posttreatment measurements with unequal variances, t-test_PostUV, is the same as that with equal variances, t-test_PostEV, and the asymptotic variance of the treatment effect estimator is given as formula (V) or (IX) for T A B L E 3 Biased model-based asymptotic variances of the treatment effect estimators, bias, and bias under equal sample sizes

Methods
Model-based asymptotic variances of̂, Bias a , Bias under =

EMEV ANCOVA_ESEV
( , 0 a The bias is calculated by model-based asymptotic variance minus asymptotic variance.

Comparison of asymptotic variances
In this section, we compare the asymptotic variances of the treatment effect estimators. Let I to V be the asymptotic variances for formulae (I)-(V) shown in Table 2. We compare I with the other asymptotic variances. V II -V I is Therefore, I ≤ II , and equality holds under equal sample sizes. Although equality also holds under equal covariances, we consider data with unequal covariances in this paper. The formula (1) is the same as the difference of the asymptotic variances between the ANCOVA_ESEV and ANCOVA_USEV in . I ≤ III from the difference of the asymptotic variances between the ANCOVA_ESUV and ANCOVA_USEV in .

Simulation studies
For continuous pre-and posttreatment measurements in randomized trials, we compared the following methods: three longitudinal models, four ANCOVAs, and four -tests. We compared actual type I error rates of these methods with a two-sided nominal type I error rate of 5% and root relative mean squared errors (RRMSE) in simulated experiments. We simulated 100,000 data sets with unequal covariances and posttreatment variances by the following model: where is, for the -th ( = 1, 2) treatment group, the measurement of the -th ( = 1, … , ) subject at the -th ( = 1, 2) time. MVN represents a multivariate normal distribution. If an actual type I error rate is 5%, the 95% confidence interval is 4.86-5.14% with 100,000 data sets.
We set the parameters of variance-covariance matrices as ( 2 pre , cov , 2 post ) = (25, 15, 59) for Group 1 ( =1) and (25, 23, 30) for Group 2 ( = 2). For the parameter setting, we refer to the estimates of the blood lead data introduced in Section 2, as well as  and . Note that the means and variances of pretreatment measurements are equal between groups. In this parameter setting, the residual variances of Group 1 and Group 2 are about 50 and 10 in the ANCOVA, 54 and 9 in the -test on change and 59 and 30 in the -test on posttreatment measurements. In each method, Group 1 has a larger variance. The slopes in the ANCOVA are 15/25 = 0.6 and 23/25 ≈ 0.9, respectively, and the correlation coefficients between pre-and posttreatment measurements are 15∕( The sample sizes in each group are set to ( 1 , 2 ) = (300, 300), (400, 200), and (200, 400) as large sample-size situations. In the first case, sample sizes are equal, in the second case, a group with a large sample size has a larger variance ( 1 > 2 ), and in the last case, a group with a larger sample size has a smaller variance ( 1 < 2 ). Similarly, the numbers are also set to ( 1 , 2 ) = (45, 45), (60, 30), and (30, 60) as moderate sample-size situations. We use the Kenward-Roger approximation (Kenward & Roger, 1997) for longitudinal models and the Satterthwaite approximation for the degrees of freedom (Satterthwaite, 1946) for the ANCOVAs and t-tests. To obtain actual type I error rates, we calculate the proportions of data sets in which a significant difference was detected under 1post = 2post . Note that the proportions for which the true treatment effect is included in the 95% confidence intervals (coverage proportion) are expressed by subtracting the actual type I error rates (%) from 100% under 1post ≠ 2post . In the simulation studies and analysis of numerical examples in the next section, we use the SAS 9.4, SAS proc MIXED for the longitudinal models and ANCOVAs, SAS proc ttest for the t-tests. The program codes of the longitudinal models are provided in the Appendix. The program codes of the ANCOVAs are provided in . Source code to reproduce the results is available as Supporting Information on the journal's web page (http://onlinelibrary.wiley.com/doi/10.1002/bimj.201800389/suppinfo).
The results are shown in Table 4. In large sample sizes, the actual type I error rates of the two longitudinal models with unequal variances (EMVUV and EMUV) were close to the nominal level, even in unequal sample sizes. The ANCOVA with equal slopes and unequal residual variances (ANCOVA_ESUV), and the t-tests with unequal variances were also close to the nominal level. The actual type I error rates of the longitudinal models with equal variance matrices (EMEV), ANCOVA_ESEV, and the t-tests with equal variances were close to the nominal level only in equal sample sizes. The ANCOVAs with unequal slopes (ANCOVA_USs: ANCOVA_USUV and ANCOVA_USEV) did not keep a nominal type I error rate even in equal sample sizes. The actual type I error rates of the methods with equal variances (EMEV, ANCOVA_ESEV, ANCOVA_USEV, and t-tests with equal variances) were less than 5%, that is conservative, when the group with a larger variance had a larger sample size ( 1 > 2 ), and these were more than 5%, that is liberal, when the group with a larger variance had a smaller sample size ( 1 < 2 ).
The RRMSEs of EMVUV and ANCOVA_USs (ANCOVA_USUV and ANCOVA_USEV) were the smallest in large sample sizes irrespective of allocation ratios. However, ANCOVA_USs did not keep a nominal type I error rate. The RRMSEs of EMUV, which has redundant unequal baseline variances, were the second smallest. In equal sample sizes, the RRMSEs of EMEV and ANCOVA_ESEV were also the second smallest. The RRMSEs of the best and second-best models were similar, and the differences were too small to be detected in Table 4. The ANCOVA with equal slopes and unequal residual variances (ANCOVA_ESUV) was less efficient compared to EMVUV, EMUV, and ANCOVA_USs, but the loss of efficiency was small compared to the t-tests on change. The t-test on posttreatment measurements was not efficient.
In moderate sample sizes, the results were similar with those of large sample sizes. Under ( 1 , 2 ) = (30, 60), the actual type I error rates were 5.11 for EMVUV and EMUV, and slightly larger than the nominal level of 5%. Under ( 1 , 2 ) = (45, 45), the actual type I error rates were slightly larger than the nominal level for the methods with equal variances (EMEV, ANCOVA_ESEV, t-test_ChangeEV, and t-test_PostEV). Under ( 1 , 2 ) = (30, 60), EMEV and ANCOVA_ESEV showed slightly better efficiency than EMVUV, EMUV, and ANCOVA_USs, but these models with equal variances did not keep a nominal type I error rate.
T A B L E 4 Actual type I error rates and RRMSEs in the simulation studies for the data with unequal covariances and unequal variances of posttreatment measurements under random allocation with large and moderate sample sizes We then examined the influences of unequal sample sizes reducing half of data from one group; ( 1 , 2 ) = (50, 25) as the case a group with a large sample size has a larger variance and ( 1 , 2 ) = (25, 50) as the case a group with a large sample size has a smaller variance. These data are shown in Figure 1. The SEs of the methods with unequal variances (EMVUV, EMUV, ANCOVA_USUV, ANCOVA_ESUV, and t-test_ChangeUV) under ( 1 , 2 ) = (50, 25) were smaller than those under ( 1 , 2 ) = (25, 50). These SEs under reduced sample sizes were larger than those under equal sample sizes, ( 1 , 2 ) = (50, 50). In contrary, the SEs of the methods with equal variances (EMEV, ANCOVA_USEV, ANCOVA_ESEV, and t-test_ChangeEV) under ( 1 , 2 ) = (50, 25) were larger than those under ( 1 , 2 ) = (25, 50) and the SEs of the above methods with unequal variances under ( 1 , 2 ) = (50, 25). The SEs of the methods with equal variances under ( 1 , 2 ) = (25, 50) were even smaller than those under the equal sample sizes ( 1 , 2 ) = (50, 50), although the former data are the reduced sample from the latter data. These results were caused by the wrong assumption of equal variances, and these models were conservative under 1 > 2 and liberal under 1 < 2 . These results of these numerical examples correspond to those of the simulation studies. Because the t-test on posttreatment measurements does not take into account the baseline measurements and the ratio of posttreatment variances was not relatively large in this example, the results of this method differed from those of the other methods.

DISCUSSION
Longitudinal models with equal baseline means and unequal covariances and posttreatment variances (EMVUV and EMUV) asymptotically keep a type I error rate and these are efficient in randomized trial. The assumption of equal baseline variances is not important for efficiency. Statistical models asymptotically do not keep nominal type I error rates if the model-based asymptotic variance differs from the asymptotic variance, and these models are shown in Table 3 including ANCOVAs and t-tests. The discrepancies occur by two reasons: the equal variance assumption and unequal slope assumption. Under equal sample sizes, the discrepancies caused by the equal variance assumption disappear, but the discrepancies caused by equal slope assumption still exist. The longitudinal model with equal variance matrices (EMEV) asymptotically keeps a type I error rate and its efficiency is the same as EMVUV and EMUV under equal sample sizes, but it does not keep a type I error rate under unequal sample sizes.  provided the details of the order of the asymptotic efficiency for the treatment effect estimators among ANCOVAs. Based on Tables 2 and 3, EMVUV and EMUV correspond to ANCOVAs with unequal slopes (ANCOVA_USUV and ANCOVA_USEV) regarding only the point estimate of the treatment effect. ANCOVA_USUV and ANCOVA_USEV have the same high efficiency as EMVUV and EMUV, but asymptotically do not keep a type I error rate. EMEV corresponds to the ANCOVA with equal slopes and equal residual variances (ANCOVA_ESEV) regarding the point estimate and the model-based variances. EMEV and ANCOVA_ESEV have the same characteristics. The ANCOVA with equal slopes and unequal residual variances (ANCOVA_ESUV) asymptotically keeps a type I error rate and the loss of efficiency is small under either equal or unequal sample sizes. When there are missing data, the longitudinal models use all observed data, but the ANCOVAs and t-tests on change use only paired data.
Unequal allocation of patients to treatments is sometimes applied in clinical trials. The following properties are known for the t-test with equal variances, the Student's t-test (Algina, 2005). In unequal sample sizes and unequal variances, the actual type I error rate is not at a nominal. The actual rate is over a nominal level when a group with a large sample size has a smaller variance, and that is under a nominal level when a group with a large sample size has a larger variance. These properties are applied to the ANCOVAs with equal residual variances  and the longitudinal model with equal variance matrices. Because large discrepancies from the nominal rate can occur, the longitudinal model with equal variance matrices should not be used when the sample sizes and covariances and variances of posttreatment measurements are unequal between groups.  discussed a conceptual data-generating model under random allocation with two random subject effects for the true pre-and posttreatment measurements, respectively. The number of the variance parameters in this model is too large to estimate from the pre-and posttreatment measurements data. Crager (1987) considered a model assuming a common random subject effect for the true pre-and posttreatment measurements. The variance-covariance matrices are equal between two groups, and the model corresponds to EMEV in this paper.
In this paper, we examined longitudinal models with equal baseline means because of random allocation. Under random allocation, the longitudinal model with unequal baseline means and unequal variance-covariance matrices has redundant parameters. Let pre be the pretreatment mean of the -th treatment group. The treatment effect is estimated by ( 2post − 2pre ) − ( 1post − 1pre ) based on this model. It corresponds to the t-test on change with unequal variances regarding the point estimate and the model-based variance, and those are identical if there are no missing data. The asymptotic variance is ( Under random allocation with 2 1pre = 2 2pre , the asymptotic variance is given as formula (IV) for arbitrary allocation ratios or formula (VIII) for 1 = 2 in Table 2. The treatment effect is also estimated by 2post − 1post based on this model. It corresponds to the t-test on posttreatment measurements with unequal variances, and those are identical if there are no missing data. These are less efficient. The assumption of equal baseline means is important for efficiency.