Get access

Sample size determination for comparing several survival curves with unequal allocations


  • Susan Halabi,

    Corresponding author
    1. Department of Biostatistics and Bioinformatics, Duke University Medical Center, Box 3958, Durham, NC 27710, U.S.A.
    • Department of Biostatistics and Bioinformatics, Duke University Medical Center, Box 3958, Durham, North Carolina 27710, U.S.A.
    Search for more papers by this author
  • Bahadur Singh

    1. Lineberger Comprehensive Cancer Center-UNC and Cancer Center Biostatistics, Duke University, Durham, NC 27710, U.S.A.
    Search for more papers by this author


Ahnn and Anderson derived sample size formulae for unstratified and stratified designs assuming equal allocation of subjects to three or more treatment groups. We generalize the sample size formulae to allow for unequal allocation. In addition, we define the overall probability of death to be equal to one minus the censored proportion for the stratified design. This definition also leads to a slightly different definition of the non-centrality parameter than that of Ahnn and Anderson for the stratified case. Assuming proportional hazards, sample sizes are determined for a prespecified power, significance level, hazard ratios, allocation of subjects to several treatment groups, and known censored proportion. In the proportional hazards setting, three cases are considered: (1) exponential failures–exponential censoring, (2) exponential failures–uniform censoring, and (3) Weibull failures (assuming same shape parameter for all groups)–uniform censoring. In all three cases of the unstratified case, it is assumed that the censoring distribution is the same for all of the treatment groups. For the stratified log-rank test, it is assumed the same censoring distribution across the treatment groups and the strata. Further, formulae have been developed to provide approximate powers for the test, based upon the first two or first four-moments of the asymptotic distribution.

We observe the following two major findings based on the simulations. First, the simulated power of the log-rank test does not depend on the censoring mechanism. Second, for a significance level of 0.05 and power of 0.80, the required sample size n is independent of the censoring pattern. Moreover, there is very close agreement between the exact (asymptotic) and simulated powers when a sequence of alternatives is close to the null hypothesis. Two-moment and four-moment power series approximations also yield powers in close agreement with the exact (asymptotic) power. With unequal allocations, our simulations show that the empirical powers are consistently above the target value of prespecified power of 0.80 when 50 per cent of the patients are allocated to the treatment group with the smallest hazard. Copyright © 2004 John Wiley & Sons, Ltd.

Get access to the full text of this article