Power and sample size when multiple endpoints are considered
Version of Record online: 2 AUG 2007
Copyright © 2007 John Wiley & Sons, Ltd.
Special Issue: Special Issue on Multiplicity in Pharmaceutical Statistics
Volume 6, Issue 3, pages 161–170, July/September 2007
How to Cite
Senn, S. and Bretz, F. (2007), Power and sample size when multiple endpoints are considered. Pharmaceut. Statist., 6: 161–170. doi: 10.1002/pst.301
- Issue online: 24 AUG 2007
- Version of Record online: 2 AUG 2007
- multiple testing;
- multiple endpoints;
A common approach to analysing clinical trials with multiple outcomes is to control the probability for the trial as a whole of making at least one incorrect positive finding under any configuration of true and false null hypotheses. Popular approaches are to use Bonferroni corrections or structured approaches such as, for example, closed-test procedures. As is well known, such strategies, which control the family-wise error rate, typically reduce the type I error for some or all the tests of the various null hypotheses to below the nominal level. In consequence, there is generally a loss of power for individual tests. What is less well appreciated, perhaps, is that depending on approach and circumstances, the test-wise loss of power does not necessarily lead to a family wise loss of power. In fact, it may be possible to increase the overall power of a trial by carrying out tests on multiple outcomes without increasing the probability of making at least one type I error when all null hypotheses are true.
We examine two types of problems to illustrate this. Unstructured testing problems arise typically (but not exclusively) when many outcomes are being measured. We consider the case of more than two hypotheses when a Bonferroni approach is being applied while for illustration we assume compound symmetry to hold for the correlation of all variables. Using the device of a latent variable it is easy to show that power is not reduced as the number of variables tested increases, provided that the common correlation coefficient is not too high (say less than 0.75). Afterwards, we will consider structured testing problems. Here, multiplicity problems arising from the comparison of more than two treatments, as opposed to more than one measurement, are typical. We conduct a numerical study and conclude again that power is not reduced as the number of tested variables increases. Copyright © 2007 John Wiley & Sons, Ltd.