Sample size planning should reflect the primary objective of a trial. If the primary objective is prediction, the sample size determination should focus on prediction accuracy instead of power. We present formulas for the determination of training set sample size for survival prediction. Sample size is chosen to control the difference between optimal and expected prediction error. Prediction is carried out by Cox proportional hazards models. The general approach considers censoring as well as low-dimensional and high-dimensional explanatory variables. For dimension reduction in the high-dimensional setting, a variable selection step is inserted. If not all informative variables are included in the final model, the effect estimates are biased towards zero. The bias affects the prediction error, and its magnitude is influenced by the sample size. For variable selection, we consider two approaches: least absolute shrinkage and selection operator (LASCO) and univariable selection. For univariable selection, we can calculate input parameters for the sample size formula. For the LASCO, supportive simulations are necessary to appropriately choose the input parameters. We investigate the performance of the proposed formulas with the use of simulations. Simulation results support the validity of the sample size formulas. An application of a real data example illustrates the practical implementation of the method. Copyright © 2012 John Wiley & Sons, Ltd.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.