An evaluation of penalised survival methods for developing prognostic models with rare events



Prognostic models for survival outcomes are often developed by fitting standard survival regression models, such as the Cox proportional hazards model, to representative datasets. However, these models can be unreliable if the datasets contain few events, which may be the case if either the disease or the event of interest is rare. Specific problems include predictions that are too extreme, and poor discrimination between low-risk and high-risk patients. The objective of this paper is to evaluate three existing penalised methods that have been proposed to improve predictive accuracy. In particular, ridge, lasso and the garotte, which use penalised maximum likelihood to shrink coefficient estimates and in some cases omit predictors entirely, are assessed using simulated data derived from two clinical datasets. The predictions obtained using these methods are compared with those from Cox models fitted using standard maximum likelihood. The simulation results suggest that Cox models fitted using maximum likelihood can perform poorly when there are few events, and that significant improvements are possible by taking a penalised modelling approach. The ridge method generally performed the best, although lasso is recommended if variable selection is required. Copyright © 2011 John Wiley & Sons, Ltd.