Get access

Efficient approximate k-fold and leave-one-out cross-validation for ridge regression


  • Rosa J. Meijer,

    Corresponding author
    • Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Postzone S5-P, 2300 RC Leiden, The Netherlands
    Search for more papers by this author
  • Jelle J. Goeman

Corresponding author: e-mail:, Phone: +31-71-5269722, Fax: +31-71-5268280


In model building and model evaluation, cross-validation is a frequently used resampling method. Unfortunately, this method can be quite time consuming. In this article, we discuss an approximation method that is much faster and can be used in generalized linear models and Cox’ proportional hazards model with a ridge penalty term. Our approximation method is based on a Taylor expansion around the estimate of the full model. In this way, all cross-validated estimates are approximated without refitting the model. The tuning parameter can now be chosen based on these approximations and can be optimized in less time. The method is most accurate when approximating leave-one-out cross-validation results for large data sets which is originally the most computationally demanding situation. In order to demonstrate the method's performance, it will be applied to several microarray data sets. An R package penalized, which implements the method, is available on CRAN.