SEARCH

SEARCH BY CITATION

Keywords:

  • distribution;
  • over-fitting;
  • partial least squares (PLS);
  • simulation, validation

Abstract

This paper presents a modified version of the NIPALS algorithm for PLS regression with one single response variable. This version, denoted a CF-PLS, provides significant advantages over the standard PLS. First of all, it strongly reduces the over-fit of the regression. Secondly, R2 for the null hypothesis follows a Beta distribution only function of the number of observations, which allows the use of a probabilistic framework to test the validity of a component. Thirdly, the models generated with CF-PLS have comparable if not better prediction ability than the models fitted with NIPALS. Finally, the scores and loadings of the CF-PLS are directly related to the R2, which makes the model and its interpretation more reliable. Copyright © 2011 John Wiley & Sons, Ltd.