Model choice and squared prediction errors in PLS regression

Authors


Abstract

Squared prediction errors (SPE) in equation image are discussed in relation to the conventional PLSR versus bidiagonalization model and algorithm issue concerning residual and prediction consistency, with focus on process monitoring and fault detection. Our analysis leads to the conclusion that conventional PLSR based on the NIPALS algorithm is ambiguous in SPE values caused by process faults. The basic reason for this is that the sample residuals are not found as projections onto the orthogonal complement of the space where the scores and regression solution are located, and where also the statistical equation image limit is defined. The alternative non-orthogonalized PLSR and bidiagonalization (Bidiag2) algorithms, as well as a simple re-formulation of the NIPALS algorithm (RE-PLSR), give unambiguous SPE values, and the last two of these also retain orthogonal score vectors. While prediction results from all of these methods in theory are identical, our conclusion is that methods where the equation image and SPE values for process faults are uncorrelated should be preferred. Tests with added equation image errors on real data do not indicate that this conclusion should be altered because of such errors. Copyright © 2011 John Wiley & Sons, Ltd.

Ancillary