Original Research Article
A PLS kernel algorithm for data sets with many variables and fewer objects. Part 1: Theory and algorithm
Version of Record online: 30 MAR 2005
Copyright © 1994 John Wiley & Sons, Ltd.
Journal of Chemometrics
Volume 8, Issue 2, pages 111–125, March/April 1994
How to Cite
Rännar, S., Lindgren, F., Geladi, P. and Wold, S. (1994), A PLS kernel algorithm for data sets with many variables and fewer objects. Part 1: Theory and algorithm. J. Chemometrics, 8: 111–125. doi: 10.1002/cem.1180080204
- Issue online: 30 MAR 2005
- Version of Record online: 30 MAR 2005
- Manuscript Accepted: 5 AUG 1993
- Manuscript Received: 22 APR 1993
- PLS regression algorithm;
- Many-variable data sets
A fast PLS regression algorithm dealing with large data matrices with many variables (K) and fewer objects (N) is presented For such data matrices the classical algorithm is computer-intensive and memory-demanding. Recently, Lindgren et al. (J. Chemometrics, 7, 45–49 (1993)) developed a quick and efficient kernel algorithm for the case with many objects and few variables. The present paper is focused on the opposite case, i.e. many variables and fewer objects. A kernel algorithm is presented based on eigenvectors to the ‘kernel’ matrix XXTYYT, which is a square, non-symmetric matrix of size N × N, where N is the number of objects. Using the kernel matrix and the association matrices XXT (N × N) and YYT (N × N), it is possible to calculate all score and loading vectors and hence conduct a complete PLS regression including diagnostics such as R2. This is done without returning to the original data matrices X and Y. The algorithm is presented in equation form, with proofs of some new properties and as MATLAB code.