Research Article
A systematic evaluation of the benefits and hazards of variable selection in latent variable regression. Part I. Search algorithm, theory and simulations
Article first published online: 14 MAY 2002
DOI: 10.1002/cem.730
Copyright © 2002 John Wiley & Sons, Ltd.
Additional Information
How to Cite
Baumann, K., Albert, H. and von Korff, M. (2002), A systematic evaluation of the benefits and hazards of variable selection in latent variable regression. Part I. Search algorithm, theory and simulations. Journal of Chemometrics, 16: 339–350. doi: 10.1002/cem.730
Publication History
- Issue published online: 14 MAY 2002
- Article first published online: 14 MAY 2002
- Manuscript Accepted: 11 MAR 2002
- Manuscript Revised: 8 OCT 2001
- Manuscript Received: 2 JAN 2001
- Abstract
- References
- Cited By
Keywords:
- cross-validation;
- variable selection;
- PLS;
- PCR;
- tabu search
Abstract
Variable selection is an extensively studied problem in chemometrics and in the area of quantitative structure–activity relationships (QSARs). Many search algorithms have been compared so far. Less well studied is the influence of different objective functions on the prediction quality of the selected models. This paper investigates the performance of different cross-validation techniques as objective function for variable selection in latent variable regression. The results are compared in terms of predictive ability, model size (number of variables) and model complexity (number of latent variables). It will be shown that leave-multiple-out cross-validation with a large percentage of data left out performs best. Since leave-multiple-out cross-validation is computationally expensive, a very efficient tabu search algorithm is introduced to lower the computational burden. The tabu search algorithm needs no user-defined operational parameters and optimizes the variable subset and the number of latent variables simultaneously. Copyright © 2002 John Wiley & Sons, Ltd.

1099-128X/asset/CEM_left.gif?v=1&s=bf7a32b94d86cfd950babd255fbe81e66d033e4b)
1099-128X/asset/CEM_right.gif?v=1&s=4630211ecefb8b6241dad7b782e7b742d7a9891a)
1099-128X/asset/cover.gif?v=1&s=2e3045c3733baa4258989f44bd61b29dd74ee736)