A comparison of methods for analysing regression models with both spectral and designed variables
Abstract
In many situations one performs designed experiments to find the relationship between a set of explanatory variables and one or more responses. Often there are other factors that influence the results in addition to the factors that are included in the design. To obtain information about these so‐called nuisance factors, one can sometimes measure them using spectroscopic methods. The question then is how to analyze this kind of data, i.e. a combination of an orthogonal design matrix and a spectroscopic matrix with hundreds of highly collinear variables. In this paper we introduce a method that is an iterative combination of partial least squares (PLS) and ordinary least squares (OLS) and compare its performance with other methods such as direct PLS, OLS and a combination of principal component analysis and least squares. The methods are compared using two real data sets and using simulated data. The results show that the incorporation of external information from spectroscopic measurements gives more information from the experiment and lower variance in the parameter estimates. We also find that the introduced algorithm separates the information from the spectral and design matrices in a nice way. It also has some advantages over PLS in showing lower bias and being less influenced by the relative weighting of the design and spectroscopic variables. Copyright © 2005 John Wiley & Sons, Ltd.
Citing Literature
Number of times cited according to CrossRef: 37
- Maxime Metz, Alessandra Biancolillo, Matthieu Lesnoff, Jean-Michel Roger, A note on spectral data simulation, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2020.103979, 200, (103979), (2020).
- Maria P. Campos, Marco S. Reis, Data Preprocessing for Multiblock Modelling – A Systematization with New Methods, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2020.103959, (103959), (2020).
- Alessandra Biancolillo, Tormod Næs, The Sequential and Orthogonalized PLS Regression for Multiblock Regression, Data Fusion Methodology and Applications, 10.1016/B978-0-444-63984-4.00006-5, (157-177), (2019).
- Rosaria Romano, Oliver Tomic, Kristian H. Liland, Age Smilde, Tormod Næs, A comparison of two PLS‐based approaches to structural equation modeling, Journal of Chemometrics, 10.1002/cem.3105, 33, 3, (2019).
- Antonio Lepore, Biagio Palumbo, Christian Capezza, Orthogonal LS-PLS approach to ship fuel-speed curves for supporting decisions based on operational data, Quality Engineering, 10.1080/08982112.2018.1537445, (1-15), (2019).
- Caroline Bazzoli, Sophie Lambert-Lacroix, Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data, BMC Bioinformatics, 10.1186/s12859-018-2311-2, 19, 1, (2018).
- Darren McDonnell, Nora Balfe, Linda Pratto, Garret E. O'Donnell, Predicting the unpredictable: Consideration of human and organisational factors in maintenance prognostics, Journal of Loss Prevention in the Process Industries, 10.1016/j.jlp.2018.03.008, 54, (131-145), (2018).
- Julien Lauzon-Gauthier, Petre Manolescu, Carl Duchesne, The Sequential Multi-block PLS algorithm (SMB-PLS): Comparison of performance and interpretability, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2018.07.005, 180, (72-83), (2018).
- Maria P. Campos, Ricardo Sousa, Marco S. Reis, Establishing the optimal blocks' order in SO‐PLS: Stepwise SO‐PLS and alternative formulations, Journal of Chemometrics, 10.1002/cem.3032, 32, 8, (2018).
- Alessandra Biancolillo, Tormod Næs, Rasmus Bro, Ingrid Måge, Extension of SO-PLS to multi-way arrays: SO-N-PLS, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2017.03.002, 164, (113-126), (2017).
- Erwan Carré, Jean Pérot, Vincent Jauzein, Liming Lin, Miguel Lopez-Ferber, Estimation of water quality by UV/Vis spectrometry in the framework of treated wastewater reuse, Water Science and Technology, 10.2166/wst.2017.096, 76, 3, (633-641), (2017).
- Sujitra Funsueb, Chanida Krongchai, Sugunya Mahatheeranont, Sila Kittiwachana, Prediction of 2-acetyl-1-pyrroline content in grains of Thai Jasmine rice based on planting condition, plant growth and yield component data using chemometrics, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2016.06.008, 156, (203-210), (2016).
- Manuela Machado, Nelson Machado, Irene Gouvinhas, Maria Cunha, José M. M. M. de Almeida, Ana I. R. N. A. Barros, Quantification of Chemical Characteristics of Olive Fruit and Oil of cv Cobrançosa in Two Ripening Stages Using MIR Spectroscopy and Chemometrics, Food Analytical Methods, 10.1007/s12161-014-0017-2, 8, 6, (1490-1498), (2014).
- Elena Menichelli, Margrethe Hersleth, Trygve Almøy, Tormod Næs, Alternative methods for combining information about products, consumers and consumers’ acceptance based on path modelling, Food Quality and Preference, 10.1016/j.foodqual.2013.08.011, 31, (142-155), (2014).
- Ching Siang Tan, Wai Soon Ting, Mohd Saberi Mohamad, Weng Howe Chan, Safaai Deris, Zuraini Ali Shah, A Review of Feature Extraction Software for Microarray Gene Expression Data, BioMed Research International, 10.1155/2014/213656, 2014, (1-15), (2014).
- Elena Menichelli, Trygve Almøy, Oliver Tomic, Nina Veflen Olsen, Tormod Næs, SO-PLS as an exploratory tool for path modelling, Food Quality and Preference, 10.1016/j.foodqual.2014.03.008, 36, (122-134), (2014).
- S. Platikanov, S. Rodriguez-Mozaz, B. Huerta, D. Barceló, J. Cros, M. Batle, G. Poch, R. Tauler, Chemometrics quality assessment of wastewater treatment plant effluents using physicochemical parameters and UV absorption measurements, Journal of Environmental Management, 10.1016/j.jenvman.2014.03.006, 140, (33-44), (2014).
- Tormod Næs, Oliver Tomic, Nils Kristian Afseth, Vegard Segtnan, Ingrid Måge, Multi-block regression based on combinations of orthogonalisation, PLS-regression and canonical correlation analysis, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2013.03.006, 124, (32-42), (2013).
- Aida Eslami, El Mostafa Qannari, Achim Kohler, Stéphanie Bougeard, Multi-group PLS Regression: Application to Epidemiology, New Perspectives in Partial Least Squares and Related Methods, 10.1007/978-1-4614-8283-3_17, (243-255), (2013).
- Armando Moro, Celia M Librán, M Isabel Berruga, Amaya Zalacain, Manuel Carmona, Mycotoxicogenic fungal inhibition by innovative cheese cover with aromatic plants, Journal of the Science of Food and Agriculture, 10.1002/jsfa.5859, 93, 5, (1112-1118), (2012).
- Magdalena Kauczynska Karlsson, Anders Lönneborg, Solve Sæbø, Microarray‐based prediction of Parkinson's disease using clinical data as additional response variables, Statistics in Medicine, 10.1002/sim.5588, 31, 30, (4369-4381), (2012).
- Ingrid Måge, Elena Menichelli, Tormod Næs, Preference mapping by PO-PLS: Separating common and unique information in several data blocks, Food Quality and Preference, 10.1016/j.foodqual.2011.08.003, 24, 1, (8-16), (2012).
- Tormod Næs, Ingrid Måge, Vegard H. Segtnan, Incorporating interactions in multi‐block sequential and orthogonalised partial least squares regression, Journal of Chemometrics, 10.1002/cem.1406, 25, 11, (601-609), (2011).
- Ellen Mosleth Fargestad, Jorun Øyaas, Achim Kohler, Torlaug Berg, Tormod Næs, The use of spectroscopic measurements from full scale industrial production to achieve stable end product quality, LWT - Food Science and Technology, 10.1016/j.lwt.2011.05.014, 44, 10, (2266-2272), (2011).
- T. Næs, O. Tomic, B.‐H. Mevik, H. Martens, Path modelling by sequential PLS regression, Journal of Chemometrics, 10.1002/cem.1357, 25, 1, (28-40), (2010).
- Tormod Næs, Per B. Brockhoff, Oliver Tomic, Multiple Regression, Principal Components Regression and Partial Least Squares Regression, Statistics for Sensory and Consumer Science, 10.1002/9780470669181, (227-247), (2010).
- Tormod Næs, Per B. Brockhoff, Oliver Tomic, Relating Sensory Data to Other Measurements, Statistics for Sensory and Consumer Science, 10.1002/9780470669181, (67-77), (2010).
- Tormod Næs, Per B. Brockhoff, Oliver Tomic, Preference Mapping for Understanding Relations between Sensory Product Attributes and Consumer Acceptance, Statistics for Sensory and Consumer Science, 10.1002/9780470669181, (127-153), (2010).
- Mark Culp, George Michailidis, A co‐training algorithm for multi‐view data with applications in data fusion, Journal of Chemometrics, 10.1002/cem.1233, 23, 6, (294-303), (2009).
- Stine Kreutzmann, Vibeke T. Svensson, Anette K. Thybo, Rasmus Bro, Mikael A. Petersen, Prediction of sensory quality in raw carrots (Daucus carota L.) using multi-block LS-ParPLS, Food Quality and Preference, 10.1016/j.foodqual.2008.03.007, 19, 7, (609-617), (2008).
- Kjetil Jørgensen, Tormod Næs, The use of LS–PLS for improved understanding, monitoring and prediction of cheese processing, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2008.03.001, 93, 1, (11-19), (2008).
- Ingrid Måge, Bjørn‐Helge Mevik, Tormod Næs, Regression models with process variables and parallel blocks of raw material measurements, Journal of Chemometrics, 10.1002/cem.1169, 22, 8, (443-456), (2008).
- Kjetil Jørgensen, Bjørn-Helge Mevik, Tormod Næs, Combining designed experiments with several blocks of spectroscopic data, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2007.04.002, 88, 2, (154-166), (2007).
- Saliha Şahin, Cevdet Demir, Şeref Güçer, Simultaneous UV–vis spectrophotometric determination of disperse dyes in textile wastewater by partial least squares and principal component regression, Dyes and Pigments, 10.1016/j.dyepig.2006.01.045, 73, 3, (368-376), (2007).
- Ingrid Måge, Tormod Næs, Optimising production cost and end‐product quality when raw material quality is varying, Journal of Chemometrics, 10.1002/cem.1043, 21, 10‐11, (440-450), (2007).
- Vegard H. Segtnan, Agnieszka Kita, Maria Mielnik, Kjetil Jørgensen, Svein Halvor Knutsen, Screening of acrylamide contents in potato crisps using process variable settings and near‐infrared spectroscopy, Molecular Nutrition & Food Research, 10.1002/mnfr.200500260, 50, 9, (811-817), (2006).
- Ingrid Måge, Tormod Næs, Split‐plot regression models with both design and spectroscopic variables, Journal of Chemometrics, 10.1002/cem.959, 19, 9, (521-531), (2006).




