Volume 18, Issue 10
Research Article
Free Access

A comparison of methods for analysing regression models with both spectral and designed variables

Kjetil Jørgensen

Corresponding Author

E-mail address: kjetil.jorgensen@tine.no

TINE BA, Center for R&D, PO Box 50, N‐4358 Kleppe, Norway

Department of Chemistry, Biotechnology and Food Science, Section for Bioinformatics and Analytical Methods, Norwegian University of Life Sciences, N‐1432 Ås, Norway

TINE BA, Center for R&D, PO Box 50, N‐4358 Kleppe, Norway.Search for more papers by this author
Vegard Segtnan

MATFORSK, Osloveien 1, N‐1430 Ås, Norway

Search for more papers by this author
Kari Thyholt

Mills DA, PO Box 4644 Sofienberg, N‐0506 Oslo, Norway

Search for more papers by this author
Tormod Næs

MATFORSK, Osloveien 1, N‐1430 Ås, Norway

Department of Mathematics, University of Oslo, Blindern, Oslo, Norway

Search for more papers by this author
First published: 31 March 2005
Citations: 37

Abstract

In many situations one performs designed experiments to find the relationship between a set of explanatory variables and one or more responses. Often there are other factors that influence the results in addition to the factors that are included in the design. To obtain information about these so‐called nuisance factors, one can sometimes measure them using spectroscopic methods. The question then is how to analyze this kind of data, i.e. a combination of an orthogonal design matrix and a spectroscopic matrix with hundreds of highly collinear variables. In this paper we introduce a method that is an iterative combination of partial least squares (PLS) and ordinary least squares (OLS) and compare its performance with other methods such as direct PLS, OLS and a combination of principal component analysis and least squares. The methods are compared using two real data sets and using simulated data. The results show that the incorporation of external information from spectroscopic measurements gives more information from the experiment and lower variance in the parameter estimates. We also find that the introduced algorithm separates the information from the spectral and design matrices in a nice way. It also has some advantages over PLS in showing lower bias and being less influenced by the relative weighting of the design and spectroscopic variables. Copyright © 2005 John Wiley & Sons, Ltd.

Number of times cited according to CrossRef: 37

  • A note on spectral data simulation, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2020.103979, 200, (103979), (2020).
  • Data Preprocessing for Multiblock Modelling – A Systematization with New Methods, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2020.103959, (103959), (2020).
  • The Sequential and Orthogonalized PLS Regression for Multiblock Regression, Data Fusion Methodology and Applications, 10.1016/B978-0-444-63984-4.00006-5, (157-177), (2019).
  • A comparison of two PLS‐based approaches to structural equation modeling, Journal of Chemometrics, 10.1002/cem.3105, 33, 3, (2019).
  • Orthogonal LS-PLS approach to ship fuel-speed curves for supporting decisions based on operational data, Quality Engineering, 10.1080/08982112.2018.1537445, (1-15), (2019).
  • Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data, BMC Bioinformatics, 10.1186/s12859-018-2311-2, 19, 1, (2018).
  • Predicting the unpredictable: Consideration of human and organisational factors in maintenance prognostics, Journal of Loss Prevention in the Process Industries, 10.1016/j.jlp.2018.03.008, 54, (131-145), (2018).
  • The Sequential Multi-block PLS algorithm (SMB-PLS): Comparison of performance and interpretability, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2018.07.005, 180, (72-83), (2018).
  • Establishing the optimal blocks' order in SO‐PLS: Stepwise SO‐PLS and alternative formulations, Journal of Chemometrics, 10.1002/cem.3032, 32, 8, (2018).
  • Extension of SO-PLS to multi-way arrays: SO-N-PLS, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2017.03.002, 164, (113-126), (2017).
  • Estimation of water quality by UV/Vis spectrometry in the framework of treated wastewater reuse, Water Science and Technology, 10.2166/wst.2017.096, 76, 3, (633-641), (2017).
  • Prediction of 2-acetyl-1-pyrroline content in grains of Thai Jasmine rice based on planting condition, plant growth and yield component data using chemometrics, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2016.06.008, 156, (203-210), (2016).
  • Quantification of Chemical Characteristics of Olive Fruit and Oil of cv Cobrançosa in Two Ripening Stages Using MIR Spectroscopy and Chemometrics, Food Analytical Methods, 10.1007/s12161-014-0017-2, 8, 6, (1490-1498), (2014).
  • Alternative methods for combining information about products, consumers and consumers’ acceptance based on path modelling, Food Quality and Preference, 10.1016/j.foodqual.2013.08.011, 31, (142-155), (2014).
  • A Review of Feature Extraction Software for Microarray Gene Expression Data, BioMed Research International, 10.1155/2014/213656, 2014, (1-15), (2014).
  • SO-PLS as an exploratory tool for path modelling, Food Quality and Preference, 10.1016/j.foodqual.2014.03.008, 36, (122-134), (2014).
  • Chemometrics quality assessment of wastewater treatment plant effluents using physicochemical parameters and UV absorption measurements, Journal of Environmental Management, 10.1016/j.jenvman.2014.03.006, 140, (33-44), (2014).
  • Multi-block regression based on combinations of orthogonalisation, PLS-regression and canonical correlation analysis, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2013.03.006, 124, (32-42), (2013).
  • Multi-group PLS Regression: Application to Epidemiology, New Perspectives in Partial Least Squares and Related Methods, 10.1007/978-1-4614-8283-3_17, (243-255), (2013).
  • Mycotoxicogenic fungal inhibition by innovative cheese cover with aromatic plants, Journal of the Science of Food and Agriculture, 10.1002/jsfa.5859, 93, 5, (1112-1118), (2012).
  • Microarray‐based prediction of Parkinson's disease using clinical data as additional response variables, Statistics in Medicine, 10.1002/sim.5588, 31, 30, (4369-4381), (2012).
  • Preference mapping by PO-PLS: Separating common and unique information in several data blocks, Food Quality and Preference, 10.1016/j.foodqual.2011.08.003, 24, 1, (8-16), (2012).
  • Incorporating interactions in multi‐block sequential and orthogonalised partial least squares regression, Journal of Chemometrics, 10.1002/cem.1406, 25, 11, (601-609), (2011).
  • The use of spectroscopic measurements from full scale industrial production to achieve stable end product quality, LWT - Food Science and Technology, 10.1016/j.lwt.2011.05.014, 44, 10, (2266-2272), (2011).
  • Path modelling by sequential PLS regression, Journal of Chemometrics, 10.1002/cem.1357, 25, 1, (28-40), (2010).
  • Multiple Regression, Principal Components Regression and Partial Least Squares Regression, Statistics for Sensory and Consumer Science, 10.1002/9780470669181, (227-247), (2010).
  • Relating Sensory Data to Other Measurements, Statistics for Sensory and Consumer Science, 10.1002/9780470669181, (67-77), (2010).
  • Preference Mapping for Understanding Relations between Sensory Product Attributes and Consumer Acceptance, Statistics for Sensory and Consumer Science, 10.1002/9780470669181, (127-153), (2010).
  • A co‐training algorithm for multi‐view data with applications in data fusion, Journal of Chemometrics, 10.1002/cem.1233, 23, 6, (294-303), (2009).
  • Prediction of sensory quality in raw carrots (Daucus carota L.) using multi-block LS-ParPLS, Food Quality and Preference, 10.1016/j.foodqual.2008.03.007, 19, 7, (609-617), (2008).
  • The use of LS–PLS for improved understanding, monitoring and prediction of cheese processing, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2008.03.001, 93, 1, (11-19), (2008).
  • Regression models with process variables and parallel blocks of raw material measurements, Journal of Chemometrics, 10.1002/cem.1169, 22, 8, (443-456), (2008).
  • Combining designed experiments with several blocks of spectroscopic data, Chemometrics and Intelligent Laboratory Systems, 10.1016/j.chemolab.2007.04.002, 88, 2, (154-166), (2007).
  • Simultaneous UV–vis spectrophotometric determination of disperse dyes in textile wastewater by partial least squares and principal component regression, Dyes and Pigments, 10.1016/j.dyepig.2006.01.045, 73, 3, (368-376), (2007).
  • Optimising production cost and end‐product quality when raw material quality is varying, Journal of Chemometrics, 10.1002/cem.1043, 21, 10‐11, (440-450), (2007).
  • Screening of acrylamide contents in potato crisps using process variable settings and near‐infrared spectroscopy, Molecular Nutrition & Food Research, 10.1002/mnfr.200500260, 50, 9, (811-817), (2006).
  • Split‐plot regression models with both design and spectroscopic variables, Journal of Chemometrics, 10.1002/cem.959, 19, 9, (521-531), (2006).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.