Evaluation of temperature compensation methods for a near‐infrared calibration to predict the viscosity of micellar liquids

Near‐infrared (NIR) spectroscopy is a popular technique for the measurement of chemical and physical properties in‐line using predictive models. The success of these models in industrial settings, in terms of accuracy and precision, often relies on the removal or avoidance of non‐linear spectral changes associated with fluctuating process parameters like temperature. In this work, a NIR calibration model developed to predict the viscosity of micellar liquids in‐line is used to evaluate various methods designed to account for temperature fluctuations. The viscosity of these liquids can vary on average by ±0.5 Pa s with a 1° change in temperature. The methods trialled include global linear techniques, a multivariate filter (generalised least squares weighting [GLSW]) and direct standardisation. The performances of these techniques were compared against one another based on root mean square error of prediction (RMSEP), prediction bias and rank. The best method was found to be GLSW, which was the least complex (five latent variables) and showed the lowest RMSEP (0.429 Pa s). This study provides insight into the use of recognised methods to remove temperature‐induced spectral variation in a PLS model developed to predict viscosity, where both NIR spectra and the property of viscosity itself are sensitive to temperature.

for them to perform at their best and as such calibration data needs to be free of fluctuations from external parameters like temperature and pressure.
Increases in temperature can cause molecules to leave the ground state energy level and move up to higher energy levels, consequently effecting infrared spectra where it is well known that changes in temperature result in spectral shifts and changes in absorption intensity for liquid samples, 10,11 therefore, resulting in non-linearity in the spectra. This has been particularly well documented for aqueous systems where peaks representative of hydrogen bonds show clear spectral changes with increases in temperature due to hydrogen bonds breaking and the resultant absorption of more free O-H groups. 10,12 Little work has been conducted on the other regions of the NIR spectra, particularly those regions related to C-H vibrations.
From the above, it is clear that temperature fluctuations in NIR spectra are a significant problem. The aim of this study is to explore different ways to correct for temperature fluctuations in NIR spectral data used to predict the viscosity of micellar liquids. The viscosity of micellar liquids is an important quality control parameter and as such needs to be monitored and controlled. Due to the complex microstructures presented by micellar liquids, measuring their viscosity online or inline is a difficult task to achieve, and therefore, this property is typically measured offline adding significant delays to the overall process. Inline NIR spectroscopy has shown potential in being able to predict the viscosity of micellar liquids in situ using PLS regression modelling. 13,14 As proof of principle for this application has been established, now, the challenge lies in understanding how temperature affects the spectra and thus the model and how these effects can be reduced or removed. As the viscosity of these systems is very sensitive to temperature (typically ±0.5 Pa s per 1 C), it is unknown whether these effects are as significant in the NIR spectra especially as the fluctuations in temperature (28 C-32 C) are small compared with other temperature compensation studies. [15][16][17][18][19][20][21] Furthermore, temperature compensation studies in PLS models have mainly been reported in relation to measuring chemical properties. Fu et al 15 successfully applied generalised least squares weighting (GLSW) to correct for temperature effects when measuring alcohol content in aqueous solutions proving useful for control purposes in the wine industry. The popular global implicit (GI) models have been used by Thamasopinkul et al, 16 Yao et al, 22 Zhang et al 23 and Ngowsuwan et al 24 measuring moisture and sugar content of honey, solids content in watermelon juice, fat and protein content in complex food systems and sweetness of oranges via Brix value, respectively. The latter also proved the use of direct standardisation (DS) as a potential temperature compensation method, demonstrating it to be comparable with using the GI method. A study by Swierenga et al 12 involved investigating the GI and robust variable selection methods to reduce temperature-induced variations in spectra used to measure the density of heavy oils. Spectra were acquired at 95 C, 100 C and 105 C, and both methods showed significant improvements in model performance compared with a local temperature model built at 100 C. To the best of our knowledge, there have been no other studies on the use of temperature correction techniques in PLS models used to measure physical properties. Therefore, this work looks to investigate the effect of temperature on the spectra of micellar liquids of varying viscosity where temperature is known effect the NIR spectra and the viscosity of the liquid itself. The aim is then to examine several different methods to correct for this temperature-induced spectral variation in order to improve the accuracy and precision of a PLS model developed to measure their viscosity. The methods trialled include DS, GLSW, GI modelling and inclusion of temperature as a dependant or independent variable. The paper begins with a description of the temperature compensation techniques employed in this study followed by the experimental, results and conclusion.

| GI: Global modelling-implicit
The popular GI model involves incorporating in the calibration sample data collected at numerous temperatures, covering the expected range for the application. 12,16,[25][26][27] This typically results in a more complex model due to the increase in the number of latent variables needed to describe these additional variances. A major downfall of this method is the need to collect many more spectra at various temperatures to ensure that the model space covers the range of expected temperatures.

| GX: Global modelling-temperature as an independent variable
The GX model adds the measured temperature of the sample, at the time spectra were collected, to the spectral data matrix. 17 Therefore, independent variables used to develop the calibration are made up of the spectral absorbances of interest and the temperature at which they were collected. Temperature is an important process parameter that would usually be monitored online and so can be used in the development and implementation of such models. However, in the case that temperature is not being measured during the process or at the same point as the spectra are measured, this method would not be applicable. By including temperature as an additional independent variable, it is thought that the model might perform better by recognising temperature effects more readily and offsetting them in some way. With the implementation of GX, the X block is autoscaled, instead of mean centred, to account for amplitude differences between the spectral and temperature data.

| GY: Global modelling-temperature as a dependant variable
The GY model adds the measured temperature of the sample, at the time spectra were collected, to the measured data matrix. 17 Therefore, the PLS2 algorithm is used to simultaneously develop calibration models for both the parameter of interest (viscosity) and the temperature. By having to predict the two variables simultaneously, it is thought that that the model can identify which spectral regions are affected by temperature changes. As with GX, this method cannot be used in processes where temperature is not being monitored.

| G2Step: Global modelling-temperature as a dependant variable in two-step model
The G2Step model first develops a calibration model for temperature, made up of a single latent variable. 17 Typically, the first latent variable of the PLS model describes the parameter of interest, so in this case is assumed to be rich in temperature-related spectral information. The second step involves using the residuals from the temperature calibration to develop a model for the parameter of interest (viscosity) where it is thought most information pertaining to the temperature will have already been removed. For this work, the data were not preprocessed (only mean centred) prior to step one: the development of the temperature calibration. In step two, no additional preprocessing was applied to the data either, as seen with all viscosity models developed in this study (more on this in Section 4.1). The success of this method highly depends on the relationship between the parameter of interest and temperature.
2.5 | GLSW: Multivariate filter-generalised least squares weighting GLSW downweights spectral variables using a covariance matrix generated from the differences between similar samples defined by their measured value. The extent of the weighting depends on the variable parameter alpha (α). Values for α typically range from 0.0001 to 1.0. Low α values result in increases in the weighting of the filter, whereas high values reduce the effect of the filter. A few different values of alpha were trialled (0.2, 0.02, 0.002, 0.0002 and 0.00002) so as to get the best performance from the filter without removing information related to viscosity. The optimal value of 0.0002 was found, where further decreases in α resulted in increases in model error, suggesting that variance in the data associated with viscosity was being removed. Generally, this results in the reduction in the complexity of the model as well as improvements to the predictive ability of the model.

| Direct standardisation
DS was originally developed as calibration transfer method where spectra from a secondary instrument are offset to match spectra measured on a primary instrument ensuring the same calibration model could be employed. The same procedure has been shown to remove temperature effects where spectra are normalised so they appear to have been collected at a single temperature. 24,28 The normalisation process for temperature compensation involves generating a transformation matrix from spectra acquired at a single temperature (30 C) and the remaining spectra at a range of temperatures (28 C, 29 C, 31 C and 32 C). Application of the transfer function produced to the spectra completely removes the temperature-related variance.

| Samples
A total of 55 samples were used with varying viscosity between 2 and 9 Pa s. All samples were made up of deionised water, sodium lauryl ether sulphate (SLES), cocoamidopropyl betaine (CAPB) and sodium chloride. The sample set composed of five formulations made up of differing surfactant concentrations. Spectra were recorded at 28 C, 29 C, 30 C, 31 C and 32 C (total of 275 spectra). The sample set was split into calibration and validation sets using the Kennard Stone algorithm resulting in 43 calibration samples (215 spectra) and 12 validation samples (60 spectra).

| Data collection
The spectra were acquired with a Bruker Matrix F FTNIR (Karlsruhe, Germany) fibre-coupled to a transmission process probe with a path length of 2 mm (Excalibur XP 20 [Helma, Müllheim, Germany]). The spectral range covered the whole of the NIR region (12,000-4,000 cm −1 ). Spectra were acquired at 10 cm −1 (2,074 data points), averaged over five scans, using a background of air. A water bath was used to heat the samples gradually, and spectra were acquired at 28 C, 29 C, 30 C, 31 C and 32 C (±0.2 C). Temperature was monitored with a Pico Technology PT-104 (Cambridgeshire, UK) temperature data logger using platinum resistance thermometers (PRTs).

| Data processing
PLS calibration models were developed using PLS Toolbox chemometrics software (PLS_Toolbox_8.0.1, Eigenvector Research Inc., Manson, WA, USA). Optimising the model involved trialling various preprocessing techniques and reviewing different spectral regions. The performance of the temperature compensation methods was assessed based on their root mean square error of prediction (RMSEP), providing information related to the predictive ability, the bias of prediction, a measure of systematic error and the rank of the models, representative of model complexity.

| PLS model
A preliminary study on the use of in-line NIR to predict the viscosity of micellar liquids has successfully been shown. 13 The model development follows the same method as presented in this study in terms of variable selection and preprocessing methods. As an initial assessment, this study was focused on a single micellar liquid formulation showing viscosity rich information in the spectral regions of the second overtones of O-H and C-H for the viscosity range of interest (Model 2 by Haroon et al 13 ). The present study expands that five different formulations are included in the model space, where the best model with the lowest errors of prediction was found to be concentrated in the region of 8,913-7,177 cm −1 where the only spectral feature present is the second overtone of C-H. Previously, it was stated that the variance found in the overtone of O-H may be related to the amount of bound water in the micelle structures 13 ; however, it is now thought to be representative of water content as for a single formulation in the viscosity range of interest a decrease in water content is followed by an increase in viscosity. The removal of these variables also likely reduces the overall effect of the temperature fluctuations on model performance due to the temperature sensitivity associated with hydrogen bonds being significantly higher than that of C-H bonds. 29,30 The model used in the present work uses no preprocessing (only mean centring) unlike the initial study where standard normal variate (SNV) was used to correct for scattering effects. This new model has found that scattering and absorption data contain more information than just absorption alone. 14 As the increase in viscosity of the liquids is due to the build-up of micellar networks, the scattering data are likely presenting some information related to size of these micelles.
For the temperature compensation methods below, the model being further developed uses the region of the spectra between 8,913 and 7,177 cm −1 with no additional preprocessing.

| Influence of temperature on micellar liquids and their NIR spectra
Physically, the micellar liquids of interest in this study are very sensitive to temperature; for a 1 C change, the viscosity on average can change by ±0.5 Pa s. These micellar networks are termed wormlike micelles or living polymers. They are described as long, semiflexible rods with two main characteristics: reptation-their movement when in an entangled state, likened to that of reptiles, and reversible scission meaning they are constantly breaking and recombining due to the dynamic equilibrium between the monomers and the micelles. [31][32][33][34] The living polymer model for wormlike micelles states that with increasing temperature, the average length of the micelles decreases exponentially. 35 The viscosity of the system depends on the build-up of the structure, that is, the size of the micelles; therefore, an increase in temperature will result in a decrease in viscosity as the average length of micelles reduces. As a quality control parameter, the maximum acceptable error for this measurement is ±0.5 Pa s. Temperature tends to fluctuate between 28 C and 32 C in this process-a range of 4 C resulting in the viscosity varying over 2 Pa s purely due to temperature changes. Figure 1A shows the full spectra for a micellar liquid with a measured viscosity of 4.96 Pa s (±0.20 Pa s) collected at the five temperatures of interest. The spectral changes are very small, almost unobservable at this scale. For this work, it was found that the spectral region most suited to predict the viscosity of these samples lies between 8,913 and 7,177 cm −1 (more on this in Section 4.1) where the only spectral feature present is the second overtone of C-H (8,900-7,900 cm −1 ). Figure 1B is concentrated on this region and shows that is an increase in temperature does result in small spectral shifts particularly noticeable at the lower temperatures (28 C and 29 C).
Clearly, the sensitivity of micellar liquids viscosity to temperature does not carry over into their NIR spectra as the small shifts seen in the spectra are reminiscent of what would be expected over such a small temperature range. Table 1 shows the temperature compensation methods trialled in this study along with application data specific to each method and notes about their general performance and implementation. Table 2

Model details Performance and implementation
Global modelling-implicit (GI) • Spectra at all five temperatures included in the calibration • Applicable to many situations with large or small temperature differences • Lots of spectral data required • Results in a complex model Global modelling-temperature as an independent variable (GX) • Spectra at all five temperatures included in the calibration with temperature appended to the data matrix • X block autoscaled (not mean centred) • Applicable to many situations with large or small temperature differences • Lots of spectral data required • Temperature needs to be simultaneously measured Global modelling-temperature as a dependant variable (GY) • Spectra at all five temperatures included in the calibration • Use of PLS2 algorithm to simultaneously model for temperature and viscosity • Applicable to many situations with large or small temperature differences • Lots of spectral data required • Temperature needs to be simultaneously measured • Longer model development (use of PLS2) G2Step: Global modellingtemperature as a dependant variable in two-step model (G2STEP) temperature to either the Y or X block as an additional variable showed no benefit over the popular GI model. Using the residuals of a temperature model (G2STEP) provides similar model statistics to the DS model but remains complex with eight latent variables, one less than the GI method suggesting that the temperature variations are present in other latent variables. As mentioned earlier, this technique depends highly on the relationship between the parameter of interest and temperature. This technique shows further improvement upon the GI method suggesting that although physically the viscosity depends strongly on temperature, in the NIR spectra, signals related to viscosity are mostly independent of signals related to temperature. DS did not produce any significant benefits over the best global method or GLSW in terms of predictive ability or complexity. Figure 2 shows a plot of temperature against bias for the models GI, GLSW and DS and a local model built at 30 C. All the compensation models show fairly similar prediction bias values (<0.02), but the biases for the local model shows divergence when predicting samples measured at the lower temperatures (28 C and 29 C) demonstrating the success of global calibration, DS and GLSW for reducing the impact of temperature on calibration models. The DS model seems to underpredict across the range of temperatures with the GI and GLSW models showing similar prediction biases. Figure 3 presents residual plots at each temperature measured for the best methods and a local model built at 30 C. A confidence level of ±0.5 Pa s has been deemed acceptable for the measurement of the viscosity of micellar liquids for quality control. Figure 3 clearly shows the improvement in the predictive ability of the model with the use of GLSW compared with the local model built at 30 C. For the local model, at the lower temperatures, the residuals vary considerably with the maximum deviation being about 5.6 Pa s. These large deviations from the measured value show that although the spectral changes due to temperature are small, they are having a marked effect on the predictive ability of the model. In Figure 1B, spectral differences between 30 C and the lower temperatures are larger than that of the higher temperatures, indicative of the behaviour seen in the residual plot. With the use of GLSW, these deviations are significantly reduced with almost all predictions being within the bounds of ±0.5 Pa s.
These results are consistent with other studies showing that the use of standardisation techniques, multivariate filters and global models are useful tools in the development of robust NIR calibrations where temperature fluctuates. 12,24,30,36 As mentioned previously, these methods have mainly been successfully employed to improve PLS models built to measure chemical properties with little work on physical properties. Clearly, the same effects are present in both cases, and although the viscosity of micellar liquids is temperature dependant, this influence does not carry over into the NIR spectra. As demonstrated by this study, established temperature compensation techniques are therefore still valid and beneficial.

| CONCLUSION
The predictive performance of NIR calibration models is greatly affected by temperature fluctuations, and this study looked to evaluate the effect of temperature on NIR spectra of micellar liquids of varying viscosity and ways in which to remove these effects in order to develop a robust PLS calibration model. A few different approaches are presented and compared including linear global models, GLSW and DS. The effects of temperature on the NIR spectra were found to be very small and not related to the temperature sensitive viscosity changes that these liquids present. The main objective was to determine the best method of temperature compensation for this data set, and superior performance was displayed when applying the multivariate filter GLSW. This produced the lowest prediction errors and least complex model (five LV's) where the majority of the validation set was predicted within ±0.5 Pa s of the measured value.
Further work will involve validation with respect to temperature, flowrate and viscosity at pilot scale and investigations into the implementation of the NIR into the process using computational fluid dynamics to determine if there is an ideal position and also to help in calibration transfer studies.