Robust Fourier transformed infrared spectroscopy coupled with multivariate methods for detection and quantification of urea adulteration in fresh milk samples

Abstract Urea is added as an adulterant to give milk whiteness and increase its consistency for improving the solid not fat percentage, but the excessive amount of urea in milk causes overburden and kidney damages. Here, an innovative sensitive methodology based on near‐infrared spectroscopy coupled with multivariate analysis has been proposed for the robust detection and quantification of urea adulteration in fresh milk samples. In this study, 162 fresh milk samples were used, those consisting 20 nonadulterated samples (without urea) and 142 with urea adulterant. Eight different percentage levels of urea adulterant, that is, 0.10%, 0.30%, 0.50%, 0.70%, 0.90%, 1.10%, 1.30%, and 1.70%, were prepared, each of them prepared in triplicates. A Frontier NIR spectrophotometer (BSEN60825‐1:2007) by Perkin Elmer was used for scanning the absorption of each sample in the wavenumber range of 10,000–4,000 cm‐1, using 0.2 mm path length CaF2 sealed cell at resolution of 2 cm‐1. Principal components analysis (PCA), partial least‐squares discriminant analysis (PLS‐DA), and partial least‐squares regressions (PLSR) methods were applied for the multivariate analysis of the NIR spectral data collected. PCA was used to reduce the dimensionality of the spectral data and to explore the similarities and differences among the fresh milk samples and the adulterated ones. PLS‐DA also showed the discrimination between the nonadulterated and adulterated milk samples. The R‐square and root mean square error (RMSE) values obtained for the PLS‐DA model were 0.9680 and 0.08%, respectively. Furthermore, PLSR model was also built using the training set of NIR spectral data to make a regression model. For this PLSR model, leave‐one‐out cross‐validation procedure was used as an internal cross‐validation criteria and the R‐square and the root mean square error (RMSE) values for the PLSR model were found as 0.9800 and 0.56%, respectively. The PLSR model was then externally validated using a test set. The root means square error of prediction (RMSEP) obtained was 0.48%. The present proposed study was intended to contribute toward the development of a robust, sensitive, and reproducible method to detect and determine the urea adulterant concentration in fresh milk samples.


| INTRODUC TI ON
The analytical methods resulting from the use of the NIR spectroscopic region reflect some significant characteristics such as fast, nondestructive, noninvasive, with high penetration of the probing radiation beam, suitable for in-line use, nearly universal application (any molecule containing C-H, NH, S-H, or O-H bonds), and with minimum sample preparation demands. The combination of these characteristics with instrumental control and data treatment has made it possible to coin the term Near-Infrared Technology (Celio, 2003).
Unfortunately, milk is one of the most vulnerable targets for economically motivated adulteration (Moore, Spink, & Lipp, 2012) and these adulterants cause serious illnesses to the consumers which may lead to death in some cases. The milk adulterants include mainly the vegetable proteins, whey, watering, and milk from different species (Singh & Gandhi, 2015). The major hazardous adulterants of milk include urea, formalin, ammonium sulfate, boric acid, detergents, caustic soda, salicylic acid, hydrogen peroxide, benzoic acid, melamine, and sugars. Urea adulteration up to 500 mg/L can result into cancer and failure of kidneys (De et al. 2011;De Toledo et al., 2017). The allowed limit for the presence of urea in fresh milk by Some researchers has recommended a range of 10-14 milligrams per deciliter (mg/dl) while others have recommend at range of 8-12 mg/dl (Penn State Extension report). Cow milk containing urea as contaminant has been reported to cause ulcer, acidity, kidney stones, and indigestion (Ezhilan et al., 2017), as urea adulterated milk is considered to overburden the kidneys (Kandpal, Srivastava, & Negi, 2012). The milk adulterated with excessive starch can accumulate undigested starch in colon, which can cause diarrhea and in some cases can also lead to fatality in diabetic patients (Singuluri & Sukumaran, 2014).
Adulteration or adding illegal additives to food products is becoming a global issue for the consumers. Due to lack of adequate monitoring policies, the underdeveloped and the developing countries are prone to higher risk of human health (Azad & Ahmed, 2016).
Nowadays, milk adulteration is being carried out more sophisticatedly (Azad & Ahmed, 2016), whereas the standard methods for food protein analysis rely mainly on the measurement of nitrogen content by using classical detection techniques (Garcia et al., 2012). Therefore, it has become difficult to differentiate the adulterant nitrogen from the milk protein and the nitrogen-rich chemicals commonly used as the adulterants (Qin et al., 2017). Hence, there is a direct need for cutting edge research through dissemination and implementation of more advanced techniques to detect these adulterants. In a previous study, we reported (Mabood et al., 2017) a NIRS method coupled with chemometrics to authenticate the level of adulteration of goat milk in camel milk. The present study was intended to contribute toward the development of a robust, highly sensitive, and reproducible Fourier transformed infrared spectroscopy (FT-NIRS) with the help of application of chemometric methods method to determine the urea adulterant concentration in cow milk. The fresh cow milk samples were intentionally adulterated with various concentrations of commercial urea and then submitted to NIR spectral measurements. validated using a test set. The root means square error of prediction (RMSEP) obtained was 0.48%. The present proposed study was intended to contribute toward the development of a robust, sensitive, and reproducible method to detect and determine the urea adulterant concentration in fresh milk samples.

K E Y W O R D S
milk adulteration, NIR spectroscopy, partial least-squares discriminant analysis, partial leastsquares regressions, principal components analysis, urea Multivariate analysis was finally applied to authenticate and quantify the levels of adulteration.

| Preparation of the urea adulterated fresh milk samples
In this study, 162 fresh milk samples were used, those consisting of 20 nonadulterated samples (without urea) and 142 with urea adulterant. Eight different percentage levels of urea adulterant, that is, 0.1%, 0.3%, 0.5%, 0.7%, 0.9%, 1.1%, 1.3%, and 1.7%, each of them prepared in triplicates, were used. The measured NIR spectral data were split into two sets. A training set including 70% of the data was used for building the PLSR model, while the second set was the test set including 30% of the spectra and used for external validation of the PLSR model.

| Fourier transform near-infrared spectroscopic analysis
A Frontier NIR spectrophotometer (BSEN60825-1:2007) by Perkin Elmer was used for measuring the absorption of each milk sample in the wavenumber range of 10,000-4,000 cm -1 , using 0.2 mm path length CaF 2 sealed cell at a resolution of 2 cm -1 .

| Multivariate analysis
Principal components analysis (PCA), partial least-squares discriminant analysis (PLS-DA), and partial least-squares regressions (PLSR) methods were applied for the multivariate analysis of the measured NIR spectral data using the Unscrambler version 9.00 and Microsoft Excel 2010 softwares. PCA was used to reduce the dimensionality of the spectral data and to explore the similarities and differences among the fresh milk samples from the ones adulterated with urea.
PLS-DA was used to discriminate between adulterated and nonadulterated milk samples. Furthermore, the PLSR models were also built to quantify the levels of urea in the fresh milk samples. The PLSR model was externally validated using the test set of samples.

| NIR spectra
The actual NIR spectral data obtained by running all the adulterated and nonadulterated fresh milk samples through FT-NIR spectrophotometer are shown in Figure 1.
Prior to the application of various chemometric methods on the near-infrared spectral data, spectral transformations such as baseline correction, 1st derivative with Savitzky-Golay smoothing, and standard normal variate (SNV) were also applied. The preprocessing on the NIR spectra was applied to remove the noise and to minimize the effect of scattering due to the presence of the suspended particles in fresh milk samples (see Table 1). The selection of the optimal spectral transformations was based on the values of the R 2 , RMSE, and RMSEP of the PLSR models, the best preprocessing spectral treatment being the one with minimum values of RMSE, RMSEP and number factors, and maximum value of R 2 .
As it can be seen from Table 1   F I G U R E 2 The 1st derivative transformed NIR spectra for the fresh milk samples.
this case, PC contain 56% of the total spectral variation X. It also tells about the spectral regions those contribute more to the PCA model.
The chemical structure of the urea molecule is shown in Figure 3c.
The absorbance spectrum of the urea exhibited two broad absorption bands at 4,650 and 4,550 cm -1 is associated with symmet- Similarly, the PLS-DA model was also showed the discrimination between the milk samples, as shown in Figure 4.
The PLS-DA model in Figure 4 shows that the 0% concentration In order to see the variation in the spectral data during building the PLS-DA model, the x-factor loading plot was also built and shown in Figure 5.

| PLS regression results
Furthermore, a PLSR model was also built on the NIR spectral data in order to quantify the levels of the urea adulterant in the fresh milk samples, as shown in Figure 6. The PLSR model was built by using 70% of the NIR spectral data, that is, the training set.
PLS regression model makes a set of orthogonal components that maximizes the level of correlations in between both the NIR spectral data, that is, X, and the concentration, that is, Y, and provide a predictive equation for Y in terms of the X's for future unknown samples. Figure 7 shows the generalized procedure as well as validation methods of the multivariate PLS regression analysis applied on the obtained NIR spectral data. It shows that the NIR spectral data of all the adulterated and nonadulterated fresh milk samples were first transformed with the application of 1st derivative spectral pretreatment. After that, the spectral data of the urea adulterated milk sam- The R-square and root mean square error (RMSE) values for the PLSR model in Figure 6 were found to be 0.986 and 0.612%, respectively. The RMSE is a statistical measure used to check the prediction ability of the PLSR model, using "pseudo" external samples and using the leave-one-out procedure. The best PLSR model is the one which has the smaller value of RMSE along with high value of correlationship. It is calculated as in Equation 1: where y i is the measured value (actual % of adulteration), ŷ i is the % of adulteration predicted by the model, and n is the number of segments left-out in the cross-validation procedure, which is equal to the number of samples of the training set. Smaller the value of RMSE is a better indicator for the prediction ability of the PLSR model.
In order to show the variation in the spectral data during building the PLSR model, the factor loading plot is shown in Figure 8. Once the PLS regression model was established, it was then assessed using the external test set including 30% the NIR spectral data, as shown in Figure 9. Figure 9 shows that the PLS regression model displayed a very good prediction ability, with prediction error, that is, (RMSEP = 0.483%) with a high correlation coefficient (R = 0.99).
The RMSEP is a statistical measure used to assess the prediction ability of the PLS model with totally new samples (not used during the calibration process), and it is calculated using Equation 2: where y t,i is the measured value (actual % of adulteration), ŷ t,texti is the % of adulteration predicted by the model, and n t is the number of samples in the test set. RMSEP expresses the average error to be expected in future predictions when the calibration model is applied to unknown samples.
Based on the minimum value of RMSEP (model with three factors), the PLS regression model can be applied to unknown fresh milk samples for detection and quantification of urea adulteration in any fresh milk sample.
(2) RMSEP = (y t,i −ŷ t,texti ) 2 n t F I G U R E 8 Factor loading plot for factor 1 F I G U R E 9 Partial least-squares prediction plot for the test set of fresh milk samples

| CON CLUS ION
The results gleaned from this study revealed that NIR spectroscopy coupled with multivariate methods can be deployed as a robust, sensitive, and nondestructive technique for detecting and quantifying the presence of urea adulteration in various fresh milk samples. The current study revealed that PLS-DA model can be used to discriminate between the milk samples those were adulterated with urea from the fresh milk samples (unadulterated).
Furthermore, the PLSR models may be used to quantify the level of the urea adulterant in milk samples (https://extension.psu.edu/ interpretation-of-milk-urea-nitrogen-mun-values).

ACK N OWLED G M ENTS
The laboratory, instrumental, and consumable facilities were provided by University of Nizwa Oman.

CO N FLI C T O F I NTE R E S T
The authors declare that they do not have any conflict of interest.
Ethical Review: This study does not involve any human or animal testing.
Informed Consent: Written informed consent was obtained from all study participants.