Chemometric analysis combined with FTIR spectroscopy of milk and Halloumi cheese samples according to species’ origin

Abstract Food adulteration is an issue of major concern, as numerous foodstuffs and beverages do not follow their labeling. Our research interest is in the field of authenticity of dairy products and particularly cheese. Adulteration of dairy products is a well‐known phenomenon, and there are numerous published studies specifically on the authenticity of cheese. In fact, substitution of a portion of fat and/or proteins, adulteration with milk of other species’ origin, and mislabeling of ingredients are some of the main issues that the science of dairy products’ authenticity is regularly facing. Discrimination of dairy products can be determined through several chemical or microbiological methods as presented in the literature. In addition, chemometric analysis is an important tool for interpretation of a huge load of measurements. The aim of this study is to discriminate between various milk samples, which is the primary ingredient of dairy products. Milk samples with different trademarks were analyzed. That data was combined with Halloumi cheese samples for chemometric discrimination of species’ origin. The innovative point of this study is the fact that it is the first time that a research study related to dairy products includes Halloumi cheese which is a traditional Cypriot cheese, not well‐studied until now. The first step of the methodology was the freeze‐drying via lyophilization of the samples. Fourier transformed infrared spectroscopy (FTIR) was chosen for their chemical characterization. Moreover, interpretation of the measurements was carried out by chemometric analysis using SIMCA software. For this study, FTIR data combined with chemometrics have given a very good discrimination of the samples according to their species’ origin. Chemometric methods such as PCA and OPLS‐DA have been used with great success. In the future, this model will be studied regarding geographical origin of the samples.


| INTRODUC TI ON
Milk is used as an ingredient in many food products for human nutrition. The characteristics of many dairy products are related to the quality and to the species' origin of the milk used, as mixtures of different kinds of milk, at specific ratios, contribute in giving special properties to the final product. Milk quality is also important in the production of all types of cheese. The commercial value of these dairy products is often determined by the exact composition of the milk mixture used (Lamanna, Braca, Di Paolo, & Imparato, 2011).
Goat and sheep milk are of higher value than cow milk and are used to produce a variety of specialty cheeses.
The fact that milk and dairy products are consumed by large segments of society is the reason that motivates unscrupulous producers to proceed to adulterations in order to maximize their profit with negative effects on product quality (Cirak, Icyer, & Durak, 2018).
Adulterations of either goat or sheep milk with cow milk result in nonauthentic milk products (Maudet & Taberlet, 2001;Nicolaou, Xu, & Goodacre, 2010). The European Union has legislation in place for the correct display of the constituents of dairy products protecting their authenticity (European Commission, 2001).
Milk is the only dairy food whose determination of adulteration of milk species is the potential drawback of biological techniques. Some authors state that during the analysis of milk with biological techniques, proteolysis, and heat denaturation can cause the loss of antibody-specific epitopes (Hurley, Elyse Ireland, Coleman, & Williams, 2004;Nicolaou et al., 2010). In addition, biological techniques, such as DNA based ones, can be impractical for routine industrial use, and quantification aspects may be hindered by environmental factors such as mastitis, or by milk processing factors such as milk heat treatment (Bania, Ugorski, Polanowski, & Adamczyk, 2001;López-Calleja et al., 2005;Cheng, Chen, & Weng, 2006;Mašková & Paulíčková, 2006;Nicolaou et al., 2010). Heat treatment is used as an integral part of halloumi production. In addition, sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) is considered as a time-consuming approach, which must be followed by in-gel digestion to identify the marker peptides of milk and to discriminate the samples regarding species' origin (Calvano, De Ceglie, Monopoli, & Zambonin, 2012).
Halloumi cheese is a semi-hard to hard, elastic, rindless, and easily sliced cheese. It does not belong in the category of rind or cheese with holes (eyes). Gibbs, Morphitou, and Savva (2004) stated that Halloumi cheese contains more calcium and less cholesterol than Cheddar cheese. In addition, it is the only cheese in the world which can be consumed uncooked, fried, grilled, baked, or boiled (Gibbs et al., 2004).
It has a white to yellow color, which may fade depending on the milk species' origin. Goat and sheep milk gives a white color; however, cow milk turns Halloumi into a yellowish cheese (Kaminarides, Rogoti, & Mallatou, 2000;Lteif, Olabi, Baghdadi, & Toufeili, 2009;Robinson, Haddadin, & Abdullah, 1991). Halloumi cheese is the primary traditional cheese in Cyprus, but so far, is neither a PDO-, nor a PGI-cheese.
The traditional recipe of Halloumi cheese demands the use of goat and/or sheep milks; however, fully cow milk is often used by the large dairy industry. In addition, local Halloumi cheesemakers use unique empirical conditions of processing and milk proportions, thus product characteristics may vary from dairy to dairy throughout the island.

Further variation may be expected when it is made outside Cyprus
where different standards may apply. The consumers of Halloumi cheese in Cyprus prefer the homemade products rather than industrial ones; however, prices of homemade Halloumi cheese vary widely.
The species' origin of milk must be labeled on the package of the product (Moatsou, Hatzinaki, Psathas, & Anifantakis, 2004). The substitution of goat or sheep with cow milk in cheese can be determined by using the European reference method (European Commission Regulation, 1996). The method is based on isoelectric focusing of γ-caseins after treatment of cheese casein fraction with plasmin.
Nevertheless, the substitution of sheep with goat milk cannot be determined with this method, and this is also a common issue in the field of food adulteration (Moatsou et al., 2004;Recio et al., 2004). In the 1980s, a national legislation was voted in Cyprus for protecting the dairy industry, which it stated that only a substantial proportions of goat and/or sheep milk must be contained within any cheese sold as Halloumi, without any fixed percentages or thresholds (Welz, 2017). This was changed in 2012 when a minimum of fifty-one percent milk from sheep and goat became mandatory, as stated in the Official Gazette of the Republic of Cyprus, 30 November 2012, Issue No 4628, p. 4786 (Welz, 2017.
The aim of this study is to discriminate between various milk samples. Milk samples with different trademarks were analyzed. The main goal of this study is the development of a method, based on spectroscopy, capable of determining, and identifying the adulteration in Halloumi cheese due to mislabeling regarding species' origin of milk. It is the first time that a research study related to dairy products includes Halloumi cheese, which is not well-studied until now.
The first step of the methodology was the freeze-drying process with lyophilization of the samples. FTIR was chosen in order to chemically characterize the samples. The advantages of FTIR spectroscopy are that assessments can be accurately reproduced and analyzed without any experimentation and without damaging the sample. It can give results even with a minimum sample amount and does not require any additional substances or chemicals; thus, it reduces the cost and time of analysis (Cirak et al., 2018;Ketty et al., 2017;Nicolaou et al., 2010;Pappas et al., 2008;Terouzi et al., 2016). Moreover, interpretation of the measurements was done with chemometric analysis based on SIMCA software. Chemometric methods such as PCA and OPLS-DA have been used with great success.

| Samples analyzed
Twenty-eight milk samples and seventy-four Halloumi cheese samples of different species' origin were purchased from local supermarkets in Cyprus and were analyzed to compare their infrared spectra regarding species' origin.

| Lyophilization
A Christ, Alpha 1-2 freeze drier was used. The condenser temperature was 233 K and the final pressure in the drying chamber 3 mPa.
The freeze-drying procedure for 3 ml or 5 g of each milk or Halloumi cheese sample required 8 or 5 hr, respectively, to be completed and the residue after homogenization was used for FTIR measurements.

| FTIR Studies
The FTIR spectra were measured in duplicate on a Shimadzu Fourier Transform -8,900 Spectrometer instrument employing a KBr beam splitter. Samples were recorded as pressed KBr pellets.
Twenty scans were co-added at a normal resolution of 8 cm −1 in the 4,000-400 cm −1 region. The samples were recorded against a background of air to minimize the interference due to carbon dioxide and water vapor in the atmosphere. The region between 2,700 and 1900 cm −1 was removed prior to multivariate data analysis because the absorbance of carbon dioxide (CO 2 ) is included in this region, and specifically the absorption at 2,360 cm −1 known to be caused by atmospheric carbon dioxide (Ketty et al., 2017). In addition, this region may contribute with more noise than chemical information, due to lack of molecular vibrations in the particular parts of the spectra.

| Data analysis
All the spectra were imported into Excel before chemometrics.
Multivariate statistical analysis including unsupervised principal component analysis (PCA), supervised orthogonal partial leastsquares discriminant analysis (OPLS-DA) was performed using F I G U R E 1 Representative samples of milk database regarding species' origin F I G U R E 2 Representative samples of Halloumi cheese database regarding species' origin, where black shapes and numbers (1-6) show the general regions with differences between the species' origins, and red letters (a-j) show the specific bands which are different between the two origins SIMCA software (version 15.0.2; Umetrics; Sweden). Prior to fitting OPLS-DA, a preliminary PCA was performed for data overview. Scaling to unit variance (UV) and mean-centering were applied before analysis. The ellipse in the plots defines Hotelling's T2 confidence region, which is a multivariate generalization of Student's t test and provides a 95% confidence interval for the observations. The number of the important components which have been chosen is given with the symbol A. R 2 X and R 2 Y represent the cumulative modeled variation, explaining the quality of the model, and Q 2 is an estimate of model predictive ability, calculated by crossvalidation analysis of variance (CV-ANOVA). R 2 X, R 2 Y, and Q 2 values (not less than 0.5) suggested a robust model with predictive reliability (Yang et al., 2016). The difference between R 2 X (cum) and Q 2 (cum) must be lower than 0.2-0.3 (Eriksson et al., 2006;Fotakis et al., 2016). Regarding the predictability of the OPLS-DA models, the misclassification percentage was calculated. Furthermore, on the cross-validation (CV) score plot, if some samples cross over to the other side, this indicates that their class assignment is uncertain. In addition, model validation was performed by permutation tests repeated 100 times. To indicate the validity of the original models, both the blue and green regression lines of the Q 2 and R 2 points, respectively, should intersect the vertical axis at, or below, TA B L E 1 Type of relative intensity of absorption for important subregions regarding species' origin Similarly using p-value approach, which should be less than 0.05.

Relative intensity of absorption
Lastly, other dataset was used in conjunction with the model constructed here to gauge the repeatability of the model in this study.

| Characterization of FTIR spectra of milk samples
The differences between goat-sheep against cow milk samples are depicted in Figure 1. Between the two origin samples of Figure 1 clear differences can be specifically seen at the bands as follows: • 1:3,450 cm −1 : it is related to -ΟΗ stretching in hydroxyl groups, • 2:2,930 -2,850 cm −1 : it is associated with C-H bending in fatty acids, • 3:1745 cm −1 : it is correlated to the degree of sugars carboxyl methyl esterification, • 4:1683 cm −1 (broad peak): it has been assigned to the carbonyl (C = O) stretching (amide I), and it may be overlapped with the broad and weak peak at 1644 cm −1 which is correlated with the nonremoved water (bending vibration), • 5:1548 cm −1 and 6:1,453 cm −1 : they correspond to the N-H bending with the contribution from C-N stretching (amide II), • 7:1,397 cm −1 : it is associated with C-H bending of esters and aliphatic chains of fatty acids, • 8:1,245 cm −1 : it corresponds to C-H bending and C-N stretching with the contribution from N-H bending (amide III), • 9:1,168 cm −1 : it is due to -NH 2 deformation, • 10:1,100 cm −1 : it is assigned to COH bending and C-C stretching with the contribution from OH bending (Pappas et al., 2008), • 11:1,100 -1,060 cm −1 : it is associated with O = P-O (phosphate groups stretch) covalently bound to casein proteins can be also observed in milk samples (Etzion, Linker, Cogan, & Shmulevich, 2004).
Variations between samples depend on the respective functional groups, and therefore, it was possible to construct chemometric models.

| Characterization of FTIR spectra of Halloumi cheese
In general, a spectrum obtained from a cheese can have the following characteristic bands, as shown in Figure  When the results of the measurements are observed, it is seen that the spectrums of cow milk and goat-sheep milk yield absorption values within more or less the same areas. These particular areas, too, can be used for classifying the species' origin of milk based on different relative intensity of absorption found for respective milk types, as shown in Table 1.
The different regions between goat-sheep and cow products summarized in Table 1 are also depicted in Figure 2 for Halloumi cheese samples. Between the two origin samples of Figure 2 clear differences can be specifically seen at the bands (a-h) as follows: • a: 3,450 cm −1 : it is related to -ΟΗ stretching in hydroxyl groups, • b: 2,930 -2,850 cm −1 : it is associated with C-H bend in fatty acids, • c: 1745 cm −1 : it is correlated with the degree of sugars carboxyl methyl esterification, • d: 1683 cm −1 (broad peak): it has been assigned to the carbonyl (C = O) stretching (amide I), and it may be overlapped with the broad and weak peak at 1644 cm −1 which is correlated with the nonremoved water (bending vibration), F I G U R E 6 (a) Score scatter plot (t2/t1) from PCA modeling of the training merged with the test set regarding species' origin of milk samples by using the subregions 3,500 -3,300 and 2,900 -2,800 cm -1 from FTIR spectra. A = 2 components, R 2 X(cum) = 0.995, Q 2 (cum) = 0.994, (b) 3D presentation of the score scatter plot (t2/t1/Num) of (a)

F I G U R E 7
Score scatter plot (t2/ t1) from PCA modeling of the final database regarding species' origin of milk samples by using the subregions 3,500 -3,300 and 2,900 -2,800 cm -1 from FTIR spectra. A = 2 components, R 2 X(cum) = 0.995, Q 2 (cum) = 0.994 • e: 1548 cm −1 and f: 1,453 cm −1 : it corresponds to the N-H bending with the contribution from C-N stretching (amide II). More specifically, small differences between the two spectra can be observed from amino acid side chain vibrations due to tyrosine at about 1515 cm -1 , phenylalanine at about 1,498 cm -1 , proline at about 1,453 cm -1 , and sometimes a small absorption at 1438cm -1 .
• g: 1,397 cm −1 : they are associated with C-H bending of esters and aliphatic chains of fatty acids, • h: 1,245 -1,243 cm −1 : it corresponds to C-H bending and C-N stretching with the contribution from N-H bending (amide III), • i: 1,168 cm −1 : it is due to -NH 2 deformation, and • j: 1,100 cm −1 : they are assigned to COH bending and C-C stretching with the contribution from OH bending (Elbassbasi, Kzaiber, Ragno, & Oussama, 2010;Pappas et al., 2008). and it showed that only the subregions of 3,500-3,300 and 2,900-2,800 cm −1 should proceed in chemometric analysis, which correspond to -OH stretching in hydroxyl groups and C-H bending in fatty acids, respectively. The fingerprint area at approximately 1,600-650 cm −1 was expected to be significant according to the data in Table 1; however, the loading plot showed that it was not.
After that, a training set was set up with the twelve represen-  show the importance of the model produced for the test set, as they are both close to 1.
After that, merging of the two sets (i.e., training and test sets) had to take place. Figure 6a Moreover, to validate the goodness of fit and the predictability of these results, a random permutation test with 100 permutations was employed, as seen in Figure 9. The criteria for validity that both

| Combination of the milk database with Halloumi cheese samples
Combination of the milk database with Halloumi cheese samples was initially tested with PCA method. A loading plot (not shown here) showed that only variables 1-390 should be included in the analysis which correspond to 1,150-400 cm -1 from the FTIR values, which contains absorptions due to -NH 2 deformation, COH bending, and C-C stretching with the contribution from OH bending and O = P-O (phosphate groups stretch) covalently bound to casein proteins and -C = O from polysaccharides and C = C stretching of acids. As a result, the model in Figure 10 was constructed with validation values R 2 X(cum) = 0.974, R 2 Y(cum) = 0.686, and Q 2 (cum) = 0.659. In addition, Table 5 shows that three samples are outliers and also 97.03% correct classification of the samples in the two classes. Finally, removal of the outlier samples, that is, M19, H33, and H56 showed a clear improvement of the OPLS-DA model in Figure 11  regarding the classification outlined in Table 6. The difference of the values R 2 X(cum) − Q 2 (cum) is 0.26 (less than 0.3), indicating that the model is good with high predictive ability. Table 7 shows a stable correct classification of samples to 97%. In Figure 12, ROC curve of the model shows that a good model was constructed with AUC equal to 0.999 for both classes.
A random permutation test with 100 permutations for the OPLS-DA model in Figure 11 took place and it is presented in Figure 13.
The criteria for validity include the following: all blue Q 2 values to the left being lower than the original points to the right and the regression line of the Q 2 points intersects the vertical axis at, or below zero. The R 2 values always demonstrate some degree of confidence, although when all R 2 points are lower than the original point to the right it also indicates a valid model (Eriksson et al., 2006 The traditional recipe of Halloumi cheese demands the use of goat and/or sheep milk; however, fully cow milk is used by the large dairy industry. Low levels of production of goat and sheep milks reinforce this consequence, as well as cow milk is all year-round available and has of course lower cost (Borkova & Snaselova, 2005;Pellegrino, Cattaneo, Masotti, & Psathas, 2010). Therefore, it is really important the fact that this research study managed to develop a method capable of determining and identifying the adulteration in Halloumi cheese due to mislabeling regarding species' origin of milk.
F I G U R E 1 2 ROC curve for the OPLS-DA model in Figure 11 F I G U R E 1 3 Random permutation test with 100 permutations for the OPLS-DA model in Figure 11

| CON CLUS IONS
The data enabled the differentiation of milk samples regarding species' origin. The classification was satisfactory although similar variables existed among the samples. The peaks at wavenumbers 3,500-3,300 and 2,900-2,800 cm −1 were significant to predict the species' origin of milk samples, and they correspond to -OH stretching in hydroxyl groups and C-H bending in fatty acids, respectively.
When the milk samples combined with Halloumi cheese samples, the subregion 1,150-400 cm −1 was only important for discrimination regarding species' origin, which contains absorptions due to -NH 2 deformation, COH bending, and C-C stretching with the contribution from OH bending and O = P-O (phosphate groups stretch) covalently bound to casein proteins and -C = O from polysaccharides and C = C stretching of acids.
The results augur well for the use of chemometric methods in the characterization of dairy products. Good results were obtained using both PCA and OPLS-DA. OPLS-DA appeared to have applicability for the chemometric analysis of the dairy products, since the total correct classification ability was very high as shown in misclassification tables and other validation tools. The FTIR spectroscopy technique coupled with chemometrics seems to be a powerful tool for controlling the authenticity of Halloumi.
The model constitutes an interesting approach and a preliminary work for future predictions of unknown Halloumi cheese samples regarding species' origin. In the future, this model will also be studied regarding geographical origin of the samples. The proposed methodology contributes to adding extra value to Cypriot traditional Halloumi cheese.

AUTH O R CO NTR I B UTI O N S
The experiments, including the Chemometric analysis, and the first draft of this paper were carried out by MT. RK performed the first Chemometric evaluation of the study. CRT edited and rewrote the first draft and produced the final paper. CRT also determined and refined the methodology, contributed to result interpretation, and provided overall supervision.

E THI C AL S TATEMENT
There are no conflicts of interest regarding the work reported herein.
This study does not involve human or animal subjects.