Fraud detection and quality assessment of olive oil using ultrasound

Abstract Today, food safety is recognized as one of the most important human priorities, so effective and new policies have been implemented to improve and develop the position of effective laws in the food industry. Extra virgin olive oil (EVOO) has many amazing benefits for human body's health. Due to the nutritional value and high price of EVOO, there is a lot of cheating in it. The ultrasound approach has many advantages in the food studies, and it is fast and nondestructive for quality evaluation. In this study, to fraud detection of EVOO four ultrasonic properties of oil in five levels of adulteration (5%, 10%, 20%, 35%, and 50%) were extracted. The 2 MHz ultrasonic probes were used in the DOI 1,000 STARMANS diagnostic ultrasonic device in a “probe holding mechanism.” The four extracted ultrasonic features include the following: “percentage of amplitude reduction, time of flight (TOF), the difference between the first and second maximum amplitudes of the domain (in the time–amplitude diagram), and the ratio of the first and second maximum of amplitude.” Seven classification algorithms including “Naïve Bayes, support vector machine, gradient boosting classifier, K‐nearest neighbors, artificial neural network, logistic regression, and AdaBoost” were used to classify the preprocessed data. Results showed that the Naïve Bayes algorithm with 90.2% provided the highest accuracy among the others, and the support vector machine and gradient boosting classifier with 88.2% were in the next ranks.

In diagnostic ultrasonic systems, it is necessary to note that propagation of ultrasonic waves from the air is difficult; the presence of air in the path of the waves leads to the attenuation of the wave. Of course, in air-coupled ultrasonic systems, this is not a serious problem (Fathizadeh & Aboonajmi, 2017). Usually, there is an air gap between ultrasonic probe and the test sample's container, so it is necessary to use a material for coupling. The matter which uses as a coupler can be oil (Meftah & Mohd Azimin, 2012), water (Haeggström & Luukkala, 2001;Zhao et al., 2003) and so on. Also in diagnostic ultrasonic tests, it is important to keep the temperature of the test sample constantly, because the ultrasonic properties vary with temperature, and ignoring these changes can lead to errors in results. If the temperature changes during the test, the results should be modified by applying the coefficients. According to Equation (2), as the temperature decreases, the speed of the ultrasonic wave increases, and vice versa. Increasing the frequency of ultrasonic waves will also lead to an increase in the wave velocity (McClements & Povey, 1992).
Diagnostic ultrasound has been widely used in food quality assessment, agriculture, livestock, fisheries, etc. This approach has been used in various researches, including detection of fruit ripening and harvest time (Mizrach, 2008;Morrison & Abeyratne, 2014), the presence of foreign bodies in packaged foods (Zhao et al., 2003), and determination of egg quality (Aboonajmi et al., 2010). Research on fruit quality assessment may be performed for sliced fruit (destructive) or for whole fruit (nondestructive). In any way, the basic of the test is the same in both methods and is based on changes in ultrasonic properties such as velocity and wave attenuation. Some of the studies on food quality assessment by ultrasonic technology are given in Table 1.
The most common approach for evaluation of olive oil quality which is performed in food laboratories is the chromatography method which is a destructive test. Generally, chromatography is in two types: gas chromatography (GC) and high-performance liquid chromatography (HPLC). Chromatography is based on the separation of the sample (in this study: olive oil) into the components.
There are several methods for analyzing and classification of data such as artificial neural network (ANN), gradient boosting classifier (GBC), support vector machine (SVM), K-nearest neighbors (KNN), blockchain, and many other models. These models have a wide application in food quality evaluation and classification   With respect to the fact that most of the well-established methods of olive oil quality assessment, such as GC, are nondestructive and slow, the presented method is nondestructive, fast, and easy to use. Ultrasound parameters of each matter are unique and known as a fingerprint of matter, so in this study, the diagnostic ultrasonic method for olive oil quality evaluation was used. Also in this study, several different classification models were used for analysis and classification, so determining the best classification method is other aims of the study. ,lfjb

| MATERIAL S AND ME THODS
In this study, a

| Preparation of samples
The EVOOs used in this study were obtained from a well-known factory in a town named Lowshan in the North of Iran. Oils in this study were extracted by cold press method. Also, fatty acid profiles were determined in a food laboratory by "Agilent Technologies (Model 7890B)" gas chromatography system. Gas chromatography (GC) is a method for detecting and separating volatiles of a substance, usually liquid or gas. The EVOOs were mixed with common frying oil including sunflower oil, canola oil, and corn oil sold in markets to make 6 classes for fraud creation of 5, 10, 20, 35, and 50 percentages in mass.
Out of each class of frauds, four samples were prepared with 100 -gram net weight and tests for each sample were repeated seven times, so a totally of 28 tests were done for each class. As mentioned above, three important factors that affect the quality of olive oil are light, temperature, and oxygen. Therefore, the samples were kept in a dark place and at lower than 25°C temperature. To minimize oxidation of samples, it is necessary to minimize the amount of oxygen in the sample container, so sample containers were filled completely.
The profile of fatty acids has been shown in Table 2.
The results of chromatography experiments on the samples showed that the percentage of some fatty acids such as oleic acid (C18:1) and palmitic acid (C16:0) decreased by increasing the percentage of EVOO in the blend, which was not unexpected.

| The guide of probe system
A thin (one millimeter) glass was used to make the oil sample container. According to the dimensions of the transmitter and receiver probes of the ultrasonic system (the outer diameter of the probe in the contact zone with the sample container was about 30 mm), a small container for storing sample oil between the two probes was designed and manufactured according to Figure 3. According to item 1 in Figure 2b, the distance between the transmitter and receiver probes was 12 mm, and after decreasing the thickness of the glass, the wave travels just ten millimeters in the oil sample. Ultrasonic  To know the data dispersion, all the data (four features of ultrasound properties) were analyzed by a box plot and a histogram. Ultrasonic waves continue to move directly or indirectly as they pass from one material to another, or reflect or refract with a certain angle. So, in this study, as shown in Figure 4, because the wave path includes several material, so we have a few wave reflections and frequently peak in the time-amplitude diagram.
Wave reflection has been taken place when the ultrasonic wave travels from a matter to the other (that includes wave traveling from transmitter probe to glass, from glass to oil, oil to glass, and glass to receiver probe).
Combining low-value, low-price oils with extra virgin olive oils changes physical properties such as density and homogeneity, and has a direct effect on velocity, attenuation, and refraction coefficient. A diagnostic ultrasonic system combined with a machine learning system can be easily used to detect various purities of extra virgin olive.
In this study, preprocessing codes, as well as data classification operations, were written and run using Python software version 3.7.

| RE SULTS AND D ISCUSS I ON
Basically, before modeling the data with most of the classification algorithms (such as neural networks, SVM, and KNN), it is necessary to normalize the data; raw data lead to an unbalance effect of each feature and it is undesirable. Figure 5 shows the effect of normalization on data distribution. Normalization will be very important when features and data are in different scales. For evaluation of the data dispersion, all the data (four ultrasound features) were analyzed by a box plot and a histogram. These diagrams showed that the outliers in the data are considerable. Also, charts indicate that the distribution of the data is more than normal value and need to preprocessing operation. Preprocessing will improve the accuracy of results, and sometimes without preprocessing, analyzing the data is impossible.
In statistical multivariate analysis, there are different methods for measuring dependence or relationship between two random variables. The correlation between two variables means the prediction of the value of one parameter by another. Table 3 shows the correlation matrix between the features.
Correlation matrix values are in the range of −1 (maximum reverse correlation) to the + 1 (maximum correlation). The zero correlation coefficient indicates that the two parameters are not interdependent.
Also, the relationship between ultrasonic properties and the desired class (correlation coefficient) was compared ( Figure 6).
According to this diagram, the "amplitude reduction from the first peak to the second peak in the time-amplitude diagram" and the F I G U R E 7 Principle component analysis F I G U R E 8 Treatment of the "difference between the first and second peak of amplitude" feature with changing in fraud "percentage of amplitude reduction from the transmitter to the receiver" showed the highest correlation with the class.
According to Table 4, the "amplitude reduction from the first peak to the second peak in time-amplitude diagram" with 32.22% was the most important feature. Attenuation coefficient of oil obtains from this feature. The "the ratio of the first peak to the second peak of amplitude" with 18.15% showed the least effect on classification. The trend of "the difference between the first and second maximum amplitudes of the domain" with changes in oil adulteration is given in Figure 8. It is observed that an increase in the percentage of fraud oil to the sample leads to an increase in the difference between the values of the first two peaks; in other words, increasing in the percentage of fraud oil leads to decrease in amplitude and increase in signal attenuation.

| Classification algorithm
For validation of classification algorithms, it is necessary to divide the data into two classes of "train data" and "test data." This requires sufficient amount of data. If the number of data is less, the results will not be valid.
In this study, 70% of data assigned to train data and 30% to test data. These percentages were determined empirically.

F I G U R E 9
Classification results with 7 different models and 74.51%, respectively). These methods may show a higher accuracy if the quantity of data was more.
The results of the data classification with the models mentioned above are given in Figure 9.

| CON CLUS ION
The results showed that among the seven different classification

ACK N OWLED G M ENT
We would like to thank Dr.