Artificial Q‐Grader: Machine Learning‐Enabled Intelligent Olfactory and Gustatory Sensing System

Abstract Portable and personalized artificial intelligence (AI)‐driven sensors mimicking human olfactory and gustatory systems have immense potential for the large‐scale deployment and autonomous monitoring systems of Internet of Things (IoT) devices. In this study, an artificial Q‐grader comprising surface‐engineered zinc oxide (ZnO) thin films is developed as the artificial nose, tongue, and AI‐based statistical data analysis as the artificial brain for identifying both aroma and flavor chemicals in coffee beans. A poly(vinylidene fluoride‐co‐hexafluoropropylene)/ZnO thin film transistor (TFT)‐based liquid sensor is the artificial tongue, and an Au, Ag, or Pd nanoparticles/ZnO nanohybrid gas sensor is the artificial nose. In order to classify the flavor of coffee beans (acetic acid (sourness), ethyl butyrate and 2‐furanmethanol (sweetness), caffeine (bitterness)) and the origin of coffee beans (Papua New Guinea, Brazil, Ethiopia, and Colombia‐decaffeine), rational combination of TFT transfer and dynamic response curves capture the liquids and gases‐dependent electrical transport behavior and principal component analysis (PCA)‐assisted machine learning (ML) is implemented. A PCA‐assisted ML model distinguished the four target flavors with >92% prediction accuracy. ML‐based regression model predicts the flavor chemical concentrations with >99% accuracy. Also, the classification model successfully distinguished four different types of coffee‐bean with 100% accuracy.


Visualization of decision boundary map for classification on PC spaces.
Recently, in terms of explainable (or interpretable) ML, it is indispensable to propose a new methodology that facilitates an intuitive interpretation for intuitive understanding why this decision was made, rather than simply presenting their scores.Along these lines, a visualization of ML results gives a great opportunity that enable to compare, analysis, and observe unique characteristics between the various types of training models.Principal component analysis (PCA) has been widely adopted for not only a common classifier as an unsupervised ML but also data compression by reduction of dimensionality.In this study, by utilizing principal     can be calculated using the residual scatter from the fitting line (black dashed line) as follows: where  is the real value,  is the predicted value, and  ̅ is the mean of the real value at the specified corresponding analyte concentration.Then,  −  can be described as a residual scatter from the prediction of the real value through the machine learning results. −  ̅ is the total variance.In general, because a higher R 2 leads to a well-fitted model compared with the real values, the accuracy of the training model can be estimated using the R 2 value.The ML results exhibit a high-scored accuracy with Gauss and neural network regression model.The scored accuracies are summarized in Supporting Information Table S8 S4.Prediction accuracy of training models for the concentration of acetic acid (Fig. S10), ethyl butyrate (Fig. S11), caffeine (Fig. S12) and 2-furanmethanol (Fig. S13).
Similar to the represented decision boundary map for classification models, the regression surface of predicted concentration can be displayed in 2-dimensional principal-components (PC) spaces for each training model (Supporting Information Figure S10-13).From the regression surface map, we can readily ascertain that the concentration prediction for untrained data is also included in the corresponding concentration boundary perfectly.

Supplementary Note 3.
Except for the tree classifier, overall models show 100% of prediction accuracy.It is likely due to a simplified algorithm for determining decision boundaries, which can be further optimized for better performance.

Figure S4 .
Figure S4.F 1s, C ls, O 1s and Zn 2p core level XPS spectra of ZnO thin films and PVDF-

Figure S5 .
Figure S5.Transfer characteristics of surface-engineered TFT-based sensors at initial state

Figure S6 .
Figure S6.Transfer characteristics of surface-engineered TFT-based sensors before and after

Figure S7 .
Figure S7.Confusion matrix of modulated classifiers such as decision tree model, linear

Figure S8 .
Figure S8.Classification boundary map with configured training model with strongly

Figure S9 .
Figure S9.Summarized accuracy of altered training model by utilizing the original 21- component analysis (PCA), a data pretreatment, which reduces the classes expressed by 21-TFT-characteristic parameters to two PC coordinates, is performed as an intermediate step between the data acquisition process and ML (classification and regression).Through applying this pre-treatment process by PCA, we could obtain three nontrivial results beyond simply demonstrating the accuracy values of the ML training model; i) it is able to express the decision boundary map as a function of two-dimensional PC coordinates (Figure 3g and Supporting Information Figure S8, S10-13).ii) We can compare the inherent and unique characteristics of various training models in visualized decision boundary map (Supporting Information Figure S8, S10-13).

Figure S10 .
Figure S10.Regression results for acetic acid.(a-f) Relationships of predicted analyte concentration vs. real analyte concentration for 10 -6 -10 -7 M of acetic acid by applying linear

Figure S11 .
Figure S11.Regression results for ethyl butyrate.(a-f) Relationships of predicted analyte concentration vs. real analyte concentration for 10 -6 -10 -7 M of acetic acid by applying linear

Figure S12 .
Figure S12.Regression results for caffeine.(a-f) Relationships of predicted analyte concentration vs. real analyte concentration for 10 -6 -10 -7 M of acetic acid by applying linear

Figure S16 .
Figure S16.Chemical identification of pristine ZnO and metal-funtionalized ZnO by XPS.

Figure S17 .
Figure S17.Photographs of heated coffee beans at 150 o C with altering time of (a) 0, (b) 30,

Figure S18 .
Figure S18.(a) Extracted gas response of pristine ZnO and metal NPs hybridized ZnO for Ethiopia coffee beans at 250 o C. (b) Representative constant (Rmax -Rsat, Rsat -Rmin) for over

Figure S19 .
Figure S19.Parallel coordinate plot of five-representative PC scores for each ZnO-based

Figure S20 .
Figure S20.Double-PCA results.(a) PCA scatter plot for four-different types of coffee vapors.

Figure S21 .
Figure S21.Classification boundary map of four-different coffee vapors with varied training

Figure S22 .
Figure S22.Confusion matrix of varied classifiers using double-PCA data manipulation such

Figure S23 .
Figure S23.Single PCA results.(a) PCA scatter plot for four-different coffee vapors with initial

Figure S24 .
Figure S24.ML results using initial-PCA feature data.Classification boundary map of four-

Figure S25 .
Figure S25.Confusion matrix of varied classifiers using initial-PCA data manipulation such

Figure S26 .
Figure S26.Summarized prediction accuracy of varied training model to compare with data

Figure S27 .
Figure S27.Comparison of the explained variance plot (left) for single-PCA process from 44-

Figure S28 .
Figure S28.Schematic diagram of the coffee aroma measurement system, including the gas , as followed;

Table S5 .
Summarized gas responses acquired from ZnO, Au NPs-ZnO, Ag NPs-ZnO, and Pd NPs-ZnO based gas sensors for four types of coffee beans.