Mildew detection in rice grains based on computer vision and the YOLO convolutional neural network

Abstract At present, detection methods for rice microbial indicators are usually based on microbial culture or sensory detection methods, which are time‐consuming or require expertise and thus cannot meet the needs of on‐site rice testing when the rice is taken out of storage or traded. In order to develop a fast and non‐destructive method for detecting rice mildew, in this paper, micro‐computer vision technology is used to collect images of mildewed rice samples from 9 image locations. Then, a YOLO‐V5 convolutional neural network model is used to detect moldy areas of rice, and the mold coverage area is estimated. The relationship between the moldy areas and the total number of bacterial colonies in the image is obtained. The results show that the precision and the recall of the established YOLO‐v5 model in identifying the mildewed areas of rice in the validation set were 82.1% and 86.5%, respectively. Based on the mean mildewed area identified by the YOLO‐v5 model, the precision and recall for light mold detection were 100% and 95.3%, respectively. The proposed method based on micro‐computer vision and the YOLO convolutional neural network can be applied to the rapid detection of mildew in rice taken out of storage or traded.

| 861   SUN et al.   small samples (Gong et al., 2021).Under these circumstances, rice mildew is relatively easy to miss, and thus more convenient, fast, and accurate detection methods are needed, which would facilitate the testing of larger samples.
Non-destructive testing methods based on optical and volatile gas signals have the advantages of being fast and requiring no preprocessing (Chen et al., 2021;Wen et al., 2022).Using such methods, significant changes in color, odor, morphology, and texture can be observed in normal and severely mildewed cereals (Wan et al., 2019).
In recent years, a series of methods have emerged that utilize odor sensors, spectral technology, hyperspectral imaging technology, and computer vision in combination with chemometrics to identify and detect moldy grains.These methods have outstanding real-time performance and have high application value in grain quality evaluation.
In their work using odor sensors, Shen et al. (2016) used electronic nose technology combined with linear and least square discriminant analysis models to detect cereal samples infected with A. flavus, parasitic Aspergillus, and penicillium, with the overall detection rates of the two models reaching 100% and 97.4%, respectively.Jiarpinijnun et al. (2020) achieved early detection of fungal infection in brown rice using electronic nose equipment combined with principal component analysis, partial least squares regression models, and support vector machine models.Wang et al. (2021) used colorimetric sensor technology and a fivefold cross-validation method to determine the optimal parameters of a qualitative identification model for wheat mildew and the number of principal components (PCs).Using a support vector machine recognition model, they achieved a 100% correct recognition rate for independent wheat samples.However, the detection process involving odor sensors requires long time intervals for volatile molecule adsorption and deionization and is thus not suitable for rapid testing.
In their work using spectroscopy and hyperspectral imaging, Cong et al. (2018) achieved non-destructive detection of the total number of mold colonies in rice using a model combining gray wolf optimization and support vector regression.Jiang et al. (2018)  However, conventional computer vision methods are unable to detect rice samples with mild mildew contamination, and it is difficult to detect mildly mildewed grains with a total viable bacterial count (TVC) of 10 5 -10 6 CFU/g using methods at conventional observational scales (Sun et al., 2022;Zhou et al., 2008).Sun et al. (2022) used micro-computer vision (MCV) technology to improve the sensitivity to mildew features and used an advanced YOLO CNN to recognize the location of A. niger, Penicillium orange, and Aspergillus griseus on single rice grains with an accuracy of 89.26%, 91.15%, and 90.19%, respectively.Conventional computer vision methods can directly detect obvious bacterial colonies in mildewed rice images; however, bacterial colonies of mildly mildewed rice are small, and MCV techniques are required for feature extraction (Barsanti et al., 2022;Sun et al., 2022).At present, detection methods for rice mold based on MCV only realize the detection of a single kind of mold on isolated rice grains.Detection methods suitable for multiple grain analysis and multiple mold types have not been presented.
Smaller observation scales also contain more complex image information, and powerful recognition models are required to recognize images acquired using MCV.The YOLO (You Only Look Once) CNN is an advanced deep learning model with strong image analysis and target detection capabilities and can locate multiple targets in images (Redmon et al., 2016).Compared with earlier networks such as LeNet and Alexnet, the YOLO model has a deeper network structure and higher recognition speed than R-CNN and Fast R-CNN, and the detection time of targets in images is usually less than 0.1 s.It has been widely used in the fields of face recognition, disease and pest monitoring, transportation, and medical imaging, but its adoption in the field of microbiology is still limited (Cao et al., 2019;Chen et al., 2021;Jiang et al., 2021;Li et al., 2022).The YOLO CNN model is suitable for the recognition of mold in complex microscopic rice images.YOLO-v5 is the fifth generation of the YOLO model, and its training and recognition speeds and model size are compared with those of its previous generations; thus, it is currently the most commonly used version of YOLO.
In this paper, in order to promote the practical application of MCV in rice mold detection, rice samples were loaded into petri dishes, and images of molded rice were captured using MCV combined with a 9-point acquisition method.Then, a YOLO-V5 model was established for detecting mildew areas in images, and its accuracy was verified.Finally, the proportion of mildew areas in each sample image was obtained using this model, and correlation analysis was conducted with the actual total number of colonies to determine the proportion threshold of mildewed areas in mildly mildewed samples.The complete detection method allows the detection of mild rice mildew using MCV and the YOLO model.

| Simulated storage of rice after inoculation
In this study, we selected a rice variety that is naturally moldy in the field (indica rice, purchased from Huolonggang Town, Wuhu City, Anhui Province, China) as the research object.The steps for mold inoculation and simulated storage are as follows: 1.The rice sample was placed in an oven and baked at 80°C for 4 h to remove the original field mold that resided on the rice grains.The dried rice grains were placed in a set of 60 mm circular culture dishes (10 g of rice in each culture dish).
2. The naturally moldy rice was washed with sterile distilled water to prepare a spore suspension.Through polymerase chain reaction testing, bacteria suspension was detected using in vitro amplification, and the results showed that the mold on the surface of the rice came from Aspergillus and Fusarium.The spore suspension was inoculated in potato glucose agar.The resulting culture dish was placed in an incubator with constant temperature and humidity (temperature 28°C, relative humidity 90%) for 24-36 h.
The concentration in the spore suspension was measured using a plate counting method and then diluted to 1.5 × 10 5 CFU/mL.
3. 70 Petri dishes containing 10 g of rice grains were created, and 1 mL of spore suspension was inoculated into each culture dish and shaken well to fully impregnate each grain of rice.Then, the dishes were stored under constant temperature and humidity conditions of 28°C and 90% relative humidity for 13 days, and a group of samples (10 petri dishes) was inspected every 48 h to determine when a high degree of mold formation had been reached, i.e. when the mold content per gram of grain was greater than 10 6 CFU/g.

| Microscopic image acquisition of rice samples
In this study, the combination of a microscope lens and a Dahua A7A20MU30 color array industrial camera (purchased from Hangzhou Huarui Technology Co., Ltd., China) was used to collect moldy rice MCV images in each culture dish at 9 points.In this process, it is important to ensure that the 9 sample points are evenly distributed and do not overlap each other, so that all details are clearly captured.Therefore, 9 MCV images were obtained for each culture dish (10 g of rice) (as shown in Figure 1), with a captured image size of 2560 × 1876 pixels and a spatial resolution of about 0.02 mm per pixel.During simulated storage, a group of samples (10 petri dishes) were taken out approximately every 48 h for image collection, and 90 images were obtained each time.A total of 630 microscopic images of moldy rice were obtained over 13 days.

| Image labeling
The YOLO CNN is a deep learning algorithm used for target detection and thus requires manual labeling of the target areas.In order to highlight the mildewed (i.e.target) areas in the MCV rice images, each image was divided into four parts, resulting in a resolution of 1280 × 938 pixels for each part.As shown in Figure 2, the Labelimg image marking tool developed in Python was used to mark mildewed areas in all MCV images.In order to extract the mildew areas more accurately, sporangia was selected as the main recognition object, as it has a certain optical structure and area layout in images.

| Establishment of the YOLO-v5 model
The fifth-generation YOLO model (YOLO-v5) was used to establish a model for identifying moldy rice areas.The architecture and where tp and fp are the numbers of true and false positives, respectively.

| Analysis of the relationship between the area of mildewed rice areas and the total number of bacterial colonies
In order to obtain a more accurate relationship between the mildewed rice area and the total number of bacterial colonies, it is necessary to repeat experiments on rice inoculation and simulated storage.In this experiment, a group of samples (10 petri dishes) were extracted every 48 h during the simulated storage period of 13 days to determine the TVC.As the current detection target for moldy rice refers to the brown rice portion after hulling, half of the rice was hulled.The purpose of this experiment was to evaluate whether the degree of unhulled rice mildew can be used to estimate the degree of mildew in brown rice in actual production processes by observing the fitting degree of the TVC curve of the two samples.Therefore, the rice particles were roughly divided in each culture dish (containing 10 g of rice) into two parts by mass.After weighing and recording the mass data, they were poured into test tubes.One portion was kept in its unhulled state, while an inspection huller (Rizhao Liang'an Storage Equipment Co., Ltd.) was used for hulling, that is, to obtain a certain amount of brown rice.
Next, 10 mL of sterile distilled water was added to 20 of the test tubes, and the contents were mixed and vibrated using an SK-1 rotary mixer (Jiangsu Jintan Yitong Electronics Co., Ltd.) to fully elute the mold spores on the unhulled and brown rice grain surfaces, and unhulled and brown rice spore suspensions were obtained.On the first day, after diluting the suspension 10 times, 0.5 mL were extracted and inoculated into potato glucose agar supplemented with chloramphenicol.The resulting 20 Petri dishes (90 mm circular Petri dishes) were placed in a constant temperature and humidity incubator (temperature 28°C, relative humidity 90%) for 36 h, and the total number of unhulled and brown rice colonies was calculated using the colony plate counting method.
During simulated storage, the dilution factor was gradually where i is the index of the image location examined; k is the index of the 9 captured regions of rice sample; M is the area of the mildewed area in one of the 4 divided images; and S is the total area of the divided image (1280 × 938).Then, regression analysis was performed on the TVC and the area of the mildewed rice areas.

| Confusion matrix for the identification results of the rice mildew area detection model
The confusion matrix of detection results of the rice mold spot area detection model is shown in Table 1.In the training set and the verification set, the Precision with which the mildew areas were detected reached 89.3% and 86.5%, respectively, while the Recall of the mildew areas was 90.5% and 86.5%.
Figure 4 shows an MCV image of the rice mold detected by the model, which was taken when the grain was mildly moldy (TVC within the range of 10 5 -10 6 CFU/g).As can be seen from the figure, the proposed model can identify all mildewed areas in the image effectively, independent of the number of rice grains and the type of mold.In addition, the recognition speed of this model is very fast, and the time to analyze a single image was about 0.04 s.

| Fitting degree of the bacterial count curve of unhulled and brown rice
Figure 5 shows the change in the TVC of unhulled and brown rice.With the increase in simulated storage days, the TVC of both rice types showed an exponential upward trend, and the mold growth rate of unhulled rice was faster than that of brown rice.On the 9th day of simulated storage, the TVC value was within the range of 10 5 -10 6 CFU/g, and the grain was in a slightly moldy state; on the 11th day, the TVC value was greater than 10 6 CFU/g, and the grain was in a severely moldy state.After correlation analysis, the determination coefficient R 2 of the logarithm of the TVC values of the unhulled rice and brown rice was

| Analysis of the relationship between rice TVC and the area of mildewed areas identified by the model
Figure 6 shows the relationship between the TVC of unhulled and brown rice and the area of the moldy rice area determined by the model (denoted using the letter S).According to the trend line equations, the area of mildewed areas identified by the model was linearly correlated with the logarithm of TVC for unhulled rice and brown rice.The R 2 of the regression models was 0.8059 and 0.7794, respectively.Specifically, the MAI values in 29 samples without significant mold (TVC < 10 5 CFU/g, detected using brown rice) were lower than 0.001, while only 2 samples with light mold had MAI values above 0.001.The MAI values in 41 samples with light mold (10 5 ≤ TVC < 10 6 CFU/g, detected with brown rice) were all higher than 0.001.It can be seen that there is a clear threshold value for MAI for both unhulled and brown rice (MAI = 0.001) that can be used to distinguish between rice with no obvious mold and rice with a relatively light degree of mildew.Using this threshold MAI value, the Precision and Recall for light mold detection were 100% (41/41) and 95.3% (41/43).Although this technology can detect slight mold with some accuracy, the spatial resolution of near-infrared spectroscopy is very low.If this technology is applied to rice detection, changes in the content of the fatty acids in rice will cause changes in their near-infrared spectroscopic characteristics, which will increase the detection error (Cong et al., 2018).The MCV technology used in this study has low cost and good real-time detection performance.Compared with traditional computer vision (Sun et al., 2016), the obtained microscopic images have higher resolution and capture detailed information from the rice samples, thereby achieving higher detection accuracy for moldy rice.
Compared with the study by Sun et al. (2022), the detection target for the experiment in this study was 10 g of rice rather than single grains of rice.By identifying and analyzing the mildewed areas of grouped samples, the obtained results can play a more effective role in practical applications of mold detection in large batches of rice.In this study, a bacterial suspension prepared by eluting spores and colonies was inoculated on the surface of naturally moldy rice, which resulted in more and richer types of molds.While MCV image capture will inevitably lead to a reduction in the visual field of the sample, in this study, a 9-point acquisition method is adopted to collect microscopic images of grouped rice grains and to comprehensively process and analyze 9 images of the same group of rice samples.Thus, the obtained image data are more representative.
Due to the fact that the current detection target for mildew in rice is based on the brown rice portion inside the rice husk, in this study, the degree of fitting of the TVC values of unhulled and brown rice was analyzed, and it was found that the determination coefficient R 2 between the two values was close to 1.The RMSE value of predicting the TVC value of brown rice using the TVC value of unhulled rice was relatively small, and the degree of mildew in unhulled rice follows a similar growth pattern to that encountered in brown rice for the same incubation time.Therefore, the degree of mildew in unhulled rice can be used to estimate the degree of mildew in brown rice, thus eliminating the tedious steps of hulling during rice mildew detection, which makes the detection process faster and more efficient.Moreover, in this study, the types of molds that contaminated the rice were not clearly distinguished.Regardless of whether it is unhulled or brown rice, the thresholds obtained clearly distinguish between no significant mold and mild mold change, so the experimental results will not be affected by different mold diffusion capabilities and visibility differences.
Through this study, it was found that the established Yolov5 model can be used to detect mildewed areas in grouped samples of rice effectively, and the method is capable of detecting even slightly moldy rice.

| CON CLUS I ON AND PROS PEC T
In this article, a YOLO-v5 model is established to automatically detect mildewed areas in MCV images of rice mildly contaminated by mold.Using this model, mildewed unhulled rice samples with a TVC of 10 5 -10 6 CFU/g can be detected using the MAI threshold.
Compared with traditional methods, this method is real-time and low-cost, and can be widely used in rice safety evaluation in the rice market.In the future, this method should be tested using rice samples collected in grain depots or markets to verify its practicality.
Moreover, in order to obtain a model with higher precision and recall, the segmentation effect of U-net or Deeplab net for mildewed area of rice in MCV image can be tested.

CO N FLI C T O F I NTE R E S T S TATE M E NT
All the authors declare that they have no conflict of interest.
F I G U R E 6 Relationship between the TVC of unhulled rice (a) and brown rice (b) and the area of mildewed areas identified by the model.
used an array spectrometer based on principal component analysis to perform cluster analysis on wheat samples infected with different molds and established a linear discriminant analysis model with an accuracy rate of over 90%.Chen et al. (2021) identified 21 characteristic substances from volatile organic compounds in rice samples using gas chromatography with ion migration spectrometry.Combining principal component analysis and k-means clustering, they established a clustering model that can be used to quickly identify the degree of rice mildew.Jia et al. (2022) established a back-propagation neural network model with an ant colony optimization classification model using a combination of standard normal variables and non-information variable elimination to process hyperspectral image data of five grades of corn seed mold in the same variety.Liu et al. (2023) proposed a method for identifying sunflower seed mildew grades based on near-infrared diffuse reflectance and transmission fusion spectroscopy, along with a one-dimensional convolutional neural network (CNN).However, generic near-infrared spectrometers can only measure sample points and require the sample to be homogenized, which is not suitable for rapid rice contamination detection.Moreover, multispectral and hyperspectral equipment is expensive, and the image data acquisition times are long.In computer vision applications, Pan et al. (2017) used computer vision, support vector machines, partial least squares discriminant analysis, and continuous projection algorithms to collect and analyze images of rice samples with different degrees of mildew, achieving an overall accuracy in distinguishing the mildewed parts of over 90%.Chen et al. (2019) used an independently developed machine vision system to simultaneously detect four types of defects in red indica rice: broken grains, chalkiness, breakage, and spots, with recognition accuracy rates of over 93%.Due to the limitations of field size and resolution, traditional computer vision has low sensitivity to mildew detection.Compared with traditional machine learning, deep learning technology has obvious advantages in classification ability and speed.Sun et al. (2016) compared the effect of traditional machine learning and the LeNet5 CNN on image recognition of mildewed rice, and the results showed that the early LeNet5 deep learning algorithm had a great advantage in recognition speed and accuracy.

F
I G U R E 1 MCV images of unhulled rice (9 imaging regions in the same culture dish).main functions of the model were consistent with those of Sun et al. (2022), which was used to extract target features from images, then aggregate the target features and build a model for identifying the same targets.In this study, a total of 38,061 mildewed areas in 2520 microscopic images of rice grains were identified.The rice microscopic images were randomly divided into training sets and test sets at a 6:4 ratio, resulting in 1512 training set images and 1008 test set images.In order to improve training speed and reduce memory consumption, a YOLO-v5s model was adopted, which is a version of the YOLO-v5 model with fewer layers and nodes for easier training and deployment.The model was built using Pycharm (JetBrains, Czechs)and the YOLOv5 master toolbox (https:// github.com/ ultra lytics/ yolov5).The YOLO-v5 hyperparameters were as follows: the input image resolution was set to 1280, the batch size was set to 8, the model learning rate was set to 0.01, and the number of epochs was set to 50.Techniques such as Mosaic enhancement and image rotation were also applied.In order to reduce the probability of the model misjudging the background and improve the fitting degree of the output boundary box and mildewed areas, the confidence threshold of the model was set to 0.5.After the training was completed, the optimal model was determined based on the change in the box loss, the object loss, and the mean average precision at a confidence threshold of 0.5 (mAP 0.5 ).Then, to verify the accuracy of the model, the confusion matrix, precision, and recall of the detection results of mildewed areas were obtained.
Marking of the mildewed areas (b) in the original MCV image (a) of the rice.increased by an order of magnitude based on the growth of mold colonies, and a total of 7 sets of measurements were obtained, with each set containing 20 quality measurements and 20 TVC values.Subsequently, the data were processed and analyzed.The equation for calculating the TVC per gram of rice (or brown rice) is shown in (3).Then, a correlation analysis was conducted on the changes in the TVC of rice and brown rice and calculated the determination coefficient R 2 .Finally, the proposed YOLO model was used to identify the mildewed areas in the images.The mean proportion of mildewed area in each image (MAI) was calculated for each grain image.The calculation method is shown in Equation (2):

Figure 3
Figure 3 shows the changes in box loss, object loss, and mAP 0.5 during model training.As can be seen from the figure, during the initial stage of training, the box value in the training set and the verification set decreased rapidly.After 30 epochs, the loss value of the Total number of colonies in each culture dish × Dilution ratio Corresponding grain quality of rice (or brown rice) Parameter variation during model training.(a) Box loss (b) object loss (c) mAP 0.5.| 865 SUN et al. 0.9926.The root mean square error (RMSE) of predicting the TVC value of brown rice using the TVC value of unhulled rice was 0.2025.
Mildewed areas detected by the YOLO-v5 model in a rice MCV image.The role of the YOLO model in this study was essentially to perform image segmentation, i.e. to divide the image into mildewed areas and background areas.However, mildewed areas in images are generally continuous and have complex shapes.The target boxes identified by the model or labeled via manual work rarely coincide completely with the real mildewed area.In order to formulate a model with higher precision and recall for mildewed area segmentation, CNN models for accurate image segmentation such as U-net or Deeplab net can be used(Srinitya et al., 2023), but the object area labeling for these models is more complex than that of the YOLO model.In recent years, scientific research on the identification and detection of moldy grains using electronic noses, hyperspectral imaging, and near-infrared spectroscopy is continuously emerging.Jiarpinijnun et al. (2020) used an electronic nose combined with GC-MS analysis technology to detect characteristic volatile odors resulting from fungal contamination on brown rice grain, but this technology requires skilled operators to operate, and the detection process is costly and time-consuming, and consequently not suitable for rapid on-site detection.Jia et al. (2022) used hyperspectral technology to detect moldy corn seeds.Hyperspectral cameras are costly, and extracting hyperspectral image data requires significant data processing capabilities from operators.Therefore, this method is also not suitable for the practical production testing process.In addition, the above two methods can only detect moldy grains but cannot distinguish between different moldiness degrees.In this study, by analyzing the relationship between the TVC value of rice and the area of the mildewed areas identified by the model, it is possible to effectively distinguish between non-significantly moldy rice and slightly moldy rice.Liu et al. (2023) used near-infrared spectroscopy to perform the classification of sunflower seed mold grades.

F
Changes in TVC of unhulled and brown rice.(a) TVC variation.(b) Relation between TVC of unhulled rice and brown rice.
Conceptualization (equal); data curation (equal); funding acquisition (equal); methodology (equal); software (equal).Mengdi Tang: Formal analysis (equal); investigation (equal); writing -original draft (equal).Shu Li: Investigation (equal); writing -review and editing (equal).Siyuan Tong: Investigation (equal); writing -review and editing (equal).ACK N OWLED G M ENTSThis work was financially supported by the Natural Science Foundation of Anhui Province, China: 2008085QC143.The authors acknowledge all of the support received.We would like to thank MogoEdit (https:// www.mogoe dit.com) for its English editing during the preparation of this manuscript.FU N D I N G I N FO R M ATI O NThis work was financially supported by Natural Science Foundation of Anhui Province, China: 2008085QC143 (Project leader: Ke Sun).