Characterization of bacterial community and flavor differences of different types of Douchi

Abstract According to the appearance and technology, traditional fermented Douchi can be divided into dried Douchi and wet Douchi. However, there are few reports on the difference of bacterial community structure between them or the influence of bacterial community on product flavor. In this study, high‐throughput sequencing technology and electronic nose were used to measure the bacterial diversity and flavor of 40 Douchi samples, and the correlation between them was explored by multivariate statistical means combined with COG database. Results showed that the cumulative average relative abundance of Firmicutes and Proteobacteria in the samples was as high as 95.93%, and the former was the core bacteria phylum. On the whole, the dominant bacteria in Douchi were Bacillus (50.67%), Staphylococcus (14.07%), Enterococcus (2.54%), Proteus (1.61%), Brevibacillus (1.46%), Providencia (1.26%), Weissella (1.24%), and Ureibacillus (1.19%). LEfSe analysis indicated that Bacillus can be used as a biomarker in dried fermented soybeans. Meanwhile, dried samples contained more intensive aromatic substances, but were significantly lower in W6S (selectivity to hydrogen) and W3S (methane‐aliph) compared with the wet samples. Aneurinibacillus and Brevibacillus were helpful to the formation of aromatic flavor in Douchi, but Vagococcus and Corynebacterium were the opposite. Gene and microbial phenotypic prediction showed that microorganisms in dried Douchi use protein more efficiently, while in wet Douchi, microbial energy metabolism was more vigorous. The pathogenic potential of microorganisms in dried samples was higher than that in wet. This study can sound the alarm for improving the safety of home‐brewed Douchi and provide guidance for the subsequent screening of strains that enhance the flavor of fermented soybeans.

can accurately identify hard-to-culture and low-abundance microorganisms, providing a new tool for exploring the diversity and function of microorganisms in fermented food. Eugenia (Jiménez et al., 2017) explored the diversity of lactic acid bacteria in Peruvian traditional fermented potatoes by high-throughput sequencing; Justyna (PolKa et al., 2015) used this technique to analyze the bacterial diversity of Italian sausages at different maturity stages; Zhang (Zhang et al., 2018) has investigated the microbial diversity of Douchi made from different raw materials. The implementation of these research topics provides a good idea for analyzing the bacterial diversity of Douchi in this study. E-nose technology can realize rapid digital evaluation of volatile flavor substances in food. The measurement is precise and less-affected by subjective factors (Loutfi et al., 2015), thus extensively employed in the fermented food industry Jung et al., 2017).
In this study, 21 dried and 19 wet Douchi samples were collected from farmers in Enshi, Hubei Province, China. The microbial diversity was determined with Illumina MiSeq high-throughput sequencing. Meanwhile, e-nose system was used to detect the intensity of flavor indexes, and a correlation analysis was performed with the dominant genus. Furthermore, the function and phenotype of bacterial microbes were predicted based on PICRUSTs and BugBase. This study aims to provide a theoretical basis for the quality improvement in production and the formulation of relevant standard.

| Sample collection and metagenomes extraction
This study adopted a completely randomized design. A total of 40 Douchi samples were collected local farmers' homes in Enshi Tujia and Miao Autonomous Prefecture, Hubei Province, China, including 21 dried samples (D1-D21) and 19 wet samples (W1-W19). The main raw material is soybean, which was planted by farmers themselves.
All samples were prepared by traditional methods; that is, they were naturally fermented without adding any starter. The main difference between the two types of samples is whether added boiled soybean soup or not. Wet Douchi was fermented with enough soybean soup to immerse the soybeans, while no water was added during the fermentation of dried Douchi.
The total DNA was extracted from 3.0 g of each sample according to the instruction of food DNA Genomic Extraction Kit (QIAGEN).
The purity, concentration, and integrity of the extracted DNA were detected by spectrophotometry and 1% agarose gel electrophoresis.
Qualified DNA samples were stored in a −20°C refrigerator for use.
The 40 qualified DNA amplicons, diluted to a concentration of 100 nmol/L, were sequenced by MiSeq high-throughput sequencing platform in Shanghai Majorbio Co., Ltd.

| Quality control and bioinformatics analysis
The pair-ended sequences generated through MiSeq sequencing were merged referring to the overlap relationship of paired fragment sequences. Sequences were divided into different samples by referring to the barcodes. The barcodes and primers in the aligned sequences were removed to obtain high-quality information for further analysis. Sequences with less than 50 bp of bases after cutting were discarded.
Bacterial community analysis and diversity assessment were performed on the retained sequences using the QIIME (v1.9.1) platform (Ju & Zhang, 2015). The procedure was as follows: (a) calibrating and aligning the high-quality sequences with PyNAST (Awasthi et al., 2017); (b) using UCLASS to cluster the aligned sequences based on 100% similarity to establish a complete representative gene set of 16S rRNA; (c) clustering the aligned sequences based on 97% similarity by UCLASS to generate operational taxonomic units (OTUs) (Hou et al., 2016); (d) identifying and deleting chimeras in the constructed OTU matrix through ChimeraSlayer (Wei et al., 2016); (e) performing homology comparison among the representative reads with RDP, Greengenes, and SILVA databases, annotating taxonomic positions at the phylum, class, order, family and genus levels, and combining annotation results generated by different databases (Balvoiūt & Huson, 2017); (f) drawing phylogenetic tree based on OTU representative sequence with FastTree (Yu et al., 2017); (g) as-

| Flavor quality evaluation through e-nose technology
Dispense 30 g Douchi sample into three e-nose vials, heat in water bath (Yiheng) at 60°C for 15 min, let it stand at room temperature for 20 min, insert e-nose probes into the sample, and collect volatile compounds using 10 sensitive metal sensors. Before testing, the sensors performed an automatic cleaning of 95 s, followed by 60 s sample determination. Response value was measured every second. The response curve tended to level off after 40 s of measurement. Therefore, response value at 49 s, 50 s, and 51 s was averaged to construct the data matrix. Response value was defined as the ratio of G to G0, which represented the resistance of metal sensors during sample test and the air test, respectively. The reading was 1 during automatic cleaning. The greater the value deviates from the value of 1, the higher the concentration of volatile compounds was (Zhang et al., 2014).

| Functional and phenotypes prediction of bacterial
Quality control qualified sequences were clustered into OTU matrix by UCLASS with the identity threshold of 97% according to the standard database of PICRUSTs (Gavin et al., 2018). Taxonomic positions of representative sequences were annotated with the Greengenes database.
The PICDRUSTs software was used to predict the functional potential of bacterial flora in Douchi samples. At the same time, the annotated OTU matrix was uploaded to the online website (https://bugba se.cs. umn.edu/) for microbiome phenotype prediction (Ward et al., 2017).

| Statistical analysis
Linear discriminant analysis effect size (LEfSe) method based on a normalized relative abundance matrix was used to identify the significant differences between cheese samples (Jin et al., 2018). Pearson correlation analysis and Wilcoxon test were applied to calculate the correlation and significance between bacteria genera (with an average relative abundance of more than 0.5%) and the relative intensity of flavor indexes generated by e-nose. Cytoscape software (v3.5.1) was used to map the correlation diagram; correlation and significance were also calculated with Pearson correlation analysis and Wilcox test between the gene function and the bacteria genera that amounted for more than 0.5% of the sequences. The result was visualized through a heat map, and all the figures were plotted by R software (v3.3.2) as well as Origin software (v2018, OriginLab Corp).

| Microbial community diversity and abundance
A total of 1,865,973 high-quality 16S rRNA sequences were ob-  Table S1.
In this study, the observed species curve and the Shannon index curve were used to further evaluate the sequencing depth to determine whether it met the requirements of subsequent bioinformatics analysis ( Figure S1a). It could be seen from the figures that both the rarefaction curve and the Shannon curve tended to be flattened with the increase of sequencing depth, indicating that the amount of data satisfied the requirements. Though the sparse curve did not plateau, the Shannon curve reached saturation, meaning that some additional new species may be discovered by expanding the coverage, but the vast majority of microbial and bacterial diversity was captured.

| Compare bacterial distribution of different types of Douchi
A total of 9 bacterial phyla were identified among all Douchi samples (Figure 1a), of which 5 were the major phyla (average relative abundance more than 0.1%), included Firmicutes (85.93%), Proteobacteria (10.00%), Bacteroidetes (0.96%), Actinobacteria (0.61%), and Cyanobacteria (0.16%). Firmicutes, present in the samples with relative abundance all above 46%, were the absolute dominant phyla. Secondly, Proteobacteria were also found in all samples, with lower abundance in some samples though. At the genus level, a total of 192 bacterial genera were identified (Figure 1b), of which 18 were dominant genus (average relative abundance more than 0.50%). Eight out of the eighteen were reported a relative abundance higher than 1.0%, namely Bacillus (50.67%), Staphylococcus Firmicutes and Bacteroidetes are widespread in fermented food (Marco et al., 2017). Bacteroidetes abundance differs significantly between the two Douchi types. It was documented that Flavobacteria (phyla: Bacteroidetes) lived mainly in aquatic environment, especially in liquid fermented food (Rios-Castillo et al., 2018). This was highly consistent with our results. We found Bacteroidetes only in wet samples, not in the 21 dried samples. At the genus level, Bacillus and Staphylococcus, two genera colonized in all samples, were key to the fermentation process (Shin & Jeong, 2015). Significance testing of dominant genus indicated that there were significant differences (p < .05) between the two types in terms of Staphylococcus, Brevibacillus, Ureibacillus, and Pseudomonas. Studies showed that Bacillus produced a large number of enzymes such as proteases (Gaurav et al., 2015) and amylases (Akhani et al., 2017), which could be used commercially for industrial production. As a result, the presence of Bacillus can accelerate the fermentation process of Douchi and provide substances required by other bacteria (Zhu et al., 2017).
Staphylococcus appeared in all the samples as a dominant genus with significant differences, exhibiting good adaptability to the environment of Douchi fermentation. Previous studies have shown its proteolytic function and antibiotic activity during Douchi fermentation (Furuhata et al., 2010;Jeyaram et al., 2008), revealing its F I G U R E 1 The relative abundance of Dominant phyla (a) and genera (b)  importance in the process (Tae et al., 2010). Yang (Yang et al., 2016) revealed that the relative abundances of the other predominant bacteria such as Brevibacillus, Ureibacillus and Pseudomonas were positively related to the fermentation duration, reaching the peak after 1 week of fermentation. Researches also demonstrated that Brevibacillus fermentation could produce glutamate (Momose & Takagi, 2014), Ureibacillus could decompose cellulose to butanol (Mariem et al., 2018), and most Pseudomonas could produce one kind or more kinds of protease, lipase, and lecithin enzymes (Dogan & Boor, 2003), thereby enhancing the flavor of the Douchi.   leading to divergence in the flora structure (Bortolini et al., 2016).
Result showed that difference not only existed between different types, but also among the samples of the same type and the former was larger. Inter-type differences may attribute to makers' empirical procedure and varied standard as well as the surrounding environment even if they use the same type of fermentation (Moretro & Langsrud, 2017).
In order to further identify the differences among the abundant groups, LEfSe was performed with a threshold score of 2.0 ( Figure S3). A total of 45 mycobacteria with significantly different abundance were identified in the study (p < .05), of which 36 and 9 were crucial to wet and dried Douchi, respectively. The LEfSe performed to further pinpoint the differential taxa in these microbiota communities, showed that the mean relative abundance of most bacterial groups in the wet samples was high.

| Comparative analysis of flavor quality of different types of Douchi
Following the bacterial diversity, analysis was a flavor quality assessment through e-nose technique (Hao et al., 2017). A comparative study was done by classifying the samples into two groups according to their feature ( TA B L E 1 Performance description of electronic nose sensors and flavor difference analysis of two types of fermented soybeans wet samples only led in W6S and W3S among the 10 flavor quality parameters and differences were significant between the two groups (p < .05).
A correlation map was drawn using the dominant genus and flavor quality data (Figure 4).

| Functional annotation of unigenes across the COG database
PICRUSTs gene prediction generated a total of 4,267 COGs which belonged to 23 functional categories (Figure 5b). Comparison between the two types of Douchi revealed that category E (amino acid transport and metabolism), G (carbohydrate transport and metabolism), and K (transcription) were dominant in all samples.
According to Wilcox test, four out of the 23 functional categories had significant differences between the two groups (p < .05) were category G (carbohydrate transport and metabolism), C (energy production and conversion), O (posttranslational modification, protein turnover, and chaperones), and T (signal transduction mechanisms). There were apparently more category G sequences and less category C, O, and T in dried samples compared with the wet ones (p < .05).
As expected, most of the sequences were functionally assigned to genes related to energy metabolism, protein metabolism, and ion transporters. Energy and protein metabolism were essential for the survival and related functions of microorganisms. Ion transport was also indispensable to basic bacterial metabolism. Ions served as enzyme cofactors to achieve a wide variety of biochemical reactions.
The dried samples overtook the wet in the amount of category G genes, indicating that the former was more efficient in protein utilization. In contrast, there were more gene sequences with potential functions of category C, O and T in wet samples. In bacteria, the generation of energy relied primarily on substrate phosphorylation, a process in which glucose was broken down into acetate (Gao et al., 2017). The fermentation of Douchi involved unique microbial metabolic pathways of carbohydrates, which produced diversified volatile short-chain fatty acids (SCFA). Besides, high expression of genes associated with signal transduction mechanisms and others may enhance the signal sensing ability of bacteria, thereby enabling them to utilize nutrients and eliminate or avoid toxic substances better. As an essential substance in organic body, protein should be translated, modified, and converted before exerting its specific role.
This suggested that the metabolism in wet Douchi may be more active. These functional categories were extremely important to the fermentation of Douchi. Their high expression showed that bacteria can improve fermentation efficiency, shorten production cycle, and increase utilization rate of raw materials.
In order to determine the potential function of single genus, this study performed a rank correlation and significance analysis for the abundance matrix of COGs functional categories and the dominant genus ( Figure 5a). The analysis found that category K (transcription) and C (energy production and conversion) were basically similar in This paper also predicted the bacterial phenotype in Douchi samples ( Figure S4). There were some yet insignificant differences between the two groups regarding to Oxygen Utilizing,    and Corynebacterium were negatively associated.

ACK N OWLED G EM ENTS
The research was supported by Hubei University of Arts and Science Foundation for Cultivation fund for teachers' scientific research ability: "technological innovation team"(2020kypytd009).

CO N FLI C T O F I NTE R E S T
The authors have declared no conflict of interest.