An examination of data from the American Gut Project reveals that the dominance of the genus Bifidobacterium is associated with the diversity and robustness of the gut microbiota

Abstract Bifidobacterium and Lactobacillus are beneficial for human health, and many strains of these two genera are widely used as probiotics. We used two large datasets published by the American Gut Project (AGP) and a gut metagenomic dataset (NBT) to analyze the relationship between these two genera and the community structure of the gut microbiota. The meta‐analysis showed that Bifidobacterium, but not Lactobacillus, is among the dominant genera in the human gut microbiota. The relative abundance of Bifidobacterium was elevated when Lactobacillus was present. Moreover, these two genera showed a positive correlation with some butyrate producers among the dominant genera, and both were associated with alpha diversity, beta diversity, and the robustness of the gut microbiota. Additionally, samples harboring Bifidobacterium present but no Lactobacillus showed higher alpha diversity and were more robust than those only carrying Lactobacillus. Further comparisons with other genera validated the important role of Bifidobacterium in the gut microbiota robustness. Multivariate analysis of 11,744 samples from the AGP dataset suggested Bifidobacterium to be associated with demographic features, lifestyle, and disease. In summary, Bifidobacterium members, which are promoted by dairy and whole‐grain consumption, are more important than Lactobacillus in maintaining the diversity and robustness of the gut microbiota.


| Diversity analysis
Alpha diversity was calculated using the vegan package (Zapala & Schork, 2006) in R software. Six indexes were applied in the analysis: the Shannon index, Chao1 index, observed ASVs, ACE index, inverse Simpson index, and Pielou index. Principal coordinate analysis (PCoA) was conducted using the data of Bray-Curtis dissimilarity data (Bray & Curtis, 1957). To assess whether the presence of the two genera was a significant factor for explaining variation in the gut microbiota, we devided the continuous variables of their abundance into categorical variables as explanatory factors. Taking Bifidobacterium, for example, we introduced two categories as explanatory factors according to its presence or not: One category is the samples with Bifidobacterium and the other is the samples without Bifidobacterium. And, permutational multivariate analysis of variance (PERMANOVA) was applied with a parameter of 9,999 permutations in R (Zapala & Schork, 2006).

| Construction of microbial networks
Microbial network analysis has been employed to examine keystone taxa and relationships among the microbial community, which can provide useful information for further intervention (Banerjee, Schlaeppi, & van der Heijden, 2018). In the present study, we applied SParse InversE Covariance Estimation for Ecological ASsociation Inference (SPIEC-EASI), a statistical method for the inference of microbial ecological networks from amplicon sequencing datasets (Kurtz et al., 2015). The network was constructed based on relative abundance at the genus level following the instructions at https ://github.com/ zdk12 3/Spiec Easi. Considering that increasing the rep.num argument may result in better performance (Kurtz et al., 2015), networks were constructed using the SPIEC-EASI package in R with the default parameters, except that the parameters nlambda and rep.num were each set as 100 (Liu et al., 2017). The degree statistics is a measure of the centrality of nodes, with higher values indicating that the node is involved in more ecological interactions. We assessed the robustness of the different microbial association networks to random node removal ("attack") (Albert, Jeong, & Barabasi, 2000;Iyer, Killingback, Sundaram, & Wang, 2013) using natural connectivity (Jun, Barahona, Yue-Jin, & Hong-Zhong, 2010) as a general measure of graph stability. We also measured how the natural connectivity of the microbial network changed when nodes and their associated edges were removed from the network (Mahana et al., 2016).

| Regression analysis
Because of excessive zero abundance in the read counts and the overdispersion, a multiple zero-inflated negative binomial (ZINB) regression model (Alan, 2015) was used to determine the differential abundance in the analysis of Bifidobacterium. The ZINB model consists of two different components: A logistic regression component for modeling excessive zeros and a negative binomial regression component for modeling the remaining count values. Missing data in each categorical variable were included in a separate hidden category (Hill, 2006). Overall, fitted mean proportions were calculated by the average predicted value (APV) method (Albert, Wang, & Nelson, 2014), in which Bifidobacterium count values are divided by the mean total read counts under each exposure status. The variables of host features were selected based on the record number and biological relevance, and 16 variables were retained for further study, namely, age, sex, race, geographical location, whole-grain consumption, vegetable consumption, fruit consumption, milk and cheese consumption, C-section, feeding patterns, antibiotic exposure, IBD, IBS, autoimmune disease, cardiovascular disease, and food allergy. To allow clear interpretation of the result, we divided frequency into three categories, "high frequency," "low frequency," and "never". We divided the race into five categories, namely, "Caucasian White" (CW), "African-American" (AA), "Hispanic" (HI), "Asian-Pacific" (AP), and "Other". We also divided geographical location into four new categories, namely, "North American" (NA), "Europe" (EU), "Oceania" (OC), and "Other".
Differences between groups were tested using Wilcoxon rank-sum test. When multiple hypotheses were considered simultaneously, p-values were adjusted to control the false discovery rate with the method described previously (Benjamini & Hochberg, 1995).

| Bifidobacterium is a dominant genus in the human gut microbiota
Based on the criteria for defining dominant genera outlined in the Methods section, only 8.0% (22/276) of the bacterial genera among the 2,186 samples were dominant. However, this small number of genera accounted for an average of 64.4% of the relative abundance ( Figure 1a). Bifidobacterium was among the dominant genera, whereas Lactobacillus was not subsamples from the USA and UK also showed that Bifidobacterium, but not Lactobacillus, was a dominant genus (Table A4-A6 in Appendix 1). The significance of the overlap test suggested that the distribution of these two genera exhibited a close connection (Figure 1b,p < .001,. We also validated the result using another online statistic service (http://nemat es.org/MA/ progs/ overl ap_stats.html), and the result also revealed a close connection between Bifidobacterium and Lactobacillus (p < 3.6 × 10 −6 ).
Among the remaining 1,409 ASVs, 6 and 13 ASVs were annotated as Bifidobacterium and Lactobacillus, respectively ( Table   A7 in Appendix 1). The relative abundance of each ASV annotated as Bifidobacterium or Lactobacillus varied significantly, with only some ASVs dominating each genus ( Figure A2). Although with the limitation of amplicon length makes it difficult to classify ASVs at the species level ( Figure A3 and Figure A4), we still found that some ASVs showed high identity (98.6%-100.0%) to species commonly used as probiotics, namely, Bifidobacterium_1 and Lactobacillus_9 (Lactobacillus brevis). These ASVs also exhibited high relative abundance for Bifidobacterium and Lactobacillus.

| Bifidobacterium and Lactobacillus are associated with the diversity of the gut microbiota
To explore the relationship between Bifidobacterium and Lactobacillus, we focused our analysis on the increase in these two genera when codetected. The relative abundance of Bifidobacterium was increased significantly when Lactobacillus was present (Figure 2a). At the same time, the relative abundance of Lactobacillus did not increase significantly when Bifidobacterium was present (Figure 2b).
In addition, we found significantly increased levels of portions of Bifidobacterium and Lactobacillus ASVs when these genera were codetected ( Figure A5). Considering the interinfluence between these two genera, we propose that these two genera also have a close connection with other dominant genera. We found that Bifidobacterium and Lactobacillus showed a positive correlation with Blautia, Faecalibacterium, Anaerostipes, Agathobacter, and Subdoligranulum, all of which are potential butyrate producers. Concomitantly, we also found a negative correlation of these two genera with some potential butyrate producers ( Figure 2c) (Vital, Howe, & Tiedje, 2014).
It can be argued that other factors exerting an effect on butyrate producers in the gut microbiota may exist.
Furthermore, we compared the alpha diversity of the gut microbiota in the AGP dataset, with alpha diversity increasing as the number of codetected Bifidobacterium and Lactobacillus increased ( Figure 3a,b and Figure A6). In addition, samples containing Bifidobacterium and not Lactobacillus showed a higher Simpson index than did those containing only Lactobacillus. The association between the two genera and the diversity of the gut microbiota was obvious for the US samples, but that for the UK samples was weaker ( Figure A7). We visualized beta diversity by PCoA according to Bray-Curtis dissimilarities (Figure 3c-e). An additional PERMANOVA analysis based on categorical variables of their abundance showed that the presence of Bifidobacterium and Lactobacillus was a significant factor in the variation of the gut microbiota (p < .001). Approximately 1% of the variance in beta diversity was explained by the presence of the two genera (R 2 = .010, .010, and .013, respectively), which is competitive with many microbiome covariates (Falony et al., 2016).

| Robustness of microbial networks related to Bifidobacterium and Lactobacillus
Analysis of the entire network constructed using the genus data from the AGP dataset showed that Bifidobacterium and Lactobacillus were not highly connected in the microbial network, suggesting that they were not keystone taxa for the cohort. However, notably, these two genera were connected to the largest cluster via Peptoclostridium and Collinsella; furthermore, Bifidobacterium and Lactobacillus were connected to each other ( Figure A8). To further explore the effect  Figure 4c, p = 7.46 × 10 −9 ). We then compared the resilience of the networks to degree disturbance using random node removal to simulate an "attack" on the networks (Mahana et al., 2016). With the absence of either Bifidobacterium or Lactobacillus, the natural connectivity of the microbial network decreased faster compared to the connectivity that when either of these genera were present (Figure 4d,e).
In addition, the microbial network constructed for the samples containing Lactobacillus but not Bifidobacterium decreased faster compared with the connectivity when Bifidobacterium but not Lactobacillus was present (Figure 4f). Node removals ordered by the degree and betweenness of the natural connectivity suggested the same results ( Figure A9). Taken  for Bacteroides and Lachnospiraceae_Other ( Figure A10a). The results showed that the ability of Bifidobacterium to sustain the gut microbiota robustness under attack was comparable to the most frequently connected genus examined ( Figure A10b-d).

| The effect of Bifidobacterium and Lactobacillus on the gut microbiota
We validated the influence of Bifidobacterium and Lactobacillus on the gut microbiota using genus data from the NBT dataset, which were annotated based on reference genomes with a similarity of >85% at the genus level . Due to the sequencing depth, all 1,267 samples showed positive results for the two genera (Table A9 in Appendix   1). Therefore, we divided the samples into two groups, a higher group and a lower group, according to the median value of relative abundance.
Spearman's correlation analysis showed a positive correlation between the relative abundance of the two genera (rho = .449, p < 2.2 × 10 −16 , Figure 5a). In addition, the samples with higher relative abundances of Bifidobacterium and Lactobacillus showed higher alpha diversities, similar to the result found on the AGP dataset (Figure 5b,c). There was also a significant association between beta diversity and a higher relative abundance of Bifidobacterium or Lactobacillus (Figure 5d,e and Figure   A11). Natural connectivity decreased faster in the group with a lower relative abundance of Bifidobacterium or Lactobacillus than in the group with a higher relative abundance, though this was not as noticeable as seen in the results for the AGP dataset ( Figure A12).  (Table A10 in Appendix 1). For example, the relative abundance of Bifidobacterium was associated with demographic features included in the present study, namely, age, sex, race, and geographical location (Figure 6a-d). In terms of lifestyle, we found that whole-grain consumption, milk, and cheese were associated with an increased abundance of Bifidobacterium, though a high frequency of vegetables and fruits consumption negatively affected the abundance of Bifidobacterium (Figure 6e-h). Breasting feeding in infants showed a close connection with a higher abundance of Bifidobacterium, even though our cohort consisted of adults ( Figure 6j). Notably, a high relative abundance of Bifidobacterium was associated with IBD and recent antibiotic exposure (Figure 6k,l). However, people with IBS, autoimmune disease, and food allergy had a lower relative abundance of Bifidobacterium than did unaffected individuals (Figure 6m,n,p). These results also showed that the relative abundance of Bifidobacterium was not associated with cardiovascular disease or C-section (Figure 6i,o).

| D ISCUSS I ON
We found the following through analysis of the AGP dataset: (1) Bifidobacterium was a common genus, but Lactobacillus was not; (2) the abundances of Bifidobacterium and Lactobacillus were positively correlated, especially at the ASV level; (3) samples containing the two genera showed higher alpha diversity; (4) Bifidobacterium was more helpful than Lactobacillus in sustaining the robustness of the gut microbiota based on the inferred microbial network; (5) demographic features, lifestyle, and diseases were closely connected with the relative abundance of Bifidobacterium.
Dominant taxa with large biomasses or major energy transformations might influence a broad array of processes, such as denitrification or organic matter decomposition (Banerjee et al., 2018). Based on the results of our analysis, Bifidobacterium had a higher relative abundance and a wider prevalence than Lactobacillus, indicating a stronger influence on gut microbiota processes. The Bifidobacterium-mediated effect is an important issue that needs to be addressed in relation to strain-specific beneficial properties (Presti et al., 2015). Although we explored each ASV to improve classification accuracy, the lengths of the sequenced amplicons made it difficult to classify them at the species level. Furthermore, our results suggested that the most abundant ASV (Bifidobacterium_1) belonging to Bifidobacterium showed a higher identity to B. longum, B. adolescentis, and B. breve, which are frequently used probiotics, despite an inability to analyze the data at the species level.
Our results suggested that the relative abundance of Bifidobacterium increased when Lactobacillus was present. The cooccurrence network and the NBT dataset also showed a close correlation between these two genera. These observations suggest that cooperation may exist between these two genera. This relationship may explain why multistrain probiotics appear to show F I G U R E 3 Alpha diversity and beta diversity of the 1,836 samples. Shannon index (a) and Simpson index (b) for the four groups. Statistical tests were performed using the Wilcoxon rank-sum test. PCoA was based on Bray-Curtis dissimilarity considering the presence of Bifidobacterium (c), Lactobacillus (d), and the number of these two genera (e). *: p < .001 (PERMANOVA, permutation = 9,999) * * * ns ns ns greater efficacy than single-strain probiotics (Chapman, Gibson, & Rowland, 2011). In addition, many factors could lead to the same observation, such as taking probiotics and dairy products containing Bifidobacterium and Lactobacillus. Cross-feeding interactions were studied between selected strains of Bifidobacterium/Lactobacillus and butyrate-producing bacteria that consume lactate (Moens, Verce, & De Vuyst, 2017). Our results verified that the positive correlation between Bifidobacterium/Lactobacillus and butyrate-producing bacteria may be one of the beneficial roles played by these two genera in the host.
The present study confirmed that the presence of these two genera is associated with higher alpha diversity. Interestingly, Bifidobacterium has a strong effect on the alpha diversity of the gut microbiota through mechanisms that may include starch-degrading activity (Ryan, Fitzgerald, & van Sinderen, 2006). Moreover, our results suggested that Bifidobacterium and Lactobacillus are not only associated with alpha diversity but may also be related to the microbial structure. A previous study indicated that the fish gut microbiota was less affected by spatial differences resulting from environmental factors via increases in the abundance of a certain strain (Giatsis et al., 2016). This finding indicates that some types of bacteria may help sustain the robustness of the gut microbiota. Indeed, according to the results of our present study, Bifidobacterium helps sustain global network connectivity.
Bifidobacterium helps in the resistance of the microbiota to the effects Fraction Removed Natural Connectivity of other factors, such as a high-fat diet and antibiotics (Kristensen et al., 2016). Moreover, comparison with another six genera proved the important role of Bifidobacterium in the gut microbiota. Microbial keystone taxa are highly connected taxa that, individually or together, exert considerable influence on microbiome structure and function (Banerjee et al., 2018). Nonetheless, Bifidobacterium did not exhibit high connectivity with other genera, indicating that they may not be keystone taxa.
However, according to Angulo's study, manipulation of driver species, which are not always highly interconnected, may control the entire network (Angulo, Moog, & Liu, 2019). Therefore, Bifidobacterium and Lactobacillus might be potential drivers of the bacterial network. In addition, the role of Peptoclostridium and Collinsella in the gut microbiota still needs to be explored, as these genera were the only two found to be closely connected with Bifidobacterium and Lactobacillus.
Considering the increasing global incidence of many diseases, changes in lifestyle and diet have been proposed to contribute to disease emergence by altering gut microbial ecology (Blaser, 2006), and many strains of Bifidobacterium have been used to improve health. However, it is uncertain whether intake of Bifidobacterium strains can ameliorate the symptoms of conditions such as IBS (Cozmapetruţ et al., 2017), allergy (Mennini et al., 2017), and diarrhea (Laursen et al., 2017), even in clinical trials. These findings suggest that the association between disease and Bifidobacterium is questionable. In the present study, we found that the relative abundance of Bifidobacterium is under the influence of demographic features. Indeed, it has been reported that age, geography, and ethnic origins are factors that influence the abundance of Bifidobacterium (Deschasaux et al., 2018;Kato et al., 2017). In terms of lifestyle, we observed that higher consumption of whole grains and dairy products was associated with a higher abundance of Bifidobacterium in the gut microbiota (Martinez et al., 2013). However, C-section did not appear to influence the abundance of Bifidobacterium in adults, even though it is associated with Bifidobacterium colonization in infants (Hesla et al., 2014). This finding suggests that the lifelong effect of C-section on Bifidobacterium is unlikely. The decreased abundance of Bifidobacterium related to higher consumption of vegetables and fruits may be due to other factors not included in the present study, which is a limitation of the present study. A small sample number may be another factor leading to this unexpected result (Table A1 in   Appendix 1). Surprisingly, exposure to antibiotics increased the relative abundance of Bifidobacterium, a finding that needs to be investigated further. One plausible explanation for this increase could be the use of probiotics considering Bifidobacterium_1 showed identity to the species commonly used as probiotics ( Figure A3); however, this information was not included in the metadata. Increased relative abundance of Bifidobacterium in the gut microbiota may be helpful for controlling IBS (Han, Wang, Seo, & Kim, 2017), autoimmune disease (Uusitalo et al., 2016), and food allergy (Mennini et al., 2017), as the relative abundance of Bifidobacterium was lower in patients with these conditions than in unaffected individuals. However, all these results together with those we presented here are mostly correlation analyses; the relationship between Bifidobacterium and human diseases and if Bifidobacterium bacteria could be a treatment option still needs to be revealed.  the background information was not sufficiently detailed to allow a solid conclusion to be drawn, with some ambiguous information; many factors influence the relative abundance of Bifidobacterium, which makes it difficult to interpret the results of the association between F I G U R E 6 Predicted relationships between Bifidobacterium abundance and host features based on the ZINB model. The overall fitted mean proportions (%) of Bifidobacterium and age (a); sex (b); race (c); geographical location (d); whole-grain consumption (e); vegetable consumption (f); fruit consumption (g); milk and cheese consumption (h); C-section (i); fermented plant consumption (i); feeding patterns (j); antibiotic exposure (k); IBD (l); IBS (m); autoimmune disease (n); cardiovascular disease (

| CON CLUS IONS
In summary, our results showed a close connection between Bifidobacterium and Lactobacillus. The genus Bifidobacterium was important for the diversity and robustness of the gut microbiota.
Increasing the intake of whole grains and dairy products may be a good way to increase the abundance of Bifidobacterium.

ACK N OWLED G M ENTS
We thank Daniel McDonald, the American Gut Project manager, for his suggestions on the present study. We also thank Yongfei Hu from China Agricultural University for his helpful comments.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
Yuqing Feng contributed to conceptualization; Yuqing Feng, Na Lyu, Fei Liu, and Shihao Liang contributed to formal analysis; Baoli Zhu contributed to funding acquisition; Yuqing Feng, Yunfeng Duan contributed to writing-original draft preperation; Zhenjiang Xu contributed to writing-review and editing.

E TH I C A L A PPROVA L
None required.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data used for this paper is available at ebi.ac.uk/ena (accession # #ERP012803) for the AGP dataset and meta.genomics.cn/meta/data-   TA B L E A 3 Workflow of American Gut Project data processing

Steps Contents
Step 1 Downloaded 19,327 samples (25 Jan. 2018) Step 2 Excluded non-fecal samples: 15,259 fecal samples left Step 3 12,127 fecal samples with over 10,000 reads Step 4 11,744 samples passed the quality control of DADA2 Step 5 Delete the ASV with a distribution of less 1% and not belong to bacteria Step 6 Delete blooming bacteria Step 7 Excluded samples with diseases         1 Lactobacillus panis s i l a n i g a v s u l l i c a b o t c a L 9 3 4 1 . 1 . 1 9 6 2 1 3 F K JX861391 .   L a ct o b a ci llu s 5 A B 9 3 2 5 2 2 .1 .1 5 1 0 L a c to b a c ill u s p a n th e ri s A B 2 5 7 8 6 3 .1 .1 5 3 5 L a c to b a c ill u s th a ila n d e n s is A Y 7 3 3 0 8 4 .1 .1 5 6 2 L a c to b a c il lu s o li g o fe rm e n ta n s H F 6 7 9 0 3 9 .1 .1 4 5 0 L a c to b a c il lu s n e n ji a n g e n s is H Q 0 2 2 8 6 1 .1 .2 2 4 6 L a c to b a c il lu s b ra n ta e A B 6 0 2 5 6 9 .1 .1 5 5 8 L a c to b a c il lu s s a n iv ir i G U 1 3 8 5 7 6 .1 .1 4 8 8 L a c to b a c il lu s g u iz h o u e n s is H F 6 7 9 0 3 8 .1 .1 4 4 8 L a c to b a c il lu s s o n g h u a ji a n g e n s is