SEARCH

SEARCH BY CITATION

Keywords:

  • faecal bacteria;
  • 16S rRNA;
  • correlation network;
  • within-individual variation

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

In the last decade, an extensive effort has been made to characterize the human intestinal microbiota by means of SSU rRNA gene sequence and metagenomic analysis. Relatively few studies have followed intestinal bacterial communities over time to assess their stability in the absence of perturbation. In this study, we have monitored the faecal bacteria of three healthy subjects during 15 consecutive days. The global community structure was analysed through SSU rRNA gene sequencing. In agreement with previous studies, we found that the between-subject variation in community structure was larger than within. The composition was fairly stable throughout, although daily fluctuations were detected for all genera and phylotypes at 97% of sequence identity. While the core shared between subjects was very small, each subject harboured a stable high-abundance core composed of a small number of bacterial groups (9% of the phylotypes accounted for between 74% and 93% of the sequences). This may suggest that studies aimed at linking the microbiota composition with disease risk should be limited to the numerically most dominant phylotypes, as the rest appears transient. Networks of potential interactions between co-occurring genera were also subject-specific, even for the same bacterial genus, which might be reflecting host-specific selective pressures and historic events.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

The human large intestine is colonized by complex microbial communities with several hundreds of species and the highest cell densities of all human-associated microbiota. Functions of the gut microbiota include assistance in nutrient digestion, colonization resistance to pathogens and regulation of host metabolism and immune system (Guarner & Malagelada, 2003; Turnbaugh et al., 2006; Kelly et al., 2007; Neish, 2009; Qin et al., 2010). The gut has been traditionally considered sterile at birth and rapidly colonized by bacteria of maternal origin, but recent studies have shown that meconium, the earliest stools of a newborn, composed of materials ingested during the stay in the uterus, is nonsterile (Jiménez et al., 2008). Over the first years of life, an ecological succession takes place that culminates in the complex adult pattern (Favier et al., 2002; Palmer et al., 2007). Then, it is generally accepted that, in the absence of perturbation, the adult human gut microbiota is composed of stable climax communities inhabiting the different niches found along and across the intestine.

Many factors are known to influence the structure of the microbiota of adult humans, such as host genotype (Ley et al., 2005; Khachatryan et al., 2008; Benson et al., 2010), disease (Turnbaugh et al., 2006; Qin et al., 2010), diet (Harmsen et al., 2000; De Filippo et al., 2010) and stochastic events such as colonization history (Mulder et al., 2009) and external disturbances (De La Cochetiere et al., 2005; Dethlefsen & Relman, 2010; van Vliet et al., 2009). Modifications of the gut microbiota have also been reported in older people related to physiological changes associated with ageing (Woodmansey, 2007; Claesson et al., 2010). However, little is known about the short- and long-term effects that these factors may have in the composition of the microbiota. To assess the response of the gut microbiota to external perturbations or its role in pathologic states, it is important to know first the ‘normal’ temporal dynamics of this ecosystem.

Human faecal communities have been considered stable because the temporal variation found within individuals is smaller than between individuals. The stability has typically been examined with several samples collected at intervals of weeks or months, finding that the microbiota of healthy individuals remains fairly constant over these long time intervals (Franks et al., 1998; Zoetendal et al., 1998; Vanhoutte et al., 2004). However, some variation has been found in the abundance of certain taxa, while others remain more constant (Vanhoutte et al., 2004). Recently, studies of the temporal variation using shorter time periods confirmed the stability of the community composition at a lower phylogenetic level but also revealed a high degree of dynamics, with relatively large fluctuations around the average level (Dethlefsen et al., 2008; Dethlefsen & Relman, 2010; Caporaso et al., 2011).

Ecological interactions between the members of gut communities, like nutritional associations, niche adaptation, growth stimulation, resource competition, and interference mechanisms, also contribute to the shape and stability of these ecosystems. Despite its importance, little is known about the relationships between gut bacteria. A first step for the study of these interactions is searching for patterns of co-occurring bacteria. This can be carried out with cross-sectional studies (see e.g. Arumugam et al., 2011) or with longitudinal ones as well (this study). The latter are helpful to control for the high level of interpersonal variation in the community assembly as the same ecosystem is followed over time.

We are interested in the short-term intrinsic temporal dynamics of human intestinal microbiota, as well as in the potential interactions between the community members. The objectives of the present study were to analyse the daily variation in the community structure of faecal bacteria from three healthy subjects over 2 weeks and to assess correlations between shifts in the relative abundance of specific bacteria to reconstruct potential interaction networks.

Materials and methods

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Subjects' characteristics

Three White male volunteers (hereinafter referred to as individuals A, B and C) provided faecal samples daily over a period of fifteen consecutive days. All participants gave prior informed written consent to the study protocol, which was approved by the Ethics Committee of Hospital Universitario La Fe (Valencia, Spain). A, B and C were 40, 39 and 29 years old, respectively. A was normal weight, B and C were obese class I according to the body mass index (BMI; 30 ≤ BMI < 35). None had a history of gastrointestinal disease or systemic comorbidities, recent (in the last 3 months) treatment with antibiotics, immunomodulating therapy, antidiarrhoeal medication or laxatives. They followed a Mediterranean diet that remained unchanged throughout the follow-up.

Sample collection and DNA extraction

Faecal samples were self-collected by the volunteers in the morning in tubes containing phosphate-buffered saline (PBS) and stored in their home freezers until its release. Afterwards, samples were stored at −80 ºC until further processing. Before DNA extraction, samples were resuspended in PBS and centrifuged at low speed to remove faecal debris as far as possible. Then, DNA was extracted from supernatants using the QIAamp DNA Stool Mini Kit (QIAGEN) and its protocol for isolation of DNA for pathogen detection.

PCR amplification and sequencing of bacterial 16S rRNA genes

The 16S rRNA genes were amplified using the broad range bacterial primers B8F (5′-AGAGTTTGATCMTGGCTCAG-3′) and B357R (5′-TGCTGCCTCCCGTAGGAGT-3′) modified with the 454 Life Sciences adaptors A and B, respectively, and with barcodes to tag each PCR product. The PCR conditions were 5 min of initial denaturation at 95 °C followed by 20 cycles of denaturation (30 s at 95 °C), annealing (30 s at 52 °C) and elongation (30 s at 72 °C). PCR products were purified by filtration, and equal amounts of the PCR products from different samples were pooled. The mixtures were sent for pyrosequencing with primer A on a 454 sequencer using the GS FLX chemistry for samples of subject A and the GS FLX Titanium chemistry for samples of subjects B and C (Roche). This was due to the change in the sequencing chemistry while the study was ongoing.

Sequence analysis

Sequences with low-quality scores (< 20) and short read lengths (< 200 nt for samples of subject A and < 250 nt for samples of subjects B and C) were removed. The remaining sequences were checked for potential chimeras using the chimera.slayer tool incorporated into the mothur software (Schloss et al., 2009). The taxonomic affiliation of the sequences was determined using the classifier tool of the Ribosomal Database Project-II (RDP) with a bootstrap cut-off of 70% (Cole et al., 2007, 2009). Sequences were clustered at 97% of sequence identity using the cluster tool of the usearch 5.0 package (Edgar, 2010). Input sequences to cluster were previously sorted by decreasing abundance as recommended for 16S sequences. The resulting phylotypes were used to study sample composition at the ‘species’ level.

Statistical analyses

Exploratory statistical analyses

We assessed the differences between and within subjects according to the bacterial composition using correspondence analysis (CA) and analysis of similarities (anosim) at genus and phylotype levels (Quinn & Keough, 2002; pp. 459 and 484, respectively) as implemented in the vegan package (Oksanen et al., 2010) for the statistical environment R (R Development Core Team, 2010). We also computed the Chao1 index of richness and the Shannon's index of biodiversity for each of the 45 samples at both genus and phylotype levels. This was performed with the QIIME pipeline (Caporaso et al., 2010), resampling 1000 times from each daily sample using the smallest sample size of all samples.

Multivariate time series modelling

From a statistical point of view, the data on daily taxa abundances can be regarded as a multivariate time series. The potential interactions between taxa are expected to be reflected in the correlations between taxa, but also some temporal correlation is expected to be present in the data. In other words, if Yit represents the matrix with the number of sequences of taxon = 1,…, n found in the sample collected on day t = 1,…, T for a given individual, both rows and columns present correlation structures of different nature. Abundances in a given row are likely to be affected by temporal correlation, whereas values in a specific column may be subject to the correlations generated by the underlying interactions between taxa. To model both correlation structures simultaneously, we applied a Bayesian hierarchical model to the follow-up data for each individual.

Our model specification is as follows. Let Yt = (Y1t,…, Ynt)′ be the taxonomic distribution of sequences on day t. Our model first assumes that Yt follows a multinomial distribution

  • display math

where Nt is the total number of sequences on day t and πt = (π1t,…, πnt)′, πit being the unknown proportion in which taxon i is present in the community on day t. The proportions πit are in turn decomposed, on the log-odds scale, into

  • display math

where αi is a taxon-specific intercept that picks up the average relative abundance of taxon i over the T = 15 days, and νit and εit are random effects intended to pickup time structured and unstructured variation, respectively. To this end, we chose a normal prior distribution for εit and a multivariate random walk of order one for νt, t = 1,…, T

  • display math

where Σ is the n × n variance-covariance matrix between taxa abundances. For convenience, we take ν0 = 0n×1. This conditional specification is a particular case of the intrinsic multivariate conditional autoregressive (MCAR) models (Kim et al., 2001; Gelfand & Vounatsou, 2003), for which the full conditional distribution is

  • display math

that is, νt follows a multivariate normal distribution centred in the average of its temporal neighbours and variance-covariance matrix inversely proportional to the number of neighbours. The joint distribution of ν = (ν11,…,νn112,…,νn2,…,ν1T,…,νnT)′ is a zero-mean multivariate normal distribution with precision matrix Ω = (D − W) ⊗ Σ −1, where W is a T × T matrix with Wtt = 1 if time points t and t′ are adjacent and Wtt′ = 0 otherwise, D is a T × T diagonal matrix with Dtt equal to the number of neighbours of time point t (i.e. D11 = DTT = 1 and Dtt = 2 ∀ t = 2,…,T − 1) and ⊗ represents the Kronecker product for matrices. The matrix D − W is singular, which makes this distribution improper. However, with our choice of W and D, Ω satisfies the so-called symmetry condition that ensures propriety of the posterior. In practice, this impropriety is overcome using the proper full conditionals for νt and imposing n sum-to-zero constraints. See for example Banerjee et al. (2004, pp. 247–251) for further details.

We fitted our model using Markov chain Monte Carlo (MCMC) simulation techniques as implemented in the WinBUGS software (Lunn et al., 2000) and the R2WinBUGS package (Sturtz et al., 2005) for the R statistical software (R Development Core Team, 2010). We ran two chains with 50 000 iterations, discarded the first 10 000 as burn-in and kept every 40th to reduce autocorrelation in the chains. Therefore, inference for each parameter is based on a thinned sample of size 2000 from its posterior distribution.

Exploring putative interactions between taxa with Graphical Gaussian Networks (GGN)

GGNs (Schäfer & Strimmer, 2005a,b) have been used to recover gene regulation network structures using gene expression data as input. They aim at predicting interaction networks between genes. Here, however, we apply for the first time GGNs to explore patterns of association between taxa using the partial correlations between their abundance profiles. A strong partial correlation between two species is indicative of some form of association. The estimation of that matrix is tricky because typically it is sparse and has large dimensionality. GGNs however allow to estimate efficiently the partial correlation matrix from the variance-covariance matrix. We therefore applied GGNs to the covariance matrix Σ (which measures covariances between taxa abundances on the log-odds scale) obtained with the above Bayesian model to infer a network of potential associations between taxa. This was performed using R package corpcor (Schäfer et al., 2010). The statistical significance of the estimated partial correlations was performed using the algorithms proposed by Opgen-Rhein & Strimmer (2007). The idea is to model the partial correlations with a mixture of two components. The first component tries to capture the null partial correlations, whereas the second component intends to pickup the sizeable ones. Opgen-Rhein & Strimmer (2007) suggest a method to estimate the mixture components and also a ‘local false discovery rate’ (LFDR) procedure to assess the statistical significance of each partial correlation. They show that their methods perform very well both in simulations as well as in application to large-scale real expression data in the context of gene association networks. We applied these methods as implemented in the R package GeneNet (Schaefer et al., 2009). These analyses were carried out for each individual separately.

The output is a graph, with nodes corresponding to taxa and edges representing statistically significant partial correlations between taxa (taking as such that the probability, 1-LFDR, for the partial correlation to be different from zero is above 0.95). Graphics were generated with the R package Rgraphviz (Gentry et al., 2008).

The nonredundant sequences from this study have been deposited in the GenBank database under the accession numbers JN547818JN555583.

Results

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

After sequencing all 45 samples, we obtained an average of 1500 sequences per sample for A, 4165 for B and 6510 for C. The average read length was approximately 250 nt in A and approximately 350 nt in B and C. The difference in the read length is the result of the slightly different sequencing technologies employed for subject A and subjects B and C, but in both cases, the variable regions V1 and V2 of the 16S rRNA gene were covered. The comparison between subjects could be affected by the potential bias introduced during the PCR by the different sequencing adaptors or the impact of the read length in the taxonomic assignment. However, we think that the impact on the estimate of the variation between subjects would be small, and none on the analysis within each subject, which is our main objective.

The Chao1 index of richness indicated a good coverage for most samples at genus level (Table S1). The coverage values decreased at phylotype level, with larger differences between observed and expected number of OTUs (Table S2).

Between-subject variation

Most members of the sampled communities belonged to a small number of genera within the Bacteroidetes phylum (61% of the sequences on average in samples from A, approximately 86% in B and C) and the Firmicutes phylum (26% in A, approximately 10% in B and C). The composition of the faecal microbiota of B and C was quite similar from phylum to genus levels. At the genus level, samples from A were dominated by Alistipes (23%), Bacteroides (22%), several genera of Porphyromonadaceae such as Barnesiella (8%) and Parabacteroides (2%), and several genera of Ruminococcaceae such as Faecalibacterium (4%) and Oscillibacter (2%; Fig. 1a). Samples from B and C were dominated by Prevotella (76% in B and 72% in C), Bacteroides (5% in B and 12% in C) and beta-proteobacteria of the genus Sutterella (approximately 3%; Fig. 1b and c).

image

Figure 1. Daily fluctuations in the relative abundance of the main bacterial genera (average abundance≥ 5‰) in subjects A (a), B (b) and C (c). Daily variation in the bacterial diversity (Shannon index) at genus and phylotype levels (d).

Download figure to PowerPoint

At a finer taxonomic scale (phylotypes defined at 97% of sequence identity), the faecal microbiota was highly specific to each individual. About half of the phylotypes detected in each subject were exclusive to him, whereas 20% were shared with the other two subjects. The greater similarity between B and C was also seen at this level, as they had more phylotypes in common than each of them with A. Most of the sequences were concentrated in a small fraction of the phylotypes. Thus, 11% of the phylotypes in A and 4% of the phylotypes in B and C had a relative abundance of at least 5‰, and all together accounted for 86%, 80% and 89% of their respective number of sequences. The most prevalent genera in A, Alistipes, Bacteroides and Barnesiella, had 9%, 10% and 3% of the total number of phylotypes, whereas only 16% of the phylotypes in B, and 10% in C, belonged to Prevotella.

Between-subject differences in the community structure were larger than dissimilarities between samples from the same subject. Therefore, CA plots clearly discriminated the samples of the three subjects at genus and phylotype levels (Fig. 2). The first CA axis separated A from B and C, and the second separated B from C. The ANOSIM found significant differences between subjects (genus level: R = 0.686, P = 0.001; phylotype level: R = 1, P = 0.001). At genus level, the between-subject median rank of distances between samples was 1.5 times that of within A and 4 times that of within B and C. At phylotype level, between-subject variation was 2.6, 4.5 and 8.6 times that of within A, B and C, respectively, thus reflecting even larger differences in microbiota composition between individuals at this finer taxonomic resolution.

image

Figure 2. Correspondence analysis at genus (a) and phylotype (b) levels. Percentages correspond to the fraction of inertia explained by each axis.

Download figure to PowerPoint

Within-subject variation

The communities experienced daily fluctuations in the bacterial diversity according to the Shannon index (Fig. 1d and Tables S1 and S2), although these values varied around a constant level. This is consistent with the fact that the structure of the sampled faecal communities remained quite stable over time. Relative abundances showed daily fluctuations for all genera but no temporal trends were observed (Fig. 1c). All genera with an abundance of at least 5‰ were permanent members of the communities, whereas genera with prevalence below that threshold were sometimes lost and recovered, although some were constantly present too. The taxa with the highest prevalence showed the lowest relative fluctuations in time. In A, Alistipes, Bacteroides and Barnesiella experienced an average daily relative change of 2.9%, 3.8% and 6.4% in their abundances, respectively, whereas in B and C, the average daily relative variation in Prevotella was of approximately 1.8%.

At a subgeneric level, a reduced number of phylotypes were constantly detected throughout within each subject (Fig. 3a, which shows the number of phylotypes detected on just 1 day, on 2 days, etc.). The individual-specific core (phylotypes detected all the days during the study period) comprised 38 phylotypes in A, 56 in B and 44 in C, which corresponded to approximately 9% of the total number of phylotypes within each subject. Conversely, most of the phylotypes detected within each subject were found in a few days only (Fig. 3a). The core phylotypes belonged to the most prevalent genera and accounted for most of the sequences in the samples (74% of the total number of sequences in A, 90% in B and 93% in C; Fig. 3b, which shows the cumulative percentage of sequences belonging to phylotypes detected on just 1 day, on 2 days, etc.). Again, phylotypes showed fluctuations in their abundance over time, but the trend was constant. Only 0.2% of the phylotypes were detected in all subjects all days during the follow-up, while 2.6% of the phylotypes were simultaneously detected throughout in the more similar subjects B and C. With a less restrictive definition of the bacterial core, we found that 0.5% of the phylotypes was detected in all three subjects in at least 13 out of 15 days.

image

Figure 3. Community structure at phylotype level in subjects A, B and C. Occurrence of phylotypes during the 15-day study period, that is, number of phylotypes detected on just 1 day, on 2 days, etc. (a). Combined average relative abundance of phylotypes detected on just 1 day, on 2 days, etc., computed over all samples where they occur (b).

Download figure to PowerPoint

Correlations between co-occurring genera

Using GGN, we built interaction networks from the statistically significant partial correlations between genera estimated with the Bayesian model. These networks are represented in a graph, where nodes correspond to genera and edges represent interactions (Fig. 4 shows the subgraphs of the networks including the statistically significant correlations). Most genera showed correlations with a small number of other genera, while a small number of genera were correlated with many. The main components of the faecal communities scarcely correlated with other genera. For instance, Alistipes correlated (positively) only with Barnesiella in A. Prevotella was negatively correlated with Alistipes in B, and with no genus in C. Bacteroides, the second genus in abundance, was correlated with five, three and two genera in A, B and C, respectively. The most connected genera were Paraprevotella in A, Faecalibacterium, Alistipes and Odoribacter in B, and Coprococcus, Escherichia/Shigella and Blautia in C. No significant correlations were found for many of the genera in the samples.

image

Figure 4. Gaussian graphical networks representing the interactions between genera within subjects A (a), B (b) and C (c). Red and blue edges represent positive and negative partial correlations, respectively. Yellow nodes are those of genera with an average relative abundance ≥ 5‰. Genera for which no significant partial correlations were found are not shown in the networks.

Download figure to PowerPoint

The sign of the correlation did not depend on the degree of phylogenetic relatedness. Both positive and negative correlations were found between closely and distantly related genera. For example, in subject A, Paraprevotella (Bacteroidetes-Bacteroidia-Bacteroidales-Prevotellaceae) had a positive correlation with Odoribacter and a negative correlation with Barnesiella, two genera within Bacteroidetes-Bacteroidia-Bacteroidales-Porphyromonadaceae. In subject C, Butyricimonas (Bacteroidetes-Bacteroidia-Bacteroidales-Porphyromonadaceae) had a positive correlation with Blautia and a negative correlation with Coprococcus, two genera within Firmicutes-Clostridia-Clostridiales-Lachnospiraceae.

The patterns of correlations were highly subject-specific, even for B and C, in which the community structure was quite similar. Typically, the same genus correlated with different genera in different individuals. For example, in subject B, Bacteroides co-occurred with Odoribacter, while in subject C, it co-occurred with Parabacteroides and Alistipes. Only five pairwise correlations were found in more than one individual: Bacteroides-Odoribacter (A and B), Bacteroides-Parabacteroides (A and C), Odoribacter-unclassified Proteobacteria (A and B), Coprococcus-Ruminococcus (A and C) and Blautia-unclassified Lachnospiraceae (A and C). Furthermore, nine pairwise correlations were positive in one subject but negative in another: Blautia-Coprococcus, Blautia-Dorea, Blautia-Ruminococcus, Eubacterium-Ruminococcus, Coprococcus-Ruminococcus, Coprococcus-unclassified Porphyromonadaceae, Bacteroides-Subdoligranulum, unclassified Bacteroidales-unclassified Porphyromonadaceae and Odoribacter-unclassified Veillonellaceae.

We did not find a relationship between the diversity of the community and the complexity of the associated interaction network. Even though the bacterial community of A was the most diverse at genus level (Shannon index 2.3), and the ones of B and C harboured a similar diversity (Shannon index 1.1), the highest number of genera in the network and the largest average number of links per genus were observed in B and the lowest ones in C.

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

In this study, we have monitored the faecal bacterial communities of three subjects during fifteen consecutive days. We analysed the daily variation in the composition of the microbiota and the degree of stability of specific members of the community. The analysis of the time series also revealed the patterns of correlations between the abundance profiles of specific bacterial groups, thus indicating potential cooperative and competitive interactions between them.

In accordance with current knowledge of the human gut microbiota, we found that the faecal microbiota is host-specific and fairly stable in the absence of perturbation, the within-subject variability being much smaller than between subjects (Franks et al., 1998; Zoetendal et al., 1998; Vanhoutte et al., 2004; Dethlefsen & Relman, 2010; Caporaso et al., 2011). The level of prevalence of the predominant members of the faecal communities of our three study subjects was sustained over time, with abundances fluctuating around constant average values. This also held true for less abundant members, although some of them were not persistently found in our sampling. Additionally, a considerable number of bacterial groups were detected at very low abundance in one or few samples across the time series, probably representing transient members of the communities. Similar observations were made at genus and phylotype (97% of identity) levels.

Two previous studies have examined the daily variation of the faecal microbiota using high-throughput sequencing technologies to study the temporal dynamics at subgeneric levels. Dethlefsen & Relman (2010) monitored the response of faecal microbiota to antibiotic perturbation in three subjects and took daily samples in the periods surrounding each antibiotic course. They found that just a few OTUs maintained uniform abundance, even in the antibiotic-free intervals. Caporaso et al. (2011) sampled daily three body sites, including faeces, in two healthy subjects for 15 and 6 months, respectively. They reported that only a small fraction of the OTUs within a single body site was present across all or nearly all time points, and from this, the authors conclude that a minimal temporal core exists. In a shorter temporal window, Booijink et al. (2010) detected morning–afternoon variation in human ileal microbiota that exceeded the fluctuations between samples collected at the same time point of the day. We found that only a small fraction of the 97% OTUs in our samples was systematically present throughout the follow-up (9% of the total number of OTUs within each subject). However, as their combined abundance is high (74%, 90% and 93% in subjects A, B, and C, respectively), we consider that a high-abundance core faecal microbiota exists also at the subgeneric level. In our view, these evidences are indicative of a dynamic ecosystem with a stable core. These findings may also suggest that studies aimed at linking the prevalence of specific phylotypes with environmental exposures and/or disease risk should be limited to the numerically most dominant phylotypes, as the rest appears transient.

The bacterial communities sampled in this study differed not only on the basis of their composition, but also in the correlation patterns found between their members. Barely, a few of the correlations were shared between subjects. The vast majority of genera usually correlated with others that varied between subjects, even when the community composition was quite similar (subjects B and C). Furthermore, correlations of opposite sign were detected for some pairs of genera in different subjects. This disparity of community assemblies could imply that each individual may be offering somewhat different niches to gut bacteria (regarding pH, temperature, secretions, retention time, etc.) in which the same species can establish different microbial interactions, even though the gut environment is overall similar in all subjects. At the same time, the identity of the interacting bacteria could be affected by host selection of commensal/mutualistic microorganisms, as well as by the order in which they arrive in colonization processes and selection by already established microorganisms (Van den Abbeele et al., 2011). Because of the functional redundancy, several microorganisms can potentially occupy a specific niche within the intestinal habitats. The first ones to arrive can be established and then select for cooperating or nonoverlapping microorganisms, as well as exclude competitors. Related to this, Mulder et al. (2009) found a long impact of early life colonization on the intestinal community composition in pigs. Functional studies would be needed to disentangle the biological meaning of the correlations detected between abundance profiles.

We used a Bayesian model to estimate the covariance matrix between relative abundances of taxa (on the log-odds scale), while accounting for the temporal autocorrelation in those. The posterior mean of the covariance matrix Σ (estimated using all the samples of a given individual) was used to estimate the partial correlation matrix, which was in turn the input for the GGN-based methods to detect associations between taxa. Although these methods cope well with sparse matrices of relatively large dimension, some of the statistically significant associations we found involve taxa with low relative abundances, which makes difficult any biological interpretation.

The persistent diversity and individuality of human gut communities can be explained by a combination of factors that vary between individuals, such as host genotype and diet, but also less predictable events, such as the colonization history during the community assembly and external perturbations with long-term effect on gut microbiota (Dethlefsen et al., 2006), an example of which is antibiotic treatment. Several studies evaluating its effect on gut microbiota have found an important loss of diversity followed by a rapid return to the pretreatment community composition. The ability of these communities to recover their original structure suggests the existence of selective forces shaping them (De La Cochetiere et al., 2005; Dethlefsen & Relman, 2010). Also, the microbiota itself is thought to account for some of its diversity through the modification of intestinal niches and the interactions established between their members (Dethlefsen et al., 2006; Van den Abbeele et al., 2011). Our data suggest that specific microbial interactions are set within each individual, which may be an important factor contributing to the interindividual variability and the temporal stability of the gut microbiota.

Recently, Arumugam et al. (2011) identified three main types of microbial assemblies (called enterotypes) in human gut after clustering faecal samples obtained in different studies based on their species composition or gene pools. Incidentally, we noticed that subjects B and C in our study, the obese ones, can be included in the enterotype enriched in Prevotella. Looking further into Arumugam's analysis of the pyrosequencing-based 16S rRNA gene samples from Turnbaugh et al. (2009), we noticed that 17 of the 20 individuals of the Prevotella-enriched enterotype were obese, whereas 87 of the 134 individuals classified in the other enterotypes were so. This gives (using the epitools R package, Aragon, 2010) a statistically nonsignificant odds ratio OR = 3.04 (P = 0.1221) of association between this enterotype and obesity. Adding our three subjects to the corresponding cells (19 obese out of 22 in the Prevotella enterotype and 87 of 135 in the other ones), we obtained an OR = 3.47 (P = 0.0498), which begins to be more supportive of a potential link between the Prevotella enterotype and some type of obesity. Further studies are needed to confirm this marginal finding and to assess whether the link is owing to the Prevotella themselves or there are other genera involved. Recently, De Filippo et al. (2010) compared the microbiota composition in African and European children and found that the former showed communities rich in Prevotella. They argue that perhaps this genus is more efficient in extracting energy from polysaccharide-rich food. In a low-calorie diet, this may be useful to survive, but otherwise, it may lead to obesity.

Recent studies showed that despite the variation in community structure between subjects (Turnbaugh et al., 2009) and the compositional normal fluctuations detected over time and the shifts owing to disturbances (Dethlefsen & Relman, 2010), the overall community function seems to be maintained, which is indicative of functional redundancy among individual members and consortia within the gut microbiota. Like in the compositional case, fluctuations are likely to be observed when analysing the functionality of the members of the gut microbiota, despite the stability mentioned above. This is because metabolic functions are probably more susceptible to changes because of the dynamic factors such as resource availability, external perturbations, stressors or host physiology at the moment of sampling. Longitudinal studies of expression patterns within a subject under different conditions may help to better understand the contribution of gut microbiota to human nutrition and well-being.

Acknowledgements

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

This work has been partly funded by the grants BFU2009-04501-E, BFU2009-12895-C02-01 and SAF2009-13032-C02-01 from Ministerio de Ciencia e Innovación, Spain, and Prometeo/2009/092 from Generalitat Valenciana to A.M. A.D. and N.J. are recipients of a fellowship from Instituto de Salud Carlos III, Spain. The authors thank all three volunteers for participating in this study.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Supporting Information

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information
FilenameFormatSizeDescription
fem1368-sup-0001-TableS1-S2.docWord document137KTable S1. Number of sequences, observed number of genera, Chao1 richness estimator (using all sequences in the sample), average values of the Chao1 richness estimator and Shannon's index of biodiversity (and the corresponding standard errors) computed for 1000 subsamples of size 590 (the size of the smallest sample). Table S2. Number of sequences, observed number of Operational Taxonomic Units (OTU, sequences grouped at 97% identity), Chao1 richness estimator (using all sequences in the sample), average values of the Chao1 richness estimator and Shannon's index of biodiversity (and the corresponding standard errors) computed for 1000 subsamples of size 590 (the size of the smallest sample).

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.