Major faecal microbiota shifts in composition and diversity with age in a geographically restricted cohort of mothers and their children


  • Ekaterina Avershina,

    Corresponding author
    1. Department of Chemistry, Biotechnology and Food Science, University of Life Sciences, Ås, Norway
    • Correspondence: Ekaterina Avershina, Department of Chemistry, Biotechnology and Food Science, University of Life Sciences, Ås, Norway. Tel.: +47 64 96 59 00; fax: +47 64 96 59 01; e-mail:

    Search for more papers by this author
  • Ola Storrø,

    1. Department of Public Health and General Practice, Norwegian University of Science and Technology, Trondheim, Norway
    Search for more papers by this author
  • Torbjørn Øien,

    1. Department of Public Health and General Practice, Norwegian University of Science and Technology, Trondheim, Norway
    Search for more papers by this author
  • Roar Johnsen,

    1. Department of Public Health and General Practice, Norwegian University of Science and Technology, Trondheim, Norway
    Search for more papers by this author
  • Phil Pope,

    1. Department of Chemistry, Biotechnology and Food Science, University of Life Sciences, Ås, Norway
    Search for more papers by this author
  • Knut Rudi

    Corresponding author
    1. Department of Chemistry, Biotechnology and Food Science, University of Life Sciences, Ås, Norway
    • Correspondence: Ekaterina Avershina, Department of Chemistry, Biotechnology and Food Science, University of Life Sciences, Ås, Norway. Tel.: +47 64 96 59 00; fax: +47 64 96 59 01; e-mail:

    Search for more papers by this author


Despite the importance, the diversity of the human infant gut microbiota still remains poorly characterized at the regional scale. Here, we investigated the faecal microbiota diversity in a large 16S rRNA gene data set from a healthy cohort of 86 mothers and their children from the Trondheim region in Norway. Samples were collected from mothers during early and late pregnancy, as well as from their children at 3 days, 10 days, 4 months, 1 year and 2 years of age. Using a combination of Sanger sequencing of amplicon mixtures (without cloning), real-time quantitative PCR and deep pyrosequencing, we observed a clear age-related colonization pattern in children that was surprisingly evident between 3- and 10-day samples. In contrast, we did not observe any shifts in microbial composition during pregnancy. We found that alpha-diversity was highest at 2 years and lowest at 4 months, whereas beta-diversity estimates indicated highest interindividual variation in newborns. Variation significantly decreased by the age of 10 days and was observed to be convergent over time; however, there were still major differences between 2 years and adults whom exhibited the lowest interindividual diversity. Taken together, the major age-affiliated population shift within gut microbiota suggests that there are important mechanisms for transmission and persistence of gut bacteria that remain unknown.


While it is widely accepted that the human gut is one of the most densely populated bacterial communities on the Earth (Whitman et al., 1998), the general mechanisms for host–bacterial interactions are not yet completely described (Avershina & Rudi, 2013). Previously, the scientific community unanimously assumed that humans are born sterile (Ley et al., 2006; Marques et al., 2010), although an evidence now exists for prenatal colonization (Jimenez et al., 2008; Satokari et al., 2009). Regardless of the time required for initial colonization, it is absolute that development of this unique and intricate community takes several years to reach its maturity (Marchesi, 2011). There are many factors that supposedly play a role in the development of gut microbiota; initial inoculation occurs via the mother's birth canal when a child is born vaginally; subsequently, an infant will frequently receive bacteria via breast milk (Martin et al., 2007), and the surrounding environment also exerts a constant influence. Existing reports have addressed various environmental influences towards gut microbiota such as age (Palmer et al., 2007; Claesson et al., 2011), geography and diet (De Filippo et al., 2010; Yatsunenko et al., 2012). There are also recent suggestions of immunological modulations of the microbiota during pregnancy (Koren et al., 2012). However, much less is known about transmission and persistence of gut bacteria in a population during the host's first years of life. We have previously described transmission of some particular gut bacteria from mother to child (Bjerke et al., 2011; de Muinck et al., 2011; Avershina et al., 2013), while we have not yet addressed general patterns of bacterial persistence and diversity in a healthy randomly selected population of children and their mothers.

The aim of this study was therefore to address longitudinal faecal microbiota shifts in composition and diversity in children and their mothers in a geographically restricted cohort. We analysed stool samples from 86 mother/child pairs, collected two times during the mothers pregnancy (15.0 ± 4.2 and 37.5 ± 1.8 gestation weeks) and five times from infants (ages 3 days and 10 days, 4 months, 1 year and 2 years). We used a polyphasic analytical approach consisting of direct mixed 16S rRNA gene Sanger sequencing (analysis of electropherograms containing information on all amplicon variants) (Zimonja et al., 2008), real-time quantitative PCR (Ginzinger, 2002) and 454-sequencing (Ronaghi, 2001). We present results suggesting highly age-dependent bacterial persistence and diversity patterns within the population. Furthermore, we also present support for mother to child transmission of adult associated gut bacteria – surprisingly not during the birth process, but at a later stage.

Materials and methods

Study material and sample preparation

Faecal samples were collected from the IMPACT cohort study among small children and mothers in Trondheim, which is a nested cohort within the PACT study (Prevention of Allergy among Children in Trondheim) (Storro et al., 2010). Most of the children were delivered vaginally (90%) and at term (90%). There was a high frequency of breast feeding, and 97% of infants were breast-fed during the first 6 weeks of life. By the age of 4 months, 66.7% of infants were exclusively breast-fed, 23.8% were receiving either formula or solid food (fruits, vegetables, wheat, bread, corn, rice) complementary to breast milk, and 9.5% of infants were receiving only formula and/or solid food. More details about the cohort characteristics are given by Storro et al. (2011).

Faecal specimens were stored in sterile Cary–Blair transport and holding medium (BD Diagnostics Sparks, MD 21152). Each specimen was frozen at −20 °C within 2 h after defecation and transported to the laboratory for further storage at −80 °C within 1 day (for children) or 4 weeks (for pregnant women). Details about the IMPACT faecal material is given by Oien et al. (2006). The data set analysed contained samples from both early (first to second trimester) and late pregnancy (third trimester) from the mothers and 3 days, 10 days, 4 months, 1 year and 2 years from the children.

We purified faecal DNA with paramagnetic beads in accordance with an optimized and automated protocol (Skanseng et al. 2006). Briefly, this protocol involved mechanical lysis with glass beads and DNA purification with silica particles. Mechanical lysis was chosen because the compositions of the gut bacteria cell walls are largely unknown.

Direct mixed sequence analysis

The V3–V4 region of 16S rRNA gene was PCR-amplified using the primers targeting universally conserved gene regions (Nadkarni et al., 2002). Subsequently the V4 region (198 bp) was targeted for sequencing using a mixed Sanger approach. The resulting sequence spectra contained information for the 16S rRNA genes representative of all the bacteria in a given sample.

The alpha- and beta-diversity of each spectrum was assessed by means of a modified Simpson's diversity index cmixed (Eqn. 1) and modified Bray–Curtis dissimilarity index (Eqn. 2), respectively. Calculations were based on the fluorescence intensity fractions of each nucleotide position. The rationale is that these intensity fractions will reflect diversity. In case there is only one bacterium in a sample, there will be only one nucleotide in every position of the sequence spectrum, and therefore, nucleotide fractions in every position will equal 1 : 0 : 0 : 0. In the case of a mixture of a range of different bacteria, though, the fractions will converge towards 0.25 : 0.25 :0.25 : 0.25. Based on these fractions, one could estimate diversity in a sample, which is independent of operational taxonomic units (OTUs).

display math(1)
display math(2)

Detailed description of the diversity indices calculations is given in the study by Avershina et al. (2013). Beta-diversity was assessed both between samples belonging to the same age group and between samples belonging to the same mother–child pair, but at different time points. Significant difference between indices at various time points was tested using Friedman's test – a nonparametric version of two-way anova – which takes into account possible correlation between the measurements (matlab® documentation, 2010). For those samples, where we did not expect the correlation, Kruskal–Wallis test was used. The null hypothesis was rejected at the level of 5%.

Information on the most dominant bacteria was subsequently resolved using multivariate curve resolution analysis (MCR-ALS). This analysis allows recovery of the common information contained between the samples of interest into so-called components, as well as simultaneous relative quantification of this information in all the samples (Zimonja et al., 2008). Taxonomic level of components' resolution for nondefined bacterial assemblages directly depends on the diversity represented within a data set (Rudi et al., 2012; Sekelja et al., 2012). If a given phylum is represented by one clearly dominant genus, then the signature sequence for this genus will be resolved as a component. While if there were several equally distributed genera within the same family, then the signature sequence for this family would have been recovered. Prior to MCR-ALS, one needs to specify the number of components to be resolved. In case the set number is too high, the ‘real’ component would be split, and thus, at least two of the resolved components would contain the same information. This can be detected by biological reasoning because these components will then represent the same taxonomic group. To define the initial number of components (initial estimates i), we used both principal component analysis (PCA) and evolving factor analysis (EFA) as recommended (Tauler et al., 1995). The detailed description of use of MCR-ALS for mixed sequence resolution can be found in the study by Avershina et al. (2013). Resolved components spectra were manually base-called and classified by Ribosomal Database Project (RDP) hierarchical classifier (Wang et al., 2007).

To address the longitudinal structure of the MCR-ALS score data, that is, relative abundance of resolved components, parallel factor analysis (PARAFAC) method was used. PARAFAC is a multiway generalization of the two-way PCA. However, unlike PCA, the rotation problem is omitted so that pure components can be resolved (Bro, 1997). The core consistency index was used as a criterion for determining the number of components.

Real-time quantitative PCR

We have previously qPCR-amplified the 16S rRNA gene of commonly identified gut bacteria, as well as some pathogenic bacterial species (Storro et al., 2011) for the same study cohort. Among tested species were Bacteroides fragilis, Bifidobacterium longum, Bifidobacterium breve, Bifidobacterium animalis subsp. lactis, genus Bifidobacterium, Clostridium difficile, Clostridium perfringens, Lactobacillus rhamnosus, Lactobacillus reuteri and Helicobacter pylori. For this work, we binarized these data based on whether the given bacterium was detected in a sample. For every age, unweighted Cohen's kappa indices (Sim & Wright, 2005) were calculated to evaluate whether there was an agreement between detection of a given bacteria in mothers and in children. Interpretation of the index was performed using guidelines provided in the matlab® script for Cohen's kappa index calculation (Cardillo, 2007). The relative amount of the detected vs. nondetected populations of bacteria is represented in Supporting Information, Fig. S1. ‘Nondetected’ populations were defined as populations that did not show amplification after 40 cycles. Some bacteria (L. rhamnosus and C. difficile) were not detected in any of the mothers, whereas others (e.g. H. pylori) were detected only in two mothers (Table S1). Therefore, to ensure sufficient amount of information, only bacterial groups that were detected in more than 11 mothers were included in the analysis. The bacterial groups that satisfied this criterion were B. longum, genus Bifidobacterium, B. fragilis and Escherichia coli. We also addressed the persistence patterns of these four bacteria in a population by calculating the fraction of individuals, in which the species was detected at a time point ‘x’ given it was detected at a time point ‘x−1’.

Pyrosequencing analysis

A subset of seven random mother and child pairs were selected for deep 454-sequencing from the pairs with the most complete temporal series in the main study cohort. DNA isolation, amplicon and PCR conditions were the same as for direct sequencing approach. The only difference was the modification of PCR primers targeting V3–V4 region of 16S rRNA gene, to be adapted to the GS-FLX Titanium Chemistry (454 Life Sciences). Sequencing was performed according to the manufacturer's recommendations at the Norwegian High-Throughput Sequencing Centre (Oslo, Norway). Pyrosequencing data were analysed using QIIME pipeline (Caporaso et al., 2010). Error correction, chimera removal and operational taxonomic unit (OTUs) clustering were performed using usearch quality filtering with QIIME, which incorporates UCHIME (Edgar et al., 2011) and a 97% sequence identity threshold. The RDP classifier (Wang et al., 2007) was used to assign taxonomic identity to the resulting OTUs. For a phylogeny-based diversity assessment, we used weighted UniFrac hierarchical clustering (Lozupone & Knight, 2005) based on 10 rarefactions with 1600 randomly selected sequences per sample for each rarefaction.

To investigate what shapes gut microbiota in both infancy and adulthood, we fitted observed species distributions to commonly used distributions using the Species Diversity and Richness, version 4.1.2 (PISCES Conservation Ltd., UK), software. Hubbell's model of neutrality, often used as a null model of community structure (Magurran, 2004), assumes that when an individual dies in a saturated community, the probability of its replacement by an offspring of rare species is the same as by an offspring of a more abundant species. Jabot & Chave (2011) have developed a generalization of this model introducing a parameter δ. This parameter estimates the non-neutrality of the system based on the deviation of observed species evenness as opposed to the system being best described by neutral model. When δ is positive, dominant species have higher chance of taking the place of the dead individual, whereas negative values indicate that rare species’ chances increase. Based on 1000 randomly selected sequences per sample from the chimera- and noise-free pyrosequencing data set, we calculated non-neutrality parameter δ using Parthy, version 1.0, software (Jabot & Chave, 2011).


Mixed sequence analysis

Nucleotide alpha-diversity (Simpson's diversity index) of mixed spectra ranged from 1.77 ± 0.10 [mean ± standard deviation] at 4-month-old to 1.91 ± 0.09 at 2-year-old infants (Fig. 1a). Generally, diversity of adult’ stool samples was higher than that of newborns (P = 0.0001) and 4-month-old infants (P = 2.26 × 10−9). At 1 year of age, the diversity increased compared with 4-month-olds (P = 0.0028) and then further increased by 2 years of age (P = 0.0054).

Figure 1.

Nucleotide diversity measurements. The significance in difference between diversity indices at two subsequent time points was calculated with the Friedman's (a and b) and Kruskal–Wallis (c) tests. *P < 0.05, **P < 0.01 and ***P < 0.001. Early period (pr) and Late pr: Early (8–20 weeks) and late (30–40 weeks) pregnancy periods, respectively. (a) The modified Simpson's index of nucleotide spectra diversity cmixed at various ages. (b) The modified Bray–Curtis index of nucleotide dissimilarity (BC) between individuals at various ages. Early pr and Late pr: early (8–20 weeks) and late (30–40 weeks) pregnancy periods, respectively. (c) The modified Bray–Curtis index of nucleotide dissimilarity (BC) between the subsequent time points. E–L pr: the period between early (8–20 weeks) and late (30–40 weeks) pregnancy periods; L pr–3 day: comparison between 3-day-old newborns and their mothers during the late pregnancy stage; 3–10 day: between 3 and 10 days of age; 10 day–4 month: between 10 days and 4 months of age; 4 month–1 year: between 4 months and 1 year of age; 1–2 year: between 1 and 2 years of age. The error bars represent standard error of the mean.

Newborns exhibited highest beta-diversity between individuals (modified Bray–Curtis index BC = 0.20 ± 0.02 and 0.18 ± 0.03 for 3- and 10-day-old infants, respectively; Fig. 1b). By the age of 4 months, the variation within the population had significantly decreased (P = 7.51 × 10−13) and remained the same up to 1 year. Although the beta-diversity between stool samples from 2-year-olds was significantly lower than that of 1-year-olds (P = 1.54 × 10−5), it was still significantly higher than the beta-diversity between adult stool samples (P = 4.38 × 10−6). In addition to interindividual comparisons, beta-diversity estimations were used to analyse intraindividual variation that developed within an individual from one time point to another (Fig. 1c). The highest variation (highest beta-diversity) was observed between the spectra of mothers at their late pregnancy stage and 3-day-old infants (BC = 0.21 ± 0.04), as well as between 4-month-old and 1-year-old children (BC = 0.20 ± 0.04), whereas the least variation (lowest beta-diversity) was observed between stool samples collected from mothers at two pregnancy trimesters (BC = 0.08 ± 0.03) and also between 1- and 2-year-olds (BC = 0.12 ± 0.02).

Both PCA and EFA suggested six components to be resolved by MCR-ALS. When six components were used, the information on Bacteroidetes group was entirely absent. Therefore, MCR-ALS was repeated by gradually increasing the number of components to be resolved until the duplication event. In total, eight components accounting for 70% of the variation in the system were resolved by MCR-ALS and classified by RDP classifier (Table S2).

Taxonomically, stool samples analysed from mothers were rich in Lachnospiraceae- and Faecalibacterium-affiliated components (Fig. 2). At 3 days, all eight components seemed to be evenly represented, but by the age of 10 days, there was a significant decrease in the level of Lactobacillales (P = 0.0191). By the age of 4 months, bifidobacteria constituted 57.6% of total gut microbiota, whereas Lactobacillales- and Streptococcus-affiliated components were diminished (P = 0.0135 and P = 0.0001, respectively). At 1 and 2 years of age, average composition resembled that of pregnant women, although there were several pronounced differences. For example, the Bifidobacterium-affiliated (P = 0.0042 and P = 0.0021 for 1 and 2 years, respectively) and other Actinobacteria- (P = 0.0016 and P = 2.3 × 10−5 for 1 and 2 years, respectively) components were higher in children than in their mothers, whereas Faecalibacterium- (P = 4.3 × 10−6 and P = 5.9 × 10−7 for 1 and 2 years, respectively) and Bacteroides-affiliated (P = 1.4 × 10−5 and P = 5.6 × 10−8 for 1 and 2 years, respectively) components were lower.

Figure 2.

Bacterial species composition in stool samples of infants (from 3 days to 2 years of age) and their mothers during pregnancy as revealed by MCR-ALS. Early pr and Late pr: early (8–20 weeks) and late (30–40 weeks) pregnancy periods, respectively.

Due to the fact that the majority of infants were born vaginally, were at term and were breast-fed during the first days of life, we could not investigate the effect of birth mode and diet. However, we could test whether implementation of solid food (wheat, rice, corn) at 4 months would affect faecal microbial composition. These analyses showed no significant difference in relative composition of gut microbiota.

To investigate longitudinal structure in the data (i.e. individual sharing of bacteria for more than one time point), 3-component PARAFAC model was deduced based on a core consistency index of more than 99%. The loadings for the MCR-ALS components dimension indicate that Escherichia-, Bifidobacterium- and Lachnospiraceae-affiliated components influenced the longitudinal structure of the data (Fig. 3a). In particular, the Escherichia-affiliated component was associated with 3 and 10 days, Bifidobacterium-affiliated component was associated with 3 days, 10 days and 4 months, while Lachnospiraceae-affiliated component was associated with early and late pregnancy, in addition to 1 and 2 years (Fig. 3b).

Figure 3.

Summary of PARAFAC on relative abundances of MCR-ALS resolved bacterial groups. C1, C2, C3 – PARAFAC components. Early pr and Late pr: early (8–20 weeks) and late (30–40 weeks) pregnancy periods, respectively. (a) PARAFAC-suggested components C1, C2 and C3 represent Bifidobacterium, Lachnospiraceae and Escherichia components, respectively. (b) At early days of life, C1 and C3 determined the variation in the system, whereas at pregnancy, 1 and 2 years of life, C2 became more important.

Real-time quantitative PCR analysis of prevalence

Figure 4 illustrates qPCR prevalence data calculated for selected bacterial groups both for the whole study cohort and for a subpopulation of children whose mothers tested positive for the target bacterium (mother–child positive subpopulation). At 10 days, E. coli was more frequently detected in those children whose mothers also tested positive for this bacterium (P = 0.002). Interestingly, the difference between detection frequencies of this bacterium in mother–child positive subpopulation and total children population was higher in 10 days as compared to 3 days. This may indicate either postnatal or very low at-birth transmission of this bacterial species. B. longum was deemed to be one of the most persistent colonizers among the four bacterial groups tested. Already by the age of 10 days, it was detected in nearly all infants who tested positive at 3 days after birth (Fig. 4). Even by the age of 2 years, this species persisted in the majority of infants who previously tested positive. In contrast, E. coli detection was observed to be stable during the first year (80–85% of population). However, by 2 years, a detection limit had decreased to 45% of children who previously tested positive.

Figure 4.

Prevalence of bacterial species in a population of children at various ages. Blue line indicates prevalence of bacteria in a subpopulation of children in whose mothers it was also detected; red line – in a total population of children of a given age. Black line depicts the percentage of individuals in whom bacteria were detected in both a given and a previous time point compared with a total number of individuals where it was detected in a previous time point. Late pr: late (30–40 weeks) pregnancy period. **one-sided binomial test P-value < 0.01.

Cohen's kappa index was used to indicate the magnitude of agreement between the detection of a given bacteria in an individual mother and her child (in the whole cohort). In our data set, the index ranged from −0.05 (poor agreement) to 0.30 (fair agreement) and was observed to decrease with age, indicating that the detection of a given bacterium in 1- to 2-year-old children was less dependent on their mother testing positive (Table 1). In concurrence with qPCR prevalence data (Fig. 4), Cohen's kappa indices indicated slight to fair agreement for both E. coli and B. fragilis. The ranking is based on the guidelines to the matlab® script for the index calculation (Cardillo, 2007). Bifidobacterium was observed to be negative at 4 months, indicating poor agreement in mother–child detection patterns. High P-values (> 0.05) also support low correspondence between detection of a given bacteria in mothers and in children.

Table 1. Cohan's kappa index – estimate of an agreement in detection of a given bacteria in mothers and in their infants
Age B. fragilis B. longum Bifidobacterium E. coli
  1. Calculations are based on detection of a given bacteria by RT-PCR.

3 days0.
10 days0.2400.040.3
4 months0.27−0.03−0.050.02
1 year0.1−0.02−0.050.01
2 years0.10−0.04−0.07

Pyrosequencing data analysis

Eight samples, mostly belonging to one mother–child pair, were removed from the analysis due to a low number of recovered sequences (less than 2000 sequences per sample). Therefore, the analysis was performed on a total of 39 samples from 6 children and 5 mothers. After quality filtering, chimera removal and normalization, 370 207 sequences were used for subsequent analysis with a mean of 9492 sequences per sample (ranging from 2146 to 21 317 sequences per sample). Apart from one sample, stool samples from mothers' and 1- and 2-year-old infants clustered separately from stool samples of newborns and 4-month-olds based on weighted UniFrac distances (1600 sequences per sample; bootstrap values are based on 10 rarefactions; Fig. S3A). To examine how similar the faecal microbiota from different age groups was, we used Jaccard distance index calculated for detected OTUs (Fig. S3B). Overall, there was higher variation in microbiota from children when compared to mothers (P = 0.0011 and P = 0.0001 at 3 days and 2 years of age, respectively), although the microbiota of newly born children were more similar to each other than to their related (P = 0.0010, P = 0.0011 and P = 0.0034 for 3 days, 10 days and 4 months, respectively) and unrelated mothers (P = 0.0011, P = 0.0006 and P = 0.0024 for 3 days, 10 days and 4 months, respectively). By the age of 1 year, their microbiota was as similar to adults as it was to other children from the same age group.

We compared how many OTUs were shared between five children at various time points and their mothers (both related and unrelated). In total, 30 samples were used for these comparisons. From birth to 4 months of age, only one child had more OTUs shared with his own mother than with any other unrelated mother. However, by the age of 2 years, the number of children who shared more OTUs with their mothers than with other unrelated mothers increased to 3 of 5 (Table S3). We also examined which OTUs were underrepresented in children at various ages compared with their mothers (Tables S4–S8). In the immediate period after birth (days 1–3), 1230 OTUs were absent in all infant samples, of which 44% were affiliated to the family of Lachnospiraceae. At the age of 1–2 years, 500 OTUs were absent, composed of c. 30% that were affiliated to the Lachnospiraceae. Overall, Lachnospiraceae–affiliated OTUs that had representatives in all children at a given age were first detected at 1 year, although in one child, OTUs affiliated to this clostridial family were detected right after birth. In contrast, within the first days after birth, only OTUs affiliated to the Bifidobacteriaceae, Streptococcaceae and Staphylococcaceae were shared among all infants, and by 4 months, only Bifidobacteriaceae-affiliated OTUs were shared. By the age of 1 year, the majority of OTUs were affiliated to the Clostridiales, whereas at 2 years, shared Bacteroidales-affiliated OTUs also appeared.

Depending on ecological forces that structure communities, species within these communities may follow different distributions that can be described mathematically (Magurran, 2004). We therefore fitted OTU distributions to these common distribution curves (Table S9). The majority of samples fitted well to truncated log normal distribution, two samples, belonging to one child at 3 and 10 days of age, fitted log series distribution. The geometric and broken stick distributions did not fit the data. We also tested whether distributions fitted a neutral model and how much they deviate from it. All these samples showed higher dominance than it would be expected in case of neutrality (Fig. S2), although there was a significant difference in deviation between mothers and 3-day-olds (P = 0.0091). Moreover, when combined, in infancy as well as at 4 months, the dominance was significantly higher than in adults and in 1- and 2-year-olds (P = 0.0001).

Data consistency

To address whether MCR-ALS and pyrosequencing predictions of faecal microbiota correspond to each other, we selected all OTUs belonging to taxonomical groups predicted by MCR-ALS from a pyrosequencing data set. We then grouped those OTUs in correspondence with MCR-ALS components and calculated their relative amounts based on the total number of OTUs. Pearson's correlation analysis revealed high correlation between MCR-ALS predictions and pyrosequencing results (correlation coefficient = 0.7463, P = 4.47 × 10−51).


Interestingly, there was a significant drop in interindividual beta-diversity in a short period of time after birth (3–10 days), as assessed by mixed sequencing. Due to practical reasons, many temporal research studies of faecal microbiota face a trade-off between sampling frequency and number of individuals included in the study. To our knowledge, all temporal faecal microbiota studies to date that have extensive sampling during first weeks of life (Favier et al., 2003; Palmer et al., 2007; Koenig et al., 2011) have few individuals analysed, whereas studies with high sample numbers often have fewer or more infrequent time points (Yatsunenko et al., 2012). However, our results illustrate that significant differences in average bacterial composition and beta-diversity occur between 3 and 10 days of age. These data therefore suggest that to better understand the development of gut microbiota, gaps between sampling periods should be reduced, particularly for those studies that compare different populations (Yatsunenko et al., 2012).

Pyrosequencing and mixed sequence analysis both demonstrated individualized clustering of the faecal microbiota during early and late pregnancy in our cohort, with little or no evidence for population-based changes during pregnancy. We were therefore not able to reproduce the results of a major change in the faecal microbiota between early and late pregnancy, as recently reported by Koren et al. (2012). Because our sampling times match that of Koren et al. with ± 3 weeks, we believe that sampling time cannot explain the differences in microbiota detected between the two studies. The most likely explanation would therefore be that there are true differences in the gut microbiota composition among pregnant women in the two cohorts.

qPCR analysis suggested a relatively low direct transmission of gut bacteria from mother to child; at 10 days of age, there was better overall agreement between detection of bacteria in mother–child pairs than at 3 days (Table 1). Even early colonizers such as E. coli were not likely to be directly transmitted at birth, but rather during first days of life (Fig. 4). The difference in detection of this species in mother–child positive subpopulation and the total population was higher at 10 days than at 3 days. Based on differences between weighted UniFrac (takes into account relative amounts) and Jaccard (takes into account only presence/absence data) distances, it may be suggested that by 1–2 years of age, adult characteristic OTUs already appeared in the gut, although they were still rare. Interestingly, many OTUs affiliated to Lachnospiraceae were shared between mothers and 1- to 2-year-old children, suggesting that these species possibly originate from the mother. PARAFAC data based on mixed sequencing also supported sharing of this component between mothers and infants. Even though detection of bifidobacteria seemed to be independent of the mother, frequency of B. longum was higher in a mother–child positive subpopulation, which is in line with a recent model suggesting transmittance of B. longum subsp. longum from mother to child (Makino et al., 2011).

At 3 days of age, there was a relatively high abundance of Lactobacillales in stool samples (Fig. 2). Lactobacilli are often isolated from human breast milk (Martin et al., 2003, 2007), and it was noted that the majority of infants (98%) in our cohort were exclusively breast-fed during the first 6 weeks of life. Interestingly, by the age of 10 days, the level of this bacterial group was observed to decline despite no changes in diet with respect to breast milk intake. As such, we hypothesize that lactobacilli detected in this study were possibly acquired via the vaginal microbiota of the mother during the infant's passage through the birth channel.

If we assume that neutral processes (i.e. random replacement of a dead individual in a community by an offspring of other species regardless of relative abundance of this species) are not involved in shaping gut microbiota, one would expect low individual alpha-diversity coinciding with high interindividual beta-diversity. In contrast, we observed steady decreases in beta-diversity over time (lowest among adult women), suggesting that overall microbiota development is ultimately directed towards a more stable community. Furthermore, delta values, which characterize a deviation from neutrality, were significantly lower in adulthood than in infancy.

In contrast to our findings, it has recently been argued that niche selection is also the main force shaping the distal gut community (Jeraldo et al., 2012). This conclusion was based on the fact that microbial OTUs in the gut were more closely related to each other than what would be expected in case of neutrally shaped community. The discrepancy, however, could be explained by the fact that niche selection will always limit the phylotypes allowed in a given environment (Magurran, 2004) and that the distal gut represents a highly selective environment (Marchesi, 2011), whereas among the allowed phylotypes, neutral processes could be important. Probably, because we did not take phylogenetic distances into account, we also discovered the neutral processes as a potential contributor. This explanation is coherent with our recently proposed interface model for bacterial–host interactions, suggesting host selection independent of the actual services provided (Avershina & Rudi, 2013).

In conclusion, our analyses of a large longitudinal cohort of mothers and their children have revealed new knowledge about the ecology of human gut bacteria, suggesting that there are still important mechanisms that remain unknown.


Funding for the IMPACT study was obtained from GlaxoSmithKline AS, Norway. The PACT study was funded by the Norwegian Department of Health and Social affairs from 1997–2003. A university scholarship from NTNU funded the research fellows. The mixed sequencing analyses were funded by a research levy on certain agricultural products from the Norwegian Government. P.P. is funded by Norwegian Research Council project 214042. Authors have no conflict of interest to declare.