Correspondence: Martin I. Bahl, National Food Institute, Technical University of Denmark, Mørkhøj Bygade 19, 2860 Søborg, Denmark. Tel.: +45 35 88 70 36; fax: +45 35 88 70 01; e-mail: firstname.lastname@example.org
Alterations in the human gut microbiota caused, for example, by diet, functional foods, antibiotics, or occurring as a function of age are now known to be of relevance for host health. Therefore, there is a strong need for methods to detect such alterations in a rapid and comprehensive manner. In the present study, we developed and validated a high-throughput real-time quantitative PCR-based analysis platform, termed ‘GUt Low-Density Array’ (GULDA). The platform was designed for simultaneous analysis of the change in the abundance of 31 different microbial 16S rRNA gene targets in fecal samples obtained from individuals at various points in time. The target genes represent important phyla, genera, species, or other taxonomic groups within the five predominant bacterial phyla of the gut, Firmicutes, Bacteroidetes, Actinobacteria, Proteobacteria, and Verrucomicrobia and also Euryarchaeota. To demonstrate the applicability of GULDA, analysis of fecal samples obtained from six healthy infants at both 9 and 18 months of age was performed and showed a significant increase over time of the relative abundance of bacteria belonging to Clostridial cluster IV (Clostridia leptum group) and Bifidobacterium bifidum and concurrent decrease in the abundance of Clostridium butyricum and a tendency for decrease in Enterobacteriaceae over the 9-month period.
The term ‘microbial metagenomics’ denotes any culture-independent study of the collective set of genomes of mixed microbial communities present in a given environment such as the human intestinal tract (Petrosino et al., 2009). Within the last couple of decades, the microbial composition of the human gut microbiota has been extensively explored both quantitatively and qualitatively using a variety of molecular technologies including denaturing gradient gel electrophoresis of PCR-amplified 16S rRNA genes, terminal restriction fragment length polymorphism, quantitative PCR (qPCR), microarray gene chips, and fluorescent in situ hybridization (McCartney, 2002; Zoetendal et al., 2006). In later years, the development of high-throughput metagenome sequencing platforms has provided a remarkable acceleration of data generation and consequently much new insight into this complex ecosystem (Eckburg et al., 2005; Ley et al., 2005; Turnbaugh et al., 2009; Larsen et al., 2011). Although all the above techniques have proved highly useful, they have various inherent limitations including dynamic range, discriminatory power (Lock et al., 2010), sensitivity to low-abundant taxa (Wagner et al., 2007), and PCR bias. Additionally, cost and speed vary considerably between the different methodologies. The choice of method or combination of methods should consequently reflect a careful consideration of the study hypothesis and what kind of data would be most suited to address this.
Studies of the gut microbial composition may in general be divided into two main categories, namely (1) static studies that focus on determining the abundances of specific genetic components, such as 16S rRNA genes, within a study group at a defined point in time; and (2) dynamic studies that focus on determining the effect of a defined and controlled intervention, for example, a dietary intervention, on the genetic composition of the microbiota in terms of changes in abundance of specific phylogenetic taxa or functional genes. Microbiological data obtained from both these types of studies may be correlated with other parameters and end points, such as clinical observations and biological risk markers, in order to associate host physiology with the observed differences and changes in the microbiota. To achieve sufficient power in low-impact dietary intervention studies, it is often required to recruit and sample a fairly large number of participants resulting in a large number of intestinal samples to be analyzed. This coupled to the fact that certain taxa within the gut environment, which are known to be relevant for host health, such as the Enterobacteriaceae, normally represent < 1% of the total bacterial community (Zoetendal et al., 1998), calls for methods with both high sensitivity toward low-abundant taxa and a high dynamic range. Real-time qPCR analysis has these properties and is a relatively simple and affordable technique, which offers a high degree of reproducibility and specificity and has thus found extensive use in the quantitative analysis of gut bacterial populations (Huijsdens et al., 2002; Bartosch et al., 2004; Gueimonde et al., 2004; Matsuki et al., 2004; Haarman & Knol, 2006; Larsen et al., 2010; Petersen et al., 2010; Combes et al., 2011; Vigsnæs et al., 2011). Analysis of multiple bacterial targets using qPCR is, however, to some extent technically limited by the fact that each primer set is associated with specific temperature cycling conditions, necessitating separate qPCR setup for each target. Furthermore, because of varying PCR efficiency, amplicon lengths, and GC-content (Colborn et al., 2008), the generation of standard curves for each target gene is normally required in order to provide absolute quantification. In many metagenomic studies however, such absolute quantifications may not be essential, as focus may primarily be placed on identifying specific changes occurring in microbial communities of the individuals undergoing a certain intervention (e.g. the individuals in a group given probiotics as compared to a control group), rather than obtaining information on absolute quantities or abundances of different bacterial taxonomical groups (Larsen et al., 2011).
In the present study, we develop and validate a real-time qPCR platform, GUt Low-Density Array (GULDA), that allows the simultaneous relative quantification of 31 relevant microbial 16S rRNA gene targets in two extracted community DNA samples, with four technical replicates all performed on one 384-well plate using a universal thermocycling program. Relative quantities of specific rRNA genes are calculated using a universal bacterial 16S rRNA gene as reference, and fold-changes for each microbial group are determined without the use of standard curves.
Materials and methods
Genomic DNA from pure culture strains and human fecal samples
Genomic DNA from a total of 27 bacterial strains and one archaeal strain was either obtained directly from Deutsche Sammlung von Mikroorganismen und Zellkulturen gmbH, Germany (DSM), as extracted DNA or extracted from pure cultures originating from either DSM or The American Type Culture Collection, USA (ATCC; Fig. 1). Fecal samples obtained from six randomly selected infants from the SKOT cohort (Madsen et al., 2010) at both 9 and 18 months of age were selected, and community DNA was extracted on the Maxwell® 16 system using the Maxwell® 16 DNA Tissue DNA purification kit (Promega Biotech AB, Sweden). In all cases, the DNA concentrations were determined fluorometrically (Qubit® dsDNA HS assay; Invitrogen) and adjusted to 1 ng μL−1 prior to use as template in qPCR.
Initial primer selection for real-time qPCR
A total of 118 primer sets were either selected from the literature or designed de novo, to represent bacterial species, genera, phyla, or other taxonomical groups of relevance, that is, either being abundant in the human microbiota or being particularly interesting in the context of gut ecosystem function or impact on human health. Almost all primer sets target regions within the 16S rRNA gene with a few exceptions targeting the 16S–23S rRNA gene intergenic spacer region and/or the 23S rRNA gene. For simplicity, only the term ‘16S’ is used in the following. The specificity of all primer sets was initially evaluated in silico using nucleotide blast (Altschul et al., 1990) and the Ribosomal Database Project (RDP; Cole et al., 2009). One hundred and ten primer sets found to be suitable after this screening process were synthesized commercially by Eurofins MWG operon GmbH (Ebersberg, Germany).
Real-time qPCR conditions
Quantitative real-time PCR was performed on an ABI prism 7900HT from Applied Biosystems (Nærum, Denmark). All amplification reactions were carried out in transparent 384-well MicroAmp® Optical reaction plates (Applied Biosystems) and sealed with MicroAmp® Optical Adhesive Film in a total volume of 11 μL containing 5.5 μL 2× SYBR Green PCR Master Mix (Applied Biosystems), 0.4 μL of each primer (10 μM), 2 μL template DNA (2 ng), and 2.7 μL nuclease-free water (Qiagen GmbH, Hilden, Germany). Liquid handling was performed with an epMotion 5075 (Eppendorf, Hørsholm, Denmark). The amplification program was identical for all amplifications and consisted of one cycle of 50 °C for 2 min; one cycle of 95 °C for 10 min; 40 cycles of 95 °C for 15 s and 60 °C for 1 min; and finally dissociation curve analysis for assessing amplicon specificity (95 °C for 15 s, 60 °C for 15 s, then increasing to 95 °C at 2% ramp rate). Initial qPCR screening on extracted mixed human fecal DNA from healthy volunteers was used in order to identify and remove primer sets, which did not amplify the expected target from this matrix. Fecal DNA was obtained from the control group of a previously conducted study and was extracted using the QIAamp DNA Stool Mini Kit (Qiagen) preceded by a bead-beater step as previously described (Leser et al., 2000; Licht et al., 2006). A subset of 58 primer sets (of the 110), selected based on their ability to generate amplification products from the complex fecal DNA template material, was used for further evaluation of target specificity on pure culture DNA.
qPCR primer validation on pure culture DNA
The 58 primer sets were tested against extracted DNA from 27 bacterial strains, and one archaeal strain, using the PCR conditions listed above. Reactions were performed in duplicate using 2 ng of DNA as template and always including the universal bacterial primers (reference gene) on the same plate. The generated PCR products were assessed by dissociation curve analysis and 2% agarose gel electrophoresis, stained with SYBR Green, to determine the homogeneity and length of the amplification product, respectively. For each qPCR on pure culture DNA, the mean Ct-value was determined based on a set threshold value of 0.2 and using the automatic baseline correction setting in the qPCR software (sds 2.2; Applied Biosystems, CA). Differences in Ct-values for each target strain were calculated between those obtained with the universal primer set and those obtained using every other primer set on the array in order to assess primer specificity. A maximum Ct-value of 35 was used for these calculations. A total of 31 specific primer sets as well as one universal bacterial reference primer set were selected for the GULDA based on their specificity toward target bacterial microbial groups (Fig. 1). The RDP ProbeMatch tool was used to assess the binding potential of the universal primer set within the five predominant bacterial phyla of the gut separately. Visualization of amplification products by agarose gel electrophoresis following amplification on fecal DNA template showed that all 31 primer sets generated single and distinct bands of the expected length (data not shown).
GULDA on DNA from human fecal samples
Extracted DNA from 12 human fecal samples, representing six infants sampled 9 and 18 months, respectively, was used as template for GULDA using the 31 validated primer sets with four technical replicas of each amplification. Following the thermocycling program, the raw fluorescence data recorded by the sds software were exported to the linregpcr program (Ramakers et al., 2003; Ruijter et al., 2009). The linregpcr software was used to perform baseline correction and calculate the mean PCR efficiency per amplicon group. This was used to calculate the initial quantities N0 (arbitrary fluorescence units) for each amplicon by the formula N0 = threshold/(), where Effmean denotes the mean PCR efficiency per amplicon, threshold is the optimal ‘cutoff’ in the exponential region, and Ct is the cycle number, where each sample exceeds this threshold. The relative abundance of the 31 specific amplicon groups was obtained by normalization to the N0-value obtained for the universal bacterial amplicon group determined in the same array. A detection limit of 10−5 (N0,specific/N0,universal) was applied to the normalized N0-values due to qPCR analysis limitations, and the normalized N0-value was set to this value for specific amplicon groups below this detection limit to allow further analysis.
Principal component analysis, fold-change calculations, and statistical analysis
The normalized N0-values (log10-transformed) obtained from each bacterial amplicon group were used as input for multivariate principal component analysis (PCA) using latentix version 2.11. Lines between the same individuals (at 9 and 18 months) were included in the PCA score plot. Fold-changes for specific amplicon groups were calculated as the (log 2) ratio of normalized abundances at 18 and 9 months. Statistical analysis was performed using the graphpad prism software (version 5.03; GraphPad Software Inc., La Jolla, CA). Indicated P-values refer to significance in Wilcoxon's signed rank test.
Results and discussion
The identification, selection, and validation of specific qPCR primers that target relevant groups of common gut bacteria and which are able to perform well during a universal thermocycling program was the primary focus of this study. Numerous primer sets targeting different bacterial taxonomical groups including species, genera, and phyla of the gut microbiota have been published during the last decades; however far from all have been evaluated in depth for their specificity to the taxonomical group that they were designed to amplify. Primer validation may be performed either in silico with reference to, for example, the RDP or by laboratory tests against a panel of DNA extracted from related bacteria. In the present study, extracted DNA from a total of 28 microbial species was used for the specificity validation of 58 qPCR primer sets all targeting the 16S rRNA gene of gut bacteria. One universal primer set was included designed to target the V3 variable regions (positions 339–539 in the Escherichia coli gene) of all known bacteria (Walter et al., 2000; Chakravorty et al., 2007). This primer set was shown in silico to match on average 99.1% ± 0.88% of a total of 931 412 good-quality (> 1200 bp) 16S rRNA gene sequences representing Firmicutes, Bacteroidetes, Actinobacteria, Proteobacteria, and Verrucomicrobia, respectively, found in RDP, with allowance for two mismatches. In some cases, unspecific amplification or lack of amplification was observed, which may to some extent be caused by the requirement for primers to perform in the applied universal two-step qPCR, with both annealing and elongation at 60 °C. Following the final screening, a total of 32 primer sets collectively representing the five dominating bacterial phyla of the gut microbiota, as well as the Euryarcheota (Methanobrevibacter smithii) and one universal bacterial primer set, were selected for the GULDA (Table 1). The specificity of these primers was overall consistent with the expected target groups, and amplification efficiencies were comparable to those observed for the universal bacterial primer set, as determined by differences in Ct-values following amplification on pure culture DNA (Fig. 1). It was recently shown that it is possible to optimize qPCR assay efficiency by primer modification, in order to run 16S rRNA gene primers displaying optimal specificities at different annealing temperatures on the same PCR plate under the same experimental conditions (Bacchetti De Gregoris et al., 2011). In the present study, the PCR efficiency for each amplicon group was calculated separately from the slope of the amplification curve by linear regression within the window of linearity (logarithmic scale) by the use of the linregpcr software. The mean calculated efficiencies for each amplicon group were then used to determine the initial concentration, N0, of the DNA target, that is, specific 16S rRNA gene, in arbitrary fluorescence units (Ramakers et al., 2003; Ruijter et al., 2009). All N0-values for specific bacterial taxa were normalized to the calculated universal bacterial N0-value amplified on the same plate. Normalized N0-values obtained from the same amplicon group, for example phylum Bacteroidetes, are directly comparable to each other and may thus be used to determine the fold-change of specific groups of bacteria between two or more samples. Direct comparisons between different amplicon groups may, however, be biased due to, for example, differences in amplicon length and GC-content causing nonequal binding of SYBR Green. Analysis of the fecal samples obtained from six infants at both 9 and 18 months was performed by PCA of normalized N0-values calculated from GULDA (Fig. 2). The score plot shows that all six individuals migrated from left to right along the PC#1 with generally little movement vertically along the PC#2. The initial grouping appeared more confined for the individuals at 9 months as compared to 18 months, which is consistent with the development of a more complex and individual gut microbiota. The loading plot indicates which bacterial taxa drive this migration, which was further studied by calculating the fold-changes for specific amplicon groups (Fig. 3). The latter shows a significant (P < 0.05) increase in the relative abundance of Clostridial cluster IV (Clostridia leptum group) and Bifidobacterium bifidum and concurrent significant decrease in the abundance of Clostridium butyricum (Clostridial cluster I) and a tendency for decrease in Enterobacteriaceae and E. coli (P < 0.10) from 9 to 18 months of age. These findings are consistent with some of the results of a previous study using both qPCR and Northern blotting to characterize intestinal bacteria in infant stools (Hopkins et al., 2005), which showed a significant decrease in Enterobacteria and increase in Faecalibacterium prausnitzii (Clostridial cluster IV) rRNA after 6 months of age. No other significant changes were observed within the remaining 26 amplification groups in the present study. Although a fairly small dataset, n = 6 infants, was used in this study, the methodological protocol of calculating the changes in a microbial community over a time period by the use of relative qPCR determinations performed under universal temperature cycling conditions was successfully demonstrated. In the present study, multiple bifidobacterial species were included in the array as this genus is known to be highly represented during infancy (Roger et al., 2010). Modification of GULDA for other purposes by incorporating other primer sets is, however, easily achieved and provides a high degree of flexibility to the qPCR array. We expect GULDA to be a very useful tool to study induced changes in the composition of the gut microbiota and consequently further elucidate the causal relationships between the vast numbers of bacterial species present in the human gut microbiota.
Table 1. List of primer sets used in GULDA
Primer sequence (5′-3′)
Approximately amplicon size (bp)
For each primer set, the first row shows the forward primer and second row the reverse primer. Y = C or T; R = A or G; M = A or C; S = C or T; B = G, C or T.
Abbreviations for primer ID are as follows: U, Universal; F, Firmicutes; B, Bacteroidetes; A, Actinobacteria; P, Proteobacteria; V, Verrucomicrobia; E, Euryarchaeota.
Primer sets F1b and A1b were associated with misamplification of Bifidobacteria spp. and slight misamplification within the Firmicutes phylum, respectively. Alternatives F1 and A1 are presented in the table and validated (Fig. 1).
Primer set B7 targets the 16S–23S rRNA gene intergenic spacer region/23S rRNA gene and primer set B8 targets the 16S–23S rRNA gene intergenic spacer region.
The authors would like to thank The Danish Agency for Science, Technology and Innovation for financial support. We thank laboratory technicians Bodil Madsen and Vivian Julia Anker for excellent technical assistance, Thomas Skov for advice on PCA analysis and Sara Dyhrberg and Louise Nissen-Schmidt for valuable assistance in the development of the GULDA qPCR setup.