The phylogeography of Adelie penguin faecal flora

Authors


*E-mail j.banks@waikato.ac.nz; Tel. (+64) 7838 4760; Fax (+64) 7838 4324.

Summary

The gut of animals changes quickly from a totally sterile environment before birth to a numerous and highly diverse microbial community shortly after birth. However, few studies have examined the source of the bacteria colonizing the gut. We used a genetic based approach to investigate the distribution and acquisition of faecal microbial communities by Adelie penguins, Pygoscelis adeliae, breeding in the Ross Sea region of Antarctica by cloning a portion of the 16S rRNA gene and by automated ribosomal intergenic spacer analysis (ARISA). We hypothesized that bacteria were either acquired from the penguins' neighbours or inherited from their ancestors. Samples were collected from Adelie penguin breeding colonies at the north-western edge of the Ross Sea coast through to the southernmost Adelie penguin colonies on Ross Island. Most of the bacterial sequences we obtained were only distantly homologous with previously published sequences. Bacterial taxa appear to have a restricted distribution as the majority of 16S rRNA clones were isolated from only one or two hosts. Faecal bacterial community similarity was strongly negatively correlated with the genetic distance between hosts suggesting that bacterial communities are inherited. There was little support for a correlation between distance between collection sites and community similarity.

Introduction

The constituents of bacterial communities, their diversity and biogeography are poorly understood and yet microbial ecology drives Earth's ecology (Curtis et al., 2002). The predominant theory in microbial diversity has been that ‘everything is everywhere, the environment selects’ (Cho and Tiedje, 2000). However, studies of bacteria in symbiotic relationships with eukaryotes and in soils are revealing that endemism exists within the microbial world and bacterial diversification is ongoing (Fulthorpe et al., 1998; Cho and Tiedje, 2000; Funk et al., 2000). Genetic data are increasingly showing that in the microbial world everything is not everywhere, suggesting that the distribution of bacteria can be restricted by factors such as distance between sites (Cho and Tiedje, 2000).

The gut of animals changes quickly from a totally sterile environment before birth to a numerous and highly diverse microbial community that is maintained throughout life (Ley et al., 2006). However, the acquisition and diversity of commensals has not been extensively studied, in part because past approaches have used culture-based techniques to identify gut communities which limits the identifications to culturable commensals (Pace, 1997). Methods for identifying microbes based on the amplification of DNA have been applied recently to the gastrointestinal flora of animals, especially humans (for example Ley et al., 2006; 2008; Palmer et al., 2007; Li et al., 2008) and are providing insights into the vast diversity and the source of these microbial commensal communities.

It is generally thought that commensals are either inherited from parents during the parental care stage or they are acquired later in life from close contacts such as mates (Brooks and McLennan, 1991), or via a combination of both routes. Thus commensal organisms can be thought of as inherited as ‘heirlooms’ or acquired as ‘souvenirs’ (Kliks, 1990). The two methods of acquiring commensals give rise to two different patterns of relationships between hosts and commensals. If commensals are perpetually inherited as heirlooms, host and commensal phylogenies will be congruent and parasite community similarity will be negatively correlated with host genetic distance; if commensals are acquired as souvenirs, host and commensal phylogenies will almost certainly be incongruent (Brooks and McLennan, 1991; Paterson and Banks, 2001) and commensal community similarity will not be correlated with host genetic similarity. If host and commensal phylogenies are incongruent, it is likely that factors other than inheritance, for example spatial proximity, explain the acquisition of commensals. Understanding the extent of transfer of faecal bacteria may also provide insight into the transmission routes potential pathogens may take.

We examined whether the faecal commensals of Adelie penguins, Pygoscelis adeliae, are souvenirs or heirlooms. Adelie penguins breed on ice-free areas around the margin of the Antarctic continent and islands south of about latitude 60°S and then disperse only as far north as the limits of the pack ice during the non-breeding season (Marchant and Higgins, 1990). Thus Adelie penguins are an ideal group in which to study the source of faecal flora as human disturbance has been very recent and limited. We cloned and sequenced a portion of the 16S rRNA gene from DNA extracted from bacterial communities obtained from faecal swabs of Adelie penguins breeding at six sites in Antarctica to identify the faecal bacteria and to examine hypotheses regarding the distribution of bacteria. We also used a DNA fingerprinting tool (automated ribosomal intergenic spacer analysis, ARISA) to characterize and compare the faecal communities of breeding birds. ARISA utilizes length variation in the intergenic spacer region between the 16S rRNA and the 23S rRNA genes, a hypervariable region that varies among species and among strains of bacteria, to characterize bacterial communities (Fisher and Triplett, 1999). ARISA is a fast and relatively inexpensive method of characterizing microbial communities and thus allows more samples to be processed compared with traditional methods (Brown et al., 2005).

We found that faecal microbial communities from individual birds were remarkably diverse and most bacterial taxa were present in only a small proportion of the birds we examined, suggesting that each bird has a unique faecal community. A few bacterial taxa were distributed over large geographic distances. We found a significant negative correlation between faecal bacterial community similarity and host genetic similarity from the clone libraries and the ARISA data. There was no support for a correlation between bacterial community similarity and distance between collection sites.

Results

Faecal flora composition

We obtained sequences from 183 clones from the six faecal samples analysed. DOTUR (distance-based OTU and Richness, Schloss and Handelsman, 2005) assigned these sequences to 52 operational taxonomic units (OTUs) using the criterion of 99% or more similarity. More than 48% of the OTUs were unique to their host and none of the clones was isolated from all six Adelie penguins. The majority of clones (76%) were found in one or two of the six clone libraries (Fig. 1).

Figure 1.

Frequency distribution of the occurrence of each clone in the six clone libraries (52 clones differing > 1% for 16S rRNA gene from six Adelie penguins).

The Ribosomal Database Project (RDP) (Cole et al., 2007; Wang et al., 2007) classified the clones in four bacterial phyla. Most of the clones were from the Firmicutes (41%) and Actinobacteria (35%) phyla, with 12.5% of the clones from Proteobacteria and 5% from Bacteroidetes. Actinobacteria and Firmicutes clones were found in all six colonies; Bacteroidetes were not sequenced from the Capes Cotter and Wheatstone samples and Proteobacteria were not sequenced from the Capes Royds and Wheatstone samples (Fig. 2). Each penguin had an average of 14.5 clones (SD = 3.8). Operational taxonomic units were not shared extensively among hosts – on average an OTU was isolated from 1.77 (SD = 1.06) of the six penguins for which communities were cloned.

Figure 2.

Relative abundances of the bacterial phyla sequenced from each clone library.

Shannon indices from each clone library were approximately similar ranging from 2.52 to 2.77 (the value for Cotter Cliffs was 2.02 but we selected fewer clones for sequencing from this library). Chao-1 estimated that there were 66.6 OTUs (at 99% similarity, 95% confidence interval 57–94.7) for the six clone libraries combined. Chao-1 estimated species richness for individual penguins ranged from 15.3 to 33 (with the Cape Cotter individual having an estimated 12.3 species as we sequenced fewer clones from this individual). Rarefaction analyses (Fig. S1) suggested that we did not reach a diversity plateau for individual birds.

The average similarity between our OTUs and sequences in GenBank (http://www.ncbi.nlm.nih.gov/) was 91% (Table 1), indicating that many of the penguin faecal bacteria have not been sequenced before. Operational taxonomic units that are more than 97% similar for 16S are considered to be analogous to metazoan species (Hagström et al., 2000), even though OTUs that are 99% similar for 16S may exhibit considerable phenotypic differences (Ward, 1998). Only four of our OTUs were more than 97% homologous with sequences in GenBank. The closest matches (all more than 99% similar) were to Corynebacterium sphenisci and C. spheniscorum bacteria isolated from the cloaca of Magellanic penguins Spheniscus magellanicus (Goyache et al., 2003a,b), and Mycoplasma sphenisci isolated from the nasal cavity of a captive African penguin, Spheniscus demersus (Frasca et al., 2005). Other close matches to our clones included Actinobacillus succinogenes (97.5%) collected from a bovine rumen (Guettler et al., 1999); Corynebacterium argentoratense, C. freneyi and C. xerosis (97.5%) collected from the human throat (Riegel et al., 1995), pus (Renaud et al., 2001) and ‘clinical samples’ (Drancourt et al., 2000). Other clones had close matches with Mycoplasma hyopharyngitis, a bacterium of the respiratory system of pigs (Blank et al., 1996); Dietzia cinnamea, a bacterium isolated from the perianal region of a human (Yassin et al., 2006) and several species of Tsukamurella, associated with environments such as human blood and sputum, and sewage sludge foam (Nam et al., 2003).

Table 1.  Closest GenBank and Ribosomal Database Project matches to 16S rRNA gene sequences obtained from adelie penguin, Pygoscelis adeliae faecal swabs.
Bacterial OTUGenbank accession number of cloneN.o. of clonesName, sample origin, phylum of closest sequence matches in Genbank (NS = not stated in Genbank)Genbank accession n.o.similarity (%)Name, sample origin, phylum of closest sequence matches in RDP (NS = not stated in Genbank)Similarity (%)Genbank accession n.o.
  1. NS, not stated in GenBank.

1218ZFJ3934959Corynebacterium spheniscorum, Magellanic penguin, ActinobacteriaAJ42923499Corynebacterium spheniscorum99AJ429234
1218H1FJ3934873Mycoplasma sphenisci, African penguin choana, FirmicutesAY75617198Mycoplasma sphenisci98AY756171
1216WFJ3934713Corynebacterium sphenisci Magellanic penguin, ActinobacteriaAJ44096497Corynebacterium sphenisci99AJ440964
1219HFJ3934983Bisgaard Taxon 14, pheasant, ProteobacteriaAY17272597Bisgaard Taxon 1497AY172725
1217S1FJ3934799Uncultured actinobacterium clone, sea squirt Microcosmus sp., ActinobacteriaAY77069996Dietzia sp., swine effluent pit, Actinobacteria94DQ337507
1218EFJ3934863Mycoplasma indiense, NS, FirmicutesAF12559395Mycoplasma orale, NS, Firmicutes96AY796060
1220B1FJ3935052Uncultured bacterium clone T4292, raw cows’ milk, ActinobacteriaEU02949395Myceligenerans xiligouense, NS, Actinobacteria91AY354285
1220MFJ3935091Actinomyces nasicol, human nose, ActinobacteriaAJ50845595Actinomyces nasicola95AJ508455
1218DFJ3934855Mycoplasma caviae, NS, FirmicutesAF22111194Mycoplasma sp., human urine, Firmicutes95AJ000494
1222NFJ3935122Mobiluncus mulieris, NS, ActinobacteriaAJ57608194Mobiluncus mulieris94AJ576081
1218J1FJ3934892Actinomyces europaeus, anus abscess, ActinobacteriaAF35519193Actinomyces europaeus, NS, Actinobacteria94AM084230
1218MFJ3934921Uncultured bacterium clone, tamar wallaby cloaca, NSEU28909993Allisonella histaminiformans, NS, Firmicutes90AF548373
1219A1FJ3934962Actinomyces nasicol, human nose, ActinobacteriaAJ50845593Georgenia ruanii, forest soil, Actinobacteria92DQ203185
1219J1FJ3934993Eubacteriaceae bacterium, contaminated groundwater, FirmicutesDQ19663593Eubacteriaceae bacterium94DQ196635
1220K1FJ3935081Propionibacteriaceae bacterium, cattle farm waste, ActinobacteriaAB37717893Actinobacteria VeSm15, rice paddy soil, Actinobacteria93AJ229243
1216EFJ3934682Kocuria sp., marine sediment, ActinobacteriaDQ44870992Georgenia ruanii, forest soil, Actinobacteria92DQ203185
1218LFJ3934913Peptostreptococcus sp., NS, FirmicutesX9047192Peptostreptococcus sp., rumen, Firmicutes92AF044946
1219AFJ3934975Uncultured organism clone, hypersaline microbial mat, NSEU24606192Eubacterium angustum, NS, Firmicutes93L34612
1219PFJ4033295Actinomyces urogenitalis, NS, ActinobacteriaAJ24389392Actinomyces urogenitalis93AJ243893
1222VFJ3935146Actinomyces urogenitalis, NS, ActinobacteriaAJ24389392Actinomyces hordeovulneris, NS, Actinobacteria93X82448
1216SFJ3934709Uncultured Firmicutes bacterium clone, human colonic mucosal biopsy, FirmicutesEF07139991Sporanaerobacter acetigenes, NS, Firmicutes88AF358114
1217FFJ3934731Clostridiales oral clone, peridontal, FirmicutesAF53885491Clostridium sp., UASB reactor, Firmicutes87AY949857
1217VFJ3934809Peptostreptococcus sp., endodontic infection, FirmicutesAF48122591Peptostreptococcus sp.92AF481225
1222BFJ3935102Actinomyces europaeus, anus abscess, ActinobacteriaAF35519191Actinomyces europaeus, NS, Actinobacteria91AM084230
1216FFJ3934691Uncultured bacterium clone, tamar wallaby cloaca, NSEU28909990Propionispora vibrioides, NS, Firmicutes84AJ279802
1217JFJ39347412Uncultured bacterium clone, human gut, NSAY98558590Erysipelothrix rhusiopathiae, NS, Firmicutes87AB055909
1218R1FJ3934941Uncultured bacterium clone, mouse caecum, NSEF60365790Clostridiales bacterium, NS, Firmicutes85DQ168652
1219QFJ3935002Uncultured Porphyromonas sp., human subgingival plaque, BacteroidetesAM42009190Porphyromonas gingivalis, NS, Bacteroidetes89AB035455
1220AFJ3935041Uncultured bacterium, feedlot manure, NSAF31737990Guggenheimella bovis, bovine lesions, Firmicutes89AY272039
1220CFJ3935062Actinomyces canis, dog, ActinobacteriaAJ24389290Actinomyces nasicola, human nose, Actinobacteria90AJ508455
1216BFJ3934672Actinomyces europaeus, human subcutaneous fistulae, ActinobacteriaEU13886689Actinomyces coleocanis, dog vagina, Actinobacteria89AJ249326
1217L1FJ3934751Actinomyces canis, dog, ActinobacteriaAJ24389289Actinomyces nasicola, Human nose, Actinobacteria90AJ508455
1217PFJ39347611Cardiobacterium sp., dental plaque dog, ProteobacteriaAY82787789Cardiobacterium sp., NS, Proteobacteria89AF144696
1217R1FJ39347714Uncultured bacterium clone, human faeces, NSDQ79502389Clostridium populeti < NS, Firmicutes88X71853
1218CFJ3934847Actinomyces canis, dog, ActinobacteriaAJ24389289Actinomyces nasicola, Human nose, Actinobacteria89AJ508455
1218I1FJ3934886Uncultured Filifactor sp., human subgingival plaque, FirmicutesAM41995689Filifactor villosus, oral cavity, Firmicutes89AF537211
1218KFJ3934905Uncultured actinomycete, termite gut, ActinobacteriaAB19223089Atopobium minutum, NS, Actinobacteria88X67148
1222HFJ3935111Filifactor villosus, NS, FirmicutesX7345289Filifactor villosus, oral cavity, Firmicutes90AF537211
1217D1FJ3934721Uncultured bacterium clone, dairy cow rumen, NSEF44524388Bacillus cereus, grassland soil China, Firmicutes80AY176771
1217O1FJ4033282Guggenheimella bovis, cow, FirmicutesAY27203988Guggenheimella bovis86AY272039
1217RFJ3934781Guggenheimella bovis, cow, FirmicutesAY27203988Guggenheimella bovis90AY272039
1218P1FJ3934932Uncultured bacterium clone, mouse caecum, NSEF60365788Clostridiales bacterium, NS, Firmicutes85DQ168652
1220FFJ3935074Guggenheimella bovis, cow, FirmicutesAY27203988Guggenheimella bovis89AY272039
1222O1FJ3935132Uncultured bacterium clone, mouse caecum, NSEF60365788Clostridiales bacterium, NS, Firmicutes84DQ168652
1219XFJ3935021Uncultured bacterium, human faeces, NSDQ80560187Clostridium sp., UASB reactor, Firmicutes85AY949857
1219RFJ3935016Neisseria sp., NS, ProteobacteriaDQ00684286Neisseria sp.87DQ006842
1219ZFJ39350313Porphyromonas endodontalis, oral cavity, BacteroidetesAY25372886Parabacteroides goldsteinii, human intestine, Bacteroidetes87EU136697
1217YFJ39348115Uncultured bacterium, aerobic activated sludge, NSEF64805683Cytophaga sp., associated with deep sea polychaet, Bacteroidetes81AJ431253

Other notable matches, although at slightly less than 97% similarity, included Mycoplasma iners (96.8% homology), a commensal that has been isolated from the upper respiratory tract of several bird species and has been associated with the development of caseous exudates in the joint space, liver and heart when experimentally inoculated into chicken embryos (Wakenell et al., 1995). Another of our clones was 96.6% homologous with Nocardia seriolae (AF254420) that causes abscesses in the epidermis and tubercles in the gills, liver and spleen of fish (Kono et al., 2002). Several of the clones had 96.6% homology with several Nocardia sp. that are known to cause nocardiosis (pulmonary infection, cellulitis and/or brain abscess) in humans (Centers for Disease Control and Prevention, 2007: http://www.cdc.gov/nczved/dfbmd/disease_listing/nocardiosis_ti.html).

Forty of the GenBank matches provided enough information to identify the collection site of the GenBank sample. Twenty-five of the 40 closest matches to our clones were collected from the gastrointestinal tract of various animals (Table 1) and five were likely associated with the gut (for example, collected from feed lot manure).

Adelie penguin phylogeny

We sequenced 651 nucleotides of the Adelie penguin mitochondrial control region from each host. Nested clade analysis (Templeton, 1998; 2004) of the Adelie penguin sequence data did not reject the null hypothesis that there had been no impediments to gene flow between the breeding colonies.

Souvenirs or heirlooms

There was a significant negative correlation between the UniFrac (Lozupone and Knight, 2005) community similarity values and host genetic distance in comparison with 2000 randomized permutations of the genetic distance matrix (partial Mantel test, r = −0.77, P = 0.01). There was a non-significant correlation between UniFrac community similarity and geographic separation (partial Mantel test, r = −0.44, P > 0.1) (Table 2). The fstat analysis (Goudet, 2002) of the correlation between UniFrac community similarity and host genetic distance and geographic distance between colonies found that the two explanatory variables explained 62% of the variance in the model. Genetic distance explained 59.6% of the variance in the model when the effect of geographical distance was controlled for, significantly more than in the randomized comparisons (partial Mantel test, P < 0.01). Geographic distance explained 20.0% of the variance when the effect of genetic distance was controlled for, which was not significantly more than found from the random comparisons (partial Mantel test, P = 0.6).

Table 2.  Results of partial Mantel tests on the correlation between bacterial community similarities from clone libraries (UniFrac) or ARISA data and host genetic distance or geographic separation.
DataMantel testrβVariance explained (%)
  1. The symbol ‘*’ denotes 0.05 > P > 0.01, the symbol ‘**’ denotes P < 0.01, and NS is P > 0.05, so for example P = 0.01 means only one in 100 of the randomizations gave values as large as or larger than the observed value.

  2. ‘(UniFrac × GEN).GEOG’ tests the partial correlation between UniFrac similarity metrics (i.e. pairwise similarities between faecal bacterial communities) and host genetic distances in which the effect of geographic distance between hosts was controlled. Likewise, ‘(UniFrac × GEOG).GEN’ controls for the effect of host genetic distance on the correlation between community similarity and geographic separation. Likewise, ‘(BC × GEN).GEOG’ tests the partial correlation between Bray–Curtis similarity coefficients and host genetic distance in which the effect of geographic distance between hosts was controlled. The null hypotheses are that the objects in matrix Y are not (linearly) correlated with the objects in X, and that the amount of variance explained by the variables was less than that explained in 95% of the randomizations.

Clones(UniFrac × GENE).GEOG−0.773.029*59.6**
(UniFrac × GEOG).GEN−0.440.000024NS20NS
GEOG × GENE0.420.000016NS17.5NS
ARISA(BC × GENE).GEOG−0.34−254.0**11.5**
(BC × GEOG).GEN0.040.001NS< 1NS
GEOG × GENE0.0391149.67NS< 1NS

Bray–Curtis similarity coefficients generated from the ARISA data were also significantly negatively correlated with host genetic distances (partial Mantel test, r = −0.34, P < 0.01). Bray–Curtis similarity coefficients were not significantly correlated with geographic separation of hosts (partial Mantel test, r = 0.04, P > 0.1) (Table 2). The partial Mantel tests between the Bray–Curtis coefficients and host genetic distances and geographic distances between colonies found that the two variables explained 11.8% of the variance in the model. Host genetic distance explained 11.5% of the variance in the model when the effect of geographical distance was controlled for, significantly more than in the randomized comparisons (partial Mantel test, P < 0.01). Geographic separation explained 0.17% of the variance when the effect of genetic distance was controlled for, which was not significantly more than found from the random comparisons (partial Mantel test, P = 0.67).

We also analysed geographically ‘near’ (i.e. < 151 km between colonies) and geographically ‘far’ (521–704 km between colonies) separately as it was apparent that there was a marked change in the strength of the correlations for the two distances categories (Fig. 3). Geographic distance and genetic distance explained 91% of the variance in the UniFrac similarity data for the ‘near’ comparisons. Both host genetic distance (variance explained = 55%, P < 0.01) and geographic distance (variance explained = 86%, P < 0.01) explained a significant amount of the variance in the model when the effect of the other variable was controlled for. However, neither variable was significantly correlated with the UniFrac similarity when the effect of the other variable was controlled for (partial Mantel tests, rgenetic distance = −0.74, P = 0.65; rgeographical distance = −0.92, P = 0.17). Genetic distance was significantly positively correlated with geographical separation (Mantel test, r = 0.61, P = 0.05) in the near comparisons.

Figure 3.

Comparison of the pairwise distances between clone community similarity and host geographic separation from the six clone libraries.

Host genetic distance and geographic separation explained 88% of the variation in the UniFrac data for the ‘far’ comparisons but only host genetic distance was significantly negatively correlated with UniFrac similarity (partial Mantel test, r = −0.91, P < 0.01). There was a non-significant correlation between UniFrac similarity and geographic separation (Mantel test, r = 0.32, P = 0.56) for the far comparisons. Only host genetic distance explained significantly more variance than did the randomizations (partial Mantel test, P < 0.01). There was no significant correlation between host genetic distance and geographic separation (Mantel test, r = −0.11, P = 0.72) for the far comparisons.

Host genetic distance and geographical separation explained 16.4% of the variance in the Bray–Curtis similarity coefficients calculated from the ARISA data for the ‘near’ comparisons. Only genetic distance explained significantly more variance than the randomizations (partial Mantel test, P = 0.05) and there was a trend towards significance for the negative correlation between host genetic distance and the Bray–Curtis similarity coefficients (partial Mantel test, r = −0.37, P = 0.06). There was no significant correlation between Bray–Curtis similarity coefficients (partial Mantel test, r = −0.19, P = 0.35) and geographic separation, and geographic separation did not explain a significant amount of the variance in Bray–Curtis values (partial Mantel test, P = 0.29). There was no significant correlation between host genetic distance and geographic separation (partial Mantel test, r = 0.06, P = 0.74) in the ARISA data.

Host genetic distance and geographic separation explained 11.0% of the variance in Bray–Curtis similarity coefficients for the ‘far’ comparisons. There were trends towards significance for the negative correlation between host genetic distance and Bray–Curtis similarity coefficients (partial Mantel test, r = −0.32, P = 0.06), and between the amount of variance in the Bray–Curtis similarities explained by host genetic distance (partial Mantel test, P = 0.06) for the far comparisons. There was no significant correlation between Bray–Curtis similarities of the far comparisons and geographical separation (partial Mantel test, r = 0.10, P = 0.66). Geographical separation also did not explain a significant amount of the variance in the Bray–Curtis similarity coefficients (partial Mantel test, P = 0.56).

Discussion

Faecal flora: composition and distribution

The composition of the Adelie penguin faecal flora contained a high proportion of Bacteroidetes and Firmicutes, similar to the gut flora of humans and mice. However, unlike humans and mice, Actinobacteria were relatively abundant members of the penguin faecal flora (30%). This finding differs from the few published studies of faecal flora from other metazoan species. Eckburg and colleagues (2005) and Ley and colleagues (2006) found that the human gut flora consists almost entirely of Bacteroidetes and Firmicutes (92–99% of clones), and Actinobacteria make up only 0.2% of the human flora (Eckburg et al., 2005) and 0.1% of the mouse flora (Ley et al., 2005). The higher proportion of Actinobacteria in our results may be due to host differences; for example, the vent of Adelie penguins is very close to the soil and approximately 15% of clones identified from Antarctic soils are Actinobacteria (Yergeau et al., 2007; Niederberger et al., 2008).

The absence of Bacteroidetes from the Cotter Cliffs and Cape Wheatstone samples, and the absence of Proteobacteria from the Capes Royds and Wheatstone samples is noteworthy. These three colonies are the smallest colonies we sampled; 2800 pairs at Cape Wheatstone, 4000 pairs at Cape Royds and 27 700 pairs at Cotter Cliffs in 1990. In comparison, the large colonies range from 43 900 pairs at Cape Hallett up to 169 000 pairs at Cape Adare in 1990 (Woehler and Croxall, 1997).

Most of the bacterial taxa we identified have not been sequenced previously. One notable match was a close match with two bacterial species collected from African and Magellanic penguins. The ranges of these three penguins do not normally overlap although Adelie and Magellanic penguins have both been recorded infrequently on the South Shetland Islands (Marchant and Higgins, 1990) and there is a record of a Magellanic penguin mixing with Adelie penguins at Marguerite Bay, Antarctic Peninsula (Barbosa et al., 2007).

Although some of the bacteria taxa had large geographical ranges, 32% of the clones were restricted to a single host. Ley and colleagues (2006) found that human gut microbes also had a restricted distributions with over 70% of their 16SrRNA phylotypes being identified from only a single host subject. The restricted range of many gut microbes suggests that there are limited opportunities for some gut microbes to transfer and/or establish in new hosts.

The Shannon indices (H′) of diversity ranged from 2.69 to 2.77 for faecal samples collected from penguins at each colony. The Shannon index was 3.54 for all six colonies combined, which is broadly comparable to a value obtained from the human rectum of 3.08 (Wang et al., 2005) and values of H′ for preadolescent turkeys of various ages that ranged from 1.49 to 3.99 (Scupham, 2007). The estimated overall species richness of 66.6 OTUs from the penguin cloacae compares with an estimate of 85 OTUs for the human rectum (Wang et al., 2005).

Souvenirs or heirlooms?

Given the restricted range of many of the bacteria in our samples, we asked, ‘How are bacterial communities acquired?’ We found a strong negative correlation between bacterial community similarity and host genetic distance, and a non-significant correlation between community distance and geographic separation from our sequence data. The more extensive ARISA data supported the results from the sequences with a significant negative correlation between Bray–Curtis similarity coefficients and host genetic distance, but not between Bray–Curtis coefficients and geographic separation. Our results suggest that bacterial communities can be inherited.

Other studies have also found a relationship between host relatedness and bacterial community similarity on shorter time scales. Closely related mice (i.e. mothers and offspring) were found to share gut (caecal) bacterial community members (Ley et al., 2005). Lucas and Heeb (2005) found that the bacterial communities of sibling chicks of great tits Parus major and blue tits P. caeruleus raised in the same nest were significantly similar. They also showed that the chicks of one species fostered into nests of the other species acquired the bacterial communities of their foster siblings. Although these studies cannot distinguish between the effect of geographical proximity and inheritance, Ley and colleagues (2008) found strong support for codiversification on evolutionary time scales between mammalian host and their gut bacteria. It appears from our results that the faecal bacteria are inherited on evolutionary time scales at the population level as well at the host species level.

A comparison of UniFrac community similarities derived from the clone libraries and geographic separation suggested that the UniFrac similarity values may have been more strongly correlated with shorter distances between colonies compared with the correlation over larger distances. However, neither variable was significantly correlated with UniFrac similarity for the near comparisons despite the two variables explaining 91% of the variance in the model, possibly due to the reduced number of comparisons in the partitioned data sets. The larger ARISA data set (in comparison to the clone libraries) also did not find a significant correlation between community similarity and geographic separation but did find a significant negative correlation between community similarity and genetic distance for the near comparisons. The correlation between host genetic distance and bacterial community similarity suggests that, even for the near comparisons, host genetic distance explains a significant proportion of the faecal bacterial community similarity.

Limited dispersal by Adelie penguins may account for lack of a correlation between community similarity and geographic separation. Ainley and DeMaster (1980) found that more than 98.8% of over 51 000 Adelie penguins banded as chicks at Cape Crozier returned to their natal rookeries to breed and almost all of the birds that did not return to their natal colony moved to colonies within 10 km. Despite these apparently low rates of migration to new breeding colonies, Roeder and colleagues (2001) found there were low levels of genetic differentiation among Adelie penguin colonies around Antarctica and suggested that environmental fluctuations may cause periods of increased migration. Periodic migrations may be the reason that bacterial community similarity is not correlated with geographical distance between colonies. Host interactions during the juvenile dispersal phase and/or the non-breeding season may also explain the wide distribution of some faecal bacteria.

It is likely that differences in the dispersal abilities of bacterial taxa may obscure explanations of bacterial community structure as the distribution of a bacterium with high dispersal capabilities is unlikely to reflect host genetic structure or geographical separation. Gordon and Lee (1999) studied several enteric bacteria from a variety Australian mammals. They found that the presence or absence of Citrobacter freundii was explained by geographic separation of its hosts and not the genetic relation between hosts. In contrast, host genetic distance rather than geographic separation of hosts explained the distribution of Hafnia alvei. Many of the bacterial taxa we sequenced were found in a single host and none of the bacteria was in every host which suggests that few of the Adelie penguins' bacteria were good dispersers and may explain why there is a strong negative correlation between bacterial community similarity and host genetic distances.

The ARISA data explained less of the variance between bacterial community similarity and host genetic distance and geographic separation than did the data from the clone sequences. This may be because ARISA can either underestimate or overestimate diversity. Underestimation occurs because ARISA does not detect sequence differences and will pool OTUs of the same length even though the sequences themselves may be substantially different. Overestimation can occur because the 16S-ITS-23S operon can be replicated several times in a genome (for example, Escherichia coli has seven operons, Cardinale et al., 2004) with the chance that the intergenic transcribed spacer (ITS) length may vary among operons. Sequencing clones is less likely to over- or underestimate diversity as we used the actual sequence to assign identity and thus could distinguish between fragments of the same length. However, we found that ARISA detected similar trends to the analysis of clone sequences and is therefore still a useful method of characterizing bacterial communities.

There are two sympatric clades of Adelie penguins, the ‘Antarctic’ and the ‘Ross Sea’ clades, present in the Ross Sea with different phylogeographic histories (Lambert et al., 2002; Ritchie et al., 2004). The clone libraries presented here are from the Antarctic clade and it will be interesting to examine the relationship between hosts and faecal bacterial communities within the other clade and to compare the bacterial communities from the two clades. Distinguishing the contribution of host genetic relatedness from host geographic separation to the acquisition of gut bacteria is difficult, but there is support for the inheritance of bacteria even within a host species.

Experimental procedures

Faecal samples were collected from Adelie penguins breeding at Capes Adare (71.3106°S, 170.2078°E), Bird (77.2140°S, 166.4402°E), Crozier (77.4622°S, 169.2818°E), Hallett (72.3185°S, 170.2221°E), Royds (77.5539°S, 166.1618°E) and Wheatstone (72.6218°S, 170.2016°E) and Cotter Cliffs (72.4041°S, 170.3110°E) on the Victoria Land coast of the Ross Sea (Fig. 4) by inserting sterile rayon swabs (LP Italiana Spa, Milan, Italy) into their cloacae. Swabs were frozen immediately in the field in liquid nitrogen and then placed at −80°C on return to the laboratory. Feather samples were also collected from individual birds and placed in tubes containing 100% ethanol and then stored at −80°C in the laboratory.

Figure 4.

Ross Sea collection sites of faecal samples. Rectangle on the inset map shows area of Antarctica covered by the larger map. The grey shading represents land and ‘permanent’ ice shelf.

DNA extraction and cloning

Genomic DNA was extracted from the swabs using the PowerSoil DNA isolation kit (Mo Bio, West Carlsbad, California, USA) following the manufacturer's protocol. Penguin DNA was extracted from feather samples collected from three birds from each of the seven breeding colonies using a DNeasy kit (Qiagen, Hilden, Germany) following the manufacturer's protocols. DNA was re-suspended in 30 μl of elution buffer. We sequenced 651 nucleotides of the hypervariable region of the mitochondrial control region using the protocol of Ritchie and colleagues (2004), although we increased the annealing temperature to 56°C. Samples were analysed on an ABI 3130XL DNA sequencer. Adelie control region sequences were aligned in Clustal X and uncorrected distance matrices were generated in PAUP*4.10b (Swofford, 2003). Based on the genetic sequences of the birds, individuals were then assigned to either the Antarctic or Ross Sea clades.

Because the Antarctic and Ross Sea Adelie penguin clades have different phylogeographic histories (Ritchie et al., 2004), we selected faecal samples from six Adelie penguins of the Antarctic clade for the generation of clone libraries. We amplified the 5′ end of 16S rRNA gene through the ITS region and into the 5′ end of 23S rRNA gene using the primers EubB (Lane, 1991) and ITSReub (Cardinale et al., 2004) from one-bird of the Antarctic clade randomly selected from breeding colonies at Capes Adare, Crozier, Hallett, Royds and Wheatstone, and Cotter Cliffs. Polymerase chain reaction conditions were 94°C for 10 min, 25 cycles of 94°C for 40 s, 55°C for 40 s and 72°C for 3 min, with a final step at 72°C for 7 min. Polymerase chain reaction products were gel-purified using gel extraction kits (Qiagen) and cloned using TOPO TA (Invitrogen, Carlsbad) cloning kits. Clones were cultured on media containing kanamycin to identify those containing a plasmid with an insert, and then the plasmid was extracted using an alkaline lysis method (Kotchoni et al., 2003). Plasmid inserts were sequenced from the 16S rRNA end of the insert using the primer M13F supplied with the cloning kit. Operational taxonomic units were assigned using DOTUR (Schloss and Handelsman, 2005) using a minimum of 99% sequence similarity to classify clones to a phylotype. Clones were assigned to taxonomic groups using the sequence match function in the RDP (Cole et al., 2007; Wang et al., 2007) and blastn searches in GenBank (http://www.ncbi.nlm.nih.gov/). Sequences from the clones were assembled in BioEdit 7.0.5.3 (Hall, 1999) and aligned in Arb (Ludwig et al., 2004). DOTUR was also used to calculate Shannon indices (H′) (Magurran, 1988) to asses species diversity, and the Chao-1 richness estimate (Chao, 1987) to estimate the richness of the faecal bacterial community.

Automated ribosomal intergenic spacer analysis (ARISA)

We also used ARISA (Fisher and Triplett, 1999) to characterize faecal bacterial communities of 12 birds of the Antarctic clade from (numbers in brackets after a location indicate number of faecal communities analysed) Capes Adare (1), Bird (1), Crozier (3), Hallett (2), Royds (2) and Wheatstone (1), and Cotter Cliffs (2). Polymerase chain reactions were conducted on the DNA extracted from the faecal swab using the primers ITSF and hex-labelled ITSReub (Cardinale et al., 2004). Each reaction consisted of 7.5 μl of 10× buffer with MgCl2 (15 mmol l−1) and dNTP (8 mmol l−1), 1.875 μl of each primer (10 µmol l−1), 4.5 units of Taq (Roche), 47.25 μl of water and 20–177 ng of genomic DNA. Polymerase chain reaction products were purified with Quick Clean purification kits (GenScript Corporation). Amplicon lengths were resolved on a Megabace 500 series capillary sequencer (Amersham Pharmacia, Sunnyvale). Peaks that were less than three standard deviations above the baseline ‘noise’ in the output from the sequencer were removed from the analysis using the software T-RFLP Stats (Abdo et al., 2006).

Analyses

We tested for host panmixia using nested clade analysis (Templeton, 1998; 2004) as implemented by the software ANeCA (Clement et al., 2000; Posada et al., 2000; Panchal, 2007), which automates the application of the inference key of Templeton et al., 1995). The locations of unsampled Adelie penguin breeding colonies in the Ross Sea region for ANeCA were obtained from Marchant and Higgins (1990).

Bacterial DNA sequences from the six clone libraries were analysed with UniFrac (Lozupone and Knight, 2005) to test whether each of the clone libraries were significantly different from the overall UniFrac metric and whether pairs of clone libraries were significantly different from each other. UniFrac generates a metric based on the difference between two environments in terms of the branch length that is unique to one environment; for example, if all sequences are found in only one of the two environments the UniFrac metric = 1; if all the sequences are found in both environments the metric = 0 (Lozupone and Knight, 2005). A phylogenetic tree for calculating the UniFrac metric was estimated from the clone sequences in PAUP*4.10b (Swofford, 2003) using the neighbour-joining algorithm (Saitou and Nei, 1987).

We used the cluster environments function in UniFrac to generate the distance between each pair of clone libraries using the weighted UniFrac metric so that differences in relative abundances of the clones in each library contributed to community differences. Conceptually, it was easier to use UniFrac similarity values which we generated by subtracting the UniFrac distance from 1. We then used the partial Mantel test implemented in the software fstat (Goudet, 2002) with 2000 randomizations to calculate partial correlations between the UniFrac environmental similarity values and geographic distance and the uncorrected host genetic distances. We also used fstat (Goudet, 2002) to calculate the amount of the variance in the correlation explained by host genetic distance and geographic separation.

ARISA profiles were compared in a pairwise manner using Bray–Curtis similarity coefficients calculated in the software Primer 6 (Clarke and Warwick, 2001) from the square root transformations of each peak's area. Bray–Curtis coefficients were then compared with the genetic distance between pairs of hosts and their geographic separation using partial Mantel tests in fstat (Goudet, 2002) with 2000 randomizations to generate the null distribution for significance calculations. We also used fstat to calculate the amount of the variance in the correlation explained by host genetic distance and geographic separation.

Geographic distances were computed from the Global Positioning System coordinates of the collection sites using the Haversine formula implemented on the Movable Types website (Veness, 2007). We assumed that penguins could move by the shortest distance between two points as there are few barriers between the colonies at least later in the summer when the sea ice has broken up. The exception to this assumption is the shortest distance between Capes Royds and Crozier, which is blocked by the ‘permanent’ Ross Ice Shelf. For this distance we used the distance from Cape Royds to Cape Bird to Cape Crozier, the shortest open water distance between Royds and Crozier.

Acknowledgements

Thanks to Charles Lee, Ian McDonald, Andreas Rueckert and Susie Wood for advice and comments. Logistic support was provided through Antarctica New Zealand. Financial support was provided by the New Zealand Foundation for Research Science and Technology through a postdoctoral fellowship for JCB (UOWX0504) and research contract UOWX0505, and from the University of Waikato Vice Chancellor's research fund. This work contributes to the Latitudinal Gradient Project (LGP) and the Evolution and Biodiversity in the Antarctic (EBA) programme.

Ancillary