Comparative bacterial diversity in recent Hawaiian volcanic deposits of different ages


  • Editor: Christophe Tebbe

Correspondence: Klaus Nüsslein, Department of Microbiology, University of Massachusetts, Amherst, MA 01003, USA. Tel.: +1 413 545 1356; fax: +1 413 545 1578; e-mail:


Volcanic activity creates new landforms that can change dramatically over time as a consequence of biotic succession. Nonetheless, volcanic deposits present severe constraints for microbial colonization and activity. We have characterized bacterial diversity on four recent deposits at Kilauea volcano, Hawaii (KVD). Much of the diversity was either closely related to uncultured organisms or distinct from any reported 16S rRNA gene sequences. Diversity indices suggested that diversity was highest in a moderately vegetated 210-year-old ash deposit (1790-KVD), and lowest for a 79-year-old lava flow (1921-KVD). Diversity for a 41-year-old tephra deposit (1959-KVD) and a 300-year-old rainforest (1700-KVD) reached intermediate values. The 1959-KVD and 1790-KVD communities were dominated by Acidobacteria, Alpha- and Gammaproteobacteria, Actinobacteria, Cyanobacteria, and many unclassified phylotypes. The 1921-KVD, an unvegetated low pH deposit, was dominated by unclassified phylotypes. In contrast, 1700-KVD was primarily populated by Alphaproteobacteria with very few unclassified phylotypes. Similar diversity indices and levels of trace gas flux were found for 1959-KVD and 1790-KVD; however, statistical analyses indicated significantly different communities. This study not only showed that microorganisms colonize recent volcanic deposits and are able to establish diverse communities, but also that their composition is governed by variations in local deposit parameters.


The Hawaiian Islands are the most isolated archipelago on Earth. Their location and volcanism have strongly shaped their biological history, and have led to the formation of model systems for ecological studies (Vitousek, 2004). Kilauea volcano in Hawaii Volcanoes National Park has created a natural laboratory, which includes chronosequences of contemporary and ancient lava flows and related deposits (Fig. 1) (Neal & Lockwood, 2003) that provide numerous possibilities for documenting and understanding patterns of microbial colonization and succession.

Figure 1.

 A simplified geologic map of Kilauea caldera shows the age and distribution of recent lava flows (Neal & Lockwood, 2003). Dates represent time of deposition of parent material at each site. Kilauea volcano is located on the Island of Hawaii.

The main island of Hawaii has been described as a biodiversity hotspot with known endemic eukaryotes, but also bacteria (Myers et al., 2000; Donachie et al., 2004). Several studies of ecosystem dynamics have used the Hawaii Island system as an ecological model, providing background data for our investigation (Crews et al., 1995; Vitousek et al., 1995; Chadwick et al., 1999; Schuur et al., 2001). Recently, an NSF-Microbial Observatory was established in the Hawaiian archipelago to study the microbial biodiversity of lakes (Donachie et al., 2004). Terrestrial microbiologists have also made use of this ‘hotspot’, although not to the same extent as other fields. The study of microbial communities in Hawaiian soils has been limited (Nüsslein & Tiedje, 1998, 1999), and it is only recently that the first microbial studies from recent volcanic deposits (<300-years-old) at the Kilauea volcano were described (King, 2003).

The absence of organic matter and fixed nitrogen for young volcanic materials imposes severe constraints on microbial colonization and activity (King, 2003). Furthermore, solidified lava is not permeable, and openings in its surfaces represent former gas bubbles, which, because they are not interconnected, lead to the inability of lava to absorb or retain water. Thus, early colonization depends on precipitation and an exogenous source of nutrients, such as atmospheric trace gases (e.g., CO, H2, CH4). Recent observations of young Hawaiian volcanic deposits indicate that CO and H2 provide a source of carbon and energy for early microbial colonization accounting for 2–4% and 15–20% of total respiratory reducing equivalent flow, respectively (King, 2003).

Recently, two studies successfully analyzed the diversity of CO-oxidizing (Dunfield & King, 2004) and facultatively lithotrophic (Nanba et al., 2004) microbial communities on several Hawaiian volcanic deposits using newly introduced statistical methods that facilitate comparisons of clone libraries (Bohannan & Hughes, 2003; Chao & Shen, 2005; Schloss & Handelsman, 2005). However, in both cases the selective bacterial diversity was limited to either the amplification of the gene coding for the large subunit of the form I ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCO) (rbcL) (Nanba et al., 2004) or a portion of coxL, the large subunit of carbon monoxide dehydrogenase (CODH) (Dunfield & King, 2004) as molecular markers. Lithotrophs are united by their ability to incorporate CO2 for cell carbon through the activity of RubisCO, whereas carboxydotrophic bacteria utilize CO as sole carbon and energy sources through the activity of CODH. Results of such comparisons indicated that statistically different communities for each functional group occurred at geographically close but distinct sites that varied in deposit age and vegetation (Dunfield & King, 2004; Nanba et al., 2004). Furthermore, analysis of clone libraries based on the PCR amplification of coxL revealed differences among sites in the relative abundance of various proteobacterial and nonproteobacterial taxa (Dunfield & King, 2004).

We describe here results from an analysis of 16S rRNA gene sequences amplified from genomic extracts of previously described Hawaiian volcanic deposits (King, 2003). Our objectives were to expand our knowledge of microbial community composition in recent volcanic deposits and to determine the extent to which community structure varied spatially within the Kilauea caldera, to identify environmental factors that correlate with structure, and to assess the extent to which patterns observed for trace gas uptake and contribution were reflected in the composition of microbial communities as a whole. This study is designed to investigate bacterial diversity between three young volcanic deposits and an established young forest (300-year-old) by comparing 16S rRNA gene libraries based on a strategic progression of statistical methods.

Materials and methods

Field sites

The Kilauea volcano is located on the Island of Hawaii (N19°425′, W155°292′) 1277 m above sea level and occupies an area of 1430 km2, or 13.7% of the island. Deposition dates and distribution of recent volcanic deposits from the Kilauea volcano were obtained from Neal & Lockwood (2003). Four representative sites have been established within the caldera of Kilauea volcano and on its flank (Fig. 1). Ecosystem development and soil parameters at each of the sites have already been reported in King (2003). King (2003) identified the 1959, 1921, 1790 and 1700 sites as II-B, I-C, I-E, and II-C, respectively. Thus for the purpose of this study the Kilauea volcano deposit (KVD) sites were identified according to the time of deposition of parent material, namely 1959-KVD (41-years-old, relative to sampling time), 1921-KVD (79-years-old), 1790-KVD (210years-old), and 1700-KVD (300-years-old).

Environmental variables

Soil parameters and data used for trace gas profiles have already been reported in King (2003). Only results corresponding to volcanic deposits from 1959, 1921, 1790, and 1700 were used for this study. Soil parameters varied significantly among volcanic deposits. Water content ranged between 14.0% and 82.5% with highest value for the 1700-KVD (forest). The pH of volcanic deposit samples ranged between 3.9 and 5.7, with the second youngest, the 1921-KVD deposit, being the most acidic. With the exception of the forest site, total C and N values were lower than 1.0% and 0.03%, respectively. No significant difference was found for total C, N and water content between sites 1959-KVD and 1790-KVD (P>0.05), but in contrast there is a significant difference in pH (P<0.001). Pairwise comparison indicated a significant difference in C (P<0.05), N (P<0.05), and pH (P<0.05) between the forest and the other sites. Additionally, both C and N values of site 1921-KVD were different from the rest of the sites (P<0.05). The increase of C and N concentrations did not correlate with deposit age (r2=0.30, P>0.05; r2=0.06, P>0.05).

Uptake rates of atmospheric CO and H2 during core-based assays account for important levels of reducing equivalents associated with CO2 production for the 1921-KVD, 1959-KVD, and 1790-KVD deposits, but contribute negligibly to the forest site (previously reported in King, 2003). After sorting trace gas flux data into a matrix for principal components analysis (PCA), three groups were clearly separated (data not shown). The statistical analyses revealed significantly distinct levels of trace gas uptake and relative contribution of CO and H2 by either the 1700-KVD or the 1921-KVD sites, but 1790-KVD and 1959-KVD sites showed no significant difference in levels of trace gas flux (P>0.05). The forest site was most dissimilar from the other fields, largely due to significant decreases in CO uptake, oxidation of H2 and CO and an increased production of CO2 in comparison with all other sites (P<0.007). Furthermore, the 1921-KVD site was clearly separated from 1790-KVD and 1959-KVD sites due to a significant increase in CO oxidation (P<0.001).

Sampling and extraction of total DNA from volcanic deposits

Triplicate samples (triangular pattern, each corner located 10–20 m apart) from each site were collected aseptically from the same location and on the same day as used by King (2003). At each site (Fig. 1), litter and the first 1 cm (A horizon) of soil were removed previous to the soil collection, sealed in sterile plastic bags, placed immediately on ice, and finally stored at −20°C. Isolation of DNA from 1 g of soil (3 ×) was accomplished with a MoBio Soil Extraction Kit (MoBio Laboratories Inc., Carlsbad, CA), according to the manufacturer's protocol. Following the extraction, the DNA products were cleaned and concentrated (DNA Clean & Concentrator™-25 Kit, ZYMO Research, Orange, CA), according to the manufacturer's protocol. Soil organisms could exhibit patchy distributions at the scale of centimeters to meters, even where topography and soil texture are relatively uniform (Ettema & Wardle, 2002). Due to the possible heterogeneity in our soils and to maximize the representation of indigenous members, a sampling approach would therefore have to consider the homogenization of a number of individual samples to obtain one soil sample representative of the average soil community (Nicol et al., 2003). Consequently, triplicate samples of purified DNA product were pooled. Preliminary characterization of within site differences was determined by quantifying fatty acid methyl esters (FAMEs) of triplicate subsamples using a modified Bligh and Dyer method (Petsch et al., 2003). Statistical analysis was used to determine whether significant differences in the FAMEs patterns extracted from the soil samples existed across the four different sites and within samples from the same site. Our results indicated that significant differences existed between deposits (P<0.01), but there were no significant differences between samples at each of the sites (data not shown). Although these results were important to support this study we did not include them in the text because these samples were taken at a different time point.

PCR amplification of 16S rRNA genes

Amplification of the 16S rRNA gene was performed using primers 8f and 1492r (Lane et al., 1985) in a total volume of 50 μL. Each reaction mixture contained: 20 ng of purified DNA, each primer at a concentration of 0.3 μM, 150 μM of each dNTP (Sigma-Aldrich Co., St Louis, MO), 200 ng of bovine serum albumin (Fisher Scientific Co., Fair Lawn, NJ), 1.5 mM of MgCl2 (Sigma-Aldrich Co.), and 3 U of Taq DNA polymerase (Promega Corp., Madison, WI), in 1 × PCR Buffer (Promega Corp.). The PCR program used for amplification was as follows: 95°C for 5 min, followed by 30 cycles of 94°C for 30 s, 56°C for 30 s, 72°C for 45 s, and a final extension step at 72°C for 5 min. PCR amplifications were performed using a PTC-200 Peltier Thermal Cycler (MJ Research, Waltham, MA).

Cloning, sequencing, and phylogenetic analysis of 16S rRNA genes

Construction of 16S rRNA gene clone libraries are described elsewhere (Nüsslein & Tiedje, 1998, 1999). Before cloning, PCR products were purified using the QIAquick PCR Purification Kit (Qiagen Inc., Valencia, CA). Purified fragments were cloned with the pGEM-T Easy vector and transformed into Escherichia coli JM109, following the manufacturer's recommendations (Promega). After PCR amplification of the 16S rRNA gene fragments from E. coli with the vector primers M13f and M13r (the same PCR conditions as those described above), 100 clones with the correct size insert (≈1500 bp) were randomly selected to assess genetic diversity at each site. Amplified 16S rRNA gene sequences were sequenced with a model 3730xl DNA Analyzer (Applied Biosystems Inc., Foster City, CA). Sequencing reactions were carried out with primer 8f (Lane et al., 1985) and sequences ≥800 bp were used in this analysis. All sequences were manually edited using the software bioedit v7.0.4 (Hall, 1999) and aligned using clustalx v1.83 (Thompson et al., 1997). Following this step, sequences were checked for chimera artifacts using the software mallard v1.02 [Ashelford et al., 2006 (]. The number and relative proportion (in percent) of sequences excluded from subsequent analyses were seven (8.0%), 10 (10.1%), and seven (7.3%) for the 1700-KVD, 1790-KVD, and 1959-KVD sites, respectively. No chimeric sequence was found in the analysis of site 1921-KVD.

A limitation to using traditional statistics to describe and compare diversity is that phylotypes or operational taxonomic units (OTUs) are defined inconsistently (Martin, 2002). Typically, sequences with greater than 97% identity are assigned to the same species, those with >95% identity are assigned to the same genus, and those with >80% identity are assigned to the same phylum (Schloss & Handelsman, 2005). Although microbiologists have not reached a consensus concerning the classification of prokaryotes at the species level based on phylogenetic data, most published libraries are based on the narrow range of 97–99% similarity (Kempton, 2002; Martin, 2002; Bohannan & Hughes, 2003; Hill et al., 2003; Forney et al., 2004; Kemp & Aller, 2004; Schloss & Handelsman, 2005). We restricted our investigation to a relatively conservative cutoff value of ≥97%, as this value was proposed previously as a discriminator of species vs. strain differences (Stackebrandt & Goebel, 1994).

Sequences that showed ≥97% sequence identity were considered similar strains of the same phylotype and grouped as a separate sample (i.e. phylotypes) using the software dotur v1.53 [Schloss & Handelsman, 2005 (]. A phylogenetic tree was constructed from the alignments based on the method Minimum Evolution and calculated by the algorithm of the Tamura-Nei model (Tamura & Nei, 1993), using the software mega v3.1 (Kumar et al., 2004). Bootstrap confidence values were obtained with 1000 replicates. To improve our phylogenetic analyses we employed different algorithms (e.g. neighbor-joining, maximum parsimony); however, phylogenetic trees consistently supported the major clusters indicated in our study. The tool Classifier and Sequence Match in the Ribosomal Database Project II release 9.37 (Cole et al., 2003) and blastn (Altschul et al., 1997) were used to classify the clones and identify the nearest neighbors of the >800 bp sequenced 16S rRNA genes in the GenBank database.

Nucleotide sequence accession numbers

The sequences obtained from the Hawaiian volcanic deposit (KVD) clone libraries have been deposited in the GenBank database under accession numbers AY425761–AY425792 and AY917279–AY917401.

Community analysis

A population census of species using the 16S rRNA gene or any other molecular marker provides only limited information on the presence or absence of species in the environment being sampled. Thus, the most robust approach is the implementation of statistical analyses that account both for richness of species (OTUs) and phylogeny, as each community analysis has particular strengths and limitations, as well as different requirements for input data (Bohannan & Hughes, 2003). We relied on the use of species richness and nonparametric approaches to compare the diversity at different sites (Bending et al., 2002). Standard methods of diversity (e.g. Shannon–Wiener, Simpson's, SACE, ChaoI) did not take into account the distance or relatedness, outside the defined OTUs. Thus, it would be possible to have communities with very different populations, but with the same species richness (Martin, 2002). To overcome this limitation, phylogenetic approaches (e.g. FST test and ∫-libshuff) were used to provide a means to genetically compare the environmental communities (Martin, 2002).

On the basis of 16S rRNA gene phylotypes, indices of species richness (S), sample coverage (C), Shannon–Wiener (H) and the reciprocal of Simpson's (1/D) diversity were calculated according to methods described elsewhere (Kempton, 2002; Bohannan & Hughes, 2003; Hill et al., 2003) using the software spade v2.1 [Chao & Shen, 2005 (]. We used the reciprocal of Simpson's index, as it has been widely used for microbial ecological studies and has good discriminating ability (Zhou et al., 2002). Higher Shannon–Wiener or Simpson's reciprocal index values describe a community with greater numbers of species and a more even distribution of species.

The potential bacterial species richness was calculated for each of the sites by nonparametric estimators SACE and ChaoI using the software spade v2.1 and dotur v1.53, respectively. Nonparametric approaches estimate OTU richness from small sample sizes without assuming a particular species (i.e. OTU) abundance model (Bohannan & Hughes, 2003). For instance, the ChaoI estimator uses the number of singletons (OTUs represented by only one individual) and doubletons (OTUs represented by two individuals) to estimate the diversity of a given environment (Chao & Shen, 2005). The SACE (abundance-based coverage estimator) assumes that the observed species are separated into rare and abundant groups; only the rare group is used to estimate the number of missing species (Chao & Shen, 2005).

Rarefaction curves were plotted for each library using the software dotur v1.53. Rarefaction analysis averages randomized species-accumulation curves, and the resulting rarefaction curves represent the rate at which new sets (i.e. species, phylotypes, OTUs, etc.) are detected as the sample size increases (Morris et al., 2002). In addition, rarefaction determines whether the sampling effort was sufficient to represent the possible richness of phylotypes at these sites.

Total genetic variation (θ) and nucleotide diversity were determined using the software package arlequin v3.1 [Excoffier et al., 2005 (]. Values of θ(π) provide an estimate of the total nucleotide variation observed when each sequence is compared with all other sequences in a sample of sequences (Martin, 2002; Excoffier et al., 2005). Nucleotide diversity is the probability that two randomly chosen homologous nucleotides are different, which is equivalent to the genetic diversity at the nucleotide level.

Statistical differences in the compositions of the four 16S rRNA gene clone libraries was assessed by pairwise comparisons of the homologous and heterologous coverage of each clone library with that of the other libraries (Singleton et al., 2001; Schloss et al., 2004). The software ∫-libshuff [Schloss et al., 2004 (] was used to generate homologous and heterologous coverage curves. The Bonferroni correction method was used to correct the experiment-wise error for multiple comparisons of libraries, and to determine the P value at which the libraries can be considered statistically different with 95% confidence (P=0.05) (Schloss et al., 2004). For comparison of four libraries, the lower of the two P values generated by ∫-libshuff must be less than or equal to 0.0043 to be considered statistically different. Communities were further analyzed by comparing the genetic diversity within each community with the total genetic diversity of all communities combined (FST) (Martin, 2002). Pairwise dissimilarity indices (i.e. FST) range from 1 (if all of the variation occurs between samples) to 0 (if the variation within samples is equal to the variation between samples) (Martin, 2002; Excoffier et al., 2005). FST values and statistical significance (P<0.05) were evaluated by randomly assigning sequences to populations and calculating the FST for 1000 permutations using the software arlequin v3.1.

Statistical analysis

The statistical significance of differences in the soil properties, trace gas uptake rates and contribution among four volcanic deposits was analyzed using linear regression and t-test using the software statistica v6.0 (StatSoft Inc., 2002). Tests with P<0.05 were considered to indicate statistically significant differences between results. The relationships between microbial community composition, soil characteristics, and trace gas uptake rates and contribution were analyzed by canonical correspondence analysis (CCA) (ter Braak & Prentice, 1988). CCA permits direct analysis of microbial community composition in relation to an entire set of parallel specific environmental variables (e.g. soil characteristics, site, and trace gas rates). The Monte Carlo test was used to calculate a P value associated with the effect of an environmental variable on the microbial community composition of a sample. CCA ordination analyses were performed using the software package pc-ord v4.30 (MjM Software, 1999).


PCR amplification of 16S rRNA gene and richness

Based on purified total DNA as template, an amplicon of c. 1500 bp was detected from DNA in all volcanic deposit samples. Across all clone libraries, a total of 130 phylotypes were identified based on a ‘cut-off’ value ≥97% nucleotide sequence identity. Clone library coverage ranged from 66 to 83% of estimated richness (Table 1). Based on the richness estimator values (SACE and ChaoI), a more exhaustive sampling of our sites would be necessary to obtain complete coverage (Table 1). In addition, rarefaction curves did not level off, suggesting that further sampling would have revealed more phylotypes; however, the slope of the curve of the 1921-KVD site began to level off, indicating that the most predominant bacterial groups were likely identified (Fig. 2).

Table 1.   Statistical estimation of relative clone abundance and diversity in 16S rRNA gene libraries of the domain Bacteria representing four Hawaiian volcanic deposits as identified by partial 16S rRNA gene sequencing
 Site (volcanic deposit)
  • Data are given as arithmetic means with standard error. Dates represent time of deposition of parent material at each site.

  • *

    Richness is the total number of phylotypes in the community (i.e. the total number of independent sequences within the clone library; defined as aligned sequences with >97% similarity).

  • Coverage value is the percentage of estimated phylotypes sampled at each site.

  • Significant differences are indicated (P<0.05).

  • §

    The classification of 16S rRNA gene sequences was obtained using the tool Classifier in the RDP-II (Cole et al., 2003).

  • Sequences not related to known taxa or only closely related to uncultured organisms.

  • ND, not detected.

Diversity estimates
 SACE76.5 ± 16.872.0 ± 37.5113.9 ± 31.392.2 ± 25.8
 ChaoI66.3 ± 16.266.5 ± 33.4152.8 ± 70.869.6 ± 20.4
 Shannon (H)3.78 ± 0.192.62 ± 0.203.87 ± 0.173.70 ± 0.22
 Simpson (1/D)20.41 ± 2.427.61 ± 1.1623.93 ± 1.9817.41 ± 1.80
 Nucleotide diversity0.21 ± 0.100.16 ± 0.080.18 ± 0.090.14 ± 0.07
 θ(π)97.67 ± 42.4277.08 ± 33.5286.74 ± 37.7063.91 ± 27.87
Phylogenetic group§
Figure 2.

 Rarefaction curves of observed bacterial phylotype richness (≥97% similarity) in clone libraries from samples collected at four volcanic deposits.

DNA sequencing and phylogenetic analysis

Analysis of 16S rRNA gene sequences from all sites indicated the presence of Acidobacteria, Actinobacteria, Alphaproteobacteria, Betaproteobacteria, Cyanobacteria, Gammaproteobacteria, Planctomycetacia, and Thermomicrobia (Table 1). Only the Acidobacteria and Actinobacteria were represented in all sites. Additionally, 56% of all sequences were identified as ‘unclassified’ [i.e. they had no bacterial phylum assigned according to the RDP II (Cole et al., 2003)] and few of these clone sequences exhibit some degree of similarity with sequences found in the database. The dominant group in the forest (1700-KVD) site was the Alphaproteobacteria, constituting 78%, followed by the Betaproteobacteria with 9%, whereas only 6% were identified as ‘unclassified’. At the genus level within the Alphaproteobacteria, the most predominant groups were Rhizobium, Bradyrhizobium, and Amaricoccus. In contrast, each of the other three libraries was dominated by 60–80% of clones identified as ‘unclassified’. Phylogenetic analysis of unclassified clones arranged into four clusters, namely I, II, III, and IV (Fig. 3b). Clusters I and II branched off within the lineage of the phylum Chloroflexi [Green Nonsulfur (GNS) Bacteria] and encompassed 15 (40 clones), seven (70), and 11 (30) phylotypes from the 1790-, 1921-, and 1959-KVD sites, respectively. It is interesting, however, that such GNS-like sequences were not detected in the forest. Most of our clones cluster with the nearest sequences available from the GenBank database, which are derived from uncultured bacteria previously detected in soil, sediment, and marine habitats as well as from acid, mine drainage systems (Fig. 3b). Furthermore, Cluster III is composed of three phylotypes from 1790- and 1959-KVD sites and branches off within the lineage of the phylum Firmicutes, while the two phylotypes (1790- and 1921-KVD) in Cluster IV are affiliated with the phylum Cyanobacteria but branch outside the lineage of the class Cyanobacteria (Fig. 3a). Sequences from Cluster III were found to be similar to those from lichen-dominated Antarctic communities. Finally, sequences affiliated with Cluster IV (Cyanobacteria) were previously detected in geothermal and ground waters. Nevertheless, some clone sequences from the 1790-KVD site were identified as members of the class Acidobacteria, Cyanobacteria, and Actinobacteria, consisting of 11, 8, and 5%, respectively. Planctomycetacia (6%), Actinobacteria (6%), Acidobacteria (1%), and Gammaproteobacteria (1%) classes prevailed in the 1921-KVD site. The 1959-KVD volcanic deposit, the youngest site (41 years old), was represented by the Alphaproteobacteria (14%), Cyanobacteria (10%), Acidobacteria (9%), and Gammaproteobacteria (6%). One phylogenetic tree corresponding to each site was constructed using an 800-bp region of the 16S rRNA gene (Fig. 3).

Figure 3.

 Phylogenetic relationship of community members on four volcanic deposits. The comparative analysis was inferred by Minimum Evolution analysis of 16S rRNA gene sequences in concert with public nucleotide databases. Sequences obtained from clone libraries are designated by the year of deposition: 1959 (○), 1921 (▵), 1790 (•), and 1700 (▴), followed by the number of clones in the respective library. The two collapsed groups in the main tree (a) are detailed in two subtrees for the phylum Chloroflexi [Green Nonsulfur Bacteria] (b) and the class Alphaproteobacteria (c). The scale-bars represent 5% estimated sequence divergence. Bootstrap values are shown for nodes that had ≥50% support in an analysis of 1000 replicates. Halobacterium halobium was used as outgroup. Phylogenetic classification: phylum [*], class [**], and family [***] (Cole et al., 2003).

Community diversity

The Shannon–Wiener (H) and reciprocal Simpson's (1/D) indices based on 16S rRNA gene clone libraries were calculated as a measure of diversity and show the highest diversity for the 1790-KVD site, followed by the youngest 1959-KVD and the oldest 1700-KVD deposits, and ultimately the 1921-KVD site (Table 1). These results were consistent with our comparisons of richness based on calculations using rarefaction and a richness estimator (Fig. 2). The calculated total genetic variation [θ(π)] was highest for the 1959-KVD, followed by the 1790-KVD site, and the values for 1921-KVD and 1700-KVD were the lowest (Table 1). The latter sites displayed a low diversity index at the genetic level, indicating an excess of closely related lineages (Martin, 2002).

Community composition based on 16S rRNA gene clone libraries (cut-off value ≥97% nucleotide sequence identity) revealed no phylotype was represented in all four sites, but one phylotype identified as a member of the genus Conexibacter (Actinobacteria class) (Fig. 3) was found in the three young sites 1790-KVD, 1959-KVD, and 1921-KVD (Fig. 4). The 1921-KVD library shared two and three phylotypes with 1959-KVD and 1790-KVD, which represents 4–17% of their total populations, respectively (Fig. 4). These sequences were classified as members of the family (class) Acidobacteriaceae (Acidobacteria), Rubrobacteraceae (Actinobacteria), Caulobacteraceae (Alphaproteobacteria), and numerous unclassified phylotypes not related to known taxa or only closely related to uncultured organisms. Only one shared phylotype was found between 1700-KVD and 1959-KVD sites (Fig. 4). In contrast, sites 1959-KVD and 1790-KVD share 10 unique phylotypes, or 38% of their total populations (Fig. 4). The main groups of shared phylotypes (Cluster I and II) clustered in proximity to the Chloroflexi and Thermomicrobia class, both of which belong to the Chloroflexi phylum (Fig. 3a, b). No other close relationship could be established, so we added the phylum Chloroflexi (also called GNS bacteria) for a more detailed phylogeny, even though we had not found any clones belonging to the Chloroflexi themselves. The GNS bacteria, including the lineage of GNS-like sequences, have long been recognized as an evolutionarily and environmentally significant group of bacteria (Boomer et al., 2002) and are widely distributed throughout various environments. All members are Gram-negative filamentous gliding bacteria and contain the phototrophic class Chloroflexi (Garrity & Holt, 2001). In addition, the GNS lineage also contains several members that include chemotrophic genera, such as Herpetosiphon and Thermomicrobium (Oyaizu et al., 1987). It is possible that GNS bacteria and GNS-like sequences may be key organisms in the process of carbon dioxide fixation as well as in photosynthetic systems (Garrity & Holt, 2001). In general, the current relative paucity of comparable molecular diversity studies about GNS-like organisms and sequences in natural environments hinders a more complete phylogenetic analysis.

Figure 4.

 Occurrence of homologous phylotypes [in brackets] in 16S rRNA gene clone libraries from samples collected at four volcanic deposits. The size of the circles is proportional to the number of phylotypes (in parentheses) in each clone library. The representation of shared sequences (≥97% similarity) between clone libraries is presented as percentage (overlapping areas are not to scale). ∫-libshuff comparisons of: (a) 1790-KVD with 1921-KVD and (b) 1959-KVD, (c) 1921-KVD with 1959-KVD, and (d) 1700-KVD with 1959-KVD. Solid and open circles indicate the coverage value of CX and CXY for samples at each value of D (distance), respectively. Broken line indicates the evolutionary distance at 0.03 (97% similarity).

The significant representation of shared phylotypes suggests a high degree of similarity between the two sites 1959-KVD and 1790-KVD. Nevertheless, the analyses of homologous and heterologous coverage curves based on the software ∫-libshuff indicated that all clone libraries have unique community structures (P<0.001, Table 2). Interestingly, sequences identified as ‘unclassified’ from library 1700-KVD, the oldest site, were not significantly different compared with the other three libraries (Table 2). Therefore, ‘unclassified’ clones present in site 1700-KVD are a subset of those present in the rest of the sites. In contrast, clones assigned to a bacterial phyla present in site 1921-KVD and 1790-KVD are a subset of those present in the youngest and oldest sites investigated, 1959-KVD and 1700-KVD (P<0.001, Table 2).

Table 2.   Comparison of 16S rRNA gene clone libraries representing four Hawaiian volcanic deposits (x vs. y in lower-diagonal and y vs. x in upper-diagonal halves of matrices)
library (x)
P value of ΔCxy heterologous library (y)
Complete clone libraryClassified clones*Unclassified clones*,†
  • Dates represent time of deposition of parent material at each site.

  • *

    According to the Ribosomal Database Project II (Cole et al., 2003).

  • Sequences not related to known taxa, therefore grouped as ‘unclassified’.

  • No significant differences are indicated.


Genetic differentiation among the four volcanic communities was assessed using the FST test and shows that communities 1700-KVD and 1921-KVD were different from the other communities: 1700-KVD vs. 1790-KVD (FST=0.302; P<0.001), 1700-KVD vs. 1921-KVD (FST=0.419; P<0.001), 1700-KVD vs. 1959-KVD (FST=0.238; P<0.001), 1921-KVD vs. 1790-KVD (FST=0.153; P<0.001), and 1921-KVD vs. 1959-KVD (FST=0.150; P<0.001). Although a statistical difference was seen between the 1790-KVD and 1959-KVD sites (FST=0.023; P<0.005) the results of the FST test suggest that less genetic diversity is present in the 1700-KVD and 1921-KVD sites; thus sites 1959-KVD and 1790-KVD harbor more distinct phylogenetic lineages (Martin, 2002).

Community diversity, gas flux, and soil parameters

CCA was used to relate the soil properties (pH, carbon, nitrogen, and water content), trace gas uptake, production, and contribution with our community composition data. Once more, our results indicated that the 1959-KVD community was more closely correlated with that from the 1790-KVD site than the communities 1921-KVD and 1700-KVD (Fig. 5). Differences in microbial community composition had the strongest correlation with carbon, nitrogen and water content, as well as with gas flux factors such as the production of CO2 and the oxidation of H2 and CO. Two factors were retained from the analysis. Factor 1 accounted for 51.4% and Factor 2 for 32.9% of total variance, for a cumulative total of 84.3%. Soil parameters and gas fluxes were found to be significant explanatory environmental variables (P=0.016), as determined by the Monte Carlo permutation test.

Figure 5.

 Canonical correspondence analysis (CCA) ordination plot of Bacteria communities of four volcanic deposits (circles and triangles). The plot was generated by CCA of selected biological indices estimated from 16S rRNA gene community analysis (biomass, S, H, 1/D, θ(π), and proportion of phylogenetic groups). Open squares represent gas fluxes that are significant as determined by the Monte Carlo test (P<0.05). Vectors and stars represent the environmental variables and abundance of vegetation, respectively. Values on the axes indicate the percentage of total variation explained by each axis. *No significant difference in levels of trace gas uptake and relative contribution of CO and H2.


Newly exposed volcanic deposits can serve as habitats for microorganisms, even though some sites are characterized by the absence of organic matter, a lack of fixed nitrogen, vegetation, low pH, and their inability to absorb or retain water. These factors are commonly listed as major environmental stresses on bacteria living in soils (Morita, 1997b; Jones et al., 1998; Wall & Virginia, 1999) and are each likely to present a selective force for colonization and succession in any environment (Morita, 1997a; Ohtonen & Väre, 1998; Zhang & Zak, 1998; de Nobili et al., 2001; Bending et al., 2002; Smith et al., 2002; Smolander & Kitunen, 2002; Ilstedt et al., 2003). As revealed by previous observation of these sites, neither soil properties (Gomez-Alvarez, King, and Nüsslein, Abstr. 104th Gen. Meet. Am. Soc. Microbiol.) nor diversity (Dunfield & King, 2004; Nanba et al., 2004) did not increase as a function of deposit age, whereas a positive correlation between biomass and the carbon and nitrogen content was found (King, 2003).

With regard to the microbial community, the bacterial diversity in the present study was extensive and a significant fraction of the community was not related to known taxa or only closely related to yet uncultured organisms (Cole et al., 2003). In addition to the unique structure and composition, patterns for phylogenetic and genetic diversity were in most cases expected (Ives 1995; Wall & Virginia, 1999; Curtis & Sloan, 2004). For instance, the lowest value for both parameters were obtained for the unvegetated and arid 1921-KVD site, which also had the most limited phylogenetic distances among clones, whereas the vegetated and relatively moist sites harbored the most-divergent phylotypes and had the highest genetic diversity. However, it is also evident from results that similar diversity values (Table 1) can be obtained from libraries with substantially different distributions of vegetation and times of deposition (e.g. forest and 1959-KVD). This is true even for sites with similar levels of trace gas uptake and contribution, e.g. the 1790-KVD (210-years-old) and the 1959-KVD (41-years-old) sites (Fig. 5). However, statistical comparisons of libraries using ∫-libshuff and FST indicate that each library differs significantly from the others, in spite of sharing a significant portion of their populations (up to 38%). This result may reflect the differences in successional state and various environmental factors (King, 2003), but also the significance of colonization by CO- and H2-oxidizing bacteria that distinguish these soils (Dunfield & King, 2004; Nanba et al., 2004). Again, the 1959-KVD occurs within a well-vegetated, relatively moist area of the caldera, whereas the 1790-KVD occurs in a region with reduced water availability and substantially less plant growth. A change in vegetation like the replacement of patches of fern with woody plants and mature trees certainly constitutes a major shift in the soil environment, especially carbon sources for the microorganisms, and accompanying changes in soil properties (Nüsslein & Tiedje, 1999). It is to be noted that in a highly heterogeneous habitat such as these soils, some populations of bacteria are represented in every soil (Fig. 3). These microorganisms are probably present as active microbial populations or distributed throughout this region and are subsisting at low levels or perhaps as spores. At present, we cannot define the origin of the microbial communities at these soils.

Despite the relative young geological age of these volcanic deposits (<300-years-old) and unique environmental characteristics, our study revealed that many common representatives of widely distributed soil bacteria, which included members of the classes Acidobacteria, Actinobacteria and Alphaproteobacteria (Andrews & Harris, 2000; Kent & Triplett, 2002; Garbeva et al., 2004), were found at these volcanic deposits. Furthermore, these representatives were found to be part of the most abundant groups of bacteria in soil (see Floyd et al., 2005 for review). It is relevant to consider that some divisions may be underrepresented, as our libraries were only a fraction of the total community. Janssen (2006) suggested that members of Firmicutes-related organisms might be underrepresented in libraries because the cells or spores may be difficult to lyse and so are not detected in PCR-based analyses that rely on DNA extraction from soil. In conclusion, colonization of recent volcanic deposits by numerous bacteria may contribute significantly to the development of complex microbial communities, as documented in our young forest (Dunfield & King, 2004; Nanba et al., 2004) and in a rainforest near the Kilauea volcano (Nüsslein & Tiedje, 1998).

Our sample set does not allow a comparison of changes over time, as the geological chronosequence spans more than one ecosystem regime. Consequently, any level of correlation between these parameters may reflect the limitations of the data set rather than the complex interactions between diversity and soil parameters (Crawford et al., 2005). Nonetheless, numerous studies of the biota in soil can provide valuable insights into the structure and dynamics of microbial communities (Ives 1995; Wall & Virginia, 1999; Reynolds 2002; Curtis & Sloan, 2004). Although we do not have conclusive evidence for its role, the distribution of vegetation, pH, and trace gas flux likely contributes to community composition. For instance, members of the Alphaproteobacteria (e.g. Bradyrhizobium, Rhizobium, Rhodobacter), which are commonly found in symbiosis with plants (Andrews & Harris, 2000), were found in deposits with a significant presence of vegetation, including Metrosideros polymorpha and Myrica faya (Vitousek et al., 1995). Furthermore, many legume symbionts (e.g. Bradyrhizobium, Mesorhizobium, and Sinorhizobium) have been shown to use either CO or H2 for lithotrophic growth (Hanus et al., 1979), which have important implications for colonization of volcanic substrates by pioneer plant species (Dunfield & King, 2004; Nanba et al., 2004). These were the cases for the 1700-KVD, a young nutrient-rich forest, and to lesser extent the 1959-KVD, which is surrounded by patches of M. polymorpha and M. faya and supports limited growth of plants (King, 2003). The relative significance of the vegetation at these sites is uncertain, but the presence of plant biomass and root material is usually accompanied by an accumulation of organic matter and an increase in microbial diversity (Wardle, 1992). Although the community composition in our forest site was neither exhaustive nor rigorously quantitative, our data sets illustrated the resemblance between the diversity captured at this site and an undisturbed rainforest near the Kilauea volcano (Nüsslein & Tiedje, 1998). Furthermore, their findings support the idea that young volcanic forests accommodate numerous bacterial species, and hence a great diversity.

The much younger site (1921-KVD) is a nutrient-deficient soil with clear absence of vegetation (King, 2003). Due to the absence of vegetation the microbial community at this site may experience prolonged exposure to arid conditions and low nutrient availability (Wardle, 1992). Furthermore, this site is unique because SO2 plumes from vents around the Halemáumáu crater may adversely affect it. The 1921-KVD is located less than 100 m downwind of the vents where the plume originated. Consequently, this site is characterized by low pH levels, which leach basic cations, affect nutrient solubility, and influence the incorporation of organic matter into microbial cells. As a result of these conditions the microbial community diversity is low and mostly dominated by unclassified taxa. These results are consistent with the notion that low pH and unvegetated environments support only small numbers of bacterial species (Fierer & Jackson, 2006), which are adapted to extreme environments (Johnson, 1998; Edwards et al., 1999). Although Wardle (1992) suggested that pH is a substantially less important factor than carbon and nitrogen in influencing microbial communities, the diversity and richness of this site could largely be explained by soil pH.

To conclude, the phylogenetic analysis supported the occurrence of many phylotypes, closely related to uncultured organisms from other studies or distinct from any reported 16S rRNA gene sequences. The 1700-KVD library (forest) generated relative low numbers of clones that clustered with unknown lineages (6%). However, the majority of clones from 1790-KVD (74%), 1921-KVD (81%), and 1959-KVD (59%) could not be assigned to a bacterial phylum (Cole et al., 2003). Similar, Dunfield & King (2004) found that the majority of CO oxidizers derived from clone libraries at these sites could not be assigned to a bacterial phylum. In addition, phylogenetic analysis of facultative lithotrophs at these sites showed a similar trend (Nanba et al., 2004). Our approach was to correlate selective soil parameters with the diversity of the total microbial community. However, such analysis may lead to oversimplification by assuming an equal response to each environmental factor. Rather, this study offers a glimpse of the microbial structure in recent volcanic deposits and the relationship between their soil parameters without assuming specific effects of these parameters on individuals. Of course, key additional analyses, such as isolation or metagenomics will be necessary to assess the diversity of bacterial populations and could provide an important understanding of the physiology, taxonomy, and their ecological role in soil (Crawford et al., 2005). This study demonstrates that Hawaiian microbial communities in recent volcanic deposits (<300-years-old) are significantly diverse in spite of the extreme environmental conditions. Our findings support the idea that bacteria are able to colonize and become established on recent volcanic deposits, and that these sites harbor numerous bacterial species, most of which are as yet unknown.


The authors are grateful to F.A. Trusdell (USGS-HAVO) for access to recent geological data of Hawaii. This project was funded by National Science Foundation grant DEB-0085495 under the LExEn Program.