Correspondence: Kazunobu Matsushita, Department of Biological Chemistry, Faculty of Agriculture, Yamaguchi University, Yamaguchi 753-8315, Japan. Tel./fax: +81 83 933 5857; e-mail: firstname.lastname@example.org
Phylogenetic relationships among three genera, Gluconobacter, Acetobacter, and Gluconacetobacter, of acetic acid bacteria (AAB) are still unclear, although phylogenetic analysis using 16S rRNA gene sequence has shown that Gluconacetobacter diverged first from the ancestor of these three genera. Therefore, the relationships among these three genera were investigated by genome-wide phylogenetic analysis of AAB. Contrary to the results of 16S rRNA gene analysis, phylogenetic analysis of 293 enzymes involved in metabolism clearly showed that Gluconobacter separated first from its common ancestor with Acetobacter and Gluconacetobacter. In addition, we defined 753 unique orthologous proteins among five known complete genomes of AAB, and phylogenetic analysis was carried out using concatenated gene sequences of these 753 proteins. The result also showed that Gluconobacter separated first from its common ancestor with Acetobacter and Gluconacetobacter. Our results strongly suggest that Gluconobacter was the first to diverge from the common ancestor of Gluconobacter, Acetobacter, and Gluconacetobacter, a relationship that is in good agreement with the physiologies and habitats of these genera.
Acetic acid bacteria (AAB) are gram-negative strictly aerobic bacteria, which are classified into 10 genera, of which the major ones are Acetobacter, Gluconobacter, and Gluconacetobacter (Prust et al., 2005; Azuma et al., 2009; Bertalan et al., 2009). These three genera are well-distinguished in their physiological characteristics. In particular, Acetobacter and Gluconacetobacter are the most prominent acetic acid producers and show relatively high acetic acid resistance ability (Sievers & Teuber, 1995). Highest tolerance to acetic acid has so far been reported for Gluconacetobacter europaeus, Gluconacetobacter intermedius, Gluconacetobacter oboediens, and Gluconacetobacter entanii (Sievers & Teuber, 1995; Boesch et al., 1998; Sokollek et al., 1998; Schüller et al., 2000). All these species are from the genus Gluconacetobacter, and were isolated from submerged industrial bioreactors with extremely high acetic acid concentrations (>10%, v/v). Two other species, Acetobacter aceti and Acetobacter pasteurianus, also involved in vinegar production and from the genus Acetobacter, are mainly used in traditional processes for vinegar production where the concentration of acetic acid does not exceed 6% (v/v).
These AAB involved in acetic acid fermentation exhibit two different acetic acid resistance phases (Matsushita et al., 2005): one is the ethanol oxidation phase, which is characterized by oxidation of ethanol to acetic acid, where acetic acid resistance occurs without acetate assimilation, and the second phase is the overoxidation phase, which is characterized by oxidation of acetic acid to water and carbon dioxide, where the cells overcome acetic acid by its assimilation. The overoxidation occurs in Acetobacter and Gluconacetobacter, but not in Gluconobacter, which exhibits a relatively weak acetic acid resistance (Sievers & Swings, 2005; Kersters et al., 2006). Gluconobacter is a genus, the AAB of which can oxidize a broad range of sugars, sugar alcohols, and sugar acids, and accumulate a large amount of the corresponding oxidized products in culture medium (Prust et al., 2005). Thus, the physiologies and habitats of the two groups, one group consisting of genera Acetobacter and Gluconacetobacter and the other group consisting of genus Gluconobacter, are quite different.
In the present study, because complete genome sequences of five Acetobacteraceae bacteria, A. pasteurianus IFO3283-01, Gluconacetobacter diazotrophicus PAl 5, Gluconobacter oxydans 621H, Granulibacter bethesdensis CGDNIH1, and Acidiphilium cryptum JF-5, are available, genome-wide phylogenetic analysis was performed using these five sequences to investigate the genome-level phylogenetic relationships among three AAB genera: Acetobacter, Gluconacetobacter, and Gluconobacter (Prust et al., 2005; Azuma et al., 2009; Bertalan et al., 2009).
Materials and methods
Sequence retrieval and phylogenetic analysis of 16S rRNA gene of Acetobacteraceae
Thirty-seven nearly complete 16S rRNA gene sequences of Acetobacteraceae, G. oxydans 621H (NC_006677), Gluconobacter frateurii (AB470921), Gluconobacter japonicus (AB470922), Gluconobacter cerinus (AB024492), Gluconobacter thailandicus (AB128050), A. pasteurianus (AB470918), Acetobacter orientalis (AB470917), Acetobacter tropicalis (AB470916), Acetobacter ghanensis (AB470920), Acetobacter syzygii (AB470919), Acetobacter indonesiensis (AB052715), Acetobacter estunensis (AJ419839), Acetobacter cibinongensis (AB052711), Acetobacter pomorum (AJ419835), Acetobacter peroxydans (AJ419836), Acetobacter lovaniensis (AJ419837), A. aceti (X74066), G. entanii (AJ251110), G. intermedius (AJ012699), Gluconacetobacter hansenii (X75620), G. diazotrophicus (X75618), G. europaeus (X85406), G. oboediens (AJ001631), Gluconacetobacter xylinus (AJ007698), Gluconacetobacter liquefaciens (X75617), Gluconacetobacter azotocaptans (AF192761), Gluconacetobacter johannae (NR_024959), Gluconacetobacter sacchari (AF127404), Asaia siamensis (AB292239), Asaia bogorensis (AB025928), Kozakia baliensis (AB056318), Acidomonas methanolica (AB110715), G. bethesdensis (AY788950), A. cryptum (D30773), Swaminathania salitolerans (AB445099), Saccharibacter floricola (NR_024819), and Neoasaia chiangmaiensis (AB208549), were obtained from the NCBI website at http://www.ncbi.nlm.nih.gov/. To construct the phylogenetic tree of AAB, these 37 sequences were collected and nucleotide sequence alignment was carried out using clustalw (Larkin et al., 2007). We used the mega version 4.0 package to generate phylogenetic trees to study the phylogenetic relationship based on 16S rRNA gene with the neighbor-joining (NJ) approach and 1000 bootstrap replicates (Tamura et al., 2007).
Sequence retrieval and data construction for phylogenetic analysis of metabolic enzymes
Three hundred and ninety-one unique complete microbial genome sequences (one genome per genus) were obtained from the NCBI FTP website at ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/. Only amino acid-coding sequences on the chromosomes were used for comparative analysis. For a homology search, a dataset of all proteins was constructed. The dataset of all proteins was constructed from all amino acid sequences from 391 complete microbial genomes. Four hundred and forty-three proteins on the KEGG metabolic map of G. oxydans were used as a query for the blastp homology search against the dataset of all proteins (Altschul et al., 1997; Kanehisa, 1997; Ogata et al., 1999; Kanehisa & Goto, 2000; Kanehisa et al., 2002, 2004, 2006, 2008, 2010). Of the 443 proteins, 293 were selected for further analysis because these ORFs exist in all three genera, Gluconobacter, Gluconacetobacter, and Acetobacter. Each homolog was identified by a homology search of amino acid sequence using the blastp filtering expectation value of e-value ≤10−10 and sequence overlap ≥70% (Altschul et al., 1997). The top 50 hits were collected and multifasta files were created for phylogenetic analysis using house-written ruby scripts.
Identification of orthologous genes
The previously published complete genome sequences of Acetobactericeae, G. oxydans, G. diazotrophicus, A. pasteurianus, G. bethesdensis, and A. cryptum were obtained from the NCBI FTP website at ftp://ftp.ncbi.nih.gov/genomes/Bacteria/ (Prust et al., 2005; Greenberg et al., 2007; Azuma et al., 2009; Bertalan et al., 2009). Only protein-coding genes on the chromosomes were used for the identification of orthologous groups. Each orthologous gene was identified by homology searches for amino acid sequence using the blastp filtering expectation value of e-value ≤10−10 and sequence overlap ≥70% (Altschul et al., 1997). All ORFs were searched against each species, and the reciprocal best hits were regarded as being orthologous genes. If genes were orthologous among all species present, the group was defined as a unique orthologous dataset.
Next, amino acid sequences of each orthologous dataset were concatenated for each species, and a phylogenetic dataset was constructed using house-written ruby scripts.
Phylogenetic analysis of protein sequences
To construct the phylogenetic tree of AAB, amino acid sequence alignment was carried out using clustalw (Larkin et al., 2007). We used the mega version 4.0 package to generate the phylogenetic tree to study the phylogenetic relationships of AAB with the NJ approach and 1000 bootstrap replicates (Tamura et al., 2007).
Results and discussion
Phylogenetic analysis of 16S rRNA gene of Acetobacteraceae
In order to investigate the phylogenetic relationship among three genera, Acetobacter, Gluconacetobacter, and Gluconobacter, a phylogenetic tree of Acetobacteraceae was constructed using 16S rRNA gene sequences. As shown in Fig. 1, the 16S rRNA gene phylogenetic tree constructed by the NJ method suggested that Gluconacetobacter was the first to diverge from the common ancestor of these three genera. These results are in good agreement with many previous works (Lisdiyanti et al., 2000, 2001; Cleenwerck et al., 2007, 2008).
Phylogenetic analysis of metabolic enzymes
To investigate the phylogenetic relationship among three AAB genera, Acetobacter, Gluconacetobacter, and Gluconobacter, phylogenetic analyses of metabolic proteins conserved in A. pasteurianus, G. diazotrophicus, and G. oxydans were performed. Four hundred and forty-three proteins on the KEGG metabolic map of G. oxydans were used as a query for the blastp homology search against the dataset of all proteins. Of the 443 proteins, 293 were selected for further analysis because these ORFs exist in all three genera, Gluconobacter, Gluconacetobacter, and Acetobacter. Each homolog was identified by homology search of amino acid sequence using the blastp (Altschul et al., 1997). The top 50 hits of each query were collected and multifasta files were created for phylogenetic analysis. Results showed three different phylogenetic patterns for phylogenetic relationship among the three genera with the 293 proteins (Supporting Information, Table S1 and Fig. 2). As shown in Fig. 2d, pattern B was observed with 200 proteins, while pattern A, which is the same pattern as determined with 16S rRNA gene, was observed only with 31 proteins. Therefore, phylogenetic analysis of the 293 metabolic proteins suggested that Gluconobacter was the first to diverge from its common ancestor with Acetobacter and Gluconacetobacter. The result is clearly different from that of phylogenetic analysis of 16S rRNA gene sequences.
Concatenated phylogenetic analysis of AAB
Because concatenating multigene analysis is an accepted technique to improve the accuracy of phylogenetic inference (Gontcharov et al., 2003; Rokas et al., 2003), we tried to determine the core set of orthologous genes for each of the five AAB complete genomes described above using the all-against-all blastp analysis. As a result, 753 groups of orthologous genes were detected on the basis of the reciprocal best hits, which include 233 groups used in Fig. 2 (Table S2). Because 748 proteins of G. oxydans were assigned Clusters of Orthologous Groups of proteins IDs among the 753 orthologous genes (data not shown), it was conceivable that a unique characterized gene dataset was acquired. Thus, the 753 orthologous gene groups were used as a unique orthologous gene dataset to investigate the genetic relationship at the whole-genome level among AAB. Amino acid sequences of the unique orthologous dataset were concatenated into a pseudo-single-sequence and an NJ phylogenetic tree was constructed from multiple amino acid alignments of the concatenated sequences (Fig. 3a). The phylogenetic tree showed that Gluconobacter was the first to diverge from its common ancestor with Acetobacter and Gluconacetobacter. This result is in agreement with that of the phylogenetic analysis of 293 metabolic proteins. In addition, two branches of the concatenated proteins showed high statistical confidence (NJ bootstrap value; 100%), suggesting that the phylogeny of the protein-coding regions of AAB is different from that of the 16S rRNA gene. In addition, some classic markers, DNA gyrase subunit B (GyrB), DNA gyrase subunit A (GyrA), and DNA-directed RNA polymerase subunit β (RpoB), also showed the same phylogenetic pattern as the concatenated phylogenetic tree (data not shown). These genes might be useful to determine phylogenetic relationships, instead of concatenated proteins, in species for which complete genome sequences are not available.
Comparison of modified tricarboxylic acid (TCA) cycle in AAB
It has been reported that A. aceti strain 1023 lacks malate dehydrogenase (Mdh) and succinyl-CoA synthetase (SCS) genes, but can assimilate acetate by a modified TCA cycle, in which Mdh and SCS are functionally replaced by malate : quinone oxidoreductase (Mqo) and succinyl-CoA : acetate CoA transferase (AarC), respectively (Mullins et al., 2008). Thus, it has been thought that these gene replacements play a key role in acetate oxidation, together with citrate synthase (AarA), which makes the cells resistant to acetic acid. Therefore, we investigated the distribution of these four genes in five AAB genomes. We classified these genes in Acetobacteraceae genomes. Table 1 shows the distribution of Mqo and AarC, as well as Mdh and SCS, in five AAB genomes. Only G. diazotrophicus and A. pasteurianus have AarC, which is consistent with the similar habitats of the two genera as described in the Introduction. In addition, Mqo of AAB was phylogenetically divided into two groups: one is Mqo (type GGr) of G. oxydans and G. bethesdensis and the other that (type GaA) of G. diazotrophicus and A. pasteurianus (data not shown). Thus, it is possible to speculate that the ability to overoxidize acetic acid to water and carbon dioxide was acquired by obtaining the aarC and mqo (type GaA) genes after divergence from Gluconobacter. In contrast, Gluconobacter lacks the TCA cycle. These results are also in good agreement with the concatenated multigene analysis, suggesting that the divergence of Gluconobacter from the ancestor of the three genera, Gluconobacter, Gluconacetobacter, and Acetobacter, occurred first.
Table 1. Comparison of genes involved in the modified TCA cycle in Acetobacteraceae
Malate : quinone oxidoreductase (EC : 22.214.171.124)
Malate : quinone oxidoreductase (EC : 126.96.36.199)
Malate dehydrogenase (EC : 188.8.131.52)
Succinyl-CoA synthetase α chain (EC : 184.108.40.206)
Succinyl-CoA synthetase β chain (EC : 220.127.116.11)
In the present study, we performed the reconstruction of phylogenetic relationships among AAB using complete genome sequences, which suggests that Gluconobacter was the first to diverge from the common ancestor of the three genera. This new concept derived from genome-wide phylogenetic analysis fits well with the physiological differences among the three genera, Gluconobacter, Gluconacetobacter, and Acetobacter, the latter two of which are found in similar habitats. Indeed, these genera were previously classified as a single genus: Acetobacter. Yamada et al. (1997) separated the genus into Gluconacetobacter and Acetobacter on the basis of partial sequences of 16S rRNA gene. In contrast to the 16S rRNA gene-based phylogenetic tree, our results fit well with the fact that Gluconacetobacter and Acetobacter have similar physiologies and habitats. The present result clearly shows that concatenating large multiprotein dataset analysis is a very useful technique to improve the accuracy of phylogenetic inference. Although whole-genome sequences are needed, the technique should be useful for the analysis of phylogenetic relationships at the genome level.
This work was supported by the Program for Promoting Basic Research Activities for Innovative Biosciences (PROBRAIN).