DNA sequence-based analysis of the Pseudomonas species

Authors

  • Magdalena Mulet,

    1. Microbiologia, Departament de Biologia, Edifici Guillem Colom, Universitat de les Illes Balears, Campus UIB, 07122 Palma de Mallorca, Spain.
    Search for more papers by this author
  • Jorge Lalucat,

    1. Microbiologia, Departament de Biologia, Edifici Guillem Colom, Universitat de les Illes Balears, Campus UIB, 07122 Palma de Mallorca, Spain.
    2. Institut Mediterrani d'Estudis Avançats (IMEDEA, CSIC-UIB), Campus UIB, 07122 Palma de Mallorca, Spain.
    Search for more papers by this author
  • Elena García-Valdés

    Corresponding author
    1. Microbiologia, Departament de Biologia, Edifici Guillem Colom, Universitat de les Illes Balears, Campus UIB, 07122 Palma de Mallorca, Spain.
    2. Institut Mediterrani d'Estudis Avançats (IMEDEA, CSIC-UIB), Campus UIB, 07122 Palma de Mallorca, Spain.
      E-mail elena.garciavaldes@uib.es; Tel. (+34) 971 173141/657698319; Fax (+34) 971 173184.
    Search for more papers by this author

E-mail elena.garciavaldes@uib.es; Tel. (+34) 971 173141/657698319; Fax (+34) 971 173184.

Summary

Partial sequences of four core ‘housekeeping’ genes (16S rRNA, gyrB, rpoB and rpoD) of the type strains of 107 Pseudomonas species were analysed in order to obtain a comprehensive view regarding the phylogenetic relationships within the Pseudomonas genus. Gene trees allowed the discrimination of two lineages or intrageneric groups (IG), called IG P. aeruginosa and IG P. fluorescens. The first IG P. aeruginosa, was divided into three main groups, represented by the species P. aeruginosa, P. stutzeri and P. oleovorans. The second IG was divided into six groups, represented by the species P. fluorescens, P. syringae, P. lutea, P. putida, P. anguilliseptica and P. straminea. The P. fluorescens group was the most complex and included nine subgroups, represented by the species P. fluorescens, P. gessardi, P. fragi, P. mandelii, P. jesseni, P. koreensis, P. corrugata, P. chlororaphis and P. asplenii. Pseudomonas rhizospherae was affiliated with the P. fluorescens IG in the phylogenetic analysis but was independent of any group. Some species were located on phylogenetic branches that were distant from defined clusters, such as those represented by the P. oryzihabitans group and the type strains P. pachastrellae, P. pertucinogena and P. luteola. Additionally, 17 strains of P. aeruginosa, ‘P. entomophila’, P. fluorescens, P. putida, P. syringae and P. stutzeri, for which genome sequences have been determined, have been included to compare the results obtained in the analysis of four housekeeping genes with those obtained from whole genome analyses.

Introduction

The genus Pseudomonas is one of the more diverse genera, and its taxonomy has undergone many changes since earlier descriptions (Palleroni, 1984). Sequencing of the 16S rDNA gene (rrs) has redistributed some of the former Pseudomonas species into other genera, in particular, into the alpha, beta or gamma subclasses of Proteobacteria (Kersters et al., 1996). The members of the genus Pseudomonas (sensu stricto) belong to Palleroni's RNA group I, in the Gammaproteobacteria. The number of species in the genus Pseudomonas is increasing every year: there were 102 in 2006, 109 in 2007, 114 in 2008 and 118 in 2009 at the time of this writing (as stated in Euzéby's List of Prokaryotic Names and in the DSMZ; web pages: http://www.bacterio.cict.fr/ and http://www.dsmz.de/). A study of all of the type strains is essential for a comprehensive view of the actual panorama of the Pseudomonas genus.

Previous work in our laboratory with members of the species P. stutzeri (García-Valdés et al., 2003; Cladera et al., 2004; 2006a,b; Mulet et al., 2008) has permitted us to generate the tools needed to extend our study to other Pseudomonas species, such as the appropriate selection of genes (Cladera et al., 2004; Mulet et al., 2008), improvements in PCR protocols and primers used (Mulet et al., 2009), and the creation of a specific database, PseudoMLSA, which is now available, to compile all of these gene sequences for the characterization and taxonomical identification of Pseudomonas strains (http://www.uib.es/microbiologiaBD/Welcome.php) (A. Bennasar, pers. comm.).

In this work, several genes were used to delineate the phylogenetic status of species in the genus Pseudomonas. The 16S rDNA was included because, as a universal marker, it permits the ascription of a strain to the genus and allows comparisons between very divergent bacteria (Santos and Ochman, 2004). However, the resolution of 16S rRNA gene sequences at the intrageneric level is low (Moore et al., 1996; Anzai et al., 2000; Yamamoto et al., 2000). Thus, the gene sequences for ‘housekeeping’ proteins, providing better resolution, have been studied. The gyrB gene encodes the beta-subunit of gyrase, which is responsible for negative supercoiling of DNA during replication. The rpoD is the gene encoding the sigma 70 subunit of RNA polymerase. Both genes have been usedby Yamamoto and collaborators, initially for the phylogenetic characterization of P. putida strains (Yamamoto and Harayama, 1998) and later for analysis of 31 species of the Pseudomonas genus (Yamamoto et al., 2000), allowing for the establishment of different intra-generic complexes. The rpoB gene, encoding the beta-subunit of RNA polymerase, has been postulated to be a good candidate for phylogenetic analysis and identification of bacteria for clinical microbiologists (Adékambi et al., 2009). This gene has been used by Tayeb and colleagues (2005) for analyses of species of the Pseudomonas genus and has also been used for other genera, such as Brevundimonas, Ralstonia, Comamonas and Burkholderia (Tayeb et al., 2008), many of which are former members of the genus Pseudomonas (sensu lato).

Moore and colleagues (1996) and Anzai and colleagues (2000) published studies on the phylogeny of Pseudomonas based only on the analysis of 16S rDNA. Later, Yamamoto and colleagues (2000) incorporated the gyrB and rpoD genes for the phylogenetic description of 23 Pseudomonas taxa. Since the work of Hilario and colleagues (2004), in which the atpD, carA and recA genes were incorporated into an analysis of the type strains of 13 species of Pseudomonas (together with other reference strains), or the publication of the rpoB sequences by Tayeb and colleagues (2005), including 75 type strains, there has been no review of the status of the Pseudomonas genus or regarding the phylogenetic relationships between their species based on DNA sequencing of representative genes, and only a few studies have considered the combined phylogenetic analysis of several genes in various species (Kiewitz and Tümmler, 2000; Frapolli et al., 2007).

In this article, partial sequences of the 16S rRNA, gyrB, rpoB and rpoD genes of 107 Pseudomonas type strains, including recently described Pseudomonas species, are analysed and provide a necessary update regarding the phylogeny of the genus. Additionally, data from 17 completely sequenced Pseudomonas genomes of strains of the species P. aeruginosa, ‘P. entomophila’, P. fluorescens, P. syringae and P. stutzeri have been included in this study to compare the results obtained from the analysis of four housekeeping genes with those obtained from whole genome analysis.

Results

Analysis, comparison and selection of genes

Four housekeeping genes were selected for a multi-locus sequence analysis (MLSA) of the phylogenetic relationships of 107 type strains of the Pseudomonas genus: 16S rRNA gene; gyrB; rpoB; and rpoD. One hundred and eighty new sequences that were obtained in the present study were deposited into the NCBI and PseudoMLSA databases and were analysed together with 248 sequences that were previously deposited into public databases. The four genes were compared in order to determine the most discriminating gene and were used in a combined analysis to infer the phylogeny of the genus.

The genetic diversity of the three selected protein-coding genes in the 107 Pseudomonas species was studied (Table 1). The rpoD gene exhibited the highest number of polymorphic sites (70.39%), followed by gyrB (51.82%) and rpoB (44.41%). The dN/dS ratio was calculated as the number of non-synonymous substitutions per non-synonymous site that result in an amino acid replacement (dN) divided by the number of synonymous substitutions per synonymous site that do not result in an amino acid change (dS). The value was 0.203 for the rpoD gene, 0.068 for gyrB and 0.047 for rpoB. If the dN/dS value is less than 1, most of the sequence variability identified is a result of purifying selection (selection against deleterious non-synonymous substitutions). rpoD is neither too highly conserved nor too highly variable in Pseudomonas, and therefore is a good candidate for phylogenetic and taxonomical analysis. The rpoD gene was also compared with the carA, atpD and recA genes from the previously published analyses of 13 type strains of Pseudomonas (Hilario et al., 2004). The comparison in these 13 type strains also demonstrated that the rpoD gene exhibited the highest number of polymorphic sites (48.79%) and the highest nucleotide diversity (0.211), followed by carA (46% and 0.183).

Table 1.  Nucleotide diversity of the gyrB, rpoB and rpoD genes for the Pseudomonas type strains and groups analysed in this study.
Groups, subgroups or type strains% Monomorphic sites% Polymorphic sitesNucleotide diversitydN/dS ratioG+C
gyrBrpoDrpoBgyrBrpoDrpoBgyrBrpoDrpoBgyrBrpoDrpoBgyrBrpoDrpoB
  • a. 

    Pair-wise comparisons referred to P. aeruginosaT; only highest values are show.

All strains48.1829.6155.5951.8270.3944.410.154 480.215 670.105 680.0680.2030.0470.5650.6160.603
P. fluorescens SG70.3073.8482.4029.7026.1617.600.087 480.077 440.042 880.0190.0670.0260.5480.6090.596
P. gessardi SG88.8588.8991.9111.1511.118.090.060 360.050 770.037 600.0190.0530.0140.5570.6130.595
P. fragi SG83.4688.0586.9916.5411.9513.010.094 610.068 450.071 770.0260.0700.0160.5520.5920.572
P. mandeli SG86.9791.5694.9713.038.445.030.074 770.047 350.026 050.0240.0890.0330.5600.5980.586
P. jesseni SG86.7283.1291.9113.2816.888.090.061 650.073 700.035 560.0060.0760.0170.5600.5980.594
P. koreensis SG95.2493.5395.964.766.474.040.047 620.064 700.040 440.0000.0430.0370.5770.6110.598
P. corrugata SG84.4686.6492.7915.5413.367.210.069 590.068 500.037 270.0290.0670.0260.5810.6230.610
P. chlororaphis SG96.1297.3396.833.882.673.170.025 900.017 820.021 130.0210.0740.0000.5720.6220.602
P. asplenii SG98.5099.5898.801.500.421.200.015 040.004 220.012 020.0000.0000.0320.5830.6210.603
P. agaricia80.5870.6488.7419.4229.3611.260.194 240.293 620.112 570.1160.3100.0820.5390.6190.621
P. syringae G74.9476.6583.6125.0623.3516.390.093 740.070 830.055 540.0140.0630.0080.5380.5860.589
P. lutea G78.9582.5585.7921.0517.4514.210.148 710.121 990.099 450.0080.0630.0240.5330.6060.582
P. rhizosphaeraea79.7071.4589.4020.3028.5510.600.203 010.285 510.106 010.0910.3410.0620.5480.6440.609
P. putida G75.5669.9089.4024.4430.1010.600.116 720.136 160.049 130.0170.1300.0230.5680.6260.614
P. anguilliseptica G81.1364.6880.4418.8735.3219.560.137 840.186 810.096 610.0560.1160.0150.5630.6000.594
P. straminea G84.2184.1891.8015.7915.828.200.109 440.109 700.055 740.0240.0690.0480.5870.6470.628
P. aeruginosa G66.5450.4375.0833.4649.5724.920.134 520.199 270.095 700.0680.3540.0600.6040.6490.634
P. oleovorans G86.5992.2689.8413.417.7410.160.079 370.047 590.058 470.0280.1580.0360.5810.6520.624
P. stutzeri G72.1869.7480.2627.8230.2619.740.163 120.182 010.117 510.0400.1280.0250.5920.6400.636
P. oryzihabitans G98.6398.7487.101.371.2612.900.013 730.012 550.128 960.0310.0000.0780.6360.6680.622
P. luteolaa79.0771.6982.4020.9328.3117.600.209 270.283 100.175 960.0510.2210.0260.5310.5740.550
P. pachastrellaea79.3269.3486.7320.6830.6613.270.206 770.306 610.132 680.1690.4560.1340.5640.6020.603
P. pertucinogenaa77.1968.6384.7622.8131.3715.240.228 070.313 730.152 410.1540.4170.1430.5720.6410.623

For each single gene, a matrix of the phylogenetic distances between the 107 type strains was constructed (mean values for each gene are indicated in Table S1), and the distances of pairs of strains (5671 values) were plotted. When compared with rpoD, the distances of the four genes in the pair-wise comparisons showed correlation coefficients (R2) of 0.64, 0.75 and 0.69 for the 16S rRNA gene, gyrB and rpoB, respectively, and are represented in Fig. 1. The discriminatory power of each gene was calculated as the ratio between the rpoD slope and the slopes of the other genes (Fig. 1): rpoD/16S rRNA (eight times); rpoD/rpoB (three times); and rpoD/gyrB (two times). The most discriminating gene analysed was rpoD, followed by gyrB, rpoB and the 16S rRNA gene. Similarly, three other genes (atpD, carA, recA) from 13 type strains (Hilario et al., 2004) were compared with rpoD: rpoD/16S rRNA (seven times); rpoD/rpoB (four times); rpoD/gyrB (two times); rpoD/carA (two times); rpoD/recA (five times); rpoD/atpD (three times). In all cases, rpoD was the most discriminating gene. The range and average distances for each gene are shown in Table S1.

Figure 1.

Least square tendency lines obtained for the comparisons of phylogenetic distances between 107 Pseudomonas type strains. The slope is indicated in each case. The lines have been vertically shifted for the sake of clarity. The correlation coefficient R2 is 0.6401 for 16S rRNA, 0.7501 for gyrB and 0.686 for rpoB.

In a similar way, the matrices constructed for the concatenated sequences of three genes (16S rRNA gene, gyrB, rpoD; 2870 nt) and four genes (16S rRNA gene, gyrB, rpoB and rpoD; 3726 nt) were compared in a pair-wise manner to assess the correlation between them and the relative discriminatory power of both sets of genes. They were well correlated and almost equally discriminating (three genes versus four genes: y = 1.0252x, R2 = 0.987). The analysis was extended to the 13 type strains studied by Hilario and colleagues (2004), included three additional genes. The seven genes (16S rRNA gene, gyrB, rpoB, rpoD, carA, atpD and recA; 5790 nt) were also concatenated and compared in a pair-wise manner with the other sets of genes. Both sets were well correlated and almost equally discriminating: three genes versus seven genes: y = 0.8962x, R2 = 0.9743; four genes versus seven genes: y = 0.9811x, R2 = 0.9877). rpoD was well correlated with all of them (three genes: y = 0.3741x/R2 = 0.9077; four genes: y = 0.3586x/R2 = 0.8881) and yielded the best resolution.

Phylogenetic trees

The criteria established for clustering the type strains of the Pseudomonas species were the analysis of the alignments and the topology of the trees, based on the presence of well-defined nodes and independent branches with bootstrap values over 500. Individual phylogenetic trees of the 16S rRNA gene, gyrB, rpoD and rpoB were calculated with several methods: Jukes-Cantor; maximum likelihood; maximum parsimony; and minimum evolution; to compare the phylogenetic relationships between the Pseudomonas species. No significant differences were found between trees, independent of the clustering method used. All of the trees are congruent and showed that the main lineages are well represented in all trees. Two main branches or lineages were clearly defined in the topologies of individual trees. The 16S rRNA gene tree was the exception, as the lineages were not distinguishable in any of the trees represented.

The concatenated sequences of four genes, applying the Jukes-Cantor method with neighbour-joining representation, were selected as the reference of our comparisons. The sequences of the four genes selected for the 107 type strains were aligned in the following order: 16S rRNA gene; gyrB; rpoD; and rpoB; comprising 3726 nucleotides. The rooted tree shows the established lineages, groups (G) and subgroups (SG) (Fig. 2), as did the trees for individual genes. The unrooted tree (Figs 3 and 4) showed the interrelationships between the 18 groups more clearly. The name of the first species described in a group or subgroup was chosen to designate it. The resulting groupings were supported by high bootstrap values and are shown in Fig. 5. No differences were found in the tree topologies when rpoB was excluded, with the only exception being the P. alcaligenes type strain, which was located in the P. oleovorans group.

Figure 2.

Schematic phylogenetic tree of 107 Pseudomonas type strains based on the concatenated analysis of 16S rRNA, gyrB, rpoB and rpoD genes. Distance matrix was calculated by the Jukes-Cantor method. Dendrogram was generated by neighbour-joining. Cellvibrio japonicum Ueda107 was used as outgroup. The bar indicates sequence divergence.

Figure 3.

Phylogenetic tree (unrooted) of 107 Pseudomonas type strains based on phylogenetic analysis of partial sequences of the 16S rRNA, gyrB, rpoB and rpoD genes. The bar indicates sequence divergence. Distance matrix was calculated by the Jukes-Cantor method. Dendrogram was generated by neighbour-joining. Cellvibrio japonicus Uada107 was used as outgroup. (1) P. antarcticaT, (2) P. azotoformansT, (3) P. cedrinaT, (4) P. costantiniiT, (5) P. extremorientalisT, (6) P. fluorescensT, (7) P. grimontiiT, (8) P. libaniensisT, (9) P. marginalisT, (10) P. orientalisT, (11) P. palleronianaT, (12) P. panacisT, (13) P. poaeT, (14) P. salomoniiT, (15) P. synxanthaT, (16) P. tolaasiiT, (17) P. trivialisT, (18) P. veroniiT, (19) P. rhodesiaeT, (20) P. simiaeT, (21) P. brenneriT, (22) P. gessardiT, (23) P. meridianaT, (24) P. mucidolensT, (25) P. proteolyticaT, (26) P. brassicacearumT, (27) P. corrugataT, (28) P. kilonensisT, (29) P. mediterraneaT, (30) P. thivervalensisT, (31) P. agariciT, (32) P. aspleniiT, (33) P. fuscovaginaeT, (34) P. aurantiacaT, (35) P. aureofaciensT, (36) P. chlororaphisT, (37) P. koreensisT, (38) P. moraviensisT, (39) P. jesseniiT, (40) P. vancouverensisT, (41) P. umsongensisT, (42) P. mohniiT, (43) P. mooreiT, (44) P. reinekeiT, (45) P. frederiksbergesisT, (46) P. mandeliiT, (47) P. liniT, (48) P. migulaeT, (49) P. fragiT, (50) P. lundensisT, (51) P. psycrhophylaT, (52) P. taetrolensT, (53) P. amygdaliT, (54) P. avellanaeaT, (55) P. cannabinaT, (56) P. caricapapayaeT, (57) P. cichoriiT, (58) P. congelansT, (59) P. ficuserectaeT, (60) P. meliaeT, (61) P. savastanoiT, (62) P. syringaeT, (63) P. tremaeT, (64) P. viridiflavaT, (65) P. abietaniphilaT, (66) P. graminisT, (67) P. luteaT, (69) P. cremoricolorataT, (70) P. fulvaT, (71) P. mosseliiT, (72) P. monteiliiT, (73) P. parafulvaT, (74) P. plecoglossicidaT, (75) P. putidaT, (76) P. aeruginosaT, (77) P. citronellolisT, (78) P. jinjuensisT, (79) P. nitroreducensT, (80) P. panipatensisT, (81) P. knackmussiiT, (82) P. resinovoransT, (83) P. otitidisT, (84) P. indicaT, (85) P. thermotoleransT, (86) P. alcaligenesT, (87) P. oryzihabitansT, (88) P. psychrotoleransT, (89) P. alcaliphilaT, (90) P. mendocinaT, (91) P. oleovoransT, (92) P. pseudoalcaligenesT, (93) P. argentinensisT, (94) P. flavescensT, (95) P. stramineaT, (96) P. anguillisepticaT, (97) P. peliT, (98) P. guineaeT, (99) P. marincolaT, (100) P. borboriT, (101) P. azotifigensT, (102) P. balearicaT, (103) P. stutzeriT, (104) P. xanthomarinaT. Corresponding numbers of culture collections are shown in Table 2.

Figure 4.

Phylogenetic tree (unrooted) of Pseudomonas fluorescens group based on the phylogenetic analysis of partial sequences of the 16S rRNA, gyrB, rpoB and rpoD concatenated genes. The bar indicates sequence divergence. Distance matrix was calculated by the Jukes-Cantor method. Dendrogram was generated by neighbour-joining. Pseudomonas aeruginosaT was used as outgroup. (1) P. antarcticaT, (2) P. azotoformansT, (3) P. cedrinaT, (4) P. costantiniiT, (5) P. extremorientalisT, (6) P. fluorescensT, (7) P. grimontiiT, (8) P. libaniensisT, (9) P. marginaliT, (10) P. orientalisT, (11) P. palleronianaT, (12) P. panacisT, (13) P. poaeT, (14) P. salomoniiT, (15) P. synxanthaT, (16) P. tolaasiiT, (17) P. trivialisT, (18) P. veroniiT, (19) P. rhodesiaeT, (20) P. simiaeT, (21) P. brenneriT, (22) P. gessardiT, (23) P. meridianaT, (24) P. mucidolensT, (25) P. proteolyticaT, (26) P. brassicacearumT, (27) P. corrugataT, (28) P. kilonensisT, (29) P. mediterraneaT, (30) P. thivervalensisT, (32) P. aspleniiT, (33) P. fuscovaginaeT, (34) P. aurantiacaT, (35) P. aureofaciensT, (36) P. chlororaphisT, (37) P. koreensisT, (38) P. moraviensisT, (39) P. jesseniiT, (40) P. vancouverensisT, (41) P. umsongensisT, (42) P. mohniiT, (43) P. mooreiT, (44) P. reinekeiT, (45) P. frederiksbergesisT, (46) P. mandeliiT, (47) P. liniT, (48) P. migulaeT, (49) P. fragiT, (50) P. lundensisT, (51) P. psycrhophylaT, (52) P. taetrolensT. Corresponding culture collections are shown in Table 1.

Figure 5.

Figure 5.

(A and B) Phylogenetic tree of 107 type strains of Pseudomonas based on the phylogenetic analysis of four concatenated genes (16S rRNA, gyrB, rpoB and rpoD genes). Distance matrices were calculated by the Jukes-Cantor method. Dendrograms were generated by neighbour-joining. (A) The P. fluorescens phylogenetic lineage and (B) the P. aeruginosa phylogenetic lineage. Cellvibrio japonicum Ueda107 was used as an outgroup. The bar indicates sequence divergence. Bootstrap values of more than 500 (from 1000 replicates) are indicated at the nodes.

Figure 5.

Figure 5.

(A and B) Phylogenetic tree of 107 type strains of Pseudomonas based on the phylogenetic analysis of four concatenated genes (16S rRNA, gyrB, rpoB and rpoD genes). Distance matrices were calculated by the Jukes-Cantor method. Dendrograms were generated by neighbour-joining. (A) The P. fluorescens phylogenetic lineage and (B) the P. aeruginosa phylogenetic lineage. Cellvibrio japonicum Ueda107 was used as an outgroup. The bar indicates sequence divergence. Bootstrap values of more than 500 (from 1000 replicates) are indicated at the nodes.

Groups established

One lineage was represented by the P. aeruginosa lineage and the other was represented by the ‘fluorescent pseudomonas’ intrageneric group, the P. fluorescens lineage. One group and three additional strains were not included in these lineages: the P. oryzihabitans group (two species) and the type strains of P. luteola, P. pachastrellae and P. pertucinogena, characterized as the most phylogenetically distant from all other Pseudomonas (Fig. 5).

The P. aeruginosa lineage presented three groups: P. aeruginosa (bootstrap 623); P. oleovorans (bootstrap 623); and P. stutzeri (bootstrap 535). Six groups could be distinguished within the P. fluorescens lineage: the groups of P. fluorescens (bootstrap 776); P. syringae (bootstrap 598); P. lutea (bootstrap 921); P. putida (bootstrap 1000); P. anguilliseptica (bootstrap 696); and P. straminea (bootstrap 696).

Some groups were composed of several subgroups (Fig. 5). The P. fluorescens group was made up of the following nine SG: P. fluorescens (20 species); P. gessardi (five species); P. fragi (four species); P. mandelii (four species); P. jessenii (six species); P. koreensis (two species); P. corrugata (five species); P. chlororaphis (three species); and P. asplenii (two species). The rest of groups in the P. fluorescens lineage do not have subgroups. The type strain P. agarici is topologically located in this group but does not specifically belong to any subgroup. There were 12 species in the P. syringae group, seven in the P. putida group, five in the P. anguilliseptica group and three in the P. straminea group (Fig. 5A). Pseudomonas rhizospherae was located in one branch that was distant from the other type strains. All type strains are listed in Table 2, and the number of type strains of each group and subgroup is listed in Table 3.

Table 2.  Bacterial strains used in this study.
SpeciesStrainaReferencebSpeciesStrainaReferenceb
  • a. 

    The strain number collection indicated in bold refers to strains sequenced in this study.

  • b. 

    These references are cited in http://www.bacterio.cict.fr (Euzéby, 1997).

  • c. 

    The gyrB gene was amplified with the primers gyrBBAUP2/APrU.

  • d. 

    The rpoD gene was amplified with the primers V4/LAPS27.

  • e. 

    Specific amplified band was cloned for the gyrB gene (e1) or the rpoD gene (e2).

  • f. 

    The gyrB gene was amplified with the primers gBMM1F/gBMM725R, specifically designed in this study.

P. abietaniphilacATCC 700689TMohn et al. (1999)P. marincolacJCM 14761TRomanenko et al. (2008)
P. aeruginosaATCC 10145TSchroeter (1872)P. mediterraneaCFBP 5447TCatara et al. (2002)
P. agariciATCC 25941TYoung (1970)P. meliaeCCUG 51503TOgimi (1981)
P. alcaligenesATCC 14909TMonias (1928)P. mendocinaATCC 25411TPalleroni (1970)
P. alcaliphilaLMG 23134TYumoto et al. (2001)P. meridianaCIP 108465TReddy et al. (2004)
P. amygdaliLMG 1384TPsallidas and Panagopoulos (1975)P. migulaeCCUG 43165TVerhille et al. (1999)
P. anguillisepticaLMG 21629TWakabayashi and Egusa (1972)P. mohniiCCUG 53115TCámara et al. (2007)
P. antarcticaLMG 22709TReddy et al. (2004)P. monteiliiDSM 14164TElomari et al. (1997)
P. argentinensisLMG 22563TPeix et al. (2005)P. mooreiCCUG 53114TCámara et al. (2007)
P. aspleniiATCC 23835TArk and Tompkins (1946)P. moraviensiscDSM 16007TTvrzováet al. (2006)
P. aurantiacaATCC 33663TNakhimovskaya (1948)P. mosseliiATCC BAA-99TDabboussi et al. (2002)
P. aureofaciensLMG 1245TKluyver (1956)P. mucidolensLMG 2223TLevine and Anderson (1932)
P. avellanaecCIP 105176TJanse et al. (1997)P. nitroreducensATCC 33634TIizuka and Komagata (1964)
P. azotifigensdDSM 17556THatayama et al. (2005)P. oleovoransLMG 2229TLee and Chandler (1941)
P. azotoformansLMG 21611TIizuka and Komagata (1963)P. orientalesDSM 17489TDabboussi et al. (2002)
P. balearicaDSM 6083TBennasar et al. (1996)P. oryzihabitansc,e1LMG 7040TKodama et al. (1985)
P. borboric,dLMG 23199Vanparys et al. (2006)P. otitidiscDSM 17224TClark et al. (2006)
P. brassicacearumCFBP 11706TAchouak et al. (2000)P. pachastrellaeCCUG 46540TRomanenko et al. (2005a)
P. brenneriDSM 15294TBaïda et al. (2002)P. palleronianaLMG 23076TGardan et al. (2002)
P. cannabinaLMG 5096TGardan et al. (1999)P. panacisCIP 108524TPark et al. (2005)
P. caricapapayaeLMG 2152TRobbs (1956)P. panipatensiscCCM 7469TGupta et al. (2008)
P. cedrinaDSM 17516Tcorrig. Dabboussi et al. (2002)P. parafulvaDSM 117004TUchino et al. (2002)
P. chlororaphisATCC 9446TGuignard and Sauvageau (1894)P. pelicLMG 23201TVanparys et al. (2006)
P. cichoriie1ATCC 10857TSwingle (1925)P. pertucinogenacLMG 1874TKawai and Yabuuchi (1975)
P. citronellolisLMG 18378TSeubert (1960)P. plecoglossicidaCIP 106493TNishimori et al. (2000)
P. congelanscLMG 21466TBehrendt et al. (2003)P. poaeLMG 21465TBehrendt et al. (2003)
P. corrugataATCC 29736TScarlett et al. (1978)P. proteolyticaCIP 108464TReddy et al. (2004)
P. costantiniiLMG 22119TMunsch et al. (2002)P. pseudoalcaligenesATCC 17440TStanier (1966)
P. cremoricolorataDSM 17059TUchino et al. (2001)P. psycrhophylaDSM 17535TYumoto et al. (2002)
P. extremorientalisdLMG 19695TIvanova et al. (2002)P. psycrhotoleransLMG 21977THauser et al. (2004)
P. ficuserectaeCCUG 32779TGoto (1983)P. putidaATCC 12633TTrevisan (1889)
P. flavescensLMG 18387THildebrand et al. (1994)P. reinekeiCCUG 53116 TCámara et al. (2007)
P. fluorescensATCC 13525TMigula (1895)P. resinovoransLMG 2774TDelaporte et al. (1961)
P. fragiATCC 4973TEichholz (1902)P. rhizosphaeraeLMG 21640Peix et al. (2003)
P. frederiksbergesisLMG 19851TAndersen et al. (2000)P. rhodesiaec,e1,e2LMG 17764TCoroler et al. (1997)
P. fulvaLMG 11722TIizuka and Komagata (1963)P. salomoniiLMG 22120TGardan et al. (2002)
P. fuscovaginaeLMG 2158TMiyajima et al. (1983)P. savastanoiLMG 2209TJanse (1982)
P. gessardiCIP 105469TVerhille et al. (1999)P. simiaecCCUG 50988TVela et al. (2006)
P. graminisLMG 21611TBehrendt et al. (1999)P. stramineaLMG 21615Tcorrig. Iizuka and Komagata (1963)
P. grimontiiCIP 106645TBaïda et al. (2002)P. stutzeriATCC 17588TLehmann and Neumann (1896)
P. guineaecLMG 24016TBozal et al. (2007)P. synxanthaLMG 2335TEhrenberg (1840)
P. indicaLMG 23066TPandey et al. (2002)P. syringaeATCC 19310TVan Hall (1902)
P. jesseniiCIP 105274TVerhille et al. (1999)P. taetrolensLMG 2336THaynes (1957)
P. jinjuensisfLMG 21316TKwon et al. (2003)P. thermotoleransCIP 107795TManaia and Moore (2002)
P. kilonensisDSM 13647TSikorski et al. (2001)P. thivervalensisdCFBP 11261TAchouak et al. (2000)
P. knackmussiidLMG 23759TStolz et al. (2007)P. tolaasiiATCC 33618TPaine (1919)
P. koreensisLMG 21318TKwon et al. (2003)P. tremaeLMG 22121TGardan et al. (1999)
P. libanensisCIP 105460TDabboussi et al. (1999)P. trivialisLMG 21464TBehrendt et al. (2003)
P. liniCIP 107460TDelorme et al. (2002)P. umsongensisLMG 21317TKwon et al. (2003)
P. lundensisLMG 13517TMolin et al. (1986)P. vancouverensisATCC 700688TMohn et al. (1999)
P. luteaLMG 21974TPeix et al. (2004)P. veroniiLMG 17761TElomari et al. (1996)
P. luteolaLMG 21607TKodama et al. (1985)P. viridiflavaATCC 13223TBurkholder (1930)
P. mandeliiLMG 2210TVerhille et al. (1999)P. xanthomarinaCCUG 46543TRomanenko et al. (2005b)
P. marginalisATCC 10844TBrown (1918)   
Table 3.  Intra-group and inter-group minimum similarities (in percentage) between Pseudomonas type strains.
Group or subgroup (G/SG)Number of speciesFour concatenated genesThree concatenated genesFour consensus genesThree consensus genesFour concatenated genes
% Minimum similarity within the G/SG% Minimum similarity within the G/SG% Minimum similarity within the G/SG% Minimum similarity within the G/SGIntra-group mean value (%) ± standard deviationClosest group or subgroup
P. fluorescens SG2092.9093.0191.8392.9195.48 ± 1.70P. mandelii SG
P. gessardi SG
P. gessardi SG594.7494.5694.0394.7597.56 ± 2.00P. fluorescens SG
P. fragi SG493.3594.5192.4294.4296.59 ± 3.07P. fluorescens SG
P. mandelii SG
P. mandelii SG495.8695.4595.2495.2297.93 ± 1.82P. jessenii SG
P. jessenii SG695.4695.4594.5894.5797.28 ± 1.82P. mandelii SG
P. koreensis SG296.5496.7795.9695.9698.27 ± 2.44P. jessenii SG
P. corrugata SG595.0695.1494.0894.0897.47 ± 2.06P. jessenii SG
P. chlororaphis SG
P. chlororaphis SG398.0598.2597.8998.7699.17 ± 0.96P. corrugata SG
P. asplenii SG299.2799.4299.1999.1999.63 ± 0.52P. chlororaphis SG
P. syringae G1291.0090.8689.7489.7395.48 ± 2.80P. chlororaphis SG
P. mandelii SG
P. fluorescens SG
P. lutea G390.0690.5188.3688.3695.51 ± 4.97P. fluorescens SG
P. mandelii SG
P. putida G791.1990.0189.2089.2394.66 ± 3.32P. asplenii SG
P. anguilliseptica G587.2786.9784.8284.592.98 ± 5.48P. oleovorans G
P. straminea G
P. straminea G392.3891.8091.0691.0596.72 ± 3.72P. oleovorans G
P. aeruginosa G1186.4285.5484.1089.2190.63 ± 4.52P. oleovorans G
P. oleovorans G493.9594.4793.1293.197.30 ± 2.83P. straminea G
P. aeruginosa G
P. stutzeri G488.2286.2883.7783.6992.83 ± 6.30P. oleovorans G
P. aeruginosa G
P. oryzihabitans G294.5597.1994.6594.6397.28 ± 3.85P. aeruginosa G
P. oleovorans G

Distances between Pseudomonas type strains

In most cases, the groups were not only well defined in tree topology and in bootstrap values but also in similarities between strains in the group. Minimal intra-group similarity, calculated with the concatenated sequences of three and four genes, are represented in Fig. 6 for the 18 Pseudomonas groups. In any of the groups, relevant differences were observed between both sets of genes (average difference was 0.47). Phylogenetic similarities (in percentage) between species within groups ranged from 99.3 to 85.4 (four concatenated genes) or from 99.4 to 85.5 (three concatenated genes), depending on the number of species and the diversity of the group. Table 3 shows the minimal similarity between species in each group. Pseudomonas pachastrellae, P. pertucinogena or P. luteola displayed the lowest similarities of the entire genus (77–80%). The maximum similarity between two type strains was 99.8%. The minimum similarity among all members of the Pseudomonas genus was 77%, and the closest related genus was Cellvibrio. Cellvibrio japonicus had a similarity of 69%.

Figure 6.

Minimal similarity in percentage within the subgroups of Pseudomonas type strains in the analysis of the concatenated three genes (inline image) and four genes concatenated (inline image).

Nucleotide diversity

The nucleotide variability of the genes in the complete set of strains and in each of the established groups was reflected as the following: percentage of monomorphic and polymorphic sites, nucleotide diversity and dN/dS values. Comparisons were performed with all strains studied (107), both by group or subgroups, and pair-wise (Table 1).

The Pseudomonas groups that exhibited the highest values of nucleotide diversity in the gyrB, rpoB and rpoD genes consistently belonged to P. stutzeri, P. aeruginosa and P. anguilliseptica groups, mainly in the three genes, and P. oryzihabitans group in the rpoB gene (these data are in accordance with the concatenated tree of four genes).

Nucleotide diversities were calculated for the 107 strains compared with the type strain of P. aeruginosa, the type species of the genus. For the rpoB gene, the highest value was with the type strain P. luteola; for rpoD and gyrB, P. pertucinogena was the type strain with the highest nucleotide diversity.

Intra-species genomic analysis

Seventeen complete genomes of Pseudomonas non-type strains (four P. aeruginosa, three P. fluorescens, four P. putida, three P. syringae, one P. mendocina, one ‘P. entomophila’ and one P. stutzeri) were retrieved from the NCBI genome databases, and the intra-species similarities were calculated for the four selected genes and compared with the respective type strain of the species. The four strains of P. aeruginosa (PAO1, LESB58, UCBPP-PA14N and PA7) were clearly affiliated with the P. aeruginosa type strain (99.4–99.9%). The three P. syringae strains were affiliated with the P. syringae SG. Pseudomonas syringae B728a (pv. syringae) was affiliated with the P. syringae type strain (98.7%). Pseudomonas syringae DC3000 (pv. tomato) and P syringae 1448A (pv. phaseolicola) were affiliated with the P. syringae group, although in a branch close to P. avellanae (98.7%) and P. ficuserectae type strains (96.1%) (Fig. S1).

Pseudomonas fluorescens Pf-5 clustered in the SG P. chlororaphis (94.4% P. aureofaciens), and PF0-1 clustered in the SG P. koreensis (P. koreensis 96.5%). Pseudomonas mendocina ymp and P. stutzeri A1501 were close to the corresponding type strains, and ‘P. entomophila’ L48 was close to P. mosselii (95.7%). The four P. putida strains were included in the P. putida branch, but one of them (W619) was closely affiliated with P. plecoglossicida (94.7%) in a different and distant branch of the P. putida type strain, and the other three strains (F1, GB1, KT2440) were closely affiliated with P. monteilli (96.7–96.8%).

Genomic analysis

blast analyses of the Pseudomonas complete genomes are indicated in Table S2. The species P. aeruginosa, P. syringae and P. putida formed three well-defined groupings, with at least 70% DNA conservation. Pseudomonas fluorescens strains were more diverse. The highest level of DNA conservation between strains of two different species was 47%.

Average nucleotide identity based on blast (ANIb) values showed a clear cut between 79% and 84% (Table S2). Comparisons between strains assigned to different species were always below 79%, and comparisons between strains of the same species were higher than 84%, with one exception, the pair P. fluorescens Pf-5 and Pf0-1. However, the strains were located in different branches of the phylogenetic tree and their identifications may need to be revised. A second clear cut was detected between 63% and 73% in ANIb. When the Pseudomonas strains were compared with C. japonicus, all of the ANIb values were below 67%. The lowest value for Pseudomonas comparisons was 73%, clearly separating both genera. Azotobacter vinelandii was located in the lower borderline of the Pseudomonas species (73–77%), but the close taxonomic position between both genera has been pointed out recently (Rediers et al., 2004).

Average nucleotide identity based on MUMmer (ANIm) values was very similar to ANIb for comparisons between strains assigned to the same species but did not clearly distinguish species of the genera Pseudomonas from Azotobacter or Cellvibrio, which are equally distant from each other and from Pseudomonas species (Table S3).

Calculation of tetra nucleotide frequencies (TETRA analysis) demonstrated the inclusion of A. vinelandii within the Pseudomonas species (Table S3). Comparisons between strains of the same species were higher than 99%, with the only exception, again, being the pair P. fluorescens Pf-5 and Pf0-1. The highest value observed between strains of different species was 97%.

Phylogenetic distances in the analysis of the four concatenated genes were compared with all indices calculated in the whole genome analyses. The gap between species that was detected in the concatenated analysis (90–93%) corresponded to approximately 50–60% in the blast analysis of whole genomes and to 79–84% in the ANIb analysis. Figure 7 shows, for example, the correlation between ANIb and the phylogenetic distances measured with the four concatenated genes. Except for the strains Pf-5 and Pf0-1, the comparisons correlated very well. Three squares indicate the levels of similarity in the comparisons. Square A, delimited by the coordinates ANIb (93–100%) and phylogenetic similarity (97–100%), includes comparisons of strains of the same species that share at least 83% conserved DNA. Square B (ANIb 84–93%; similarity 93–97%) includes comparisons between strains of pathovars of P. syringae and between strains assigned to P. putida, strains GB1 or W619, both located in a separate branch from P. putida type strain in the phylogenetic tree. The range of conserved DNA was 50–62%. Rectangle C (ANI 73–79%; similarity 82–90%) includes comparisons between strains of different Pseudomonas species, sharing less than 47% conserved DNA. A clear cut can be seen between ANIb values of 79–84% and similarities of 90–93%, with the only exception being the comparisons between strains Pf-5 and Pf0-1.

Figure 7.

Plotted values of ANIb versus phylogenetic distance calculated for four genes in the 16 Pseudomonas strains for which the complete genome is known.

Discussion

Previous work on Pseudomonas intra-generic phylogenetic relationships (Moore et al., 1996; Anzai et al., 2000) considered only the 16S rRNA gene. Although this is a powerful tool for genus assignments, it does not discriminate sufficiently at the inter-species level (Yamamoto et al., 2000). The high degree of conservation of the rrs gene (an advantage for its universality) led to a small number of informative sites in its sequence. Its utility has been questioned because of its intra-species heterogeneity, and it often fails to reveal precise and statistically supported phylogeny at the species level (Tayeb et al., 2005). Many arguments support the utility of other housekeeping core genes (Konstantinidis et al., 2006) for phylogenetic studies: they are a class of highly expressed, highly conserved, protein-encoding genes that exhibit a high degree of codon bias. These genes evolve more slowly than typical protein-coding genes but more rapidly than rRNA genes (Tayeb et al., 2005). It has been proposed that an MLSA of housekeeping core genes provides an accurate approach for the phylogenetic analysis of the genus Pseudomonas. It is an adequate methodology that is easy to handle, quick and highly reproducible. It has even been considered as a possible adequate methodology for substituting DNA–DNA hybridization procedures (Maiden et al., 1998; Gevers and Coenye, 2007; Richter and Rosselló-Móra, 2009). The main purpose of this article is to use a multi-genic approach to give an actual view of the phylogenetic interrelationships of such a complex genus. This need becomes apparent because of the high number of new Pseudomonas species described each year. The first step is to know the organization of the Pseudomonas type strains; this can be used as a backbone to incorporate new isolates within the framework of the type strains. For many studies, an updated analysis covering all species in the genus is needed for the facilitation of the phylogenetic ascription of new isolates to previously identified species or for the description of novel species.

The selection of the genes to be studied can be envisioned under two points of view. One possibility is to look for universally distributed genes that are present in all bacteria; another is to select a set of genes that can be used within all strains of a particular species, genus or family. The reason behind this thinking is that genes that are informative for a given genus or family may not be useful or even present in other taxa (Konstantinidis and Tiedje, 2005a; Gevers and Coenye, 2007). Konstantinidis and colleagues (2006) indicate the lack of a robust and highly accurate method of measurement that can be used as a reference standard with which to compare the informativeness of every gene in the genome and hence, to identify the best-performing genes for phylogenetic purposes. Therefore, the selection of genes for the present study was based empirically on successful studies previously performed with Pseudomonas (Yamamoto and Harayama, 1998; Yamamoto et al., 2000; Hilario et al., 2004; Tayeb et al., 2008) or with species of this genus, such as P. stutzeri (Cladera et al., 2004), in which new genomovars have been predicted and described (genomovar 10 and genomovar 19) (García-Valdés et al., 2003; Mulet et al., 2008). The genes have been also tested for members of the other genomovars (gv11–gv18) (Mulet et al., 2008) and have also been useful for the reclassification of P. chloritidismutans as a junior name of P. stutzeri (Cladera et al., 2006a) and for the phylogenetic affiliation of previously misclassified strains (Cladera et al., 2006b). The usefulness of another gene, oprF, has been tested in nine Pseudomonas species by Bodilis and Barray (2006).

Aside from the 16S rRNA genes, the theoretical selection of the other three genes (gyrB, rpoD and rpoB) is supported by studies of Pseudomonas and other genera. When several proteins were studied in 175 completely sequenced genomes, RpoB and GyrB proteins, together with FusA (GTP-binding protein chainelongation factor EF-G), tRNA synthethases (IleS) and RecA (recombination protein), were the markers that showed considerable robustness (Konstantinidis and Tiedje, 2005a). Therefore, rpoB has been postulated as a discriminating gene that could be used for routine identification of Pseudomonas laboratory isolates (Tayeb et al., 2005) or clinical isolates (Adékambi et al., 2009), as well as for other genera, such as Microbacterium or Aureobacterium (Richert et al., 2007).

Additionally, rpoD fulfils the requirements for an MLSA marker: it is a housekeeping, protein-coding gene; it is present in one copy in Pseudomonas; it is not transferred laterally; it has a high discriminative power (eight times higher than rpoB); it is widely distributed; and it is long enough to contain significant information but short enough to allow convenient sequencing. rpoD contains an appropriate amount of phylogenetic information (resolution) and is neither too conserved nor too variable. In this article, we compared pairs of similarity values between strains (4000 pairs) of rpoD with gyrB and rpoB with the 16S rRNA gene; the ratio of discriminatory power was 2:4:16. The rpoD gene shows the highest range of similarities (46–99%) throughout the genus Pseudomonas.

Our results clearly demonstrate that the analysis of three concatenated genes (16S rRNA, gyrB and rpoD) is sufficient for a reliable phylogenetic analysis of the genus. The inclusion of rpoB may be necessary in some cases, but it does not improve resolution when discriminating type strains. Konstantinidis and colleagues (2006) pointed out that three genes could also be adequate for the analyses of Escherichia coli, Salmonella, Burkholderia and Shewanella groups when compared with their whole genomes. We have also demonstrated for P. stutzeri a good correlation between similarity indices in the comparisons of six genes (16S rRNA gene, rpoD, gyrB, nosZ, nirS and nahH) (Cladera et al., 2004) and three genes (16S rRNA gene, rpoD, gyrB) (Mulet et al., 2008).

A test was performed to determine whether the addition of more genes would improve resolution. Phylogenetic trees were constructed with the available genes of 13–15 Pseudomonas type strains, including the genes carA, atpD and recA, which were previously published by Hilario and colleagues (2004). These genes were added consecutively to the concatenated sequences, and three (2870 nt), four (3788 nt), five (4414 nt), six (5190 nt) and seven genes (5790 nt) were compared (data not shown). The topology of the trees was the same in all cases and followed the guidelines of the concatenated tree of three genes, with the same sublineages and distribution. The bootstrap values are comparable (greater than 700).

On the other hand, some studies indicate that the analysis of even one carefully selected gene may be equal to or even surpass the power of DNA–DNA hybridization in assigning related bacterial isolates to a species (Zeigler, 2003); Adékambi and collaborators support this argument for the rpoB gene (Adékambi et al., 2009).

Coherent phylogenetic groups and subgroups have been defined by comparing the individual genes selected, the consensus and concatenated phylogenetic gene trees, and the bootstrap values in each branch. The Jukes-Cantor and neighbour-joining methods were selected as the tree-building method. These methods are routinely used in Pseudomonas taxonomic studies, and their accuracy has been widely demonstrated (Yamamoto et al., 2000; Hilario et al., 2004; Sarkar and Guttman, 2004; Frapolli et al., 2007; Mulet et al., 2008).

Groups and subgroups were considered to be stable when their members were maintained independently of the gene or clustering method used. Under these considerations, the P. fluorescens subgroup is composed of 20 species (almost 20% of the Pseudomonas type strains analysed), although it conforms to a clear, stable complex with monophyletic branches. The intra-group minimal similarity value was 92.9% between the 20 members, while in other groups, such as the P. straminea group with only three members, the minimal similarity value of 92.4% was similar. This demonstrated that diversity within groups can vary enormously. Minimal similarity in the P. stutzeri group, which has four species, was 88.2%. This value was maintained when 45 P. stutzeri strains were included in the analysis (Mulet et al., 2008). The addition of new P. stutzeri strains did not modify the topology of the Pseudomonas tree. The genetic diversity of each one of the groups was studied, and the results showed that the P. aeruginosa lineage included groups with high values of nucleotidic diversity. Pseudomonas anguilliseptica group is the only in the P. fluorescens lineage with high nucleotide diversity.

As stated by Spiers and colleagues (2000), the extraordinary phenotypic and genetic diversity within Pseudomonas showed no definite pattern of distribution that could precisely define any of the lineages. In fact, we attempted to find specific phenotypic traits that could be characteristic of and differentiate between groups, but our attempt was not successful. For this reason, a meaningful account of the evolutionary history of the genus Pseudomonas needs to be well supported by robust inferred phylogenies based on molecular data (Hilario et al., 2004).

The possibility to sequence complete genomes represents a revolution in bacterial phylogenetic and evolutionary studies and, consequently, in bacterial taxonomy. As of September 2009, 17 strains of species in the genus Pseudomonas have been sequenced and 38 are in progress or are in the annotation process. This new tool will allow greater knowledge of the genus and will most likely offer higher accuracy compared with traditional methods. For many routine microbiological studies, however, simplified methods are needed; therefore, at least for some time, both whole genome sequencing and MLSA approaches will coexist. Although each Pseudomonas genome typically contains approximately 5000 genes, only about half (c. 2000 gene families) are conserved across all Pseudomonas genomes that have been sequenced to date (Jensen et al., 2004). In the present study, we compared two types of sequence-based approaches, a multi-genic sequence analysis and several complete genome comparisons to assess the congruency of both methods. An important issue to keep in mind is that the strains, for which whole genomes have been sequenced, are not type strains and their taxonomic identifications have been performed mainly with traditional methods; when new species have been discovered, the taxonomy of the sequenced strains was not revised. It will be worth in the near future to have sequenced the type strains of the most important species in the genus, at least the representatives of each phylogenetic group, because the type strains are the taxonomical representatives of their species. Konstantinidis and Tiedje (2005a) point out the problem of representative strains that have been assigned to a species without a comparison to the type strain of the species; hence, their species designation may be questionable. None of the 17 completely sequenced Pseudomonas strains is a type strain. When analysed phylogenetically, all were included into the corresponding group, although in several cases (eight strains of P. fluorescens, P. syringae and P. putida), they do not affiliate with the corresponding type strain of the species in the phylogenetic tree. In another case, ‘P. entomophila’ has not been described formally as a new species, i.e. the species name is not validly published and, thus, has no standing in nomenclature. The taxonomy of these species will most likely need to be revised in the near future (Bossis et al., 2000). Silby and colleagues (2009) concluded in their study regarding the whole genome comparisons that the genomic heterogeneity detected among the three strains of P. fluorescens is reminiscent of a species complex rather than of a single species. In fact, Goris and colleagues (2007) demonstrated that strains Pf-5, Pf0-1 and SBW25 do not belong to the same species.

This is not the situation for P. aeruginosa strains. Pseudomonas aeruginosa is a clear, compact, defined species with no biovar, pathovar, serovar or genomovar, although recently, new marine ecotypes of P. aeruginosa have been found and analysed with an MLST approach (Khan et al., 2008). Pseudomonas aeruginosa is the ideal type species for the genus because of its relative phenotypic and genotypic homogeneity. The similarity between P. aeruginosa strains ranged from 99.4% to 99.8%.

There have been many attempts to establish clear species boundaries, or ‘cut-offs’, between bacterial species, based on bioinformatics analysis of comparisons of whole genomes or phylogenetic similarities between strains. The recognized 70% genomic DNA–DNA similarity by direct hybridization is the experimental level that is accepted for species circumscriptions in bacterial taxonomy. Conserved DNA measured with blast in the 16 sequenced genomes determines a species boundary of approximately 80%. This corresponds to a phylogenetic distance of 97% in the multi-genic analysis. ANIb is very well correlated with the phylogenetic distance calculated for the selected housekeeping genes, especially when strains of the same species are compared (R2 = 0.953), indicating that the four genes represent the whole genome very well. A 94–95% similarity in the ANIb has been proposed as the minimal value between strains of the same species (Konstantinidis and Tiedje, 2005b; Richter and Rosselló-Móra, 2009). The limit is 93% in the ANIb for P. aeruginosa and P. putida strains, which corresponds to 97% in the MLSA analysis for the genus Pseudomonas. In our study, a second clear cut was found at 84% ANIb, corresponding to a phylogenetic distance of 93%, above which only comparisons of strains of the same phylogenetic group were included. They were clearly distinguished from members of other phylogenetic groups, as described in the phylogenetic trees, confirming again the consistency of the established groups. ANIm analysis seems to also be useful in Pseudomonas for species discrimination. The strains that were compared in a pair-wise manner and located in square B of Fig. 7 might be considered by some taxonomists to be members of the same species but not by others, because species boundaries in bacteriology are not fixed and should be adapted to the groups of organisms under consideration (e.g. pathovars of P. syringae or genomovars of P. stutzeri). The taxonomic status of the P. syringae strains has been discussed by several authors who propose consideration of the pathovars as different genospecies (Gardan et al., 1999); they are on the borderline of the ANIm species discrimination. The TETRA values are very homogeneous between species in the genus, also including A. vinelandii, which has been proposed to be included in the Pseudomonas genus (Rediers et al., 2004). However, C. japonicus is clearly separated. Therefore, TETRA may be a good candidate for genus circumscription.

As a concluding remark, we consider that the MLSA approach to be a good tool for studying the phylogeny of the genus Pseudomonas, as well as for ascribing novel strains to known species. Our proposal is a gradual approximation to infer whether a new isolate belongs phylogenetically to the genus Pseudomonas and to a precise species within the genus. The first step is the sequencing of the 16S rRNA gene with universal primers, which allows determination of the location of the strain under study in the genus. A second step would be the sequencing of the rpoD and gyrB genes, with the combined distance matrix permitting clarification as to which group or subgroup it should be included. For some groups, such as the P. oryzihabitans, P. oleovorans, P. asplenii, P. koreensis and P. fragi groups, analysis of the rpoB sequence may be needed to reach a higher resolution. We will further increase and continuously update our PseudoMLSA database, which currently contains more than 1928 entries, to contribute, to an accurate phylogeny of the genus and the location of novel strains under study.

Experimental procedures

Bacterial strains

One hundred and seven Pseudomonas type strains described until 2008 were used in this study and are compiled into a taxonomic database (http://www.bacterio.cict.fr) (Euzéby, 1997). These strains are listed in Table 2. Strains were obtained from culture collections or from the authors who described the species. Strains were grown in appropriate media and temperatures according to the recommendations of the culture collections.

DNA extraction, PCR amplification and DNA sequencing conditions

DNA extraction procedures were previously described by Cladera and colleagues (2004). PCR amplification and DNA sequencing conditions for 16S rRNA and gyrB gene primers were described previously in Cladera and colleagues (2004), with some modifications: for the gyrB primers UP-1E/APrU, an initial hot-start step at 96°C for 15 min was added (92 strains). A new set of primers was used when the UP-1E/APrU primers did not amplify the DNA samples: BAUP2/APrU (14 strains) and gBMM1F/gBMM725R (one strain). The rpoB primers LAPS5/LAPS27, described in Tayeb and colleagues (2005), were used routinely for 102 strains. In the remaining five strains, a new set of primers (VIC4/LAPS27) was used. The PCR conditions used for this combination of primers for the gyrB and rpoB genes were the same as for the rpoD primers PsEG30F/PsEG790R (Mulet et al., 2009). The rpoD primer set (PsEG30F/PsEG790R) amplified the 107 Pseudomonas strains tested. PCR conditions have been described in Mulet and colleagues (2009). Table 2 shows the strains in which different combinations of primers were used, and Table S4 lists the primers used for amplification.

The same primers were used for both amplification and sequencing reactions of 16S rRNA, gyrB, rpoD and rpoB genes. Only the gyrB amplicon required a specific pair of sequencing primers [M13(-21)/M13R] (Table S4).

Clone libraries and screening of the clones

A cloning procedure was used when more than one amplicon was obtained (four cases, Table 2). The pGEM®-T Easy Vector System (Promega, USA) was used to generate the clone library according to the manufacturer's instructions. The PCR product was ligated into the pGEM®-T vector and transformed into E. coli JM109 high-efficiency competent cells. Library screening has been described previously in Mulet and colleagues (2009).

Sequence analysis

The sequence analysis procedures have been previously described by Mulet and colleagues (2009).

Individual, consensus and concatenated trees

A series of individual trees from the 16S rRNA, gyrB, rpoB and rpoD partial gene alignments was generated. A consensus analysis of three (16S rRNA, gyrB, rpoD) and four (16S rRNA, gyrB, rpoD and rpoB) genes was performed (Mulet et al., 2008). Concatenated gene trees were constructed with individual alignments in the following order: 16S rRNA, gyrB, rpoD and rpoB. The length and nucleotide positions are in reference to P. aeruginosaT: 16S rRNA gene (X06684) has nucleotide positions from 1 to 1295, for gyrB gene (AB039386), the nucleotide positions are 43 to 843, for rpoD gene (AB039607), the nucleotide positions are 42 to 759 and for the rpoB gene (AJ717442), the nucleotide positions are 40 to 954. Concatenated sequences of three genes have a length of 2870 nucleotides, and the sequence of four genes is 3726 nucleotides long.

Nucleotide diversity and statistical analysis

Nucleotide diversity, number of monomorphic and polymorphic sites, dN/dS ratio, and G+C content from the sequenced fragments of the different genes were calculated with the DnaSP program (DNA sequence polymorphism), version 4.50.3 (http://www.ub.es/dnasp) (Rozas et al., 2003). These data were calculated for each of the type strains, referenced to the P. aeruginosa type strain, to each of the groups or subgroups established and to the complete set of 107 Pseudomonas type strains.

Whole genome comparisons

Conserved DNA.  Seventeen complete Pseudomonas genomes were retrieved from the NCBI database: P. aeruginosa LESB58 (NC_011770); P. aeruginosa PAO1 (NC_002516); P. aeruginosa PA7 (NC_009656); P. aeruginosa UCBPP-PA14 (NC_008463); P. entomophila L48 (NC_008027); P. fluorescens Pf0-1 (NC_007492); P. fluorescens SBW25 (NC_012660); P. fluorescens Pf-5 (NC_004129); P. mendocina ymp (NC_009439); P. putida F1 (NC_009512); P. putida KT2440 (NC_002947); P. putida W619 (NC_010501); P. putida GB-1 (NC_010322); P. stutzeri A1501 (NC_009434); P. syringae pv. phaseolicola 1448A (NC_005773); P. syringae pv. syringae B728a (NC_007005) and P. syringae pv. tomato DC3000 (NC_004578).

For each strain, a blastn (megablast) analysis was performed using the NCBI server against the whole nucleotide collection. The query coverage, which is per cent of the query length included in the aligned segments, was considered to be a measure of conserved DNA or percentage of homologous fragments between both genomes.

ANIb, ANIm and TETRA.  ANIb values were calculated using blast, as described by Goris and colleagues (2007). ANIm was calculated using the MUMmer software as indicated in Richter and Rosselló-Móra (2009). Tetranucleotide frequencies, correlation coefficients used and algorithm were described in Teeling and colleagues (2004). All calculations were implemented in the work package JSpecies (http://www.imedea.uib.es/jspecies).

Nucleotide sequence accession numbers

The nucleotide sequences determined in this study have been deposited into the PseudoMLSA database (http://www.uib.es/microbiologiaBD/Welcome.php) and EMBL database under the following accession numbers: gyrB gene, FN554166 to FN554233; rpoB, FN554726 to FN554765; rpoD, FN554447 to FN554518. All sequences used in this article are shown in Table S5.

Acknowledgements

The type strains of P. guineaeT and P. simiaeT were generous gifts from Dr Ma J. Montes and F. Fernández-Garayzábal. We are in indebted to P. De Vos, who helped with the first steps of this study. We thank A. Bennasar for his collaboration and advice regarding phylogenetic analysis. We are grateful to M. Richter and R. Rosselló-Móra for their useful comments and for their help with ANI and TETRA calculations. This work was supported by projects CGL 2006-09719, CGL 2008-03242/BOS from the CICYT (Spain) and FEDER funding. M.M. was the recipient of a predoctoral fellowship from the Plà Balear de Recerca i Desenvolupament Tecnològic de les Illes Balears (PRIB).

Ancillary

Advertisement