Nonagricultural reservoirs contribute to emergence and evolution of Pseudomonas syringae crop pathogens



  • While the existence of environmental reservoirs of human pathogens is well established, less is known about the role of nonagricultural environments in emergence, evolution, and spread of crop pathogens.
  • Here, we analyzed phylogeny, virulence genes, host range, and aggressiveness of Pseudomonas syringae strains closely related to the tomato pathogen P. syringae pv. tomato (Pto), including strains isolated from snowpack and streams.
  • The population of Pto relatives in nonagricultural environments was estimated to be large and its diversity to be higher than that of the population of Pto and its relatives on crops. Ancestors of environmental strains, Pto, and other genetically monomorphic crop pathogens were inferred to have frequently recombined, suggesting an epidemic population structure for P. syringae. Some environmental strains have repertoires of type III-secreted effectors very similar to Pto, are almost as aggressive on tomato as Pto, but have a wider host range than typical Pto strains.
  • We conclude that crop pathogens may have evolved through a small number of evolutionary events from a population of less aggressive ancestors with a wider host range present in nonagricultural environments.


When epidemic clones of bacterial crop pathogens emerge, they often spread around the world very quickly, causing significant economic damage. Recent examples include a clone of Pseudomonas syringae pv. aesculi, which causes the devastating bleeding canker disease of horse chestnut in Europe (Green et al., 2010), P. syringae pv. actinidiae, which causes a dramatic kiwifruit canker epidemic in Europe and New Zealand (Mazzaglia et al., 2012), and Xanthomonas arboricola strains that cause new diseases on various tree species in Europe (Hajri et al., 2012). The processes by which the ancestors of these pathogens turned into highly aggressive crop pathogens are still poorly understood and the routes and modes of their international spread are elusive.

For several human pathogenic bacteria, environmental reservoirs have well-defined roles in evolution and disease epidemiology. For example, the causal agent of Legionnaire's disease, Legionella pneumophila, is ubiquitous in freshwater, has coevolved with amoebae, and the mechanisms for its pathogenicity to humans appear to be derived from those to amoebae (Albert-Weissenberger et al., 2007). As another example, the type III secretion systems of Pseudomonas aeruginosa and Vibrio parahaemolyticus, once assumed to be only important for pathogenesis in humans, are likely to have important roles for survival in soil and sea water in the face of predation from protists (Matz et al., 2008, 2011). Also Vibrio cholerae, Escherichia coli and Salmonella enterica are well known for having environmental reservoirs associated with disease emergence and epidemiology, in which the genetic and phenotypic diversity exceeds by far that found in the human hosts (Winfield & Groisman, 2003; Vezzulli et al., 2010). In some pathogens, like V. cholerae, virulence genes that enable human pathogenesis are only present in certain epidemic clones but are absent from closely related environmental isolates (Faruque & Mekalanos, 2012). By contrast, in other pathogens such as S. enterica, a core repertoire of pathogenicity genes is present in the majority of strains (Jacobsen et al., 2011). Moreover, for some human pathogens, the relationship between epidemic clones that cause severe disease outbreaks and overall species diversity is well understood. For example, Neisseria meningitidis has a so-called epidemic population structure (Maynard Smith et al., 1993): genetically monomorphic epidemic clones emerge from a frequently recombining diverse population.

Surprisingly, the study of evolution and epidemiology of bacterial plant pathogens has so far focused almost exclusively on strains isolated from crops in agricultural environments (Morris et al., 2009). Only recently, characterization of bacteria belonging to the P. syringae species complex isolated from headwaters of rivers in North America, Europe, and New Zealand (Morris et al., 2010) and from precipitation, snowpack, and leaf litter (Morris et al., 2010; Monteil et al., 2012) has revealed an immense genetic diversity of P. syringae outside of agricultural habitats. These observations clearly illustrate that the P. syringae strains known as crop pathogens represent only a small fraction of the global P. syringae metapopulation and that P. syringae life history and evolution are inextricably connected to the freshwater cycle (Morris et al., 2008). Based on multilocus sequence typing (MLST) (Maiden et al., 1998), some of the environmental isolates were found to have the same haplotypes as strains isolated from crops (Morris et al., 2008). This supports earlier observations that precipitation and surface waters may be direct sources of inoculum for crop diseases (Riffaud & Morris, 2002).

However, the role of environmental diversity in the emergence and evolution of P. syringae crop pathogens has not yet been taken into consideration (Sarkar & Guttman, 2004; Yan et al., 2008; Cai et al., 2011a,b). Hence, it is not surprising that the population structure of P. syringae appeared endemic and clonal when analyzing only strains from diseased crops (Sarkar & Guttman, 2004). Our hypothesis is that highly aggressive host-specialized P. syringae crop pathogens may have evolved from less aggressive ancestors with broader host range adapted to diverse plant communities typical of preagricultural times (and even today's nonagricultural environments). This scenario is similar to what has been proposed for fungal and oomycete pathogens (Stukenbrock & McDonald, 2008). However, no evidence for such a scenario was found for P. syringae when analyzing crop pathogens alone (Cai et al., 2011b).

Here we compare P. syringae pv. tomato (Pto), an intensively studied crop pathogen (Buell et al., 2003), with closely related P. syringae strains isolated from snowpack and pristine creek headwaters in terms of evolutionary relationships, virulence gene repertoires, host range, and aggressiveness both to test the hypothesis that host-specific P. syringae crop pathogens emerged from a diverse environmental population, and to investigate the evolutionary mechanisms at the basis of pathogen emergence and crop adaptation. In support of our hypothesis, we find that the close environmental relatives of Pto constitute a large and genetically diverse population. While some of these environmental strains are indistinguishable from some Pto strains, others share alleles with different crop pathogens at different loci, indicating that recent ancestors of current crop pathogens and environmental strains recombined. Moreover, environmental strains are equipped with repertoires of virulence genes necessary for plant pathogenesis that are very similar to those of crop pathogens. Nonetheless, these strains are less aggressive on tomato than the most aggressive Pto strains. Based on these results, we propose not only that compartments of the water cycle and other nonagricultural environments constitute a reservoir of current crop pathogens but also that today's highly aggressive and host-specific crop pathogens may have emerged from a frequently recombining environmental population through a small number of evolutionary events.

Materials and Methods

Bacterial strains

Environmental strains were isolated from samples described previously (Morris et al., 2010), except for CSZ0223, which was isolated in a separate study. All samples are described in Supporting Information Table S1 and the strains chosen for in-depth characterization are listed in Table 1. All but one of the crop strains, CFBP 5524, were also described previously (Cai et al., 2011b). Strain CFBP 5524 is the P. syringae pv. spinaceae pathotype strain and was obtained from the French Collection of Plant Associated Bacteria (CFBP) (Angers, France).

Table 1. Pseudomonas syringae strains used in this study were collected between 2007 and 2010 in the indicated geographic locations and from the indicated substrates and showed high identity to Pseudomonas syringae pv. tomato strains T1 (PtoT1), DC3000 (PtoDC3000), the P. syringae pv. spinaceae pathotype strain or the P. syringae pv. apii pathotype strain based on the gyrB and cts genes
SiteLatitude, longitude, altitude, substrate   P. syringae population sizeaReferenceStrain% DNA identity in gyrB to strainMLSTBulk genome sequencing
  1. a

    Population size expressed in log(CFU l−1) of water or snowmelt.

  2. b

    PT, pathotype.

France, Alpes-de-Haute-Provence CountyCol-de-Vars meadow44°32′12″N, 06°42′07″E2100 mSnow2.75Monteil et al. (2012)CCV0611100% PtoT1YesNo
Sauze creek, east branch44°20′40″N, 6°40′54″E2100 mWater4.32Morris et al.(2010)SZ00399.8% spinaceae PTbYesYes
      SZ014100% PtoT1YesYes
      SZ01599.8% spinaceae PTNoYes
Sauze creek, source44°20′11″N, 6°40′15″E2500 mWater1.85Morris et al.(2010)SZ13599.8% apii and PtoT1YesYes
Soudane creek, downstream44°21′14″N, 06°42′05″E1600 mWater3.80This studyCSZ022399.8% apii and PtoT1YesYes
Super Sauze meadow44°20′40″N, 06°41′57″E2000 mSnow5.32Monteil et al. (2012)CSZ0292100% apii PTYesYes
      CSZ0295100% apii PTNoYes
   Snow2.01 CSZ0326100% apii PTNoYes
   Snow5.30 CSZ0914100% PtoT1YesYes
New Zealand, Central Otago, South IslandSchoolhouse Creek45°12′05.14″S677 mWater3.45Morris et al. (2010)AI001100% PtoDC3000YesYes
 168°59′09″E    AI056100% PtoDC3000NoYes
      AI088100% PtoDC3000NoYes
      AI103100% PtoDC3000YesNo

Primers, PCR and sequencing of PCR products

Most primers used for PCR have been described previously (Yan et al., 2008; Cai et al., 2011b). New primers were designed for the cheA1 locus (cheA1-F, GAAGCCCGTGAGCTGTTG; cheA1-R, CAAATCCGTCAGCGAGAAG), the cheA2 locus (cheA2-F, TTAGGGAGCACCCCATGA; cheA2-R, GCAA-CCCCGGATTCAAATA), and the fliC locus (fliC-F, GCCGG-AAGCCACGTAGTA; fliC-R, TCATATCCATGACCATCAC-CTC). DNA extraction, PCR, and sequencing were performed as previously described (Yan et al., 2008; Cai et al., 2011b).

Phylogenetic and population genetic analyses

Sequences were processed and uploaded to as previously described (Almeida et al., 2010). Evolutionary models for each individual gene fragment and for concatenated sequences were determined in jModelTest (Posada, 2008), applying the likelihood ratio (LR) test, and are listed in Table S4. Trees were constructed in MrBayes 3.1.2 (Huelsenbeck & Ronquist, 2001; Ronquist & Huelsenbeck, 2003). Trees based on concatenated gene sequences were constructed using the evolutionary model determined for the entire concatenated sequence (we call these trees ‘concatenated trees’). Consensus trees based on all sequenced genes or a subset of genes were constructed using partition-specific evolutionary models after creating partitions corresponding to each individual gene fragment. A total of 108 Markov Chain Monte Carlo (MCMC) iterations were used for the concatenated sequences (with a sampling frequency of one in every 5000 iterations) and 107 iterations were used for individual gene fragments (with a sampling frequency of one in every 2000 iterations). The burn-in was set to 25%. Convergence was evaluated by drawing a trace plot of recorded swap acceptance rates from the MCMC run and by visually determining that swap acceptance rates stabilized. Consensus trees were obtained in MrBayes for concatenated trees and for individual gene trees. The consensus trees based on individual gene trees were constructed in PAUP. To test phylogenetic congruence between trees, the Shimodaira–Hasegawa (SH) test (Shimodaira & Hasegawa, 1999) was performed in PAUP 4.0 ( Neighbor-joining trees were also constructed in PAUP 4.0.

Split decomposition analysis was performed using the Neighbornet method in SplitsTree 4 (Huson & Kloepper, 2005). The same evolutionary models used for plylogenetic tree construction and 10 000 bootstrapping replicates were applied.

MEGA 4.0 was used to calculate average pairwise genetic distances using the method of Jukes-Cantor (Tamura et al., 2007). The ratio of nonsynonymous to synonymous substitutions (dN/dS) was estimated in PAML (Yang, 2007). LDhat 2.1 (McVean et al., 2002; Auton & McVean, 2007) was used to estimate the population-scale mutation rate (θ) and the population-scale recombination rate (ρ), and Tajima's D. DnaSP v5.10.1 (Librado & Rozas, 2009) was used to determine the confidence limits of Tajima's D.

Plant infections

Plants were grown and inoculated as previously described (Yan et al., 2008; Cai et al., 2011b). Inoculations were repeated at least three times for each strain/plant combination.

Bulk genome sequencing and identification of sequences similar to type III effectors

Illumina sequencing was performed at the University of Exeter, Exeter, UK. Equal amounts of genomic DNA from each of the environmental isolates were pooled together and sequenced in a single lane on the Illumina HiSeq2000 using the manufacturer's standard protocols. A total of 48 148 804 paired-end 100 bp reads were generated and assembled using Velvet 1.1.04 (Zerbino & Birney, 2008). As the resulting assembly was a composite of 12 different genomes, sequence reads in the assembly could not be traced back to individual genomes, making it impossible to reliably detect the presence of truncated alleles. Therefore, all effector sequences found in the assembly were simply compared with the repertoire of effector sequences present in the two tomato pathogens T1 and DC3000 without considering whether an effector sequence represents a functional full-length gene or not.


P. syringae strains closely related to Pto are present in snowpack and pristine river headwaters

Pseudomonas syringae strains were isolated from precipitation, snowpack, leaf litter, and headwaters of creeks in France and New Zealand upstream of any agricultural activity (Table S1). Among these strains, 238 were randomly selected for sequencing of the cts locus (Sarkar & Guttman, 2004). As can be seen in the phylogenetic tree in Fig. S1, 70 of these strains (43 previously described by Morris et al. (2010) and 27 strains sequenced here for the first time) belong to the same phylogenetic group as the intensively studied and well-characterized Arabidopsis and tomato pathogen PtoDC3000 (Buell et al., 2003), the highly virulent host-specific Pto strain T1 (Cai et al., 2011a) and other closely related crop pathogens (Yan et al., 2008; Cai et al., 2011b). Among the 71 strains, 26 shared > 99% nucleotide sequence identity at a second locus, gyrB (Hwang et al., 2005), with either PtoDC3000 (Buell et al., 2003) or PtoT1 (Cai et al., 2011a) or strains of their close relatives P. syringae pv. spinaceae, a spinach pathogen (Bull et al., 2011; Gironde & Manceau, 2012), and P. syringae pv. apii, a celery (Apium graveolens) pathogen (Yan et al., 2008; Cai et al., 2011b).

Environmental Pto relatives share alleles with P. syringae crop pathogens at several core genome loci

Nine strains (Table 1) representative of the diversity of the 26 environmental strains most similar to either T1 or DC3000 were characterized by MLST for 13 loci to infer the evolutionary relationship between crop pathogens and environmental strains. Besides the core genome loci that we previously used (Yan et al., 2008; Cai et al., 2011b), three additional loci were sequenced: fliC coding for the flagellum subunit flagellin, and cheA1 and cheA2 coding for signaling components of two hypothetical chemotaxis pathways in P. syringae (C.R. Clarke et al., unpublished). These three loci were chosen because of the important role of FliC and of chemotaxis in plant–microbe interactions (Felix et al., 1999; Cai et al., 2011a; C.R. Clarke et al., unpublished). The majority of the 13 analyzed gene fragments contain a small number of nonsynonymous mutations compared with synonymous mutations, which is reflected in dN/dS ratios much < 1 (Table 2) and indicating purifying selection. However, compared with all other genes, fliC has the highest dN/dS ratio and a Tajima's D value > 2, indicating that the evolution of fliC significantly departed from neutrality, which suggests that this locus is, in fact, under selection to avoid recognition by plant immune receptors, as observed previously (Cai et al., 2011a).

Table 2. Sequenced Pseudomonas syringae loci, their length, number and percentage of segregating sites, number of alleles, average pairwise genetic distance calculated with the Jukes–Cantor correction, Tajima's D, and ratio of nonsynonymous (dN) to synonymous (dS) mutations
Locus nameLengthNumber of segregating sitesRatio segregating sitesNo. of allelesAverage pairwise genetic distanceTajima's D a dN/dS
  1. a

    ns, not significant. *< 0.05.

  2. All values were calculated based on the strains listed in Fig. 1. Full sequences of each allele can be found and downloaded from

acnB 555170.03130.007 (0.002)−0.201 ns0.0001
CheA1 597400.07130.018 (0.003)0.157 ns0.1780
CheA2 58890.0290.003 (0.001)−0.8 ns0.1137
fliC 846470.0670.023 (0.004)2.269 *0.2208
gap1 600190.03100.003 (0.001)−2.02 *0.0569
gltA 50470.0190.005 (0.002)0.789 ns0.0001
gyrB 696180.03140.007 (0.002)0.034 ns0.0164
kup 1059550.05210.009 (0.002)−1.09 ns0.0246
pgi 564140.0290.005 (0.001)−0.752 ns0.0207
PSPTOT1_0038597170.03100.008 (0.002)0.466 ns0.0142
PSPTOT1_2359498190.04150.010 (0.003)0.028 ns0.0864
PSPTOT1_1665435150.03120.006 (0.002)−0.996 ns0.0392
rpoD 636240.04170.013 (0.003)1.124 ns0.0237

Allelic distribution at the 13 analyzed loci (Fig. 1) revealed that two environmental strains, AI001 and AI103, isolated from a creek in New Zealand upstream of any agricultural activity, are 100% identical to Pto strain DC3000 at all loci. The other environmental strains show varying degrees of allelic overlap with crop pathogens, sharing alleles at two to nine loci. Importantly, environmental strains share alleles at different loci with different crop strains, strongly suggesting recombination. For example, environmental strain SZ014 has the same allele at the fliC locus as crop strains of pv. persicae, but shares an allele with strains of pvs maculicola, tomato, and antirrhini (but not with pv. persicae) at the cheA2 locus. Conversely, crop strain T1 has the same allele at the fliC locus as environmental strains CSZ0292, CSZ0223, SZ135, and CSZ0914, while at the gyrB locus T1 has the same allele as the environmental strain SZ003, a strain with which it does not share alleles at any other locus. At the pgi locus, environmental and crop strains show yet another pattern: six of seven environmental strains from France are identical to each other and the seventh strain shows only one single-nucleotide variation, making the pgi locus the most conserved locus among environmental strains from France. Interestingly, this locus is of average allelic diversity among crop pathogens and, as for all other loci, alleles are shared between environmental strains and crop pathogens.

Figure 1.

Allele composition at all sequenced loci for all analyzed crop strains and environmental strains. Environmental strains are highlighted in blue. The total number of alleles for each locus is reported at the bottom of each column. Loci are organized by increasing number of alleles, and strains are sorted based on allele numbers, starting with the locus with the fewest numbers of alleles (i.e. fliC). Alleles shared between strains PtoT1 or PtoDC3000 and environmental strains are in bold and underlined. Pma, Pseudomonas syringae pv. maculicola; Pto, Pseudomonas syringae pv. tomato; Ppe, Pseudomonas syringae pv. persicae; Pan, Pseudomonas syringae pv. antirrhini; Pbe, Pseudomonas syringae pv. berberidis; Pap, Pseudomonas syringae pv. apii; Pla, Pseudomonas syringae pv. lachrymans; Psp, Pseudomonas syringae pv. spinaceae.

In summary, environmental strains and crop pathogens share alleles at all analyzed loci, while the allelic distribution pattern changes from locus to locus. This strongly suggests that ancestors of environmental strains and crop strains frequently recombined. This is further substantiated by the population genetic recombination analysis discussed in the following section.

Evidence for recombination between environmental strains and crop pathogens

Phylogenic relationships among environmental strains and crop strains were investigated by Bayesian inference (Huelsenbeck & Ronquist, 2001). Individual gene trees and trees based on either the concatenated data set or the consensus of all individual loci were built. Trees were then compared with each other using the SH test (Shimodaira & Hasegawa, 1999). As expected from the discordant allele distribution at the different loci (Fig. 1), most trees built on individual loci are incongruent with the sequences at all other loci (Table S2). However, the trees based on the combined sequences were congruent with the sequences at all individual loci, with the exception of two loci: gyrB and cheA1. A final tree (shown in Fig. 2) was thus built on the concatenated sequences excluding gyrB and cheA1 (because of the strength of the conflicting data at these two loci) and excluding fliC and gapA (because these two loci had been found not to evolve neutrally; Table 2).

Figure 2.

Bayesian consensus trees based on the concatenated set of nine gene fragments listed in Table 2 (excluding cheA1, fliC, gyrB and gap1). Strains isolated from the environment are indicated in bold, while previously described crop pathogens are not. Pseudomonas syringae pv. syringae (Psy) B728a was chosen as outgroup since it is a well-characterized P. syringae strain outside of the PtoDC3000 clade (Sarkar & Guttman, 2004). Bayesian posterior probability values are given above branches. Pma, Pseudomonas syringae pv. maculicola; Pto, Pseudomonas syringae pv. tomato; Ppe, Pseudomonas syringae pv. persicae; Pan, Pseudomonas syringae pv. antirrhini; Pbe, Pseudomonas syringae pv. berberidis; Pap, Pseudomonas syringae pv. apii; Pla, Pseudomonas syringae pv. lachrymans; Psp, Pseudomonas syringae pv. spinaceae. *The environmental strains AI001 and AI103 are identical to PtoDC3000 in all sequenced loci.

The tree in Fig. 2 shows that the seven environmental strains from France are embedded within clades of related crop pathogens, while the environmental strains AI001 and AI003 from New Zealand are identical to strain DC3000. However, we cannot be confident in the tree's topology considering that many clades have posterior probabilities far below 0.95. The majority of clades also have very low support in a tree built by neighbor-joining using the same concatenated sequences and in a consensus Bayesian tree built from trees of individual loci (Fig. S2a,b). We conclude that it was the presence of too many conflicting signals that prevented us from obtaining trees with significant statistical support, strongly suggesting recombination (Didelot & Maiden, 2010). In fact, a phylogenetic network, which can accommodate conflicting signals as reticulation (Huson & Bryant, 2006), was also constructed and showed a high degree of reticulation (Fig. 3). The reticulation at the base of the branches connecting the crop pathogens Pbe CFBP17127 and ATCC13454 with the environmental strains CSZ0223, SZ135, CSZ0914, and SZ003 is just one example indicative of gene exchange between ancestors of crop pathogens and of environmental relatives.

Figure 3.

Split decomposition analysis of the concatenated set of nine gene fragments listed in Table 2 (excluding cheA1, fliC, gyrB and gap1), that is, the same gene fragments used for construction of the phylogenetic tree shown in Fig. 2. Only bootstrap values higher than 70 are shown. Pma, Pseudomonas syringae pv. maculicola; Pto, Pseudomonas syringae pv. tomato; Ppe, Pseudomonas syringae pv. persicae; Pan, Pseudomonas syringae pv. antirrhini; Pbe, Pseudomonas syringae pv. berberidis; Pap, Pseudomonas syringae pv. apii; Pla, Pseudomonas syringae pv. lachrymans; Psp, Pseudomonas syringae pv. spinaceae.

Recombination can occur not only between loci but also within loci. Therefore, the relative contribution of recombination and mutation to the sequence diversity at each individual locus was quantified with LDhat (McVean et al., 2002). LDhat uses a population genetic approach to estimate population-level recombination and mutation rates and previously revealed that recombination contributed more than mutation to the diversity among 23 out of the 24 crop strains used here (Cai et al., 2011b). Repeating the same test on the crop pathogens alone, the combined group of crop pathogens and environmental strains, or the environmental strains alone revealed that the ratio of recombination rates to mutation rates for many genes is higher in the environmental strains than in the crop strains and higher when considering crop and environmental strains combined than in the crop strains alone (Fig. 4).

Figure 4.

The ratios of population-scale recombination rate (ρ) to population-scale mutation rate (θ) for all crop and environmental strains (black bars), environmental strains (gray bars), and crop strains (white bars) were estimated in LDhat 2.1 (McVean et al., 2002; Auton & McVean, 2007). All numerical values used in Fig. 4 are listed in Table S3. Zero values are the result of recombination rates equal to zero.

P. syringae strains from crops and environmental strains have partially overlapping repertoires of genes coding for type III secreted effectors

To determine whether environmental strains contain genes coding for type III-secreted effectors, a pool of 12 environmental strains (listed in Table 1) was sequenced and assembled. The resulting assembly (called ‘environmental assembly’ from here on) was searched for all currently confirmed P. syringae type-III-secreted effectors ( using TBLASTN (Table 3). Three interesting results were observed. First, sequences corresponding to all T3SS effector genes known to be present in Pto strain DC3000 were found in the environmental assembly. Secondly, sequences of four effectors present in strain T1 but not in strain DC3000 were also present in the assembly (avrRps4, hopAB3, hopAE1, hopW1) while four other effectors present in strain T1 but not in strain DC3000 (avrA1, avrD1, avrRpt2, and hopAW1) were not. Finally, three effector sequences found in the environmental assembly were only present in more distantly related P. syringae strains but neither in strain T1 nor DC3000: hopAZ1, hopBD2, and hopX2.

Table 3. Comparison of repertoires of DNA sequences with homology to Pseudomonas syringae type III-secreted effectors in 12 environmental strains sequenced in bulk (referred to as ‘environmental assembly’), strain DC3000 (PtoDC3000), and strain T1 (PtoT1), including additional genomes of the same genetic lineage (Cai et al., 2011a)
StrainsEffector sequences
Environmental assembly, PtoDC3000, and PtoT1 lineage avrE1 avrPto1 hopA1 hopB1 hopC1 hopD1 hopF2 hopH1 hopI1 hopM1 hopO1-1 hopQ1-1 hopR1 hopS1 hopS2 hopT1 hopT2 hopY1 hopAA1 hopAF1 hopAG1 hopAH1 hopAI1 hopAS1
Environmental assembly and PtoDC3000 (absent from PtoT1) hopE1 hopG1 hopK1 hopN1 hopU1 hopV1 hopX1 hopAB2 hopAD1 hopAM1 hopAO1 hopAT1
Environmental assembly and PtoT1 lineage (absent from PtoDC3000) avrRps4 hopAB3 hopAE1 hopW1
Environmental assembly (absent from PtoT1 and PtoDC3000) avrRpm2 hopAB1 hopAV1 hopAZ1 hopBB1 hopBD2 hopX2 hopZ1
PtoT1 lineage (absent from environmental assembly and PtoDC3000) avrA1 avrD1 avrRpt2 hopAW1

The effector gene hopM1 plays a particularly important role in P. syringae–plant interactions (Nomura et al., 2006) but is truncated in PtoT1 (Cai et al., 2011a) and in several other sequenced P. syringae crop strains (Baltrus et al., 2011), possibly because of selection to avoid recognition by the plant immune system (Baltrus et al., 2011). In light of its importance in pathogenicity and because sequences corresponding to hopM1 are present in the environmental assembly, the hopM1 gene was sequenced in all individual crop and environmental strains. The gene was found to be of full length in most strains but truncated in the barberry pathogens ATCC13454 and CFBP1727 and in one of the environmental strains, CSZ0914. This suggests that environmental strains are also under selection to avoid effector recognition.

Pto and environmental Pto relatives have similar aggressiveness on tomato

Because of the similarity in effector repertoires between Pto and environmental strains and because of their phylogenetic relatedness, we wondered whether the environmental strains can also cause disease on tomato. As expected, the environmental strains from New Zealand that are genetically identical to DC3000 were found to be as aggressive on tomato as strain DC3000 (data not shown). The other environmental strains had different degrees of aggressiveness. Fig. 5(a) shows that 4 d after spray inoculation of leaves of the tomato cv ‘Rio Grande’, all environmental strains grew to a population density within two orders of magnitude of the highly aggressive Pto strain K40 (which belongs to the same genetic lineage as strain T1 (Cai et al., 2011a)) and DC3000. Moreover, all environmental strains attained population densities greater than that of strain M6 (nonpathogenic on tomato). In particular, strain CCV0611 consistently reached a population density almost as large as that of strain K40 and strain DC3000 and induced disease symptoms that were almost as severe (Fig. 5b). Similar results were obtained for a second tomato cultivar (data not shown). We therefore conclude that environmental strains such as CCV0611 could represent progenitors of strains that may emerge as new tomato pathogens in the future.

Figure 5.

Bacterial growth (a) and disease symptoms (b) on tomato (cv ‘Rio Grande’). (a) Bacterial population densities were determined 4 d postinoculation. Population densities are indicated as colony-forming units (CFU) cm−2 on a log scale. Pseudomonas syringae pv. tomato (Pto) strains DC3000 and K40 are examples of aggressive strains and Pseudomonas syringae pv. maculicola M6 is a strain that does not cause disease on tomato. Means associated with the same letter are not significantly different (pairwise Student's t-test, < 0.05). Errors bars represent ± SE. (b) Pictures of disease symptoms were taken 4 d postinfection.

Environmental strains have a wider host range than the PtoT1 lineage and other closely related crop strains

To determine whether the environmental strains analyzed here are part of the population of hypothetical wide-host-range ancestors from which host-specialized crop pathogens could have emerged, their host range was tested on the same plant species for which host range of Pto and related crop pathogens had been tested previously (Cai et al., 2011b): Arabidopsis thaliana, cauliflower (Brassica oleracea), celery (Apium graveolens), and snapdragon (Antirrhinum majus). As predicted, on celery and cauliflower most environmental strains were more aggressive than Pto strain K40 but less aggressive that the crop pathogens isolated from these crops (Table 4). Also, disease symptoms caused by the environmental strains were more severe than those caused by strain K40 but less severe than those caused by the respective crop strains (Fig. S3). However, on Arabidopsis (ecotype ‘Columbia’) and snapdragon, the environmental strains did not grow significantly better than K40 and did not cause any symptoms. These results show that environmental strains closely related to Pto have a wider host range than some host-specific crop strains like T1 but are not generalists.

Table 4. Fitness of Pseudomonas syringae strains isolated from snowpack and water compared with closely related crop strains on fours plant speciesThumbnail image of


The processes underlying the evolution and emergence of highly aggressive and host-specific bacterial crop pathogens and their spread around the world are poorly understood. The reason may be that the study of bacterial crop pathogens has so far focused on strains isolated from diseased plants (e.g. Sarkar & Guttman, 2004), neglecting their relatives that exist in nonagricultural environments (Morris et al., 2010). This lack of inclusion of environmental strains is also true for our previous studies with regard to the evolution of the model pathogen strains PtoT1 and PtoDC3000 (Yan et al., 2008; Cai et al., 2011a,b).

Here, c. 10% of P. syringae isolated from precipitation and pristine surface water were closely related to Pto. If we consider the population size of P. syringae in aquatic habitats to be 1020, as estimated by (Morris et al., 2010), the environmental population size of Pto relatives is c. 1019 CFU. Moreover, based on the sequenced gene fragments, each environmental sample contained different strains. Therefore, it is obvious that the relatively small number of environmental strains that were chosen here for further comparison with crop pathogens represent only the ‘tip of the iceberg’ of the much wider genetic diversity of strains closely related to Pto that exist in the environment. Importantly, the inferred large population size and high genetic diversity of Pto relatives in the environment strongly suggest that the small number of genetically monomorphic crop pathogens like Pto emerged from this environmental metapopulation.

Our results also shed light on the evolutionary mechanisms by which Pto may have emerged. Based on the discordant allele distribution at 13 loci, conflicting phylogenetic signals, and higher contribution of recombination than mutation to diversity among strains, it is very likely that recent ancestors of Pto and closely related crop pathogens frequently recombined with environmental strains. This is in stark contrast to the vision of Pto evolution that was conceived based on strains collected from diseased cultivated tomato around the world, which belong to a single clonally expanded genetic lineage typical of genetically monomorphic pathogens (Cai et al., 2011a). These earlier results on Pto were in agreement with a clonal population structure proposed for P. syringae crop pathogens based on MLST (Sarkar & Guttman, 2004). However, by integrating results from the combined analysis of environmental strains and crop pathogens with earlier results from the study of crop pathogens alone (Sarkar & Guttman, 2004; Cai et al., 2011a), our work reveals that P. syringae may, in fact, have an epidemic population structure similar to that of N. meningitidis (Maynard Smith et al., 1993) or P. aeruginosa (Pirnay et al., 2009): (1) genetic lineages of P. syringae recombine frequently in the environment; (2) epidemic clones like Pto occasionally emerge out of recombining populations after reaching a crop in an agricultural field; and (3) then expand on that crop over large areas (possibly worldwide), acquiring the apparent highly clonal endemic population structure previously proposed (Sarkar & Guttman, 2004). Stochastic processes are likely to be important for initial contact with a crop plant under environmental conditions that lead to successful colonization and disease. Subsequently, crop production practices (e.g. monoculture, pruning, dissemination of plant propagation materials, and the vegetative propagation of certain crop plants) contribute to the rapid and expansive spread of the successful clonal lines.

Interestingly, not all loci showed the same degree of recombination, suggesting that some of them are under stronger selection than others. One of these loci is fliC. This locus stands out because the number of alleles is lower than at any other sequenced locus; the mean pairwise genetic distance between alleles is the highest of all loci; and, most importantly fliC is the locus at which the highest number of environmental and crop strains share the same alleles. This is a clear indication of extensive and recent horizontal gene transfer of a few, but very different, fliC alleles, which confer a strong selective advantage to recipient strains. As groups of environmental strains and crop strains share identical fliC alleles, we conclude that these groups of strains are exposed to the same selection pressure. An intriguing question concerns the nature of the selection pressure. The gene fliC codes for flagellin, the structural subunit of bacterial flagellum, known for its plant immunity-triggering activity (Felix et al., 1999; Sun et al., 2006; Cai et al., 2011a). Therefore, the question arises: do the different alleles trigger different strengths of immune responses in different plant species and are they thus the result of adaptation to different hosts? Our data reveal that all fliC alleles share identical sequences in the plant immunity-triggering flg22 epitope (Felix et al., 1999) and in the key residues of the plant immunity-triggering flgII-28 epitope (Cai et al., 2011a). Most differences between alleles are concentrated in a region C-terminal to flgII-28 (approximately between amino acid positions 150 and 200), which has not yet been thoroughly investigated for its possible role in triggering plant immunity. However, this region may trigger immunity in other organisms with which P. syringae might interact in the environment, such as algae or amoebae. Alternatively, the different alleles may represent adaptations for optimized swimming motility in different environments.

Another locus that showed unusual genetic variation is pgi. This locus stands out for its extreme conservation among environmental strains from France. We hypothesize that either the pgi alleles themselves or alleles at genetically linked loci acquired by these strains confer a selective advantage in the ecological niche occupied by these strains and thus swept through the population, similar to what was recently described for loci in populations of ocean bacteria during ecological differentiation (Shapiro et al., 2012).

Results from the bulk sequencing of type III effector repertoires and comparison of environmental strains with crop pathogens with regard to host range and aggressiveness also illustrate the relevance of our findings to mechanisms underlying the emergence of P. syringae as a crop pathogen. In fact, type III effector repertoires of environmental strains largely overlap with the effector repertoires of Pto strains T1 and DC3000, and some of these strains are almost as aggressive on tomato as Pto. This reveals that P. syringae crop pathogens do not need to acquire essential pathogenicity genes before they emerge, as, for example, is the case for pandemic clones of V. cholerae (Faruque & Mekalanos, 2012). The presence of large type III effector repertoires in environmental strains also suggests that a small number of evolutionary events may be sufficient to turn a pre-existing environmental P. syringae strain into a pandemic clone. In the case of Pto, the two key differences between the analyzed environmental strains and the typical Pto strain T1 appear to be presence of a full-length hopM1 gene (beside strain CSZ0914, which appears truncated), and absence of the effector avrRpt2, a known virulence factor on tomato (Lim & Kunkel, 2005). Therefore, acquisition of avrRpt2 and loss of hopM1 could be sufficient to turn the analyzed environmental strains into a highly successful tomato pathogen like T1. In fact, since high concentrations and diversity of P. syringae strains were previously found in plant litter of grasslands in the southern French Alps (Monteil et al., 2012), we hypothesize that plant litter may be one of the ‘breeding grounds’ for P. syringae, where strains like a hypothetical T1 ancestor may have acquired avrRpt2 in the series of events that lead to its emergence as a crop pathogen. Moreover, isolation of strains of the same haplotype as DC3000 (based on 13 loci) in a creek upstream of any agricultural activity suggests that even P. syringae crop pathogens may transit through the compartments of the water cycle. Therefore, precipitation and surface water may very well be significant inoculum sources for plant diseases, as previously suggested by Riffaud & Morris (2002).

If we assume that the analyzed environmental strains are, in fact, members of the population from which crop pathogens emerge, then the finding that environmental strains closely related to strain T1 are almost as aggressive on tomato as Pto strains, but have a somewhat wider host range, strengthens our hypothesis that highly aggressive crop pathogens evolved from less specialized strains with a broader host range (Cai et al., 2011b). Surprisingly though, neither T1 nor the closely related environmental strains can significantly grow on or cause any disease on A. thaliana or snapdragon. We thus conclude that crop pathogen ancestors were already adapted to some plant species/families but not to others. In particular, it appears that crop pathogens might have evolved from their ancestors in two different ways: restricting their host range when adapting to a single crop, like PtoT1; or by switching hosts completely, like P. syringae pv. antirrhini (Pan). In fact, unlike most of the closely related crop-pathogenic strains or environmental strains, Pan causes disease on the ornamental snapdragon. This suggests that the Pan ancestor was not a snapdragon pathogen but completed a host jump made possible by an as yet unknown evolutionary event. Undoubtedly, whole-genome sequencing in combination with population genomic analyses will provide the means to identify this and many other evolutionary events at the basis of crop pathogen evolution, similar to the dramatic progress made in our understanding of the evolution and epidemiology of human pathogenic bacteria (Morelli et al., 2010; Didelot et al., 2011; Shapiro et al., 2012).

In conclusion, our data suggest that the current emphasis on coevolution of crop pathogens with crops in a resistance gene-driven fashion needs to be expanded to a more holistic paradigm of pathogen evolution that includes interaction with multiple plant species, other organisms, and the environment at large, as proposed previously (Morris et al., 2009). This complex process of evolutionary history of pathogens is widely accepted for numerous human pathogens with environmental reservoirs and could serve as an important framework for transforming the paradigms of plant pathogen evolutionary history.


We thank Dr Konrad Paszkiewicz and Dr Karen Moore for technical assistance with the Illumina DNA sequencing. Research in the Vinatzer laboratory was funded by the NSF,, with award number IOS 0746501.