Adaptation of bacteria to glyphosate: a microevolutionary perspective of the enzyme 5-enolpyruvylshikimate-3-phosphate synthase.

Glyphosate is the leading herbicide worldwide, but it also affects prokaryotes because it targets the central enzyme (5-enolpyruvylshikimate-3-phosphate, EPSP) of the shikimate pathway in the synthesis of the three essential aromatic amino acids in bacteria, fungi and plants. Our results reveal that bacteria may easily become resistant to glyphosate through changes in the 5-enolpyruvylshikimate-3-phosphate synthase active site. This indicates the importance of examining how glyphosate affects microbe-mediated ecosystem functions and human microbiomes.


Background
Evolutionarily conserved 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) is the key enzyme of the shikimate pathway in the synthesis of three essential aromatic amino acids (phenylalanine, tyrosine and tryptophan) across taxa, including bacteria, fungi and plants (Bentley, 1990;Richards et al., 2006). The EPSPS enzyme is also involved in the production of the para-aminobenzoate (McConkey, 1999). Thus, glyphosate may have an effect on the synthesis of folate even in environments rich in phenylalanine, tyrosine and tryptophan. The EPSPS enzyme is also known as AroA (EC:2.5.1.19) and catalyses the reaction from shikimate-3-phosphate to chorismate. Because the shikimate pathway does not occur in vertebrates, the use of glyphosate, targeting the EPSPS enzyme by competing for the binding site with the second substrate of the EPSPS (phosphoenol pyruvate, PEP) (Schönbrunn et al., 2001), is regarded safe for use in food production.
Due to its affordable price, effectiveness and broadspectrum ability to kill weeds, glyphosate has become the most commonly used herbicide worldwide (Woodburn, 2000;Helander et al., 2012;Myers et al., 2016). Although glyphosate antibiotic properties are known (Kurenbach et al., 2015;Kurenbach et al., 2018), its possible effects on microbiomes (Funke et al., 2006;Motta et al., 2018;Gómez-Gallego et al., 2020;Ruuskanen et al., 2020) have largely been neglected until recently (Motta et al., 2018;Mesnage et al., 2019). As microbes have driven eco-evolutionary processes since the origin of life, we note the importance of thoroughly understanding the possible undesirable effects of glyphosate on ecosystem structures, functions and services (Bote et al., 2019). Here, we propose that the wide use of glyphosate may potentially affect microbial communities by (i) eradicating susceptible microbes and (ii) leading to evolution of resistance.

Results and discussion
We analysed 663 microbial genomes distributed across 32 alignable tight genomic clusters (ATGCs) (Novichkov et al., 2009;Puigbò et al., 2014), with the EPSPSClass server (http://ppuigbo.me/programs/EPSPSClass) (Leino et al., 2020) to identify changes in sensitivity to glyphosate. The algorithm classifies EPSPS proteins based on the presence and absence of amino acid markers in the active site that determine whether the enzyme is putatively sensitive (class I) or resistant (class II, III and IV) to the glyphosate (see methods). Currently, the majority of species of prokaryotes belong to class I, thus are putatively affected by the herbicide (Leino et al., 2020). We analysed 32 ATGCs that amply represented bacteria and one archaeal group. The EPSPS sequence was not found in ATGC033 (Mycoplasma), small-genome single-membrane bacteria (Meseguer et al., 2003); ATGC046 (Rickettsia), obligate intracellular parasites (Zomorodipour and Andersson, 1999) and ATGC056 (Lactobacillus). Even though Lactobacillus strains in the ATGC dataset do not have the EPSPS sequence, other strains have the enzyme and present differential sensitivity to glyphosate. Previous studies have shown that some Lactobacillus species are susceptible to glyphosate (Shehata et al., 2013).
A microevolutionary analysis of the EPSPS enzyme shows that phylogenetics best explains bacterial sensitivity to glyphosate (Fig. 1, Table 1 and Supplementary -Table 1). In general, Firmicutes are significantly more (putatively) resistant to glyphosate than Proteobacteria, whereas Actinobacteria is the most sensitive group to glyphosate. Furthermore, bacterial lifestyle is also associated with sensitivity, i.e. facultative host-associated (FHA) bacteria are more sensitive to glyphosate than free-living (FL) bacteria (Supplementary Tables 2 and 3; Supplementary Fig. 1). However, an analysis with a greater number of samples would be necessary to test whether this association with lifestyle is independent of phylogeny. Intracellular parasites seem to be the most putatively sensitive group; however, due to the limited amount of data, we cannot determine statistical significance for this group's sensitivity. We hypothesize that sensitivity mirrors the strength of selective force of glyphosate on bacteria. FL bacteria are directly exposed to glyphosate spray at high concentrations, whereas hostassociated microbes can be exposed to lower concentrations when the glyphosate is moving through the plant from leaves to roots, and intracellular parasites might be protected by the cell membrane. Genomes with %GC below 40% tend to be more resistant to glyphosate ( Fig. 2A). Spearman correlation coefficients between sensitivity and genome parameters from the original ATGC dataset (Puigbò et al., 2014) were not statistically significant (see methods and Supplementary Table 4). To statistically decouple the contribution of %GC, ATGC clusters Fig. 1. Distribution of the predicted sensitivity to glyphosate across the species tree. Pie charts indicate the proportion of species that are putatively sensitive (green) or resistant (red) to glyphosate, and unclassified (black). See full description in Table 1. (closely related phylogenetic species), number of genes, taxonomic groups (e.g., Actinobacteria, Proteobacteria, Firmicutes) and lifestyle (FL, FHA and intracellular parasites) to the bacterial sensitivity to glyphosate (sensitive, resistant, unknown), we carried out a multiple correspondence analysis (MCA) and random forest (RF) classification model. The MCA analysis revealed that Firmicutes and FL bacteria are mostly resistant, whereas Actinobacteria, Proteobacteria and FHA appeared to be sensitive to glyphosate ( Fig. 2B and V). The RF suggests that ATGC clusters (i.e. phylogenetics) are the most important definer of sensitivity to glyphosate (Fig. 2D).
EPSPS is the central enzyme in the metabolic pathway for the synthesis of three essential amino acids in bacteria (Steinrücken and Amrhein, 1980); thus, its protein sequence is presumably evolutionarily conserved (Novick and Doolittle, 2020). Indeed, the EPSPS sequence is highly conserved within 17 ATGCs that preserve sensitivity status across species. However, 12 ATGCs contained species with different sensitivities to glyphosate (Fig. 3, Table 1 and Supplementary Figs 2-13). Although the active site of the EPSPS is highly conserved in prokaryotes, in some species, the sensitivity to glyphosate may be altered with few changes in the active site (Fig. 3). In Enterobacteria (ATGC001), only Enterobacter R4 368 (out of 109 species) is putatively resistant to glyphosate after recently acquiring a second copy of the EPSPS via horizontal gene transfer (HGT) from a resistant strain of Klebsiella pneumoniae (Fig. 4). Other Proteobacteria clusters have species with differential sensitivity status after few amino acid changes in the active site (Fig. 3). This microevolutionary analysis shows that some bacteria may become putatively resistant to glyphosate through small changes, analogous to the antibiotic resistance mechanism (Kurenbach et al., 2018;Bote et al., 2019). Thus, the wide use of glyphosate may have a very large impact on the species diversity and composition of microbial communities not only because of a potential purifying selection effect against sensitive bacteria but also because (i) some bacterial groups may adapt rapidly to become resistant to glyphosate and (ii) glyphosate-based herbicides may enhance multidrug resistance in bacteria (Liu et al., 2020). Previous studies have shown that glyphosate can affect the microbiome composition of insects, including bees (Motta et al., 2018), potato beetles (Gómez-Gallego et al., 2020) and birds . Moreover, preliminary results suggest a putative effect in the human (Leino et al., 2020) and rat (Mesnage et al., 2019) gut microbiomes. In addition, glyphosate may affect ecosystem functions including nutrient cycle and carbon balance, plant growth-promoting microbes (e.g. mycorrhizae and nitrogen-fixing bacteria), microbially mediated plantherbivore and pathogen interactions, and pollination biology depending on the related microbial community composition and its sensitivity to glyphosate (Helander et al., 2018;Gómez-Gallego et al., 2020). An analysis of variability in the amino acid landscape (see methods) shows that the amino acids in the active site of the EPSPS that bind PEP are 23% more conserved than the rest of the amino acid residues ( Supplementary Fig. 14), in agreement with results from previous studies (Schönbrunn et al., 2001). Moreover, amino acids flanking binding sites (±5 residues) also show higher levels of conservation than average, possibly due to selection preserving the function of the EPSPS. Thus, additional residues within the active site may determine sensitivity to the herbicide. Bacteria potentially more resistant to glyphosate tend to have fewer amino acid substitutions than sensitive ones, e.g. the binding amino acids of actinobacteria and FL bacteria (more resistant to glyphosate) tend to be more conserved (Fig. 5 and Supplementary Tables 5 and 6). Although there are homogeneous and heterogeneous clusters in relation to glyphosate sensitivity, the active site is equally conserved (Supplementary Table 7). Further analyses of amino acid variabilities in EPSPS (Supplementary Table 8) should help elucidate the effects of glyphosate on microbial communities. Moreover, predictions based on amino acid markers may partially explain resistance to glyphosate. However, glyphosate resistance may also occur due to alternative mechanisms, such as efflux pumps or elevated expression of the EPSPS (Staub et al., 2012).

Conclusions
In conclusion, this study shows that phylogenetic groups and bacterial lifestyles are key factors determining sensitivity to glyphosate. Our results suggest that the sensitivity of bacterial species to glyphosate varies and can change in the short evolutionary time of ATGC. Moreover, microbes more exposed to the herbicide (FL bacteria) are putatively more resistant. Nevertheless, because single mutations in the EPSPS active site and rapid HGT events may change the status of sensitivity, heavy use of glyphosate may impact microbial biodiversity independently of taxonomy and lifestyle. The central and unanswered questions are (i) what are the effects of glyphosate-modulated microbiomes on ecosystem functions and services and (ii) on human wellbeing.

Protein sequences
Protein data were obtained from the ATGCs database, which contains data on >4.5 million proteins encoded in >1500 genomes of prokaryotes (approximately 60% of proteins and 62% of genomes from RefSeq as of June 2013) that met the same criteria as the original ATGCs (Novichkov et al., 2009), as described in Puigbò et al. (2014). The ATGC database provides an ideal environment for a microevolutionary analysis, as it provides a widespread representation of genes from a representative set of prokaryotes. However, other databases (e.g. the Clusters of Orthologous Groups (Galperin et al., 2019)) would be more suitable for a comprehensive macroevolutionary analysis of the EPSPS. EPSPS protein sequences were annotated through BLAST mapping onto the Clusters of Orthologous Genes (COG) database (Galperin et al., 2015). EPSPS belongs to COG0128 (category E, amino acid transport and metabolism). We analysed 32 ATGCs that amply represented bacteria and one archaeal group. Each ATGC contains at least 10 species by the definition in the original study (Puigbò et al., 2014), and up to 109 species are present in the enterobacterial cluster (ATGC001). The final ATGC dataset used in this study includes 332 FHA species (13 ATGCs), 312 FL species (17 ATGCs) and Fig. 3. Examples of the differential predicted sensitivity to glyphosate in three Proteobacteria ATGCs. Phylogenetic species trees of three ATGCS. Filled boxes show bacteria putatively sensitive (green) and resistant (red) to glyphosate, and unclassified (black). Each box from left to right corresponds to Class I alpha, Class I beta, Class II, Class III, Class IV and unclassified. The amino acids in the active site are shown on the right side of the figure. A. Change from sensitive to resistant in Enterobacter R4 368 through putative horizontal gene transfer from Klebsiella pneumoniae (Fig. 4). B. Two independent changes from sensitive to unclassified in propionibacterium with single amino acid changes. C. Change from resistant to unclassified in Campylobacter with several amino acid changes. sequence was not found in ATGC033 (Mycoplasma), ATGC046 (Rickettsia) and ATGC056 (Lactobacillus). Even though Lactobacillus strains in the ATGC dataset do not have the EPSPS sequence, the enzyme is present in other strains (Supplementary Table 9).

Classification of the EPSPS
Glyphosate inhibits the shikimate pathway by inhibiting the enzyme EPSPS (Schönbrunn et al., 2001;Abraham, 2020). The EPSPS class has been classified with the EPSPSClass server (http://ppuigbo.me/programs/ A. Multiple sequence alignment of proteins. The sequence WP_065810684 from K. pneumonia was obtained as a result of the best BLAST hit (identity 100%). B. Classification of the EPSPS enzyme into sensitive (class I) and resistant (Class II) to glyphosate. C. Identity values of the four proteins to the reference EPSPS sequences in EPSPSClass server (https://ppuigbo.me/programs/EPSPSClass/). EPSPSClass), which assesses the type of potential sensitivity of the EPSPS enzyme to the herbicide glyphosate (Leino et al., 2020). The amino acid markers to determine potential glyphosate sensitivity to glyphosate (Supplementary  Table 10) are also available at Leino et al. (2020)) and in the web server. EPSPS enzymes can be classified as class I (alpha or beta) (Light et al., 2016;Firdous et al., 2018), class II (Barry et al., 1997;Priestman et al., 2005), III (Carozzi et al., 2006) or IV (Lira et al., 2013) based on the presence of known amino acid markers (class I, class II and class IV) and motifs (class III) (Barry et al., 1997;Priestman et al., 2005;Carozzi et al., 2006;Lira et al., 2013;Light et al., 2016;Firdous et al., 2018). In addition, a large portion of microbial species are yet unclassified, and their potential sensitivities to glyphosate are unknown (Leino et al., 2020). Different uses of amino acid markers distinguish the EPSPS enzymes into four classes using the reference EPSPS enzymes from Vibrio cholerae serotype O1 (vcEPSPS, class I), Coxiella burnetii (cbEPSPS, class II), Brevundimonas vesicularis (bvEPSPS, class III) and Streptomyces davawensis (sdEPSPS, class IV).

Analysis of the amino acid substitution landscape
We have performed a protein sequence alignment of the EPSPS sequences from the ATGCs and the reference EPSPS of Vibrio cholerae serotype O1 (vcEPSPS; Class I (Light et al., 2016;Firdous et al., 2018); Supplementary - Table 5) with the program Muscle (Edgar, 2004). The vcEPSPS sequence was used to determine the location of residues in the active site that interact directly with PEP (Schönbrunn et al., 2001;Healy-Fried et al., 2007). Glyphosate inhibits EPSPS by competing with PEP (Schönbrunn et al., 2001). We counted the number of alternative amino acids in each position of the alignment to determine the degree of amino acid substitution in the active site and the rest of the residues. We determined an active site region, i.e. a wider area around the active site residues (±5 amino acids) that interacts with PEP.

Statistical analysis
The tested Spearman correlation coefficients between sensitivity and total genome dynamics (as the sum of gains, losses, expansions and reductions; GLER), genome number (genome no.), genome size, protein-coding genes, gene families, median gene cluster content, median gua-nine+cytosine (median GC), median dN/dS, synteny distance (dY) and percentage of aromatic amino acids were not statistically significant (p-value >0.05) (Supplementary Table 4). We performed a chi-squared test to determine associations between sensitivity to glyphosate and lifestyle (FL vs FHA) and taxonomic group (Actinobacteria, Firmicutes and Proteobacteria). A Kruskal-Wallis test was used to analyse conservation of the amino acids in the EPSPS. The importance of each variable (ATGC, taxonomic groups, %GC, genome size and lifestyle) to determine potential sensitivity to glyphosate was performed with a MCA and a RF classification model. Input variables are available in the Supplementary Table 11.