Widespread bacterial utilization of guanidine as nitrogen source

Guanidine is sensed by at least four different classes of riboswitches that are widespread in bacteria. However, only very few insights into physiological roles of guanidine exist. Genes predominantly regulated by guanidine riboswitches are Gdx transporters exporting the compound from the bacterial cell. In addition, urea/guanidine carboxylases and associated hydrolases and ABC transporters are often found combined in guanidine‐inducible operons. We noted that the associated ABC transporters are configured to function as importers, challenging the current view that riboswitches solely control the detoxification of guanidine in bacteria. We demonstrate that the carboxylase pathway enables utilization of guanidine as sole nitrogen source. We isolated three enterobacteria (Raoultella terrigena, Klebsiella michiganensis, and Erwinia rhapontici) that utilize guanidine efficiently as N‐source. Proteome analyses show that the expression of a carboxylase, associated hydrolases and transport genes is strongly induced by guanidine. Finding two urea/guanidine carboxylase enzymes in E. rhapontici, we demonstrate that the riboswitch‐controlled carboxylase displays specificity toward guanidine, whereas the other enzyme prefers urea. We characterize the distribution of riboswitch‐associated carboxylases and Gdx exporters in bacterial habitats by analyzing available metagenome data. The findings represent a paradigm shift from riboswitch‐controlled detoxification of guanidine to the uptake and assimilation of this enigmatic nitrogen‐rich compound.

| 201 SINN et al. arginine and creatinine to guanine and large secondary metabolites such as streptomycin. Although all these guanidine compounds comprise a potential source of guanidine, the known catabolic routes proceed mostly by hydrolytic attack of the guanidine carbon atom, giving raise to urea. Despite some early reports about the biological formation of guanidine and its utilization (Kihara et al., 1955;Natelson & Sherwin, 1979), only few biotic reactions have been characterized that produces or breaks down guanidine to date. The best studied enzyme that catalyzes the production of guanidine is the ethylene-forming enzyme (EFE). Here, guanidine is formed via the δ-hydroxylation of arginine and subsequent loss of guanidine (Fukuda et al., 1992;Hausinger, 2004).
The idea that guanidine might have a prominent but so-far overlooked role in natural physiology or metabolism was sparked by a series of publications from Breaker and coworkers who described three classes of riboswitches that respond to guanidine . The recent discovery of a fourth class of guanidine riboswitches further supports this idea (Lenkeit et al., 2020;Salvail et al., 2020). The ykkC motif RNA was published as a riboswitch candidate in 2004 (Barrick et al., 2004;Battaglia et al., 2017;Reiss et al., 2017), and is common in various bacterial clades and associated with genes encoding transporters like multidrug efflux pumps, urea carboxylases, purine biosynthesis, and amino acid metabolism enzymes. Two additional riboswitch candidates, the mini-ykkC (Huang et al., 2017a;Weinberg et al., 2007) and the ykkC-III RNA motif (Huang et al., 2017b;Weinberg et al., 2017) were subsequently identified to be found upstream of several of these genes. Although the three motifs do not share sequence or structural characteristics in the ligand-binding domain, it was hypothesized that they sense the same ligand, based on the extensive overlap of the genetic context (Meyer et al., 2011;Sherlock & Breaker, 2020). The wide variety of associated genes hampered the search of the ligand based on the genetic context, but also indicated that its ligand participates in widespread metabolic reactions (Barrick et al., 2004;Meyer et al., 2011). All three ykkC motifs, now renamed as guanidine-I, -II, and -III riboswitches , and the recently published guanidine-IV (Lenkeit et al., 2020;Salvail et al., 2020) riboswitch were verified to respond selectively to guanidine.
A gene commonly controlled by guanidine riboswitches encodes certain representatives of "small multidrug resistance" (SMR) transporters such as YkkCD, EmrE, and SugE. Subsequent to the discovery of guanidine riboswitches, this specific family of SMR proteins has been demonstrated to function as selective guanidine exporters, termed Gdx (for guanidine exporters) (Kermani et al., 2018), see Figure 1a. When SugE-type genes occur under guanidine riboswitch control, we will refer to these as Gdx throughout the manuscript. Another gene product that is frequently controlled by guanidine riboswitches is annotated as urea carboxylase (Uca), see Figure 1a. It has been demonstrated already during the initial discovery of guanidine-I riboswitches that these carboxylases can use both urea and guanidine as substrates . The studied riboswitch-controlled carboxylase showed a 40-fold lower K M for guanidine compared to urea but a comparable k cat . Thus, the riboswitch-associated carboxylase annotated as urea carboxylase prefers guanidine as a substrate. We will refer to carboxylase enzymes that occur under guanidine riboswitch control as guanidine carboxylases (Gca). It was speculated that the carboxylation reaction initiates the degradation of guanidine for the purpose of detoxification . When urea is carboxylated, the resulting product allophanate is further hydrolyzed by the enzyme allophanate hydrolase (Kanamori et al., 2004). Indeed, many guanidine riboswitch-regulated operons contain genes annotated as allophanate hydrolases. However, guanidine carboxylation results in carboxyguanidine, which might require different hydrolysis activities compared to allophanate. Two further genes (annotated as urea carboxylase-associated genes 1 and 2; ucaa 1 and 2) often associate with guanidine carboxylase enzymes. A recent report clarified that these two proteins comprise the subunits of a heterodimeric carboxyguanidine deiminase (CgdAB)) enabling its hydrolysis to allophanate and ammonia (Schneider et al., 2020). The work also addresses the question of substrate specificity of the associated carboxylases and speculates about its involvement of the use of guanidine as nitrogen source.
Taken together, guanidine riboswitches predominantly induce Gdx transporters in order to export this compound from bacterial cells before it reaches problematic concentrations. However, other gene functions controlled by guanidine riboswitches enable the carboxylation and subsequent degradation. We show that this carboxylase pathway enables the utilization of guanidine as sole nitrogen source. We describe the isolation and characterization of guanidineassimilating bacteria and demonstrate that carboxylase enzymes F I G U R E 1 Association of guanidine-I, -II, -III, and -IV riboswitch classes with specific gene functions. The largest circle represents all 753 organisms with complete genomes that contain at least one predicted guanidine riboswitch. The sets indicate which organisms have guanidine riboswitches that regulate guanidine exporters Gdx (602 organisms) or guanidine carboxylase Gca genes (152 organisms), and also considers the regulation of ABC transporter genes by guanidine riboswitches. A total of 530 organisms only regulate Gdx by guanidine, 80 organisms only regulate Gca in this way, and 72 organisms do both. Of 113 riboswitch-regulated operons encoding ABC transporters, 112 contain a substrate-binding domain [Colour figure can be viewed at wileyonlinelibrary.com] have evolved that display selectivity for either urea or guanidine.
In addition, we analyze metagenomics data in order to characterize habitats that are enriched for carboxylase pathway enzymes for the utilization of guanidine as nutrient.

| Gene functions under control of guanidine riboswitches
Since the discovery of widespread riboswitches that induce gene expression in response to the presence of guanidine, the physiology of guanidine has remained a mystery. In order to shed more light on the physiology of guanidine in bacteria, we sought to investigate genes that are commonly associated with the four known riboswitch classes for guanidine. For a comprehensive analysis, we combined the information of the genetic context for all four known riboswitch classes in the available genomic data. Since certain gene functions under guanidine riboswitch control seem to occur in conserved operons that contain groups of highly associated genes, we grouped the gene categories into different pathways: The gene classes most frequently controlled by guanidine riboswitches are Gdx exporters that have been demonstrated experimentally to export guanidine, see Figure 1 and Supporting Information File 1. They occur as individual genes or tandem arrangements (reminiscent of their homoor heterodimeric topology) and appear only sometimes associated with other genes. The second-largest group of genes associated with guanidine riboswitches comprise a pathway (Gca P ) that consists in most cases of ATP-binding cassette-type (ABC) transporters, accompanied by genes that encode a urea carboxylase, two different urea carboxylase-associated genes, and allophanate hydrolases.
As it has been demonstrated for a representative before  and detailed in the introduction, it is likely that most of the riboswitch-controlled carboxylase enzymes are in fact guanidine carboxylases. In addition, it could be speculated that the associated ABC transporters are specific for guanidine transport.

| Detoxification versus utilization
Subsequent to the discovery of widespread occurrence of guanidine riboswitches in bacteria, it has been believed that the controlled genes encode activities that facilitate either the export or the degradation of guanidine by carboxylation and subsequent hydrolysis for the purpose of detoxification (Battaglia & Ke, 2018;Nelson et al., 2017). However, we noted that organisms that regulate urea/ guanidine carboxylase-containing pathways by guanidine riboswitches often (in 113 of 152 organisms, 74%) also regulate ABC-type transporters in this way ( Figure 1). This suggests that the ABC-type transporters, which are not structurally related to Gdx exporters, are somehow functionally connected to the carboxylase pathway gene. If the associated ABC-type transporters are exporters, this would mean that the guanidine-controlled carboxylase operons contain two different means to detoxify this compound, export as well as modification via carboxylation and subsequent hydrolysis.
Moreover, in 52.2% of the organisms with carboxylase pathwayassociated ABC-type transporters, a Gdx-type exporter is also found in the same genome (59 from 113 occurrences, see Figure 1). Hence, these organisms would encode two different transporters for the export of guanidine. Such a high frequency of redundancy would be surprising. Instead, the ABC transporter could work together with the carboxylase and imports its substrate guanidine.
In order to clarify the role of the ABC transporters associated with the carboxylase pathway, we analyzed the encoded genes more carefully. ABC transport systems can either import or export compounds (Locher, 2016;Wilkens, 2015). In bacteria, these two activities can be distinguished easily by the composition of the subunits: Importers are characterized by the presence of a periplasmic substrate-binding protein (in Gram-negative) or a lipid-anchored external protein (Gram-positive) that delivers the transported substrate to the outer side of the ABC-type channel (Berntsson et al., 2010;Maqbool et al., 2015). In the case of exporters, this domain is not necessary and is lacking. We further analyzed the guanidine riboswitch-controlled carboxylase operon-associated ABC transporters and found periplasmic substrate-binding domains in 112 of the 113 cases, see Figure 1. This finding implies that these carboxylase pathway-associated ATP transporters function rather as importers instead of detoxifying guanidine via export. The prospect of active, ATP hydrolysis-driven import into bacteria opens up the possibility that guanidine is utilized as a nutrient.

| Enrichment of guanidine utilizers
In order to investigate whether guanidine can serve as nutrient for bacterial growth, we sought to enrich microorganisms that are able utilize this N-rich compound as nitrogen source. We collected water samples from the lake shore surface sediment of Lake Constance (Bodensee) and plated filtered and diluted samples on minimal media that contained glycerol as carbon source and guanidine as sole nitrogen source. With undiluted lake water samples, densely overgrown plates were observed at 25°C over 24 hr. In diluted samples, many colonies could be identified. From these colonies, three morphologically different ones were chosen for enrichment and subsequent characterization. Individual colonies were passaged several times on selective media in order to obtain homogenous strains. By 16S rRNA sequencing the three isolated bacteria were identified as strains of Raoultella terrigena (Schicklberger et al., 2015), Erwinia rhapontici (Huang et al., 2003), and Klebsiella michiganensis (Saha et al., 2013).
In order to characterize the utilization of guanidine as N-source, the three isolated strains were cultivated in liquid culture and growth was monitored. As comparison, the bacteria were grown on minimal medium with glycerol as C-source and supplemented guanidinium chloride (5 mM), urea (10 mM), or ammonium chloride (15 mM) as sole nitrogen source. All strains were able to grow efficiently on guanidine ( Figure 2). The bacteria grew slightly slower on guanidine compared to ammonia. Nevertheless, they reached the same maximal optical density indicating that all nitrogen atoms of guanidine can be assimilated.
Interestingly, only E. rhapontici was able to grow on urea as N-source.
This result is somewhat unexpected as the isolated R. terrigena and K. michiganensis strains both encode urease genes, see below. E. rhapontici, which is able to utilize urea efficiently, does not encode a urease, but rather two copies of annotated urea carboxylases, see below for a detailed characterization of the two enzymes. We speculated that the difference in growth on urea could be related to the difference of the two means of urea degradation. Urease is a Ni-dependent metallohydrolase (Boer et al., 2014). When we repeated the growth experiments with supplemented Ni 2+ , both R. terrigena and K. michiganensis grew to the same optical density ( Figure S1). The inability of R. terrigena and K. michiganensis to utilize urea in absence of supplemented Ni demonstrates that the encoded riboswitch-controlled guanidine carboxylases (see next paragraph) do not also allow for the utilization of urea but seem to be specific for guanidine.
In order to investigate the molecular basis for the observed utilization of guanidine, the genomes of the three strains were sequenced. The analysis of the resulting genome sequences R. terrigena and K. michiganensis JH7 (JABANY000000000.1) utilizing the Type (Strain) Genome Server (Meier-Kolthoff & Göker, 2019) confirmed the taxonomic classification based on the 16S RNA analysis. We next analyzed the occurrence of guanidine riboswitches and associated genes in the obtained genomes, see Table 1. All three strains contain a guanidine carboxylase operon under the control of a guanidine riboswitch. The organization of the genes is very similar among the bacterial strains with the ABC transporter genes and the carboxyguanidine deiminase genes (cgdAB) under control of a guanidine-I riboswitch followed by guanidine carboxylase (gca) and allophanate hydrolase genes (atzF) under the control of a guanidine-II riboswitch (see Figure 3). In E. rhapontici str. JH02 atzF is missing downstream of the riboswitch-associated gca. However, atzF is found at a different locus in conjunction with a second urea/guanidine carboxylase gene cluster. Both R. terrigena and K. michiganensis genomes contain a Gdx-type exporter under guanidine riboswitch control, whereas this activity is lacking in E. rhapontici.

| Analysis of guanidine-dependent gene expression
In order to investigate whether the observed utilization of guanidine as N-source is based on the activity of the riboswitch-controlled  gene products, we analyzed the proteome of the three isolated strains in response to guanidine. Each of the isolated bacteria was grown in minimal medium that contained 1% of glycerol as carbon source and either 5 mM guanidinium chloride or 15 mM ammonium chloride as nitrogen source. Bacteria were harvested in late exponential phase and the whole proteome was determined by mass spectrometry. In all three bacteria the genes under the control of guanidine riboswitches (Table 1) were highly upregulated, see Figure 3 and Supporting Information File 2. We noticed that nitrogen metabolism-related genes are generally enhanced when bacteria utilize guanidine, see Supporting Information File 2. We speculate that this is due to the lack of a preferred nitrogen source.

| Characterization of carboxylase enzymes
When we analyzed the genome and proteome data of E. rhapontici, we noticed that this organism encodes two different urea/guanidine carboxylase genes. The first carboxylase (WP_171149239.1) is not riboswitch-controlled, whereas the second (WP_171148480.1) is under control of a guanidine riboswitch, see Table 1. As both were upregulated upon growth on guanidine we were interested to understand their roles in guanidine utilization. Both urea/guanidine carboxylases were overexpressed as His-tagged versions in E. coli and purified via Ni-NTA in order to determine their substrate preferences.
The carboxylation reaction consumes ATP stoichiometrically with regard to substrate turnover (Fan et al., 2012). We monitored the reaction as described before (Kanamori et al., 2004; by coupling the ATP-consuming carboxylation reaction to the reactions of pyruvate kinase and lactate dehydrogenase, which result in the oxidation of NADH. The decrease of NADH can be monitored spectrophotometrically. By plotting the initial velocity over the substrate concentration the kinetic parameters K M and k cat could be obtained. The data for both carboxylase enzymes fitted well to Michaelis-Menten kinetics (Figure 4).
The calculated parameters are shown in Table 2. The riboswitch- Thus, the substrate preference of the enzymes is interchanged. The urea carboxylase has a 375-fold higher specificity constant (k cat / K M ) for urea compared to a 90-fold higher specificity constant of guanidine carboxylase for guanidine. As saturation was not reached for the respective poorer substrate, kinetic parameters for those substrates should be taken with care. However, saturation was almost reached as visualized by plotting the data on a linear scale ( Figure S2). Since E. rhapontici encodes no urease enzyme, the nonriboswitch-associated urea carboxylase (WP_171149239.1) is likely responsible for the observed Ni-independent utilization of urea as N-source ( Figure 2).

| Distribution of urea and guanidine carboxylase enzymes
Previously, it has been noticed that in the substrate-binding pocket of the then described urea carboxylase of O. sagarensis an aspartic acid for the substrate specificity of urea carboxylases. We performed a homology search starting from K. lactis urea carboxylase followed by a multiple sequence alignment using the Consurf platform (Landau et al., 2005). We subsequently aligned the urea and guanidine carboxylase sequences from our strains and performed a neighborjoining using BLOSUM62 (Henikoff & Henikoff, 1992). The homolog sequences clustered in five major and three minor clades (see  Note: Kinetic parameters were obtained from a Michaelis-Menten fit to data in Figure 4. specific enzyme clades: urea carboxylases and guanidine carboxylases. Interestingly, the asparagine residue is only found in the binding pocket of the urea carboxylase (green) clade, whereas all other clades in Figure 4c comprise aspartate at that position. If the occurrence of the aspartate residue in the binding pocket is indeed indicative for guanidine specificity, it seems that guanidine carboxylation is the much more widespread activity of this class of enzymes compared to urea carboxylation. However, more representatives from other clades need to be tested in order to support such a conclusion.

| Distribution of guanidine-utilizing carboxylases in metagenomes/habitats
So far, we have demonstrated that the guanidine-controlled operons encoding ABC-type transporters and carboxylases, carboxyguanidine deiminases, and allophanate hydrolases enable the uptake and assimilation of guanidine. This result is contrasted by widespread occurrence of riboswitch-controlled Gdx-type exporters of guanidine.
In order to shed more light on the physiology of guanidine utilization in nature, we aimed at investigating the occurrence of both pathways in certain bacterial habitats. When the occurrence of guanidine riboswitches in known organisms is taken into account, one notes that the switches are found widely distributed in many phyla of bac- teria. An analysis is complicated by the fact that many bacteria are ubiquitously distributed and it is not always possible to determine whether a given bacterium has a predominantly water-, soil-, plant-, or animal-associated life style. In order to nevertheless extract information about the habitat where guanidine might play a pronounced role, we envisioned that metagenome data could be very helpful in order to connect the occurrence of a riboswitch-controlled activity to a given environment. Since for these data sets there is always a more or less specific sampling, and therefore, the isolated sequences are likely to be typical of the given habitat, we analyzed riboswitches that occur in metagenome data and correlated the frequency of occurrence of Gdx-type exporters and guanidine carboxylase-type utilization pathways (Gca P ) to the annotated habitat, see Figure 5 and Supporting Information File 3.
We selected some representative habitats such as human skin and gut metagenome data as well as environmental samples such as soil and aqueous habitats. We then identified Gdx-type exporters and Gca P (related) genes under control of guanidine riboswitches.
Next, we calculated a frequency of the occurrence of Gca P in the metagenome data in relation to the sum of Gca P and Gdx P . By doing so, we wanted to gain insight into whether riboswitch-controlled genes in a given habitat prefer guanidine utilization via Gca P or Gdx P -mediated export of guanidine, in comparison to other environments. In Figure 5, the X-axis value is zero if all riboswitches in the environment regulate Gdx-type exporter genes, and one if all riboswitches regulate Gca P -related genes. Environments are sorted from those that relatively favor guanidine carboxylase-mediated utilization (freshwater) to those that relatively favor Gdx P -mediated export (human skin). Interestingly, it seems that the occurrence of the guanidine-utilizing carboxylase pathway negatively correlates with the nutrient-and nitrogen-richness of the respective habitat.
Animal-based microbiomes thrive under nutrient-rich conditions, whereas aquatic habitats are often nitrogen-scarce environments (Young et al., 2016). We identified riboswitch-controlled Gca P activity to be more prevalent in N-scarce environments such as fresh water, marine, and soil samples, whereas in N-rich habitats such as the human gut guanidine carboxylase pathway genes are found less frequently in comparison to Gdx exporters.

| D ISCUSS I ON
Here, we show that guanidine is utilized by bacteria using riboswitch-controlled carboxylases and hydrolases. ABC-type importers are also often encoded under control of guanidine riboswitches, likely facilitating the efficient uptake of the N-rich compound. Given that the guanidine carboxylases are found widespread in bacteria, it seems likely that the utilization of guanidine as N-source is a common activity in many organisms. It seems that a single amino acid in the active site is responsible for determining the substrate specificity of the carboxylase reaction. This finding has also been reported recently when for the first time the role F I G U R E 5 Occurrence of Gdx P and Gca P activities regulated by guanidine riboswitches in different environments. Gca P : the number of occurrences of genes assigned to the guanidine carboxylase pathway that are regulated by guanidine riboswitches in data sets from the given environment. Gdx P : the number of gdx genes controlled by guanidine riboswitches. For the selected environments, Gca P + Gdx P is at least 285 of the associated ucaa genes was clarified as carboxyguanidine deiminases (Schneider et al., 2020).
The ATP-dependent carboxylation and subsequent hydrolysis of urea for its utilization has been first described in yeasts and algae in 1968 (Roon & Levenberg, 1968). However, it took until 2004 for the same activity to be described in bacteria (Kanamori et al., 2004). It has been noted before that the urease-mediated and the urea carboxylase/allophanate hydrolase-mediated reactions are two apparently redundant means of degrading urea (Hausinger, 2004). The co-occurrence of bacterial urease and urea carboxylase in one organism led Hausinger to speculate that one of the enzymes might catalyze an alternative reaction. As we have demonstrated, it seems that the majority of organisms encode enzymes that should show greater specificity toward guanidine than urea carboxylation, nevertheless carboxylases with higher specificity toward urea do exist, such as the examples in E.
rhapontici (this study) or in S. cerevisiae and C. albicans (Schneider et al., 2020). Vice versa, considering that the two additional bacteria isolated in this work (K. michiganensis and R. terrigena) are not able to grow efficiently on urea without supplemented Ni for urease activation, it seems that the guanidine carboxylases that these two organisms encode are so specific that they are not able to hydrolyze urea in sufficient amounts in order to sustain growth in absence of a urease activity.
With regard to a possible redundancy of urea degradation in organisms that contain both urease and urea carboxylase it might be advantageous for certain bacteria specialized on the utilization of such compounds to invest in the maintenance of genes encoding both systems. Although the urease-dependent direct hydrolysis of urea is more straightforward and seems more energy-economic than the ATP-dependent detour via the carboxylated intermediate, it requires Ni as cofactor that might not always be available in sufficient amounts. Such a scenario was observed in our experiments when the two strains R. terrigena and K. michiganensis did not grow in minimal media with urea as sole N-source unless Ni was supplemented. In addition, several additional proteins responsible for the modification of the active site, Ni 2+ loading, and Ni homeostasis are necessary for the activation of urease (Farrugia et al., 2013).
Here, we have shown that guanidine utilization is widespread and predominantly found in bacterial organisms living in nutrient-scarce environments. The three isolated bacteria seem to have specialized on the utilization of alternative nitrogen sources since all of them possess activities for the assimilation of guanidine and urea. Additionally, two of the isolates (R. terrigena and K. michiganensis) possess genes necessary for N 2 fixation, a rather rare feature among enterobacteria.
Guanidine utilization is carried out via carboxylation and subsequent hydrolysis. The carboxylase enzymes appear to be specific for guanidine, although urea-specific homologs also exist, sometimes even in the same organism as we have found for E. rhapontici. Interestingly, R. terrigena and K. michiganensis also contain Gdx-type guanidine exporters. Similar to the Gca pathway enzymes Gdx is also under control of a guanidine-dependent on-riboswitch. We speculate that at low nitrogen concentrations guanidine is utilized via the Gca activities and at high nitrogen and guanidine concentrations the Gdx exporter is getting rid of excess guanidine. Such a scenario could be facilitated by placing the expression of the Gca pathway under control of a nitrogen limitation-responsive control mechanism. We see some evidence in the proteome data where nitrogen limitation activities are upregulated in general when guanidine is offered as sole nitrogen source. We have further presented a bioinformatics method in order to assign a given biochemical activity to certain habitats by surveying the occurrence and frequency of certain genes in metagenome data from specific habitats. However, the presented findings pose again the intriguing question of the source of guanidine in nature. Given the widespread occurrence of guanidine-sensing riboswitches as well as guanidine-transporting and metabolizing activities, it seems very likely that so-far overlooked, widespread guanidine-producing abiotic or biotic reactions exist in nature.

| Enrichment of guanidine-utilizing bacteria
Environmental sample was taken in late September 2019 from the lake shore sediment of the Lake of Constance (47°41′44.2′′N 9°11′35.1′′E). Sediment was rinsed with lake water and filtered.

| Growth analysis
Isolated strains were grown in minimal medium with 5 mM guanidine, 10 mM urea, or 15 mM NH 4 Cl as the sole carbon source, respectively.
Growth was monitored in a 96-well plate in biological triplicates. The medium was inoculated to OD 600 = 0.0005. The growth medium was covered with M20 silicon oil to allow gas exchange but avoid evaporation. Plates were shaken at 200 rpm at 30°C in a TECAN reader and OD 600 was measured every 10 min until stationary phase was reached.

| Proteome data
Cells were grown in minimal media with the respective nitrogen source at 30°C. Same amounts of cells were harvested after approximately 10 hr. Cells were lysed by sonification with a Branson Sonifier in 1x PBS. Total protein amount was determined with the BCA Kit from Thermo Scientific according to the manufacturer's protocol. A 50 µg total protein were send for proteome analysis.

| Construction and expression of urea and guanidine carboxylase
The full length gene of urea (uca1, WP_171149239.1) and guanidine

| Enzymatic assay
Carboxylation activity (ATP cleavage) was measured by monitoring the coupled activity of pyruvate kinase and lactate dehydrogenase, as described before (Kanamori et al., 2004

| Phylogenetic analysis of urea and guanidine carboxylases
Homology search and multiple sequence alignment were performed with Consurf (Landau et al., 2005) platform based on K.
Jalview (Waterhouse et al., 2009) was used for the alignment of the sequences of the guanidine carboxylases and urea carboxylase from our strains. Subsequently, a phylogenetic tree was generated based on neighbor joining with BLOSUM62 (Henikoff & Henikoff, 1992). The phylogenetic tree was illustrated with iTOL (Letunic & Bork, 2019).

| Analysis of guanidine operons in genomes and metagenomes
All complete bacterial genomes in version 87 of the RefSeq nucleotide database (NCBI Resource Coordinators, 2015) were analyzed.
Complete genomes were defined as those whose accession begins with "NC_." The guanidine-I, -II, -III riboswitches were searched using the standard procedure in Rfam [cite https://pubmed.ncbi.
nlm.nih.gov/33211869]. For guanidine-I, -II, and -III, we used Rfam entries RF00442, RF1068, and RF01763, respectively. Guanidine-IV riboswitch locations were taken from our recent publication (Lenkeit et al., 2020). Guanidine-I riboswitches are highly similar to riboswitches with other ligand specificities (Sherlock & Breaker, 2020). To extract only guanidine-I riboswitches, we looked at the two nucleotides immediately following the conserved CAC sequence . We accepted only sequences in which these nucleotides were GG. To reduce false positive riboswitch predictions, we also eliminated guanidine-II sequences unless they had the tetramer ACGR in both hairpins. We also enumerated guanidine-III nucleotides that were at least 97% conserved and not predicted to form a Watson-Crick base pair, and eliminated sequences that deviated from these conserved nucleotides.
If the distance between a riboswitch and the first downstream gene was at most 700 nucleotides away, and the gene is encoded in the same direction, we assumed that the gene was regulated by the riboswitch. Subsequent genes were presumed to be co-transcribed if they were also encoding in this strand, and were located no more than 500 nucleotides from the previous gene. These maximum distances are conservatively high to ensure all relevant genes would be found.
Genes were functionally classified based on conserved protein domains in version 32.0 of the Pfam database (Mistry et al., 2021). In addition to assigning Gdx-type exporters and Gca-type carboxylase pathways, a third activity (termed Agmat for Agmatinase-like proteins) was included in the analysis since it is also often controlled by guanidine riboswitches, see Supporting Information File 1 and Table   S2. However, since its function is unclear and it is not connected to the Gdx and Gca activities, we have not further pursued the occurrence and function of the Agmat pathway. We manually defined a mapping between guanidine-associated gene functions and Pfam entries (Table S2: Functions-and-PFAM). We also defined a mapping between gene functions and pathways. Riboswitches were deemed to control a pathway when they appeared to regulate at least one gene function that is unambiguously associated with that pathway, that is, are not components of ABC transporters. Metagenomes were downloaded from various sources, predominantly IMG/M (Chen et al., 2019) and GenBank (NCBI Resource Coordinators, 2015), and we classified them into environmental categories based on available metadata. Due to inconsistencies in metadata, environmental categories were largely created manually. The locations of genes were predicted by MetaProdigal (Hyatt et al., 2012). In Figure 5, we counted the number of pathway-specific and riboswitch-regulated genes to arrive at numbers for the two pathways (Gdx exporter and Gca carboxylase). Otherwise, our annotations of riboswitches and genes in metagenomes was the same as with RefSeq.

ACK N OWLED G M ENT
JSH acknowledges funding from the ERC CoG "RiboDisc." ZW acknowledges funding from the DFG (WE6322/1-1). We thank Stephanie Gurres and Astrid Joachimi for excellent technical assistance. The authors declare no conflict of interest.