MarR homologs with urate-binding signature


  • Inoka C. Perera,

    1. Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana 70803
    Current affiliation:
    1. Inoka C. Perera's current address is Department of Zoology, University of Colombo, Colombo, Sri Lanka
    Search for more papers by this author
  • Anne Grove

    Corresponding author
    1. Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana 70803
    • Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803

    Search for more papers by this author


Bacteria that associate with living hosts require intricate mechanisms to detect and respond to host defenses. Part of the early host defense against invading bacteria is the production of reactive oxygen species, and xanthine oxidase is one of the main producers of such agents. The end-product of the enzymatic activity of xanthine oxidase, urate, was previously shown to be the natural ligand for Deinococcus radiodurans-encoded HucR and it was shown to attenuate DNA binding by Agrobacterium tumefaciens-encoded PecS and Burkholderia thailandensis-encoded MftR, all members of the multiple antibiotic resistance regulator (MarR) family. We here show that residues involved in binding urate and eliciting the DNA binding antagonism are conserved in a specific subset of MarR homologs. Although HucR controls endogenous urate levels by regulating a uricase gene, almost all other homologs are predicted to respond to exogenous urate levels and to regulate a transmembrane transport protein belonging to either the drug metabolite transporter (DMT) or the major facilitator superfamily (MFS), as further evidenced by the presence of conserved binding sites for the cognate transcription factor within the respective promoter regions. These data suggest the use of orthologous genes for different regulatory purposes. We propose the designation UrtR (urate responsive transcriptional regulator) for this distinct subfamily of MarR homologs based on their common mechanism of urate-mediated attenuation of DNA binding.

Members of the multiple antibiotic resistance regulator (MarR) family of transcriptional regulators control gene expression in response to ligand binding. We have identified a subset of MarR homologs that are predicted to bind urate as their ligand. These homologs are encoded by bacterial species that associate with a living host, suggesting that they detect the urate produced by the host in the course of its defense against the bacterial invasion.


Invading bacteria trigger an array of defense mechanisms in their hosts. One of the early defenses is to produce oxidative radicals, and a number of enzymatic pathways are involved in production of such oxidative free radicals, with NADPH-oxidase and xanthine oxidase (or xanthine dehydrogenase) activity among the best characterized.1–3 Xanthine dehydrogenase, which accepts both NAD+ and molecular oxygen as electron acceptors to yield NADH and superoxide, respectively, can in some species be converted into xanthine oxidase, which has little reactivity towards NAD+. Xanthine dehydrogenase participates in purine degradation where it catalyzes the conversion of xanthine to hypoxanthine and hypoxanthine to urate, both reactions associated with the production of reactive oxygen species. Although the byproducts, oxidative radicals, may be used for defense purposes, the main product, urate, is a potent antioxidant that may protect the surrounding cells from harmful effects of oxidative free radicals.

In Deinococcus radiodurans, which is particularly adept at resisting environmental stress,4 including that produced by reactive oxygen species, urate may indeed serve a primary role as an antioxidant. Previous work has shown that cellular concentrations of urate are intricately regulated by a transcriptional regulator, HucR, which belongs to the multiple antibiotic resistance regulator (MarR) family.5 MarR homologs are winged helix transcriptional regulators that often function as repressors; transcriptional regulation may be achieved by binding a specific ligand, which causes the transcription factor to dissociate from its cognate sites, relieving the repression.6 Typically, MarR homologs are auto-regulatory in that a cognate site is frequently found in the intergenic region between genes encoding the MarR homolog and a divergently oriented gene. For D. radiodurans HucR, its cognate site is in the shared promoter region between genes encoding HucR and a uricase.5, 7 The natural ligand for HucR is urate, the substrate for uricase, and binding of urate attenuates HucR binding to the promoter region allowing the transcription of both uricase and hucR genes. A locus containing a uricase gene that is encoded divergently from a urate-reponsive MarR homolog is unique to D. radiodurans, lending credence to the inference that endogenous urate may contribute to the unique oxidative stress resistance of this organism.

In silico docking suggested that urate, which binds HucR with low μM affinity and negative cooperativity, binds two symmetrically disposed sites in the HucR dimer and predicted the residues required for urate binding (Fig. 1).7–9 These data suggested a model for attenuation of DNA binding upon the interaction with urate, in which the N3-deprotonated urate binds by association with W20 and R80 while repelling D73.9 As R106 is critical for proper orientation of the DNA recognition helix, in part through a salt bridge between D73 and R106, urate-mediated displacement of D73 causes a conformational change in the recognition helix, resulting in attenuated DNA binding. This model was supported by site-directed mutagenesis and by the observation that other purines fail to elicit a comparable effect on DNA binding, further suggesting that urate is the specific ligand.9

Figure 1.

Structure of urate-docked HucR. The two monomers are in green and blue and the DNA binding helices are in cartoon representation. The residues identified to be important for ligand binding and attenuated binding to DNA are colored red. Urate is in a mesh rendering.

Considering the structural homology between MarR homologs, conservation of the four residues shown to be essential for urate-binding to HucR and for communicating occupancy of the ligand-binding pocket to the DNA recognition helices predicts that urate is a ligand for that particular homolog. This prediction was borne out by the demonstration that Agrobacterium tumefaciens (Rhizobium radiobacter) encodes a MarR homolog, annotated as PecS, which responds to urate by attenuated DNA binding in vitro and transcriptional regulation in vivo and by the observation that the predicted Burkholderia thailandensis-encoded urate-responsive MarR homolog also responds to urate.10, 11 For both D. radiodurans HucR and A. tumefaciens PecS, extensive site-directed mutagenesis was performed to support the identification of amino acids involved in ligand binding, and all three transcription factors respond comparably to a series of structurally distinct purines, consistent with a shared mechanism for ligand-mediated attenuation of DNA binding.9–11 In A. tumefaciens and other species that encode a MarR homolog annotated as PecS, the transcription factor is not predicted to regulate intracellular levels of urate, but to regulate expression of an efflux pump of the drug metabolite (DMT) family as well as other genes involved in host infection; in Burkholderia sp., the urate-responsive MarR homolog is predicted to regulate an efflux pump of the major facilitator superfamily (MFS) that has similarity to the multidrug exporter EmrD from Escherichia coli.11 Here, we report evidence for an evolutionary relationship within a subset of MarR homologs that conserve a urate-binding signature; while HucR has been shown to regulate endogenous urate levels, other homologs are predicted to respond to exogenous urate and to regulate a membrane transporter, as evidenced by conserved cis-regulatory elements in the respective promoter regions. On the basis of the predicted conservation of function, we propose the designation UrtR (urate responsive transcriptional regulator) as a distinct subfamily of MarR homologs.

Results and Discussion

To examine the frequency of occurrence of MarR homologs that conserve all four residues shown to bind urate and to confer attenuated DNA binding, we used the Dickeya dadantii (Erwinia chrysanthemi) PecS protein sequence to search for homologs that conserve these residues. Physiological and cellular functions of D. dadantii-encoded PecS have been particularly well characterized; for example PecS was shown to function as a repressor of the efflux pump PecM, encoded divergently from the gene encoding PecS, and of the indigoidine biosynthesis operon, IndABC.12–15 Production of the antioxidant indigoidine and its export through PecM was shown to be important for host infection.16, 17 Recently, a genome-wide description of the PecS regulon was achieved by a comparative microarray analysis of WT and ΔpecS strains of D. dadantii, further establishing the key role of PecS in regulating genes associated with infectivity.18 Further, we have documented that urate elicits transcriptional de-repression in vivo for A. tumefaciens, as evidenced by upregulation of PecS and PecM transcripts.10

Sequence alignment predicts existence of MarR sub-family with urate-binding signature

Among the ∼1000 sequences retrieved from a BLAST search of the NCBI data-base (, 340 homologs conserve these four residues. The sequences were aligned to identify significant features that characterize an UrtR at the amino acid level. As observed in the sequence alignment (Fig. 2 and a more comprehensive list in Fig. S1), a salient feature of UrtR homologs is the presence of an N-terminal extension, which harbors one of the critical amino acids of the urate binding pocket, tryptophan. This extension is absent from E. coli MarR and its homologs. Further, helix-3 has high sequence identity among UrtR homologs. Helix-3 forms the bottom of the ligand-binding pocket; in HucR, in addition to D73 and R80, T77 was predicted to hydrogen bond with urate through a backbone nitrogen atom. Amino acids in the DNA recognition helix (α5) also show sequence similarity, particularly in the central region of the helix. The arginine that is critical for anchoring the binding helix in proper orientation (R106 in HucR) is likewise conserved in all UrtR homologs, whereas the degeneracy at the ends of the binding helix and in the wing region may contribute to the individual sequence preference of UrtR homologs. Additionally conserved residues, such as residues connecting helical segments, are likely conserved for structural reasons. What is particularly significant is that MarR homologs, known to bind ligands other than urate or to respond to oxidative stress, fail to conserve the residues lining the urate-binding pocket (Figs. 2 and S1). Given the high sequence identity, it is clear that the UrtR homologs cluster together in a phylogenetic tree suggesting that they constitute an evolutionarily conserved group of proteins (Fig. 3).

Figure 2.

Economically important pathogenic or symbiotic bacteria that encode an UrtR. Residues involved in urate binding and eliciting the conformational changes are highlighted in red and by asterisk above. E. coli MarR and S. typhimurium SlyA do not conserve these residues and do not contain α1, which harbors the vital tryptophan residue. Secondary structure elements are with reference to HucR (2fbk). Sequences are truncated at the C-termini to generate figure.

Figure 3.

Evolutionary relationships of UrtR and other MarR homologs. All UrtR homologs cluster under a single sub-tree whereas blue sub-tree represents the selected MarR homologs. Tree was rooted using E. coli GntR as an out group. Uniprot identification numbers of the proteins are followed by a short description at the termini of each branch. Prototypical UrtR homologs, D. radiodurans HucR (Q9RV71 DEIRA) and A. tumefaciens PecS (Q7D1T4 AGRT5) are highlighted in red.

UrtR homologs prevalent among species associating with a living host

Notably, a majority of the bacteria predicted to encode an UrtR homolog are known to associate with a living host. The identified homologs were found in a number of symbionts or pathogens of plants and animals, including a few human pathogens. UrtRs are found in plant-associated bacteria that may be either pathogenic, for example Erwinia sp. that cause soft-rot diseases, or symbiotic, for example Rhizobium sp. that generate nitrogen-fixing root nodules. Among plant-associated bacteria, Burkholderia sp. are also featured prominently; this genus includes both species that induce disease (e.g., B. cepacia that causes onion rot) as well as species with the potential to be beneficial (e.g., nitrogen fixing B. ambifaria). The human pathogens Vibrio parahaemolyticus and V. vulnificus are predicted to encode an UrtR that conserves all residues that constitute the urate-binding signature, whereas V. cholerae is not. This may correlate with the ability of V. parahaemolyticus and V. vulnificus to colonize marine bivalves that respond to the invading microorganisms by a coordinated response that includes a respiratory burst.19–21 In contrast, the natural habitat of V. cholerae is the aquatic environment. Bacteria identified to be pathogenic to humans are medically important; many are opportunistic pathogens and several are capable of causing fatal infection. Xanthine oxidase activity has been reported to contribute to the generation of oxidative free radicals in macrophages in response to infection.22, 23 Bordetella pertussis, the causative agent of whooping cough, encodes an UrtR homolog. Further, Burkholderia mallei that causes glanders and the etiological agent of melioidosis, B. pseudomallei, both of which are classified as Category B Priority agents and as potential bioterrorism agents, also encode UrtR homologs.24, 25 Some bacteria encode more than one UrtR homolog, which may have resulted from a gene duplication event, and there are a few homologs that lack the N-terminal extension, which would otherwise be characteristic of UrtRs. The latter might be due to deletions that occurred during evolution or to misannotations in the databases; for example, B. ambifaria AMDD contains a characteristic UrtR while B. ambifaria MC40-6 homolog BamMC406_1599 has an N-terminal truncation that resulted from an insertion of an OrfB family transposase in the operator region.

Divergent genes under UrtR regulation

The vast majority of UrtR homologs were found to be part of a genomic locus that includes a divergently oriented gene. This mode of regulation is not unique to UrtR homologs and has been previously found to characterize a number of other MarR homologs. Binding of UrtR to the intergenic region would be expected to repress transcription of the divergent gene while simultaneously auto-regulating its own expression, with urate-binding to UrtR predicted to elicit de-repression. This mode of regulation was experimentally confirmed for both D. radiodurans HucR and A. tumefaciens PecS.5, 10 The vast majority of genes divergent from UrtR are annotated in sequence databases as membrane transporters related to PecM, permeases of the drug/metabolite transporter (DMT) superfamily (to which PecM belongs), DUF6 (domain of unknown function 6), a domain found twice in proteins annotated as PecM, or as major facilitator superfamily MFS_1 homologs. As described in D. dadantii, PecM is an efflux pump, which mediates the secretion of antioxidant species such as indigoidine to protect the pathogen from host defenses. PecM contains 10 membrane-spanning segments and may have arisen from an internal gene duplication event.26 In contrast, the MFS homologs, which may have also resulted from a gene duplication event, have 12 predicted transmembrane segments. Although most genes predicted to be under UrtR control belong in the PecM/DMT/DUF6 category and share homology to known PecM homologs, Burkholderia sp. are predicted to use their UrtR to regulate an MFS homolog, which bears resemblance to the multidrug efflux pump EmrD from E. coli.11 Notable exceptions include D. radiodurans HucR, which regulates a uricase, as discussed above. No other UrtR homologs were found in an equivalent genomic locus (including the closely related D. geothermalis) suggesting that regulation of urate homeostasis is unique to D. radiodurans. Another exception is found in Streptomyces sp., some of which use an UrtR homolog to regulate a divergent gene encoding a putative trans-aconitate methyltransferase (e.g., S. coelicolor SCO3133). Trans-aconitate inhibits aconitase, an inhibition prevented by its methylation by trans-aconitate methyltransferase.27 Aconitase is a bifunctional enzyme that in addition to its interconversion of citrate to isocitrate via cis-aconitate in the citric acid and glyoxylate cycles binds a number of mRNAs to regulate their translation. The latter occurs on disassembly of its [4Fe-4S] cluster under conditions of Fe starvation or oxidative stress. As apo-aconitase regulates translation of proteins involved in oxidative stress responses, it has been proposed to function as an oxidative stress sensor.28, 29 For example, E. coli AcnA activates synthesis of superoxide dismutase, and acn mutants are sensitive to oxidative stress. As the presence of exogenous urate would correlate with an oxidative burst, we speculate that the ability to upregulate trans-aconitate methyltransferase would remove an aconitase inhibitor and ensure full activity of aconitase and hence a more robust defense against reactive oxygen species.

Consensus UrtR binding sites in intergenic regions

A selection of sequences upstream of UrtR homologs were analyzed for the presence of binding sites similar to those identified for D. radiodurans HucR and A. tumefaciens PecS. It was found that many contain two or three dyadic sequences in the intergenic region to which the UrtR may preferentially bind. When three are present, a single binding site was found towards the urtR gene while closely spaced tandem (e.g., S. meliloti, S. medicae, and E. carotovora) or overlapping sequences (e.g., A. marina, P. mendocina, A. tumefaciens, and R. etli) were found towards the divergently encoded gene. In some instances, the sites extend into the coding region of the gene under regulation (e.g., in A. tumefaciens and B. ambifaria AMDD). Occupancy of both isolated and overlapping sites was experimentally observed by DNase I footprinting for A. tumefaciens PecS.10 Overlapping sites typically share 3 bp; this arrangement of cognate sites would place the centers of each palindrome 15 bp apart, predicting that two UrtR homologs bind on opposite faces of the DNA duplex (Figs. 4 and 5). For tandem, non-overlapping sites, the separation is variable, but typically not less than 9 bp.

Figure 4.

Consensus UrtR binding site generated with selected intergenic sequences from UrtR regulatory regions. Top sequence was generated with 60 putative binding sequences whereas bottom panel represents the consensus sequence generated after removing overlapping sequences. WebLogo represents the relative frequency of base pairs at each position.

Figure 5.

Many UrtR homologs have multiple binding sites in the shared operator region. Overlapping sites generally share 3 bp, resulting in two cognate sites on opposite faces of the duplex.

The sequences were aligned and the 18 bp dyadic sequences were used to generate a frequency matrix (Tables S1 and S2). Aligned sequences were also used to generate a consensus sequence using WebLogo, the height of each base representing the relative frequency at each position (Fig. 4). Two separate consensus frequencies were calculated, one based on all sequences and the other in which the more divergent binding site for HucR and overlapping sequences were excluded. Considering only isolated, non-overlapping palindromes reveals that the central bases of each half-site are highly conserved but that the sequence is asymmetrical, with greater conservation within the upstream half-site (towards the urtR gene). The conservation of the central bases (C4T5T6 and A13A14G15) is consistent with the high sequence identity in the DNA recognition helix discussed above. The asymmetric nature of the binding site is intriguing, considering that the transcription factor is a homodimer. However, asymmetric conformational changes on DNA binding have been inferred for some MarR homologs, and the negative cooperativity of ligand binding characteristic of several homologs (including HucR and PecS) also implies non-equivalence of the two lobes.7, 9, 10, 30 Non-equivalence of each DNA half-site may contribute differently to DNA- or ligand-induced conformational changes and may potentially contribute to activator functions of the protein by promoting favorable interactions with RNA polymerase. Inclusion of overlapping sites in calculating a consensus sequence primarily results in reduced sequence conservation within the three upstream bases, which are shared with the adjacent palindrome, suggesting that simultaneous occupancy by two protomers imposes additional sequence preferences.5

The UrtR regulon

Using the above described frequency matrix, a weight matrix that takes into account the genomic G+C-content was generated for D. dadantii and A. tumefaciens using genome scale Patser.31 The search resulted in 74 potential binding sites in D. dadantii and 81 in A. tumefaciens. As expected, both position weight matrices identified the binding sites in the intergenic region between the urtR (pecS) and pecM genes with the highest score. For D. dadantii, the search also identified cellulase (YP_002987500.1), which contributes to the soft-rot phenotype on D. dadantii infection of the host. Although sites resembling the consensus sequence were not found upstream of pectate lyase genes, two genes that may be useful in utilizing the byproducts of host tissue maceration were identified as potentially regulated by UrtR. Oligogalacturonate-specific porin (YP_002987550.1) imports pectin catabolites to be utilized as a carbon source and was previously reported to be regulated by PecS in D. dadantii, while Mandelate racemase/muconate lactonizing protein (YP_002988077.1) was identified as an enzyme involved in catabolism of aromatic acids, which can be derived from the breakdown of lignin.32, 33

Potential binding sites were also predicted upstream of a number of virulence genes, of which most are membrane-associated proteins such as YP_002987993.1 and YP_002989586.1, the latter annotated as Curli production assembly/transport component CsgG. Curli production has been reported in several bacterial pathogens where they aid in association with a number of host cell surface proteins and biofilm formation.34, 35 Flagellar associated proteins are required in pathogenic phases for the motility of the bacterium towards the host and for their proper orientation. The Patser search identified Flagellin chaperon FliS, which is vital for the proper assembly of the flagellum and flagellar transcriptional activator flhC, as having potential UrtR binding sites in their operator regions.36, 37 Similarly, a search of the A. tumefaciens genome finds two flagellar associated proteins, NP_531246.1 and flab.38 Indigoidine biosynthesis genes were not identified under the lower threshold estimation of 9, but further lowering of the threshold estimation did identify these genes, suggesting the UrtR binding sequences can be highly degenerate, as previously reported for Erwinia PecS.39 In both organisms, a large number of transcriptional regulators were predicted to contain UrtR binding sites. This result suggests that the control of gene expression through UrtR involves an even higher level of complexity.


Both in vivo gene expression studies and in vitro analyses of DNA binding have documented that urate is a ligand for D. radiodurans-encoded HucR, A. tumefaciens PecS, and B. thailandensis MftR (major facilitator transport regulator).5, 10, 11 What these proteins have in common is the conservation of four residues shown to contribute to ligand binding or to communicating occupancy of the ligand-binding pocket to the DNA recognition helices. Our data indicate that a subset of MarR homologs exists that conserves these amino acids, whereas homologs that bind other ligands or respond to oxidative stress fail to conserve these residues. Although HucR is an outlier in terms of the gene under its regulation (and in responding to and regulating endogenous urate levels),5 A. tumefaciens PecS represents the vast majority of the identified transcription factors in regulating a membrane transporter of the PecM family.10 Notably, we have documented experimentally that urate elicits transcriptional de-repression in vivo, as evidenced by increased levels of PecS and PecM transcripts.10 Other UrtR homologs predicted to respond to exogenous urate levels include Burkholderia MftR that regulates a member of the MFS family of membrane transporters11 and a few homologs in Streptomyces sp. predicted to regulate a trans-aconitate methyltransferase, also potentially implicated in oxidative stress responses. Interestingly, UrtR homologs were found in species associating with living hosts of which many are medically and economically important as causative agents of diseases in crops, livestock, and humans. This is consistent with the notion that UrtR responds to urate produced by a host organism as part of an oxidative burst designed to produce ROS as a defense against the invading bacteria.

UrtR homologs are generally divergently encoded from a membrane transporter gene. Most of these are homologs of E. chrysanthemi PecM, which is an efflux pump that secretes the antioxidant indigoidine, important for protecting the pathogen from host defenses. Many operator sequences between urtR and the divergent gene have multiple UrtR binding sites with the closely spaced tandem or overlapping sites invariably near the gene encoding the transporter, which may result in differential gene expression in response to urate. Taken together, these findings suggest the existence of a distinct subfamily of MarR homologs with a characteristic urate-binding signature.

Materials and Methods

Blast search was performed using Dickeya dadantii (Erwinia chrysanthemi) encoded PecS (Uniprot ID: P42195) as the search query. Non redundant protein sequences above 40% sequence identity were retrieved from Uniprot and NCBI servers and manually filtered to remove genes from multiple variants of a single species and from species not identified beyond their genera. Retrieved sequences were aligned using the MUSCLE sequence alignment server.40 The D. radiodurans encoded HucR sequence was included in the alignment to trace secondary structure elements. A number of E. coli MarR homologs were also included to illustrate the absence of identical residues and the N-terminal extension seen in other UrtR homologs. Residues were shaded according to their identity and similarity using BOXSHADE v3.21 at

The evolutionary history was inferred by Neighbor-Joining method using pre-aligned sequences with MEGA4 where the evolutionary distances were computed using the poisson correction method and are in the units of the number of amino acid substitutions per site.41–43 The bootstrap consensus tree was inferred from 500 replicates and the tree is drawn to scale.44

On the basis of the results of previous findings, 60 random UrtR operator regions were analyzed in order to find dyad sequences to which UrtR may bind. The search identified 18 bp palindromic sequences, which were used to generate the sequence logo in which the frequency of each nucleotide at a particular position is represented by the height of the corresponding symbol.45 Calculated nucleotide frequencies at each position were used to create a weight matrix to find probable alternate binding sites in several genomes using genomic-scale patser.31 The lower threshold estimation was raised to 9 and the genomic GC content (D. dadantii = 55% and A. tumefaciens = 59%) was considered in calculating the weight matrix.


Support from the National Science Foundation (MCB-0744240 to A.G.) is gratefully acknowledged.