Substrate specificity characterization for eight putative nudix hydrolases. Evaluation of criteria for substrate identification within the Nudix family

ABSTRACT The nearly 50,000 known Nudix proteins have a diverse array of functions, of which the most extensively studied is the catalyzed hydrolysis of aberrant nucleotide triphosphates. The functions of 171 Nudix proteins have been characterized to some degree, although physiological relevance of the assayed activities has not always been conclusively demonstrated. We investigated substrate specificity for eight structurally characterized Nudix proteins, whose functions were unknown. These proteins were screened for hydrolase activity against a 74‐compound library of known Nudix enzyme substrates. We found substrates for four enzymes with k cat/K m values >10,000 M−1 s−1: Q92EH0_LISIN of Listeria innocua serovar 6a against ADP‐ribose, Q5LBB1_BACFN of Bacillus fragilis against 5‐Me‐CTP, and Q0TTC5_CLOP1 and Q0TS82_CLOP1 of Clostridium perfringens against 8‐oxo‐dATP and 3'‐dGTP, respectively. To ascertain whether these identified substrates were physiologically relevant, we surveyed all reported Nudix hydrolytic activities against NTPs. Twenty‐two Nudix enzymes are reported to have activity against canonical NTPs. With a single exception, we find that the reported k cat/K m values exhibited against these canonical substrates are well under 105 M−1 s−1. By contrast, several Nudix enzymes show much larger k cat/K m values (in the range of 105 to >107 M−1 s−1) against noncanonical NTPs. We therefore conclude that hydrolytic activities exhibited by these enzymes against canonical NTPs are not likely their physiological function, but rather the result of unavoidable collateral damage occasioned by the enzymes' inability to distinguish completely between similar substrate structures. Proteins 2016; 84:1810–1822. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.


INTRODUCTION
The Nudix protein superfamily is vast and diverse. 1,2 It comprises about 50,000 members in the Nudix clan (Pfam ID: CL0261) of the Pfam database (version 27.0), 3,4 and all members share a characteristic 130 amino acid beta-grasp domain architecture 5 classified as the Nudix fold (SCOPe v2.03 SUNID 55810, SCCSID d.113; 6,7 ). Many Nudix proteins are pyrophosphohydrolases. They catalyze the hydrolysis of nucleoside diphosphates linked to some other moiety, X. 1 These Nudix hydrolases are typically characterized by a sequence of 23 amino acids [Gx 5 Ex 7 REUxEExGU], where U can be Ile, Leu, or Val, and x represents any amino acid. The common structural theme amongst Nudix hydrolases is that the active site contains a magnesium binding site that serves to recognize the pyrophosphate linkage common to all Nudix substrates. 4 In the Pfam database, non-hydrolase proteins are also classified under the Nudix clan. For these proteins, one or a few conserved residues of the 23-amino-acid Nudix box vary. They do still share a characteristic 130 amino acid beta-grasp domain architecture 5 classified as the Nudix fold (SCOPe v2.03 SUNID 55810, SCCSID d. 113 6,7 ).
Although Nudix proteins were originally proposed to catalyze reactions that "sanitize" the nucleotide pool, and thus act as "housecleaning" enzymes, 1 they are now known to exhibit a wide range of activities apparently delimited only by the common recognition of a pyrophosphate bond of the substrate. As of July 2013, 171 Nudix proteins had been experimentally characterized (J. R. Srouji et al. Submitted), which serve a variety of cellular functions, including messenger RNA decapping, 8 alternative mRNA polyadenylation, 9 3 0 !5 0 RNA exonuclease activity, 10 isopentenyl pyrophosphate isomerization, 11 ADP-ribose responding calcium channel gating, 12 ADP-ribose responding transcriptional regulation, 13 SIRT1 or related deacetylase regulation, 14,15 and hydrolysis of a large group of nucleoside diphosphate derivatives. 16 Specifically, a large subset of enzymes in this family hydrolyzes potentially mutagenic NTPs, such as 8-oxo-dGTP, 17 2-OH-dATP, 18 and 5-methyl-UTP. 19 Although sequences for about 50,000 Nudix family genes are available, X-ray or NMR structures of only 78 Nudix proteins are found in the PDB database (Feb 1st, 2013). 20 Some experimental characterization data are available for 2/3 of these 78 proteins, but the identities of the true physiological substrates for many of them are uncertain. The eight proteins selected for functional characterization in this study were chosen because: (1) X-ray structures were available; (2) none had been characterized experimentally; and (3) they fall into well separated clades of the Nudix family tree, based on sequence and structure analysis (J. R. Srouji et al. Submitted). Thus substrate identification for these enzymes would be expected to provide a resource that would enhance computational functional annotation of additional Nudix genes.
A screening library of 74 chemicals was assembled and initially 63 of these were divided into 11 groups that were screened as mixtures, and 11 chemicals were screened individually. Chemicals from the most active groups were selected and screened individually. The most reactive compounds were assayed carefully to determine the kinetic parameters. The medium throughput assays were carried out with a phosphate sensor fluorescence assay. 21 Finally, many of the previously characterized enzymes, as well as some of those investigated here, were found to exhibit catalytic activity on a variety of substrates with disparate values of kinetic constants. Thus, identifying the true physiological substrates can often be a challenge. In the Discussion Section, we suggest a series of criteria to facilitate this process.

Materials
One functionally characterized Nudix protein in this study, B9WTJ0_STRSU, was encoded in a plasmid constructed by the Joint Center for Structural Genomics. The other seven were encoded in plasmids supplied by the New York SGX Research Center for Structural Genomics. All eight were purchased from the PSI:Biology-Materials Repository using the Clone IDs provided in Table I  (Pittsburg, PA), and MP Biomedicals (Santa Ana, CA). MDCC, common biochemicals and enzymes were from Sigma-Aldrich (St. Louis, MO). Fast Protein Liquid Chromatography (FPLC) was performed on a BioLogic DuoFlow 10 workstation (from Bio-Rad, Hercules, CA). The HisTrap HP liquid chromatography column was supplied by GE Healthcare (Piscataway, NJ). P i -sensor assays were performed on either a FluoroMax-4 spectrofluorometer (from HORIBA Jobin Yvon, Edison, NJ) or a GENios microplate reader (from Tecan, Switzerland).

Nudix protein purification
The eight enzymes investigated here were purified with a His-Tag protein purification protocol. The vector harboring Q8PYE2_METMA fused with the C-terminal 6-His tag was extracted from the storage strain of the PSI:Biology-Materials Repository (kanamycin resistant, grown in LB medium) using the standard protocol of the Gene Jet Plasmid Miniprep Kit. The plasmid was transformed into BL21(DE3) cells.
Q8PYE2_METMA was expressed and purified as described by Harris et al. 23 with the following changes: The BL21(DE3) cells bearing the pSGX3-Q8PYE2_METMA plasmid were grown overnight in LB at 378C, diluted with LB (1:50) to 4 L, and grown at 378C for about 2.5 h until Abs 600 was between 0.5 and 1.0. IPTG was added to a final concentration of 0.5 mM to induce protein production for 2 h. The cells were harvested by centrifugation at 4500g, at 48C for 20 min. The cell pellets were stored at 2808C. Frozen cell pellets corresponding to 0.5 L of cell culture were lysed with Bug-Buster reagents as described in the kit protocol. Cell debris was removed by centrifugation at 5000g, at 48C for 20 min.
The cell lysate was filtered through a 0.22 lm PES syringe membrane, and the filtrate applied to a HisTrap HP column equilibrated with 50 mM Tris-HCl buffer, pH 7.6 containing 10 mM imidazole. Q8PYE2_METMA was eluted with 0-100% gradient of buffer containing 500 mM NaCl and 500 mM imidazole. Q8PYE2_-METMA fractions were combined and concentrated to < 500 lL by Amicon filtration. The final preparation of Q8PYE2_METMA was > 95% pure as judged from SDS-PAGE. These fractions were combined, concentrated to 100 lM, divided into 20-lL aliquots, and stored at 2808C.

Enzyme assays
The P i -sensor kinetic assays were performed on a GENios microplate reader (reaction volume 5 100 lL) for initial screening and with a FluoroMax-4 spectrofluorometer (reaction volume 5 500 lL) for accurate determination of the kinetic parameters. The standard reaction mixture contained: 10 mM Tris-HCl, pH 7.6, 1 mM MgCl 2 , 5-10 lM PBP-MDCC (depending on the concentration of background phosphate introduced by the substrates impurities), 0.05 U/mL of yeast pyrophosphatase (PPase) where pyrophosphate was one of the products, or 1 U/mL of alkaline phosphatase (APase), where a nucleoside monophosphate was a Nudix enzyme product. Experiments were done in all cases to verify that sufficient coupling enzyme was present to ensure that the rates of reaction were linearly dependent on the concentration of the Nudix hydrolase. Nudix enzymes concentrations ranged from 1-100 nM. The mixtures were incubated at 378C and monitored continuously for 30 min on the microplate reader or for 5 min on the spectrofluorometer. (GENios: k ex 425 nm, k em 465 nm, gain 50, 100 cycles, 378C; FluoroMax-4: k ex 430 nm, slit width 1 2 2.5 nm, k em 465 nm, slit width 1-5 nm, 378C).
A screening library of 74 commercially available putative substrates was assembled primarily from compounds that had been shown to be active with one or more Nudix hydrolases (J. R. Srouji et al. Submitted). Chemicals were grouped initially by structural similarity and by considerations of the necessity and choice of either alkaline phosphatase or inorganic pyrophosphatase as a coupling enzyme in the P i -sensor assay. 21 Typically the 74 substrates were screened in about 22 wells of the microplate. Sixty three of the substrates were assembled into 11 groups; 11 others were assayed individually because of Substrate specificity screening of eight putative Nudix hydrolases against a 74-compound library by P i -sensor assay. Approximate k cat /K m values (M 21 s 21 ) are reported with error bars. Each reaction was carried out at pH 7.6 and 378C with each substrate at 5 lM. Enzyme concentrations varied from 1 to 100 nM. Sixty-three of the potential substrates were mixed into 11 groups and screened as indicated by the numbers in the left brackets. The eleven substrates shown in the "ungrouped" bracket were initially screened individually in one plate. The kinetic values for compounds that were assayed only in the specified group are reported with the grouped activity, which thus represents the upper limits for each component substrate (white bars). Substrates that were assayed individually are reported with mean (gray bars) and standard errors if tested more than three times. The X axes are scaled linearly and are different for each enzyme. their high free phosphate background (Fig. 1, left brackets and numbers).
Each substrate concentration was 5 lM in both grouped and individual screenings. The compounds from individual screening that showed significant activity over background (600 RFU above background) were assayed from 0 to 20 lM, with enzyme concentrations varied from 1 to 100 nM. The kinetics were typically evaluated for each substrate concentration three or more times, with enzyme concentrations varying from 0.26 to 52 nM, to determine the values of the Michaelis-Menten parameters.

Data processing
The regression calculations below were performed with scripts written in R. 24 P i -sensor standard curves for normalizing fluorescence readings were obtained by titrating inorganic phosphate under the same conditions as were used in the experimental reactions. Fluorescence readings were plotted against reaction time, and the linear regions of the plots (RFU/s) yielded initial velocities, v i (lM P i / s). Plots of v i /[E 0 ] were fit to the Michaelis-Menten equation: respectively, by nonlinear regression to yield values of k cat , K m and k cat /K m . For some reactions, nonlinear regression failed to converge, or the standard errors for k cat and K m were comparable to the fitted values themselves; in these cases, only the k cat /K m ratio was obtained by linear regression with a fixed intercept of zero: Two equivalents of inorganic phosphate are ultimately produced for those reactions that initially yield pyrophosphate in the presence of inorganic pyrophosphatase. The values of v i were corrected for this factor. Data from multiple trials were fitted into the same equation to yield weighted average values of the kinetic parameters.

Functionally characterized nudix proteins
Enzymes Q0TTC5_CLOP1 and Q0TS82_CLOP1 are from Clostridium perfringens (strain ATCC 13124/NCTC 8237/Type A), a Gram-positive, spore-forming, obligate anaerobic bacterium. Bacterial alpha toxin produced by C. perfringens is responsible for histotoxic infections, such as gas gangrene. There are 13 putative Nudix proteins in C. perfringens strain ATCC 13124, as annotated by UniProt (release 2013_12), 25 none of which had been functionally characterized previously. Nudix proteins have been shown to facilitate pathogenicity in the host 26 as well as enhancing virulence of the pathogen. 27 Enzyme Q92EH0_LISIN is from Listeria innocua (strain CLIP 11262), a Gram-positive, non-spore forming bacillus, which is a facultative anaerobe. L. innocua is ubiquitous because it can survive in extreme pH and temperature. 28 It is important because it is very similar to the food-borne pathogen Listeria monocytogenes, but is non-pathogenic. In UniProt release 2013_12, 25 none of the functions of the nine putative Nudix proteins in L. innocua (strain CLIP 11262) is reported as having been functionally characterized.
Enzyme Q5LBB1_BACFN is from Bacteroides fragilis (strain ATCC 25285/NCTC 9343). Bacteroides species is a Gram-negative obligate gut anaerobe. B. fragilis is the most frequent isolate from clinical specimens, and is regarded as the most virulent Bacteroides species. 29 Eight genes from B. fragilis strain ATCC 25285 are annotated as coding for putative Nudix proteins by UniProt release 2013_12, 2 5 There are no experimental functional characterization data for any of them.
Enzyme A0ZZM4_BIFAA is from Bifidobacterium adolescentis, a gram-positive organism that is non-motile and often observed in a Y-shaped form. The bacteria colonize human and animal intestinal tracts. 30 Enzyme B9WTJ0_STRSU is from Streptococcus suis, a Gram-positive bacterium. It is a pathogen of pigs and is also a causative agent for zoonotic disease. 31 Enzyme Q9K704_BACHD is from Bacillus halodurans, a rod-shaped Gram-positive, spore-forming soil bacterium that can survive in alkaline environments. 32,33 Enzyme Q8PYE2_METMA is from Methanosarcina mazei, a methane-producing archaeon. M. mazei is a freshwater organism that can adapt to grow at elevated salinities. 34 Initial substrate screening Figure 1 shows the results from substrate screening of 74 compounds for eight potential Nudix hydrolases, in the presence of the appropriate secondary enzyme, namely PPase or APase. Approximate k cat /K m values (M 21 s 21 ) are reported with error bars when applicable. The substrate screening results presented here emerged from a two-step strategy. The substrates were initially grouped by structural similarity and screened in groups. Secondly, those substrates from the most reactive group(s) were separated and screened individually. The substrate concentrations for each compound-both in the grouped mixtures and in the individual screeningswere 5 lM. Sixty-three compounds were initially divided into 11 groups, and screened in the mixtures. The mean k cat /K m values of each group is represented by a white bar. Compounds that passed the initial screening in groups were assayed individually. Those activities are shown in grey bars. Compounds were not assayed individually in cases where the grouped activities were low (for example, Q0TS82_CLOP1 and group 11). Eleven compounds could not be grouped because of their high phosphate background, and were screened individually. Those activities are also depicted by grey bars.
High values for the specificity constant (k cat / K m > 10,000 M -1 s 21 ) were found for Q0TTC5_CLOP1, Q0TS82_CLOP1, Q92EH0_LISIN and Q5LBB1_BACFN with at least one substrate, and these reactions were characterized extensively (see below).
Q8PYE2_METMA exhibits moderate activity toward dinucleotide polyphosphates, where at least one of the bases is adenine (Fig. 1, Q8PYE2_METMA, groups 3 and 4), with k cat /K m values of ca. 5,000 M 21 s 21 ; however, Q8PYE2_METMA shows no preference with respect to the structure of one of the two bases. This is consistent with our previous report, 21 where Q8PYE2_METMA was shown to have moderate activities toward Ap 3 A, Ap 4 A, and Ap 5 A, but has none with 8-oxo-dGTP.
B9WTJ0_STRSU catalyzes the hydrolysis of a variety of dinucleotide polyphosphates such as Ap 5 U and Gp 5 G. The k cat /K m value for Ap 5 U is 10-fold greater than that found for the other tested substrates of this group; however, all of their k cat /K m values are <3000 M -1 s 21 .
No significant activity was found for A0ZZM4_BIFAA or Q9K704_BACHD against any substrate tested, as all k cat /K m values are < 1000 M 21 s 21 .
Theoretically, the apparent k cat /K m value of grouped activity should be equal to the sum of the k cat /K m values from individual screening of the same group. This is, however, generally not true for the data presented in Figure 1. Part of the inconsistency can be explained by the large errors that are inherent in compound library screening exercises. Specific examples include the reactions of Q0TTC5_CLOP1 with 8-oxo-dATP, and of Q8PYE2_METMA with Ap 5 U. Further the microplate reader used here has lower precision than does the cuvette spectrofluorometer. Finally, some compounds in a grouped mixture might act as nonhydrolyzable substrate analogs, thus they behave as competitive inhibitors for an otherwise active substrate.

Michaelis-menten parameters for the most active substrates
The enzyme-substrate pairs with approximate k cat / K m > 10,000 M -1 s 21 as identified from screening were further characterized to determine the kinetic parameters more precisely. Figure 2 shows the results for the reactions of Q0TTC5_CLOP1, Q0TS82_CLOP1, Q92EH0_LISIN, and Q5LBB1_BACFN with their most reactive substrates. The kinetic parameters and their standard errors are given in Table II.

Q0TTC5_CLOP1
The most reactive substrate tested for Q0TTC5_CLOP1 is 8-oxo-dATP with k cat /K m 5 (2.8 6 0.7) 3 10 6 M 21 s 21 . This value approaches the diffusion-controlled limit (see Discussion). The next most reactive substrates are in order: 8-oxo-GTP, which is about one-third as reactive, followed by 8-oxo-dGTP, dGTP, dATP, and GTP. Based on considerations elaborated in the Discussion section, it is tentatively concluded that 8-oxo-dATP and 8-oxo-GTP are the target substrates for this housekeeping enzyme. Q0TTC5_CLOP1 discriminates variously between the ribose and deoxyribose moieties of the substrate; for example, the k cat /K m ratio for 8-oxo-GTP >8-oxo-dGTP is 3, but it is 100 for dGTP versus GTP. Comparison of the v i /[E 0 ] values in the absence of PPase (data not shown) indicated that Q0TTC5_CLOP1 cleaves the substrates mainly at the a-b pyrophosphate bond.

Q92EH0_LISIN
Q92EH0_LISIN is most reactive toward ADP-ribose in the presence of APase with a k cat /K m value of (1.85 6 0.08) 3 10 6 M 21 s 21 , ADP-glucose and cADPribose are 30% and about 13% as reactive, respectively.

Q5LBB1_BACFN
Q5LBB1_BACFN efficiently catalyzes the hydrolysis of 5-substituted cytidine nucleotide triphosphates, that is, 5-Me-dCTP, 5-MeOH-dCTP, and 5-OH-dCTP (Fig. 2,  Q5LBB1_BACFN). This enzyme does not discriminate between the 5-Me (k cat /K m 5 (4. Some of the kinetic parameter determinations presented have large standard errors, especially those for Q5LBB1_BACFN (36% for 5-Me-CTP). We performed control experiments to maximize the reproducibility of the assay, including diluting the enzyme stock with different protocols, repeating the experiment with different batches of enzymes, washing the cuvette extensively with nitric acid, stirring the reaction solution with magnetic bars, and so forth In total, we repeated the measurement of 5-Me-CTP activity on 6 different days and that of 5-Me-dCTP on 14 different days, respectively. However, large standard errors were found on each of the different days, and when using each of the different approaches above. Therefore, the large error bars (Fig. 2, Q5LBB1_BACFN)-representing the entirety of the experiments-were not due to any of the factors that were considered.
In summary, a total of eight putative Nudix hydrolases was screened for activity against our library of 74 demonstrated substrates for this group of enzymes. Three of the enzymes were found to exhibit k cat /K m values of >10 6 M 21 s 21 for either ADP ribose or for a noncanonical NTP, and, by the criterion presented in the Discussion, can be reasonably assigned the designated physiological function. The highest activity for the fourth enzyme is hydrolysis of the noncanonical 3'-dGTP, but the k cat /K m value of 1.6 3 10 4 M 21 s 21 is too low to allow a confident assignment of this activity to this enzyme.
To explore whether protein structure might help the assignment of function for the eight newly assayed enzymes, we studied the structures of the proteins considered herein and their similarities to other structurally characterized Nudix proteins. The results were unrevealing.

DISCUSSION
Principles to identify the physiological substrates for Nudix enzymes Historically, enzymes were usually identified by pursuing a predefined assay to the point of highest specific Kinetic characterization of 4 Nudix hydrolases against their most reactive substrates. The highly active substrates were identified in the initial plate reader screen. The reaction rates were monitored spectrofluorometrically in the P i -sensor assay. The substrates specified in the legends are sorted by decreasing values of k cat /K m . Error bars are shown when where replicate determinations were carried out. All assays were performed at pH 7.6 and 378C, with enzyme concentrations varying from 0.26 to 52 nM. The coupling enzymes employed are listed in Table II. The fitted curves were calculated by either linear or nonlinear regression, as specified in the methods section. activity as the enzyme was purified in stages. The homogeneity of the purified protein was usually ascertained by the available technology, and a limited range of alternate substrates was sometimes investigated. It is now recognized that many enzymes have more than a single activity. 35,36 The results presented in this article report an exploration of possible substrates for eight putative Nudix hydrolases. However, the mere observation of significant catalytic activity with a given substrate does not necessarily serve to define it as the physiological target. Here we propose a set of criteria to help to achieve such target identification for Nudix enzymes. We argue that a diffusion-controlled k cat / K m value (ca. > 10 6 M 21 s 21 ) is usually definitive. When k cat /K m is much less than this figure, genetic evidence and, to a lesser extent, genomic methods (see below), may provide conclusive evidence for an assignment. Although many Nudix hydrolases are reported to have activities for canonical nucleoside triphosphates (Table III), only one of these activities exhibits a k cat /K m value > 10 5 M -1 s 21 , and almost all the others have values of <10 4 M 21 s 21 (Table  III). Furthermore, many enzymes capable of hydrolyzing canonical NTPs show higher activities to noncanonical NTPs with similar structures. We therefore conclude that the apparent activities against canonical NTPs likely represent collateral damage. This criterion could potentially be applied generally when assigning physiological functions to other proteins of this family. We used that gauge to assign probable physiological activities for the three putative Nudix hydrolases (Q0TTC5_CLOP1, Q92EH0_LISIN, and Q5LBB1_BACFN) functionally characterized in this article (Table II).
A nearly diffusion-controlled value of k cat /K m may serve as a sufficient condition to identify a likely physiological substrate, because virtually every enzyme substrate encounter is catalytically productive, assuming that said enzyme has physical access to that substrate. 37,38 Enzymes that catalyze such reactions have been called "perfect" because they cannot be improved by further evolution. 39 The observation of a significantly smaller k cat /K m value means that the investigated compound may not be the physiological substrate for the enzyme, or that such low activity is acceptable for the enzyme's cellular role. In cases where the substrate is poorly hydrolyzed, the observed activity might provide some hint regarding the structure of the physiological substrate, which might be similar to that of the less active substrate.
It is also possible that the physiological substrate for a given Nudix hydrolase may exhibit a low value of k cat / K m , if for example, the enzyme were allosterically regulated. There are few reports of regulation of Nudix activity. The only biochemical evidence of allosteric regulation of Nudix enzymes is for the ADP-ribose pyrophosphatase of E. coli (UniProt Entry Name: ADPP_E-COLI). Although this enzyme's k cat /K m of 1.75 3 10 6 M 21 s 21 for ADP-ribose 40 is not low, this parameter is increased by 8-fold in the presence of glucose 1,6diphosphate. 41 Functional assignments based on lower than diffusioncontrolled values of k cat /K m may be ambiguous, due to either catalytic promiscuity of the enzyme or significant structural relationships among many substrates. The results of genetic probes (for example, gene knockouts, and complementation tests), alongside enzymatic assays, are often definitive, as they provide orthogonal information regarding the physiological role of the enzyme in the cellular environment. For example, prior to genetic  analysis, the highest k cat /K m value for any examined substrate reacting with E. coli RNA pyrophosphohydrolase (gene name: rppH; UniProt Entry Name: RPPH_ECOLI) was 2800 M 21 s 21 for Ap 5 A. 42 The criteria introduced above would cast doubt on the assignment of this as the primary activity of this enzyme. Indeed, subsequent experiments showed that this enzyme cleaves the pyrophosphate entity from the 5 0 triphosphate end of RNA to yield a pyrophosphate ion (note that this is different from mRNA decapping activity in eukaryotes, as the eukaryotic mRNA has a m 7 G cap at the 5 0 end of RNA), and in vivo accelerates the degradation of transcripts 43 ; these data support the contention that RNA is the physiological substrate of RppH. In addition to experimental characterization, functional assignment of enzyme activity is currently facilitated by genomic methods including operon and protein family evolution analyses. 44-47 An illustrative example is gmm from E. coli (UniProt Entry: GMM_ECOLI), which has been designated as a GDP-mannose mannosyl hydrolase. 48 Both the low k cat /K m value of 1600 M 21 s 21 for GDP-mannose and biosynthetic considerations call the assignment into question. gmm is part of an operon responsible for the synthesis of colanic acid. The two genes immediately upstream and downstream of gmm encode a GDP-fucose synthase (fcl) and a predicted colanic biosynthesis glycosyl transferase (wcal), respectively. 49 Moreover, GDP-mannose is synthesized by an enzyme coded by cpsB in the same operon, immediately downstream of wcal. It would be biologically wasteful for the same operon to also code for an enzyme that hydrolyzes GDP-mannose, thus completing a futile cycle, although it cannot be ruled out as a regulatory mechanism.
Activities observed for canonical NTPs may be the result of collateral damage  (Table III). However, none of the k cat /K m values is >10 5 M 21 s 21 , with the exception of one enzyme (Q9A8K7_CAUCR), which exhibits a k cat /K m value of 1.3 3 10 5 M 21 s 21 for UTP. 19 It is uncertain whether UTP is or is not the physiological substrate. Notably, three of these 22 Nudix hydrolases do have k cat /K m values > 10 6 M 21 s 21 for noncanonical NTPs (MUTT_ECOLI, 8ODP_HUMAN, and NUDG_ECOLI), providing evidence that their true function may be to eliminate noncanonical NTPs, supporting Bessman's earlier conclusion. 1 The activity against canonical NTPs is therefore likely collateral damage. Furthermore, functional assignments for 12 of these 22 enzymes have been secured from genetic experiments, as discussed below. Notably, not one of the physiological substrates for these 12 enzymes is a canonical NTP, despite the enzymes' having some activity against them. There is a good chance that the physiological substrates for the remaining putative Nudix hydrolases have yet to be discovered.
An illustrative example of an enzyme function secured with strong genetic evidence is nudB (UniProt Entry: NUDB_ECOLI), which was originally characterized to preferably hydrolyze dATP with k cat /K m of 6,600 M 21 s -1 . 50 It was later found to hydrolyze DHNTP, the substrate of the committed step in folic acid synthesis, but the k cat /K m value is not high enough (4.3 3 10 4 M 21 s 21 ) for definitive function assignment based on the kinetic constant alone. 51 However, the deletion of nudB led to a reduction in folate biosynthesis, which was completely restored by a plasmid carrying the same gene. 51 Genetic evidence also establishes that YJ9J_YEAST and its homolog, NUD20_ARATH, catalyze the hydrolysis of oxo-and oxy-thiamine diphosphate in vivo. 52 Specifically, deletion of YJ9J_YEAST decreases the oxythiamin resistance of its host Saccharomyces cerevisiae, and overexpression of NUD20_ARATH restores the oxy-thiamine resistance in YJ9J_YEAST-deleted S. cerevisiae. However, the k cat /K m values are only on the order of 1000 M 21 s 21 for oxo-and oxy-thiamine diphosphate, while both enzymes show higher activity for the canonical substrate GDP. 52 One explanation is that the low hydrolase activity for oxo-and oxy-thiamine diphosphate suffices because thiamine diphosphate is not produced in the cell in high quantities. Therefore, neither high catalytic activity for the designated substrate nor the perfection of substrate selectivity is evolutionarily mandated.
It is likely that the activities observed for the hydrolysis of the canonical NTPs may simply be the result of unavoidable collateral damage. For example, the wellcharacterized MutT from E. coli effects the hydrolysis of the potentially mutagenic, 8-oxo-dGTP with a k cat / K m 5 6.1 3 10 7 M 21 s 21 , but also catalyzes the hydrolysis of dGTP with a k cat /K m 5 1.0 3 10 4 M 21 s 21 . 53 A factor of 10 3 corresponds to 4.2 kcal/mol, which is about the degree of specificity that might be gained by a single optimally placed charged hydrogen bond. 54 The crystal structure of the MutT complex with 8-oxo-dGMP shows that MutT strictly recognizes the overall conformation of the 8-oxo-guanine base through multiple hydrogen bonds. 55 Greater distinctions between similar molecules by the hydrolase might be achievable, but evolutionary considerations argue that substrate optimization would progress only to the point where further differentiation provides little or no added biological advantage. DNA, RNA, and protein biosynthesis all require far more robust selection of the component monomers, but this is achieved subsequent to the committed steps by repair or editing mechanisms, respectively at the ultimate expense of further energy consumption. 56 It does take considerable energy to make NTP and other naturally biosynthesized Nudix substrates, such as coenzyme A, NADH and FAD. Why would they be made only to be subsequently discarded in a futile cycle? While low values (<10 5 M -1 s 21 ) of k cat /K m observed with physiological substrates (for example, canonical NTPs, CoA and its derivatives, and NAD 1 ) might indicate that the observed activity is not the major function of the enzyme, it has to be recognized that these molecules in addition to their major roles in central metabolism, have additional regulatory functions. For example, hydrolysis of NAD 1 may be a means for regulating the size of the peroxisomal pool of nicotinamide coenzymes independently of those in other subcellular compartments in response to a change in available carbon source. 57 Thus, their concentration levels must be carefully monitored and controlled.
While the data collected in Table III and elsewhere in this article support the contention that unregulated canonical NTP hydrolase activity is not a defining characteristic of any enzymatically characterized Nudix hydrolase, they do not allow the conclusion that it is never purposeful to drive the hydrolysis of canonical NTPs. For example, the enzyme, SAMHD1, which is induced by HIV infection is allosterically activated by dGTP, and converts dNTPs to the corresponding nucleosides and inorganic triphosphate, presumably to reduce the rate of viral replication. 58 SAMHD1 is unrelated to the Nudix family.

Newly functionally characterized nudix hydrolases
Prior to this investigation, 20 Nudix enzymes had been shown to have k cat /K m values >10 6 M -1 s 21 for at least one substrate. We screened eight new ones, whose structures are known, against our library, and found three (Q0TTC5_CLOP1, Q92EH0_LISIN, and Q5LBB1_BACFN) to have k cat /K m >10 6 M 21 s 21 for at least one substrate, expanding by 15% the group of characterized Nudix enzymes with high k cat /K m values. These three Nudix enzymes are also the first reported characterized examples from their respective host organisms. Five of eight do not show significant activity against the compounds in the library; their true activity remains to be discovered.
Two of the previously functionally uncharacterized Nudix hydrolases, Q0TTC5_CLOP1 and Q5LBB1_BACFN, demonstrate higher k cat /K m values toward mutagenic nucleoside triphosphates than for the closest canonical NTPs. Q0TTC5_CLOP1 distinguishes the mutagenic NTPs (for example, 8-oxo-dATP) from the canonical NTPs (for example, dGTP) with a 10-fold difference in k cat /K m . Q5LBB1_BACFN is most active against a mutagenic nucleotide (5-substituted (d)CTP), with k cat /K m values are >10 6 M 21 s 21 . Q5LBB1_BACFN has much lower activity against the canonical NTPs, as was shown in the screening result (Fig. 1). The third enzyme, Q92EH0_LISIN, hydrolyzes ADP-ribose, also with a k cat /K m value >10 6 M -1 s 21 . Therefore, based on these nearly diffusion-controlled specificity constants, it is likely that the physiological substrates for these three hydrolases are now identified.
Q0TS82_CLOP1 shows significant activity only for nucleoside triphosphates containing a guanine base (Fig.  1). Detailed kinetic analysis for four of these showed that the best substrate assayed is 3'-dGTP (k cat /K m 2 3 10 4 M -1 s 21 ). There are no reports showing that this is a naturally occurring molecule, although it is possible that it may be formed following an aberrant reduction by ribonucleotide reductase. [59][60][61] The approximate k cat /K m values obtained for Q8PYE2_METMA, B9WTJ0_STRSU, A0ZZM4_BIFAA, and Q9K704_BACHD from the screening experiments are fairly low (<5,000 M -1 s 21 ); thus it is unlikely that the appropriate substrates have been identified.
Our characterization of these enzymes will aid the prediction of the functions of others in the Nudix protein family. However, the huge size of the superfamily and its functional plasticity mean that common approaches often yield misleading results and even the most sophisticated protein function prediction methods must be used with care and deftness in this superfamily. 88