Structural and functional diversity of asparaginases: Overview and recommendations for a revised nomenclature

Asparaginases (ASNases) are a large and structurally diverse group of enzymes ubiquitous amongst archaea, bacteria and eukaryotes, that catalyze hydrolysis of asparagine to aspartate and ammonia. Bacterial ASNases are important biopharmaceuticals for the treatment of acute lymphoblastic leukemia, although some patients experience adverse allergic side effects during treatment with these protein therapeutics. ASNases are currently divided into three families: plant‐type ASNases, Rhizobium etli‐type ASNases and bacterial‐type ASNases. This system is outdated as both bacterial‐type and plant‐type families also include archaeal, bacterial and eukaryotic enzymes, each with their own distinct characteristics. Herein, phylogenetic studies allied to tertiary structural analyses are described with the aim of proposing a revised and more robust classification system that considers the biochemical diversity of ASNases. Accordingly, based on distinct peptide domains, phylogenetic data, structural analysis and functional characteristics, we recommend that ASNases now be divided into three new distinct classes containing subgroups according to structural and functional aspects. Using this new classification scheme, 25 ASNases were identified as candidates for future new lead discovery.

The current classification scheme for ASNases is based on protein sequence, biochemical properties and crystallographic data, 3 although only limited primary sequences and crystallographic structures were available at the time this classification was proposed. A later study expanded the phylogenetic analyses, but still with only a limited number of ASNase sequences and with no inclusion of any crystallographic data. 5 The crystal structures for more ASNases have now been resolved and these structures, together with the primary amino acid sequences for the enzymes, are publicly available. This now affords the possibility for expanded sequences sampling to rebuild the existing ASNase phylogeny and, to propose a more robust nomenclature for these clinically significant enzymes.

Source of sequences
A dataset of 108 ASNase amino acid sequences from archaea, bacteria and eukaryotes (fungi, metazoans, and plants) was selected from the literature [3][4][5] and downloaded from the NCBI (https://www.ncbi.nlm.nih.gov/protein) and UniProt (https://www.uniprot.org/) protein databases (Figs. S1-S3). The datasets were manually expanded to include an additional 229 ASNases sequences from: (1) nonredundant sequences subsequently deposited in UniProt (sequences with 100% identity were deleted); (2) sequences from model organisms (e.g., Escherichia coli, Danio rerio, Drosophila melanogaster, Mus musculus); and (3) sequences from a large diversity of taxa not previously sampled (e.g., cnidarians). A multiple sequence alignment of 337 sequences was performed using a plugin of Multiple Alignment using Fast Fourier Transform software v.7.402 6 as part of the Geneious bioinformatics tool 7 with default settings, and the InterPro Scan was used to identify protein domains. To deepen the analyses of clades that contained enzymes with high sequence homology but with distinct affinity for l-Asn (different K M values), a separate dataset was constructed using 133 sequences from ASNases presenting more than 90% of sequence similarity to human and guinea pig ASNases. Additionally, in order to investigate enzymes with comparable affinity for l-Asn (similar K M values) but with distinct structural characteristics, a third dataset using 56 sequences with more than 60% sequence similarity to E. coli and Erwinia chrysanthemi ASNases was compiled.

ASNases phylogeny reconstruction
The ProtTest 3 8 program was used to select the best model for amino acid replacement and this protein model

Highlights
• Asparaginases are a large and diverse family of enzymes, commercially significant in biotechnology and medicine. • Currently, the family is divided into bacterialtype, plant-type and Rhizobium etli-type enzymes; but this nomenclature is outdated and has conflicts related to taxonomic origins and biochemical properties. • A modernized analysis of available sequences and structures is presented, and a new updated and robust nomenclature is proposed. Application of this classification is demonstrated with the identification of asparaginases that may provide useful therapeutic agents.
was applied to all phylogenetic analyses (LG+I+G). Maximum likelihood based phylogenetic analyses were carried out using Randomized Axelerated Maximum Likelihood. 9 Ten initial randomized maximum parsimony starting trees were constructed and the "best tree" (best log likelihood score) was selected. One thousand bootstrap replications were subsequently applied to the tree. Bayesian analysis was carried out with the Geneious plugin of MrBayes v 3.2.6 10,11 using default settings. Finally, the trees were edited using FigTree v1.4.3 software and Inkscape (https://inkscape.org/). Tree topologies of both phylogenetic methods (ML and Bayesian) were compared and used to classify the ASNases sequences.

Primary and tertiary/quaternary structural analysis of ASNases
A literature search was performed to identify the cellular location, enzymatic activity and structural information for each sequence, where available. For analysis using available crystallographic structures, the coordinates were obtained from the PDB 12 and the figures were generated using Pymol software (http://www.pymol.org). The MEME motif-based sequence analysis tools (v5.0.2) (http: //meme-suite.org/tools/meme) was used to identify motifs present in sequence clusters. After compiling the data, a final division of enzymes into classes was performed using a combination of primary and tertiary structure, functional characteristics and inferred phylogenetic relationships.

3.1.1
ASNase Class 1 proteins are highly diverse and include enzymes used as biopharmaceuticals According to the UniProt database, proteins in Class 1 (InterPro ID: IPR027474) display distinct biological properties, such as cell location and affinity for the substrate, and can be further divided into four functionally distinct groups ( Fig. 1(A)) which present a very conserved protein fold ( Fig. 1(B)). Group 1: this group contains 38 ASNases homologues from the Dikarya sub-kingdom and the bacterial ASNases including from E. coli ( Fig. 1(C); EcAII) and E. chrysanthemi (ErAII). Both bacterial sequences contain an ∼20 amino acid N-terminal signal peptide for export to the periplasm and, in the mature form, are ∼330 amino acid in size (35 kDa) (Fig. S4). 14 The EcAII and ErAII enzymes display Michaelis-Menten behavior (following the equa- where V is the observed initial rate, V max is its limiting value at substrate saturation, and K M is the substrate concentration when V = V max /2) and have a K M value of 15 and 50 μM, respectively. 15 Group 1 proteins can also be cytosolic enzymes, for example, ASNases of Helicobacter pylori, which has allosteric behavior and S 0.5 = 290 μM. 16  The elements of the secondary structure are colored as follows: α helix = light blue, ribbons β = yellow and loops/random coils = pink. The figures were generated using PyMol amino acids. 17 Group 2: this group comprises 40 ASNase sequences from the Dikarya sub-kingdom and includes proteins that contain ∼50 additional N-terminal amino acid residues of unknown function, resulting in an overall molecular mass of ∼40 kDa (Fig. S5). Bioinformatics analyses revealed that these N-terminal sequences did not encode for any recognized signal peptide motifs. The only enzyme in Group 2 that has been characterized to date is Asp1 from S. cerevisiae, which has allosteric behavior and S 0.5 = 75 μM. 18 Group 2 does not include any enzymes for which a crystal structure has been resolved. Group 3: this group consists of 34 enzymes from Archaea named Glutamyl-tRNA aminotransferase D (GatD), and have an N-terminal extension of ∼90 amino acids residues, giving an overall primary sequence of ∼450 residues (e.g., GatD of Pyrococcus abyssi; Fig. 1(D)) and molecular mass of ∼50 kDa (Fig. S6). The N-terminal extensions form a heterodimer in Glutamyl-tRNA aminotransferase E (GatE). 19 These enzymes are aminotransferases, using acylated Glu-tRNA as substrates to produce Gln-tRNA. However, l-Asn can also be used as an amide donor in addition to Gln. 19 Group 4: this group is composed of 86 enzymes, 41 of which contain a 33 amino-acid ankyrin domain sequence in the C-terminal region that mediates protein-protein interactions. These proteins can be cytosolic or secreted (Figs. S7 and S8) [20][21][22] and includes the 338 amino acid 37 kDa E. coli type I cytosolic ASNase ( Fig. 1(E)) but does not contain any recognized ankyrin domains. This ASNase exhibits allosteric behavior, with low substrate affinity (S 0.5 in range of mM). 23 Using ASNase of Cavia porcellus as an example ( Fig. 1(F)), the polypeptide chain consists of ∼560 residues and has a molecular mass of 60 kDa. 24 In addition to ASNase activity (K M in the μM-mM range), the enzymes of Group 4 can also exhibit lysophospholipase, transacetylase and acetylhydrolase activities. [24][25][26] Using the MEME motif discovery software, it was possible to detect regions of highly conserved amino acids containing the following previously known motifs: 3 motif 1: T-G-G-T and motif 2: H-G-T-D-T (Figs. S4-S7). Motifs 1 and 2 were found in all 198 Class 1 ASNases and are of catalytic importance. For example, in E. coli ASNase 2, Motif 1 ( 9 TGGT 12 ) contains a 12 Thr residue, which stabilizes the substrate at the active site and Motif 2 ( 87 HGTDT 91 ) that contains a 91 Thr residue, which promotes the protonation of the leaving amino group and activates a nucleophilic water molecule. 27 A third but previously undescribed motif (equivalent to 297 L-N-P-X-K-X-R 303 in E. coli ASNase 2) was found in Groups 1 and 2 ASNases and is located in the Cterminal region of the proteins. It has been reported that residues 300 Gln, 303 Arg and 281 Tyr, when substituted by Ala, affect the stability of the intimate dimers. 28 In this context, Motif 3 may contain residues necessary for the formation of intimate dimers (Fig. S9).

3.1.2
Class 2 ASNases: functional diversity and structural conservation Previous work classified enzymes from this class as plant type ASNases 4 and contain an "asparaginase 2" domain (InterPro ID: IPR000246). However, phylogenetic analyses revealed that in addition to plants, this group of enzymes can also be found in bacteria, for example, EcAIII from E. coli, and in archaea and eukaryotes including metazoans ( Fig. 2(A) and Fig. S1). Based on the revised classification proposed herein, this group should now be designated as Class 2 ASNases and comprises 107 enzymes with very diverse substrate specificities. Although all the enzymes display ASNase activity, biologically l-Asn may not be the major substrate for these enzymes. 29 For example, some hydrolyze dipeptides of β aspartyl and function in the metabolism of oligosaccharides or are Thr endopeptidase. 30,31 In this context, it is important to note that all the enzymes present K M for l-Asn in the mM range. [31][32][33] Class 2 ASNases also retain high conservation in secondary protein folding ( Fig. 2(B)) further dividing the class into three distinct functional subgroups. Group 1: 60 isoaspartyl peptidases, β-aspartyl peptidases and ASNases including EcAIII (Fig. 2(C)), a protein of 321 amino-acid residues and molecular weight of 35 kDa. The substrates for this group are l-Asn and l-isoAsp, some representatives can also hydrolyze spontaneous damage products of proteins and cyanoficin (a polymer commonly found in cyanobacteria). 29,[34][35][36] In plants ( Fig. 2(D) and 2(E)), this group of ASNases causes release of ammonia from Asn, which becomes the main nitrogen source for transport and storage. 29,37 In metazoans, the human hASRGL1 ( Fig. 2(F)) is a 308 amino acid protein with a molecular weight of 32 kDa, which also displays β-aspartic peptidase activity. 31 Group 2: this group comprises 14 endopeptidases, which cleave substrates after Asp residues and are localized within the nucleus of eukaryotic cells (Fig. 2(A)). 30,33 The human isoform of Thr aspartase 1 (Fig. 2(G), Taspase1) is composed of 420 amino acids and has a molecular mass of 45 kDa. Taspase1 cleaves KMT2A (Histone-lysine N-methyltransferase) involved in the regulation of gene expression during early stages of hematopoiesis. Its dysregulation is involved in hematological malignancies, including ALL. 38,39 Group 3: this group contains 33 enzymes with amidohydrolase activity involved in catabolism of oligosaccharides (e.g., aspartyl-glucosaminidases) (Fig. 2(A)). The AGA (Fig. 2(H)) of Homo sapiens is composed of 346 amino acids; however, the first 23 amino acids correspond to a signal peptide, thereby giving a mass of 36 kDa to the mature protein. 31,36 In eukaryotes, Group 3 ASNases are localized in lysosomes and are implicated in the decomposition of glycosylated proteins by cleavage of the glycosidic bond between the carbohydrate and an Asn residue. 40,41 This group of enzymes is well studied and is found in bacteria and eukaryotes. [42][43][44] Using MEME, two highly conserved amino acid motifs were identified in Class 2 enzymes. Motif 1 is represented by the sequence T/Q-V/I/L-G-X-V/I/L-X (2) -D/H-X (2) -G/N. 168 Thr in the human enzyme Isoaspartyl peptidase (ASGL1), where cleavage occurs between the α and β chains, is located within this motif making it essential for the function of the enzyme. The second motif, T/S-T/S-X-G-X (3) -K/R-X (2) -G-R-V/I-G, contains 186 Thr (in ASGL1), which has catalytic activity 45 (Fig. S10).

Class 3 ASNases
Proteins belonging to Class 3 were traditionally classified as R. etli ASNases (a symbiotic bacterium of leguminous plants 46 and contain "asparaginase II" domains (Inter-Pro ID: IPR010349) (not to be confused with "asparaginase 2" domains found within Class 2 proteins). Only 32 nonredundant proteins could be grouped into this class ( Fig. 3(A)). Class 3 ASNases include representatives from fungi, bacteria (including R. etli ASNases) and cyanobacteria. Using the enzyme of R. etli as an archetype to describe Class 3 enzymes, this ASNase is composed of 370 amino acids and has a mass of ∼40 kDa. The K M for this enzyme is 8.9 mM and the catalytic efficiency measured by a k cat /K M is 1 × 10 4 M -1 Sec -1 for the l-Asn substrate. No glutaminase activity has been detected. 46 Class 3 ASNases are also found in cyanobacteria, actinobacteria and Firmicutes bacteria (Fig. 3). To date, there have been few studies that have characterized these enzymes and, amongst the proteins used in this analysis, only the ASNase from Streptomyces griseus has been studied and was found to be an extracellular enzyme with ASNase but no glutaminase activity. 47 It was possible to identify two conserved amino acid motifs in enzymes grouped into this class-Motif 1: R-S-X (2) -K-P/A-X-Q-A-L and Motif 2: N/G-C-S-G/A-K-H-X-G/A-M/F (Fig. S11). Although the active site of Class 3 enzymes has not been characterized, in other ASNases catalysis involves at least two Thr residues and in some cases Ser. 48,49 Ser residues are found in both motifs, which have a carbonyl γ that conceivably could function in catalysis. To investigate this further, a theoretical model was constructed for the ASNase of R. etli. The model was generated using the glutaminase coordinates of E. coli (YBAS) (PDB code: 1U60), 13 which has 21% identity and 29% similarity with the ASNase of R. etli (Fig. 3(B)). The model covered from residue 22 to 316 (from a total of 367 residues, ∼80%) providing a quantitative model energy analysis (QMEAN) = −6.35 and global model quality estimate between the ASNase of R. etli model and the E. coli crystallographic of 0.35, suggesting the model was of good quality. Glutaminase enzymes encode a motif S-X-X-K (66-S-X-X-K-69 in YBAS), which contains the catalytic nucleophile Ser66, very similar to motif 1 identified for class 3 asparaginases. 13 In addition, the theoretical model showed that motifs 1 and 2 were spatially close and that the amino acids of these motifs could interact thereby, in principle, forming an interaction network.

Phylogenetic considerations supporting the revised classification
Phylogeny of Class 1 ASNases revealed low conservation of primary sequence (<30%) in agreement with previous work. 3 However, conservation of distinct sequence motifs allowed the class to be further divided into four groups. Groups 1 and 2 contained enzymes from the yeast Dikarya sub-kingdom and from bacteria, where the primary sequence is composed of an exclusive 50 amino acids N-terminal domain. Group 3 is composed of ASNases of archaeal origin and where the N-terminal of the primary sequence is composed of ∼90 amino acids, which function in tRNA repair by transamidation of misacylated Glu-tRNA(Gln) to produce the correct Gln-tRNA(Gln). 19 Group 4 is the most diverse, containing ASNases and lysophospholipase from protozoa, bacteria, fungi and metazoans. Additionally, the ASNases primary sequences of metazoan and fungal origin contain ankyrin domain repeats, suggesting formation of multi-domain proteins common to metazoans. 50 Repetitive ankyrin domain sequences may also provide a regulatory role for these enzymes. 19 Enzymes of Group 4 can also exhibit lysophospholipase, transacetylase and acetylhydrolase activities, [24][25][26] in addition to the ASNase function (K M in the μM-mM range). Class 2 ASNases represent enzymes of the Ntn-hydrolases family, which are enzymes with an N-terminal Thr, Ser or Cys residue, which acts as a nucleophile during catalysis. 4 Phylogenetic analyses provided further sub-division of Class 2 ASNases: Group 1 is composed of isoaspartyl-peptidases, β-aspartylpeptidases and ASNase enzymes derived from multiple taxa; Group 2 is composed of endopeptidases of plant, fungal and metazoan origins; and Group 3 contains aspartylglucosaminidases enzymes from bacteria, plants and metazoans. Only 32 nonredundant proteins could be grouped by phylogeny into a revised Class 3 and were traditionally classified as R. etli ASNases. 46 Class 3 ASNases presented fungi, cyanobacteria and bacteria representatives. The conserved motifs suggested that the catalytic sites are quite distinct from Class 1 and Class 2 ASNases, with Ser replacing Thr. Future studies to expand, define and further refine Class 3 ASNases are now especially warranted.

Structural considerations supporting the revised classification
From a structural perspective, Class 1, 2 and 3 enzymes are very distinct suggesting evolutionary divergence within the ASNases. Concerning the crystallographic structures determined for members of Class 1 ASNases, there is remarkable conservation of the 3D structures of the monomers of enzymes in each of the different sub-groups. Although the enzymes have low sequence identity (<30%), the three-dimensional structure of the monomers give high spatial identity. In all cases, ASNases have two α/β globular domains and, in both domains, have a β sheet surrounded by α helices (Fig. 1(B)). When EcAI (Fig. 1(C)) and EcAII ( Fig. 1(D)) are compared, the structures are very similar, giving a root mean square deviation (r.m.s.d.) of the Cα amino acid chain of 2.5 Å. The enzymes also shared high similarity with guinea pig GpASNase (r.m.s.d. ∼1.2 Å) ( Fig. 1(B)-1(G)). However, it is important to note that the graphical representations of GpASNase do not include Cterminal ankyrin repeats, as these were not present in the crystal structures ( Fig. 1(E)). When EcAII was compared with a Group 3 ASNase (GatD from P. abyssi), a high level of structural identity (r.m.s.d. ∼1.5 Å) was also observed. In general, these results indicated that even with low amino-acid identity, the tertiary structure of Class 1 ASNases are highly conserved, suggestive of a common evolutionary ancestry. Class 2 enzymes show more complex spatial organization compared with ASNases of Class 1 and resemble N-terminal nucleophilic hydrolases. 4,57 Class 2 ASNases are translated as a single inactive polypeptide precursor, which undergoes an autocatalytic intramolecular cleavage, dividing the polypeptide chain into two active subunits: α subunit and β subunit, exposing the Nterminal nucleophile amino acid (Thr, Ser, or Cys) of the β subunit. 51 Therefore, a single polypeptide chain can form an α/β heterodimer. 4 Despite members of this class displaying quite distinct biological functions and exhibiting only moderate similarity (25-40%), analysis of the tertiary folding of the polypeptide chain reveals a very high structural similarity (r.m.s.d. ∼0.8-1.0 Å 2 ), even when comparing proteins from different organisms such as bacteria, plants or humans ( Figure. 2(B)-(I)). The representatives of this class form α/β folds, having two β sheets superimposed within the molecule and surrounded by an α helix (Fig. 2(B)). This folding is quite different from Class 1 enzymes (Fig. 1(B)-1(G)). Concerning Class 3 enzymes, no crystallographic structures are available in Protein Data Bank. However, our theoretical model generated using the glutaminase coordinates of E. coli (YBAS) 13 revealed a tertiary structure distinct from Class 1 and 2 enzymes.

Application of the revised nomenclature
The organization of structural and functional features of the different ASNases classes and groups provided by the revised nomenclature, could facilitate bioprospecting for exciting new ASNases. The most interesting enzymes from a therapeutic point of view can be found in Class 1, with examples of enzymes that show K M to l-Asn in micromolar range, for example, EcAII and ErA 2 in Group 1, Asp1 from S. cerevisiae from Group 2 and GpASNase1 from C. porcellus from Group 3. The most studied ASNases are EcAII (Fig. 1, blue arrow) and ErAII (Fig. 1, black arrow) because these enzymes have both been developed into biopharmaceuticals for the treatment of ALL, and provide excellent prototypes to explore discovery of new ASNases with therapeutic potential. 1,52,53 Briefly, both enzymes have high sequence similarity (63%) and the crystallographic structures display remarkable conservation of amino acids involved in catalysis. The reaction mechanism involves two catalytic triads ( 12 Thr- 25 Tyr-283 Glu and 89 Thr-90 Asp-162 Lys in EcAII, and 15 Thr-29 Tyr-63 Glu and 95 Thr-96 Asp-168 Lys in ErAII) and a movement of a loop containing residues 11-31 in EcAII and 14-35 in ErAII, including the catalytic Thr residue of motif 1 (residue 12 in E. coli and residue 15 in E. chrysanthemi). 27,53,54 However, in E. coli ASNase, the Glu283 residue of the adjacent monomer (The * symbol is added to the residue from the adjacent monomer, equivalent to * 283 Glu) is responsible for the interaction with the N group of the substrate main chain and in increasing the nucleophilicity of 12 Thr. In E. chrysanthemi, the 63 Glu of the same monomer exerts this function, but is spatially located in a distinct region (Figs. S12A and S12B). 55,56 A multiple alignment of sequence homologs revealed three regions that were different between the two ASNases ( Fig. 4(A)): region I, where the 63 Glu residue in ErAII is located, region II where a Ser residue is located in the representatives of the group also containing ErA2, or Asn in the majority of the representatives also containing EcAII, that form a network of interactions with the residues of the active site; and region III where the 283 Glu residue of EcAII is located (Fig. 4(A)). Using the phylogenetic analyses (Fig. S12C, the bacterial tree), it was possible to observe sequences that maintained these patterns were clustered into two sub-groups: 59Gln-250Asn-283Gln in the EcAII sub-group and 63 Glu-2 54 Ser-Δ in the ErAII sub-group. When analyzing the active site of the crystallographic structure of EcAII, it was possible to observe that the *283 Glu residue occupied a loop close to the catalytic site where there is a deletion in ErA2 (Fig. 4(B) and 4(C)). The -COOH group is positioned toward the N-terminal of the substrate main chain, increasing the nucleophilicity of the active site and decreasing the volume of the active site in comparison with ErAII, indicating a strong influence on substrate choice. [57][58][59] However, *283 Glu positioning occurs due to interactions with the NH 2 groups from the side chains of the 59 Gln (Region I) and *250 Asn (Region II) residues, which would be lost by performing substitution with Glu and Ser, respectively. 15,58 In addition, the volume of the active site of ErA2 is larger in comparison with EcAII, which justifies its higher affinity for Gln (K M = 360 μM). Analyses of site-directed mutations and the crystallographic structures of EcAII and ErAII revealed that the main modulator of glutaminase activity is due to the presence or absence of 283 Glu located in region III (Fig. 4(B-III)), and a natural deletion in the ErAII sequence group probably resulted in natural substitutions of key residues (Gln by Glu at region I, and Asn by Ser at region II) that would allow catalysis of both substrates, even in the absence of *283 Glu.
Treatment using an ASNase lacking glutaminase activity may provide beneficial effects in most cases of ALL, in which the cancerous cells do not produce Asn-synthetase (ASNS -), thereby offering an approach to decrease possible adverse side effects. Some authors point out that glutaminase activity may also be important for cytotoxicity, especially in tumor cells where the Asn synthetase gene remains activated, even after neoplastic transformation (ASNS + tumor types). 53 Nevertheless, utilizing the analysis of the two sub-groups obtained in the phylogenetic tree of Class 1 ASNases (Fig. 1(A) 5 Sequence analyses of the Class 1 ASNAses from vertebrates. Alignment of the amino acid sequences of Class 1 ASNases of primates, rodents and a lagomorph. Conserved motifs 1 and 2 and regions 1 and 2 are related to the low ASNase K M from C. porcellus. Identical residues are highlighted in dark gray and similar residues in light gray. The red boxes denote motifs 1 and 2, blue boxes denote residues that are different between species, and black boxes denote residues that differ only in GpASNase1. The alignments were performed using ClustalW oxytoca, Klebsiella grimontii, K. pneumoniae, Klebsiella aerogenes and K. pneumoniae).
The same analyses can be performed with GpASNase1 from C. porcellus (domestic guinea pig). The GpASNase1 has a K M over Asn in the μM range and exhibits Michaelis-Menten behavior. Despite sharing high sequence similarity with the human protein (HsASNase1), HsASNase has a value of S 0.5 between 2.9 and 11.5 mM toward Asn, and displays allosteric behavior as demonstrated by a Hill coefficient between 2.5 and 3.9. 20,32 The reasons for the differences between GpASNase1 and HsASNase1 remains elusive. Recent studies using a DNA shuffling technique that generated chimeric enzymes containing segments of hASNase1 and GpASNase1 revealed two regions in the GpASNase1 enzyme (region 1: between amino acids 1 and 38 and region 2: between amino acids 305 and 357) (Fig. 5(A)) essential for maintaining the low K M of the enzyme. 24 To investigate this point, we analyzed a multiple sequence alignment of ASNase sequences from primates and rodents, and it was possible to observe that Regions 1 and 2 are highly conserved in primates but not in rodents, suggesting that these amino acid substitutions may be related to the observed differences in K M between hASNase1 and GpASNase1. Thus, rodent ASNases may be more interesting from a biotechnological point of view, as there is a higher diversity of enzymes when analyzing the two regions related to the K M of the enzymes (Fig. 5). Structural analysis using the crystallographic structure of GpAS-Nase1 revealed a tetrameric enzyme formed by a dimer of intimate dimers (Fig. S13A), where regions 1 and 2 is in the region of the active site (Fig. S13B). Both regions are related to the open/close dynamics of the loop containing 308 Tyr (region 2) of the adjacent monomer, responsible for correct positioning of 19 Thr (region 1) and stabilization of the substrate, acting as a lid to enclose the substrate at the active site. 20 It is also possible to observe a β-sheet twist in region 1, which allows for correct positioning of 308 Tyr (Fig. S13B). However, in GpASNase1, the loop that facilitates correct positioning of the Tyr residue has two deletions and four substitutions (Ala for Thr, Ala for Ser, Met for Leu, and Gly for Asn), which may cause faster lid closure over the substrate at the active site (Fig. S13). 24 In addition, homologs of GpASNase1 in rodents, for example, Ictidomys tridecemlineatus (13-lined ground squirrel) and Dipodomys ordii (Ord's kangaroo rat) and the lagomorph Oryctolagus cuniculus (European rabbit) showed some substitutions in regions 1 and 2, which could also affect loop dynamics compared with the enzymes of Class 1 from primates ( Figure S14), although only GpASNase1 has been well characterized. 20,60 Thus, structural aspects can be related to differences in the K M of eukaryotic ASNases and, together with the phylogenetic analyses, has revealed that Class 1 ASNases other than those in Group 1 (e.g., ASNases of Group 4) could be interesting enzymes with favorable features that can be exploited for future clinical or other biotechnological applications.

CONCLUSIONS
The existing scheme used to classify ASNases is contradictory and confusing because the current divisions of bacterial-type, plant-type and R. etli-type enzymes is not homogenous but, instead, contain representatives from a wide range of other taxa. 4 The current classification separates ASNases according to biochemical properties, for example, bacterial-type II enzymes are localized within the periplasmic space, have high affinity for l-Asn and exhibit Michaelis-Menten behavior. [3][4][5] However, some enzyme currently classified as bacterial-type II enzymes have opposing biochemical properties, for example, the Asp3 ASNase from the yeast S. cerevisiae is located at the cell wall and has a low affinity for l-Asn (K M = 270 μM). 61 Using phylogenetic tree constructions that adopted enhanced taxon sampling supported by structural and biochemical data, it was possible to propose a revised ASNases nomenclature at two levels: (1) ASNases divided into three classes based upon distinct domains (InterPro Scan) and phylogenetic and structural features, and (2) further division of the classes into groups with shared functional, mechanistic and structural features. The revised nomenclature is underpinned by recognition of conserved sequence motifs that are distinctive for each of the three classes, which could be related to enzyme structure and catalytic activity, strongly inferring discrete evolutionary origins of each Class (Fig. 1, 2, and 3). In summary, a revision of the current ASNase classification was made using primary protein sequence phylogeny, tertiary protein structure analysis and characteristics of enzyme catalysis. Using this updated classification scheme, it was possible to select ASNases with desirable properties suitable for further assessment as alternative biopharmaceuticals to treat ALL. Additionally, expanded taxon sampling for the phylogenetic analysis provided previously poorly characterized ASNases with high primary sequence diversity mainly in regions related to substrate affinity, suggesting that enzymes with interesting catalytic properties have still to be discovered.