Plastics degradation by hydrolytic enzymes: the Plastics-Active Enzymes Database - PAZy

Petroleum based plastics are durable and accumulate in all ecological niches. Knowledge on enzymatic degradation is sparse. Today, less than 50 veriﬁed plastics-active enzymes are known. First examples of enzymes acting on the polymers polyethylene terephthalate (PET) and polyurethane (PUR) have been reported together with a detailed biochemical and structural description. Further, very few polyamide (PA) oligomer active enzymes are known. In this paper, the current known enzymes acting on the synthetic polymers PET and PUR are brieﬂy summarized, their published activity data were collected and integrated into a comprehensive open access database. The Plastics-Active Enzymes Database (PAZy) represents an inventory of known and experimentally veriﬁed plastics-active enzymes. Almost 3000 homologues of PET-active


Introduction
Today, we face the global challenge of plastics pollution in nearly all environments.The pollution has meanwhile reached levels that will ultimately have impact on our food chain and well-being within the next decades.A recent study implied that about 399000 tons of plastics are present in the oceans alone, of which 69000 tons are microplastics 1 .Thus, urgent actions need to be implemented for removal of plastics from the environment and by reducing the steady input into the environment.Whereas it is perhaps more likely that large pieces can be removed mechanically from ocean surfaces or terrestrial sites, smaller particles (microplastics) will remain there unless microbial or chemical degradation (i.e., weathering) will occur [2][3][4] .Plastic waste is a valuable raw material, therefore recycling is a promising alternative to incineration, either as a basis of synthesis of polymers or as a carbon source for fermentation 5 .
Petroleum-based plastics are in general extremely stable and durable; hence it is widely accepted that plastics do not degrade well in nature 6 , nor can be directly used in fermentation.The degradation processes described so far are slow, and it was shown that a PET bottle remains up to 48 years in the ocean until it is decomposed by microbial degradation 7 .Within this setting, it is reasonable to speculate that prior to microbial and enzymatic degradation, mechanical treatment (waves, wind, friction) and photodegradation by UV light (especially for aromatic ring-containing polymers such as PET and PS) break down the debris into microplastics, thereby increasing the surface area, which mediates microbial degradation.For more details on microplastics-associated bacteria and fungi, we refer to excellent reviews of the field of plastics ecology [8][9][10] .However, the colonization of microplastics does not necessarily indicate that the polymer is degraded, because additives are in general more bioavailable than the polymers.Therefore, measuring weight loss as an indicator for degradation might result in a false interpretation of the data 11 , and in the conclusion that we already have many plastics-active enzymes from different microbial sources.A detailed search in the PubMed database revealed that today roughly 1500 publications address the topic of plastics degradation.However, less than 50 described the isolation and biochemical characterization of plastics-active enzymes (Tables 1, S1-S3 ).Nevertheless, while this obvious challenge can be met by better analytical techniques, the by far greater risk for misinterpretation of data comes from the unfiltered and non-critical use of the predicted plastic degrading microorganisms and consortia by not verified bioinformatic tools and pipelines.
For instance one recent study developed Hidden Markov models for some plastic degrading enzymes and predicted a global distribution even though no such enzymes have been biochemically characterized 12 .Others have developed phylogenetic trees and global distribution patterns by simply using automated literature searches without critical analyses of the data 13 .These very recent studies in high-ranking journals are perhaps only the tip of the iceberg, but clearly demonstrate that there is an urgent need for standardized and verified enzyme databases in this rapidly developing field.The non-critical and unfiltered use of many of the potential plastic degrading gene sequences ultimately leads to incorrect conclusions on the availability of plastic degrading enzymes and their role in nature.These studies do not only mislead researches, they furthermore suggest to environmentalists, policy and law makers and even to the broader public audience that we would have solutions for the global plastics problem, which we however do not have.Within this framework, the proposed PAZy database will be a reliable and very useful tool giving an overview on truly functional enzymes.
Notably today, only for polyethylene terephthalate (PET), polyurethane (PUR), and polyamide (PA), a rather small number of degrading enzymes are known, but none for other major polymers such as PVC, PE, PP, and PS, and most of the PUR-based polymers.The known plastics-degrading enzymes are hydrolases, often annotated as lipases, esterases, cutinases, amidases, or proteases (E.C. 3.1.x).However, we have still a limited understanding of the mechanism of enzymatic degradation.It is not clear to which extent bacteria have evolved specific enzymes that bind to the polymers and cleave the bonds similar to the processes that occur when cellulose or other biopolymers are degraded.It is supposed that plastics-degrading enzymes are exoenzymes, and it can be speculated that plastics-binding domains or proteins might contribute to degradation, similar to the role of cellulose binding domains or expansins in the degradation of cellulosic materials.
To advance the research field, we have collected information of the currently known and verified plasticsactive enzymes in the Plastics-Active Enzymes Database (PAZy).It will serve as a comprehensive resource for the identification of further novel plastics-active enzymes, pathways, or microorganisms for plastics removal in industry and the environment.It will further help to advance improved circular use of the different plastic types.Finally, PAZy will in general be a valuable repository and tool in this emerging field of plastics research.

Data selection
Protein sequences with available Uniprot identifiers and known activity against PET or PUR were downloaded from the Plastics Microbial Biodegradation Database 14 (PMBD, http://pmbd.genomemining.cn/home/) and NCBI GenBank.Based on available biochemical and/or structural data, in total 44 protein sequences were selected from the PMBD, with 34 and 10 Uniprot identifiers for PET and PUR activity, respectively (as of November 2021).

PETase homologues
We are aware that there are controversial discussions about the term PETase, but we prefer to define all PET-active enzymes as PETases.Sixteen protein sequences for enzymes with known activity against PET were clustered using CD-HIT (version 4.6.8-1)at a threshold of 90% sequence identity and a word length of 5 to derive a reduced set of twelve centroid sequences 15,16 .These protein sequences were aligned in a structure-guided multiple sequence alignment by T-COFFEE (version 11.00.8cbe486-1) 17 .A profile hidden Markov model (HMM) was derived from this multiple sequence alignment by HMMER (version 3.1b2, http://hmmer.org).The profile HMM was trimmed by selecting alignment columns that corresponded to the region between amino acid positions 32 and 274 in the PETase fromIdeonella sakaiensis (Is PETase, Uniprot identifier A0A0K8P6T7) to avoid ambiguities at the N-and C-termini (Table S4 and Figure S1 ).The profile HMM and the underlying multiple sequence alignment can be downloaded from https://doi.org/10.18419/darus-2055.This PETase-profile HMM was used to search both the NCBI nonredundant (nr) protein database and the Protein Data Bank (PDB) for an update of the Lipase Engineering Database (LED, https://led.biocatnet.de),which was previously established as a collection of protein se-quences from α/β-hydrolases [18][19][20] .Hits for the PETase-profile HMM were selected from the HMMER results with a minimal score of 100, a minimal profile coverage of 95%, and a maximum ratio of bias/score of 10%.
HMMER was also used to identify the C-terminal region for the Type IX secretion system sorting domain, using the profile HMM TIGR04183, which was derived from a multiple sequence alignment of 889 protein sequences in the TIGRFAM database (http://tigrfams.jcvi.org/cgi-bin/index.cgi),with an E-value cut-off below 1.

PURase homologues
Four protein sequences for enzymes with known activity against PUR served as queries for BLAST (blastp, version 2.10.0+)against the NCBI non-redundant (nr) protein database and the PDB 21 .BLAST performance was improved by multithreading with GNU/Parallel (version 20170622-1) 22 .The BLAST results were filtered by an E-value threshold of 10 -10 and a minimal coverage of 50% to further update the LED.

Conservation analyses
The PETase-profile HMM was applied for a standard numbering scheme, by aligning the 2930 sequences of PETase homologues from the LED against the respective profile HMM and subsequently assigning the position numbers from the Is PETase reference sequence as standard position numbers.For conservation analysis of PETase homologues, the frequency of amino acid residues or gaps was counted at each standard position.
For the conservation analysis of PURase homologues from LED superfamilies 11 and 13, two multiple sequence alignments were generated using Clustal Omega (version 1.2.4) 23 , and the frequency of amino acid residues or gaps was counted at selected positions.

Protein sequence networks
Sets of representative protein sequences were formed by clustering with CD-HIT to reduce the sample size and thus computational effort for pairwise sequence alignments.Values of pairwise sequence identity or similarity were calculated by the Needleman-Wunsch algorithm available in EMBOSS (version 6.6.0) with default gap opening and gap extension penalties of 10 and 0.5, respectively, and the substitution matrix BLOSUM62 24,25 .
Collections of protein sequences were represented as protein sequence networks that depicted sequences as nodes connected by edges (lines).The edges in a protein sequence network were weighted by values of pairwise sequence identity or similarity.A threshold of the respective edge weights was chosen to select a subset of edges for the network.Protein sequence networks were visualized in Cytoscape (version 3.8.2) with the prefuse-force directed layout algorithm, taking the edge weights into account 26 : edges of higher sequence identity or similarity were depicted preferably in closer vicinity to each other.The Python NetworkX package (version 1.11) was used to store the metadata of protein sequence networks in GraphML format, available for download at https://doi.org/10.18419/darus-2054 27.

Update of the Lipase Engineering Database
We focus only on validated enzymes acting on the synthetic polymers PET and PUR.Detailed biochemical data of catalytically active enzymes (Tables 1, S1-S3 ), analysed sequences and structures of homologous proteins are comprised in the Plastics-Active Enzymes Database and accessible at https://www.pazy.eu.Within the PAZy infrastructure, the Lipase Engineering Database (LED, https://led.biocatnet.de)serves as the database for protein sequences and structures from different superfamilies of α/β-hydrolases and their sequence annotations, since all currently known enzyme activities towards PET or PUR were reported for α/β-hydrolases.
The updated LED contains 283,672 sequence entries and 1590 PDB entries (an increase of 3034 and 33 entries compared to the previously published LED version from June 2019, respectively).For the update of the LED, sequences that shared at least 50% similarity were assigned to the samesuperfamily (Table S5 ).Sequences that shared at least 60% similarity were assigned to a homologous family ; otherwise, they were assigned to a separate group containing all "singleton" sequences.A new superfamily was introduced for PURase homologues of PudA from Delftia acidovorans as outlined in more detail below.

Biochemical properties
Our literature searches identified a total of 34 enzymes that have been shown to catalyse the partial degradation of PET to oligomers or even to monomers, originating from four different bacterial phyla and one eukaryotic lineage (Table 1 ).No archaeal PETases have been functionally verified to date.
Many of the currently known PETases are thermostable enzymes, because the catalytic activity increases at temperatures close to the glass transition temperature (65 °C) of PET due to the formation of flexible and thus enzyme-accessible amorphous domains 28 .Notably, few enzymes are active at lower temperatures implying they may play a role in cold-adapted PET degradation 29 .However, all known native PETases have rather low catalytic activity toward PET.

Enzyme structures
All known PETases (EC number 3.1.1.101)are homologues of cutinases and are therefore annotated as dienelactone hydrolases, lipases, or esterases.In solution, PETases are supposed to be active as monomers 30 .A total of twelve structures affiliated with different organisms are available in the PDB.For the best characterized examples LCC and Is PETase, multiple entries of variants have been made.All PETases consist of a single domain, the α/β-hydrolase fold, which is formed by a central twisted β-sheet, flanked by two layers of α-helices 31 , and thus belong to a class of small α/β-hydrolases that consist only of the core domain without a mobile lid 20 .For a few PETase homologues from Bacteroidetes, an additional C-terminal sorting domain for the type IX secretion system has been annotated and was verified in the single structure published (Figure S5, Table 1 ).The type IX secretion system comprises several protein components 32 , and the corresponding C-terminal domain was also found in other polymer-active enzymes such as cellulases and endo-1,4-β -glucanases [33][34][35][36] .PETases share a conserved catalytic triad of serine, histidine, and aspartic acid, and a GX-type oxyanion hole, which stabilizes the reaction intermediate 37 .In the PETase homologues, the first oxyanion hole residue X is mostly a conserved aromatic residue such as tyrosine or phenylalanine.The second oxyanion-stabilizing residue is a conserved methionine following the serine of the catalytic triad.For the PETase from I. sakaiensis , several residues were suggested for substrate binding, such as an aromatic clamp formed by the first residue of the oxyanion hole and a second aromatic residue 38 .In addition to this subsite I, a second subsite II was proposed from the interaction observed in a modelled complex with a PET monomer 39 .Variants with increased catalytic activity were designed and tested.The most active enzymes are LCC variants, where the addition of disulfide bonds increased thermostability 40 .The two LCC quadruple variants F243[WI]-D238C-S283C-Y127G, which include the additional disulfide bond, are among the most active PETases described to date.

Sequence network
For the comparison of PETase sequences, the profile HMM for PETases was used to identify the PETase core domain, and the sequences of all core domains were aligned without considering additional regions at the N-or C-termini (signal peptides or transport domains, respectively).In superfamily 1 of the LED, 31,560 sequences were annotated as GX-type, but only 2930 sequences were identified as PETase homologous by a profile HMM.At a threshold of 55% sequence similarity, the bacterial PETase core domains formed a large cluster, mainly originating from Actinobacteria or Proteobacteria (Figure 1 ).Most of the sequences from the PMBD were found in this cluster (Figure S3 ).In addition, a connected subgroup of PETase core domains from other bacterial phyla emerged, such as the PETase proteins from Bacteroidetes or Planctomycetes.Some homologues of PETase core domains occurred also in enzymes from extremophiles (Figure S4 ).The fungal PETase core domains such as the PETase homologues from Fusarium were separated from the bacterial PETase core domains.At a higher threshold of 60% sequence similarity, the sequences for PETase core domains from Bacteroidetes or Planctomycetes emerged as a separated cluster (Figure S5 ).

Sequence motifs
The PETase-profile HMM was applied to analyse the conservation of amino acid residues in the 2930 PETase core domains annotated in the LED (Table S6 ) in comparison to the equivalent positions in the PETase from Ideonella sakaiensis (Is PETase, Uniprot identifier A0A0K8P6T7) and LCC (Uniprot identifier G9BY57).The catalytic triad, the previously suggested PET binding subsite I, which includes an aromatic clamp for possible substrate interaction, and PET binding subsite II from 39 were found to be highly conserved (Table 2 ).The extension of the second α-helix and the extended loop region, which were described previously as functionally relevant in Is PETase, were also found in several PETase homologues in the LED.
Using the position numbers from Is PETase, we suggest a typical PETase sequence motif written as follows (with X indicating an arbitrary amino acid): [YF]87, Q119, X 3 139-141, S160, M161, W185, D206, H237, X 6 242-247, followed by one of the previously published amino acid substitutions from 40 .Interestingly, two sequences from an uncultured bacterium (NCBI: ACC95208.1)and Alkalilimnicola ehrlichii (NCBI: WP -116302080.1) were found to comprise the PETase sequence motif and W238, which was mentioned as an amino acid substitution for improved activity and substrate binding, and four additional sequences (from Caldimonas manganoxidans , NCBI: WP 019560450.1,from C. taiwanensis , NCBI:WP 062195544.1,from Rhizobacter gummiphilus , NCBI:WP 085749610.1,and from Aquabacterium sp., NCBI: MBI3384080.1) were found to comprise the PETase sequence motif and M241, which was mentioned as an amino acid substitution for improved thermostability.These six different and novel protein sequences, each selected by a sequence motif of seventeen amino acid positions in total, are proposed for upcoming studies on PETase activity.

Biochemical properties
Polyurethanes comprise numerous possible polymers of diverse composition, such as combinations of different isocyanates with different polyethers, polyesters, or polycarbonates 41 .The best studied PURases are α/βhydrolases, as reviewed in more detail in 41 .Only ten characterized PUR-degrading enzymes (PURases) have been reported, yet.Four recently identified enzymes (LCC, TfCut-2, Tcur 1278 T, and Tcur0390) are cutinases from Actinomyces, which are also active on PET and have a broad substrate profile 42 .Further bacterial lipases from Betaproteobacteria have been identified and characterized, such asPueA and PueB from Pseudomonas chlororaphis.Whereas all the above-mentioned studies identified lipases or esterases, earlier studies reported that commercially available peptidases and proteases might also degrade thin PUR films 43 .

Enzyme structure
The ten known PURases belong to two groups (Table S1 ): four belong to the cutinases (LCC, TfCut-2, Tcur 1278 T, and Tcur0390) and are similar to PETases.No crystal structures are available for the PURases PueA and PueB from Pseudomonas chlororaphis , but structures of homologues indicate that they belong to superfamily 11 of the LED, which in addition to the core domain has a mobile N-terminal lid, which might mediate binding to the substrate interface and substrate access, and an additional C-terminal β-sandwich domain.The four PETases and PueA and PueB from P. chlororaphis are GX types 37 .Recently, a modelling study on the PURases fromPseudomonas chlororaphis predicted putative substrate binding sites for PUR-like substrates 44 .However, a rearrangement of the substrate was observed upon the molecular simulation of the complex, which is an additional challenge for the identification of the substrate binding site 45 .Interestingly, most of the substrate binding residues predicted for PueA 44 are conserved (Table S7 ).
The PURase from Delftia acidovorans (Uniprot identifier Q9WX47) does not belong to the GX type hydrolases, but has a sequence similarity to carboxylesterases of superfamily 13 and to the family PF00135 in Pfam, and thus belongs to the GGGX-type hydrolases 37 .Other carboxylesterases in the LED are members of superfamily 4, which have a mobile lid between β-strand -4 and -3 of the core domain.Because the PURase from Delftia acidovorans shared less than 50% sequence similarity to the sequences in the LED and due to the lack of experimental structure information, it was assigned to a new superfamily.

Sequence network
A threshold of 60% similarity was used to construct protein sequence networks for LED superfamilies 11 and 13 whose members originated from Proteobacteria only (Figure 2 ).Two disconnected sequence networks emerged: a network of 127 representative sequences of superfamily 11, which contains the GXtype PURases PueA andPueB from Pseudomonas chlororaphis , and a network of fifteen representative sequences of superfamily 13, which contains the GGGX-type the PURase PudA from Delftia acidovorans .In both superfamilies 11 and 13, the sequences originated mainly from Gammaproteobacteria, with the genus Pseudomonas being most frequently annotated in superfamily 11.

Sequence motifs
Two sequence motifs were reported previously in 46 for the PURase from Pseudomonoas chlororaphis , PueA (Table 3 ).The occurrence of the serine hydrolase motif (GXSXG) and the secretion signal sequence motif (GGXGXDXXX) were confirmed for the vast majority of all 2054 sequence entries in a multiple sequence alignment for superfamily 11 of the LED.Most PURase homologues from superfamily 11 have a GHSLG motif flanking the catalytic serine and secretion motif GGKGNDYLE.For sequences from superfamily 11 in the LED, the catalytic triad is formed by serine, aspartate, and histidine.In addition, most sequences in superfamily 11 (2039 out of 2054) matched the profile HMM for a RTX calcium-binding nonapeptide repeat (PFAM PF00353), which supports the previous suggestion of a Type I secretion system for protein translocation 47 .
Prominent amino acid positions in superfamily 13, which comprises homologues of the PURase PudA from Delftia acidovorans , include the GXSXG serine hydrolase motif, a catalytic triad of serine, aspartate, and histidine, and a putative PUR binding region at PudA positions 347 to 395 48 .Many of these positions were found to be conserved within the sequences of superfamily 13 (Table 4 ).Most PURase homologues from superfamily 13 have the motif GES AG flanking the catalytic serine, the motif VPX 3 G[ST]X 2 DE at the catalytic glutamate, and the motif AXH X 3 [LI]XY flanking the catalytic histidine.In addition, several positions in the putative PUR binding region were found to be conserved, including mostly hydrophobic amino acids (Table S8 ).

Sequence annotations of plastics-active enzymes
Enzymes of different EC classes have been proposed to contribute to degradation of PET or PUR.α/βhydrolases annotated as cutinases, esterases, or lipases were described to catalyze the hydrolysis of ester bonds and were collected in the PMBD 14 , and peroxidases and laccases have been reported to enhance degradation of PUR 49 .For most types of plastics, however, knowledge on enzymatic degradation is missing, although materials such as polypropylene (PP) are produced in large scales and contribute extensively to global plastics pollution.In this paper, we focus on PET-and PUR-degrading α/β-hydrolases due to the availability of sequence information and detailed experimental data from literature and public databases 14 .
Many PETase homologues were found in the NCBI non-redundant protein database, due to the sequencing efforts of the cutinase-expressing bacterial phyla such as Actinobacteria and Proteobacteria.In contrast, sequences from fungal origin are unrepresented.The usage of metagenomics is expected to further broaden the scope of currently known PETases 50 .The PETase sequence motif suggested herein (Table 2 ) is based on current knowledge from literature, such as the occurrence of an additional flexible loop region and amino acids that seem relevant for interaction with PET.The seven suggested PETase candidates can be used as starting points for wet-lab experiments, such as protein design experiments for improved PETase activity or thermostability.
Although PETases and PURases share the α/β-hydrolase fold as catalytic domain, their structure and their oxyanion hole types differ.All the 2930 PETase homologues belong to a large family (36936 sequences) of GX-types, 37 which consist of the core domain only (superfamily 1 in the Lipase Engineering Database, 20 ), whereas PURases belong to either of three superfamilies: superfamily 1, superfamily 11, which consists of GX-types with two additional domains, an N-terminal mobile lid and a C-terminal β-sandwich domain (2054 sequences), or to superfamily 13 , which are carboxylesterases of the GGGX-type (44 sequences).Ample structural information is available for PET-degrading α/β-hydrolases, especially for LCC and Is PETase, whereas the structures of PUR-degrading α/β-hydrolases have not been resolved, yet.The former has also inspired the design of improved PETase variants, as recently demonstrated for variants based on LCC 40 .Previously, conserved subsite I, subsite II, and an extended loop region were identified inIs PETase and used for a systematic comparison with its homologues 39 , which was confirmed by our conservation analysis of 2930 protein sequences.The profile HMM for PETases and the derived standard numbering scheme is available at the PAZy.It will help in the identification and comparison of amino acid positions reported in literature and will facilitate the design of new PETase variants.For PURases, homology models predict substrate binding regions 44,45 .Recently, a putative PURases was identified in the Proteobacterium Serratia liquefaciens 51 .Similarly, most putative PURases in the PAZy are from Proteobacteria (mainly from Pseudomonas chlororaphis and Delftia acidovorans ).

Major challenges in searches for plastics-active enzymes
One of the major challenges will be the identification of novel enzymes for polymers for which none are currently known 11,52 .Thus, we urgently need enzymes acting on PE, PP, PVC, but also on polymeric PA and PU ether bonds.Further we realized the lack of a common PET or PUR model substrate, which would allow the direct comparison of the kinetic parameters of different plastics-active enzymes.In contrast, kinetic analysis is generally performed for typical esterase substrates such as pNP-caproate.This data, however, does not allow a reliable prediction of the actual plastics activity.The few enzymes that have been characterized using polymers were tested on different polymer types, and pre-treatment was used to enable better degradation (see Table 1 for references).In addition, all kinetic data was recorded using single point measurements, thus the hydrolysis of the polymer could not be separated from the attachment of the enzyme to the polymer surface or from the hydrolysis of the resulting oligomers.Within this framework, a characterization of surface area and a control of surface properties such as crystallinity would be favourable to obtain better and more reliable kinetic data on the actual polymer.
The accumulation of verified plastics-active enzymes in databases with a reliable structure-function analysis will allow more predictive searches to rapidly and reliably identify novel and more active enzymes.Thereby it will allow to foster the search and development of novel pathways to create designer bugs using to solve the plastics problem.

How can databases contribute to a solution of the plastics waste problem?
For an efficient enzymatic degradation of plastics, we see four challenges.First, enzyme families other than α/β-hydrolases should be considered.For instance, laccases or peroxidases can also act on PUR 41 .First reports have been published but fell short in the identification of proteins and genes.Enzymatic or non-enzymatic degradation of other plastics components such as dyes or additives from commercial sources might need further investigations, too.Second, there is an increasing need for comparable data on plastics degradation.The comparability and reproducibility of data on enzymatic plastics degradation is impeded by the variety of possible substrates, e.g., in case of plastics built from several types of monomers.Furthermore, physical properties of plastic materials can differ remarkably among different commercial suppliers, e.g., thickness of plastic foils, number of additives, or crystallinity.Finally, incorrect annotation of genome and metagenome datasets has resulted in the accumulation of many false positive plastic degrading enzymes in various publications but also in PMBD.Removing these from the data bases is a major task.
Within this setting the PAZy database provides information on several well studied enzymes supplemented by the protein sequences and structures of homologous sequences, which enables the search for novel candidates and the design of enzyme variants.Most protein sequences and functional data are available for PETases, for which several positions were already assessed experimentally for substrate binding or thermostability in earlier studies from literature (Table 2 ).The standard numbering scheme of PETase homologues facilitates the comparison of different amino acid positions from literature and the identification of sequence motifs for PETases, as shown for the comparison between Is PETase, LCC, and other PETase homologues.In upcoming studies, the PAZy database will be updated to cover sequences from other protein family databases than the LED, depending on the availability of structures and biochemical data.Finally, the underlying web platform of the PAZy database makes it easier to share knowledge on various plastics degrading enzymes in a more comparable way.Thus, it can be expected that the suggested sequence motifs for PETases or PURases will be refurbished, as more experimental data on residues for substrate binding or thermostability are made available in the future.

FIGURE 2 .
FIGURE 2. Network representation for 142 complete protein sequences similar to PURases linked by 6419 edges.The protein sequences depicted here were selected by clustering at a threshold of 90% sequence identity.Edges (links) were selected at a threshold of 60% global sequence similarity, without defining a core domain region.Nodes are coloured according to their annotated source organisms, with Proteobacteria in blue and unknown bacteria in white.The network on the left represents sequences with an N-terminal lid and a C-terminal β-sandwich domain and contains 127 nodes connected by 6314 edges.Diamonds represent sequences originating from the genusPseudomonas (from the class Gammaproteobacteria).The network on the right represents sequences similar to carboxylesterases and contains 15 nodes connected by 105 edges.Squares represent sequences originating from the class of Betaproteobacteria.See Methods section for more details on the network layout.