Substrate profiling of the metalloproteinase ovastacin uncovers specific enzyme–substrate interactions and discloses fertilization‐relevant substrates

The metalloproteinase ovastacin is released by the mammalian egg upon fertilization and cleaves a distinct peptide bond in zona pellucida protein 2 (ZP2), a component of the enveloping extracellular matrix. This limited proteolysis causes zona pellucida hardening, abolishes sperm binding, and thereby regulates fertility. Accordingly, this process is tightly controlled by the plasma protein fetuin‐B, an endogenous competitive inhibitor. At present, little is known about how the cleavage characteristics of ovastacin differ from closely related proteases. Physiological implications of ovastacin beyond ZP2 cleavage are still obscure. In this study, we employed N‐terminal amine isotopic labeling of substrates (N‐TAILS) contained in the secretome of mouse embryonic fibroblasts to elucidate the substrate specificity and the precise cleavage site specificity. Furthermore, we were able to unravel the physicochemical properties governing ovastacin–substrate interactions as well as the individual characteristics that distinguish ovastacin from similar proteases, such as meprins and tolloid. Eventually, we identified several substrates whose cleavage could affect mammalian fertilization. Consequently, these substrates indicate newly identified functions of ovastacin in mammalian fertilization beyond zona pellucida hardening.

Ovastacin (encoded by the gene ASTL), a member of the astacin family of metalloproteinases, is one of the enzymes that regulate fertilization at the level of the zona pellucida (ZP), the egg-enveloping extracellular matrix.As a typical astacin-like proteinase, ovastacin comprises a catalytic domain made up of 198 amino acid residues containing the extended zinc binding motif (H 182 ExxHxxGxxH 192 ) and the strictly conserved 1,4-b-turn, called the Met-turn (M 236 ), typical for metzincin proteinases [13].Additionally, it contains a unique cortical granule localization motif (D 52 KDIPAIN 59 ) within the N-terminal propeptide, upstream of the conserved zinc-binding aspartate ensuring latency, and a C-terminal region (CTR) of 150 residues in mice with unknown function [14,15].
Upon plasmogamy and oocyte activation, ovastacin is released into the perivitelline space during the cortical reaction and cleaves a distinct peptide bond (SRLA↓D 168 ENQ) within zona pellucida protein 2 (ZP2), thereby converting it into ZP2f [1].Thus, sperm reception is disrupted and the ZP "hardens," which prevents sperm binding to the ZP and definitely abrogates its penetration for further sperm [1,16].Even though ZP2 has so far been the only proven physiological substrate for ovastacin, cleavage of additional substrates in the complex secretome of fertilization appears likely.
Phenotypically, ovastacin-deficient mice (ASTL À/À ) are inconspicuous except for a significant reduction in fecundity [17].Here, the absence of ZP hardening results in a soft ZP, which fails to mechanically protect the embryo until implantation.During in vitro fertilization (IVF), ASTL À/À -oocytes and embryos are covered by a vast number of bound sperm and many sperm in the perivitelline space [17,18].However, this does not increase the rate of polyspermy, which further supports the concept that the decisive block against polyspermy is located at the oolemma level [18][19][20].Similarly, loss-of-function mutations in ovastacin also appear to affect human reproduction [21].
Absence of the plasma protein fetuin-B, an endogenous ovastacin inhibitor, leads to female infertility [22].Here small quantities of prematurely released ovastacin are sufficient to prevent fertilization.Hence, strict regulation of this proteinase is essential to maintain fertility.Conversely, supplementation of fetuin-B during IVF significantly improves the success of fertilization by effectively blocking prematurely released intracellularly activated ovastacin [17,22,23].
Recently, we were able to elucidate the precise mechanism of this inhibition [24][25][26].In mammals, only three members of the astacin family of proteinases, ovastacin, and the closely related meprins (a & b) are inhibited by fetuin-B [24,27].The underlying molecular mechanism differs greatly from the classical cystatin/papain interaction [28].This inhibition by the so called "raised elephant trunk mechanism" is not due to the interaction of one or both cystatin domains of fetuin-B with the active site of ovastacin but is rather based on the warhead containing linker (C 154 PDC 157 ) between the two cystatin-like domains and a hairpin loop in the second cystatin-like domain.
In the present study, we analyzed the cleavage specificity of ovastacin by applying N-terminal amine isotopic labeling of substrates (N-TAILS) [29] to the secretome of mouse embryonic fibroblasts (MEFs).We found unique characteristics of ovastacin, differing from other astacins.Furthermore, we identified a variety of cleavage sites in novel potential physiological substrates.These cleavages might extend the functions of ovastacin during fertilization.

N-TAILS analysis
We used the secretome proteins of mouse embryonic fibroblasts (MEFs) as a versatile resource for mass spectrometric N-TAILS analysis of the cleavage specificity of heterologously expressed murine ovastacin.We have favored this model, as ovastacin and its regulation by fetuin-B have been best studied for the mouse ortholog [1,17,22,24,[30][31][32][33].Since the physiological function, ZP2 cleavage, per se, and its regulation by fetuin-B are highly conserved [24,25,30], we assume this model is valid for most theria and thus also for humans as well as domestic animals.MEFs were chosen rather than germ cells to overcome the particularly low quantity of germ cell secretome and to avoid very high animal consumption.Based on this complex mixture of natively folded mouse proteins, we unraveled ovastacin's specific cleavage preference.On top, we detected a series of novel, potentially physiologically relevant ovastacin substrates.Altogether, 855 unique cleavage sites were identified in 489 proteins.After restriction to the physiological context, that is, the extracellular space, we allocated a dataset of 353 unique cleavage sites to 163 proteins (Table S1).

Ovastacin adds individual characteristics to the typical acidic prime side specificity of most astacin metalloproteinases
Results are presented according to the concept by Schechter and Berger [34], indicating amino acid residues located N-and C-terminally of the cleavage site as "non-prime" (P) and "prime" (P 0 ), respectively.The corresponding subsites of the protease, harboring the substrate's side chains, are termed S and S 0 , accordingly.The general cleavage profile of the 855 unique cleavage sites shown in Fig. 1 is highlighted by the graphic iceLogo representation (Fig. 1D).The cleavage site in mouse ZP2, the so far only known physiological ovastacin substrate, is SRLA↓D 168 ENQ [1]. Figure 1E shows a frequency plot of the corresponding ovastacin cleavage sites in mammalian ZP2-proteins (PGMA↓-DENA, for alignment see Fig. S1), which slightly deviates from the mouse sequence.The key preference of ovastacin is aspartate in position P1 0 .At this crucial position, aspartate even outnumbers glutamate at a striking ratio of 15 : 1.In general, aspartate is clearly favored over glutamate in all positions of the prime side (P1 0 to P5 0 ).Ovastacin significantly prefers negatively charged residues in positions P1 0 to P6 0 , especially aspartate in positions P1 0 (+70.1%,rel. to natural occurrence) and P2 0 (+16.6%)(Fig. 1C,D).There is considerable variability in position P1, where aspartate (+7.6%), phenylalanine (+4.8%), alanine (+4.6%), tryptophan (+4.0%), asparagine (+3.7%), and histidine (+3.5%) are found predominantly.In position P2-P4, hydrophobic residues are preferred, such as leucine, (P2, +19.8%), phenylalanine (P2, +8.2%), valine (P3, +5.0%), and isoleucine (P4, +5.3%).Condensed, on the non-prime side, bulky aromatic side chains are observed at higher frequency in positions from P3 to P1, whereas negatively charged residues, especially aspartate, predominate the prime side.This is highlighted by the graphic iceLogo representation (Fig. 1D).As reflected from our N-TAILS analysis, the most crucial side chains are P1 0 and P2 0 .Adjacent side chains in P4, P3, P2 on the non-prime side and P3 0 , P4 0 on the prime side are more variable, probably also reflected by the conservation of the cleavage site in the mammalian in ZP2-proteins (Fig. 1E).This probably contributes to species-specific cleavage.

Physicochemical properties of the ovastacin substrate binding cleft compared to other astacins
The cleavage preference determined for ovastacin (Fig. 1D) shows high agreement with the cleavage characteristics of other astacins (BMP1, meprin a, meprin b, and LAST MAM) partly due to the preference for the negatively charged aspartate residue in position P1 0 [35].To identify ovastacin-specific characteristics, we analyzed in detail the physicochemical properties of cleavage sites, that is, electrostatics, hydrophilicity/ hydrophobicity, volume, flexibility, and degree of conservation of distinct residues flanking the cleavage site (Fig. 2).All signatures of the individual astacin members (Fig. 2, Fig. S2) feature the highest degree of conservation in position P1 0 , and to lesser extent in the positions P2 to P2 0 for meprin a only.The hydrophobicity of the residues is low in P1 and P1 0 , and high in P3 and P2.The side chains with the largest volumes tend to be in P3 to P2, those with the smallest in the range P1 to P2 0 .The flexibility of the residues is lowest between P3 and P2 and is highest in the positions between P1 0 and P2 0 .Therefore, the typical substrate of the astacins may be characterized as follows: small, flexible, hydrophilic residues in P1 0 and to a slightly lesser extent in the flanking positions P1 and P2 0 , and large, rigid, hydrophobic residues in positions P3 to P2.All investigated astacins exhibit a common pattern, despite significant differences in the individual substrate sequences (Fig. 2) and can be combined in a typical astacin signature of astacins ("astacin footprint") (Fig. 2A).How does the individual "ovastacin footprint" look like, in comparison?Ovastacin (Fig. 2B,C) shares the family-specific, strong preference for aspartate in P1 0 , but is preferring aspartate also in P2 0 -P4 0 , even over glutamate.The "ovastacin footprint" also includes pronounced hydrophobicity, rigidity, and bulkiness of residues in positions P4 to P2 (especially in P2) (Fig. 2C).
Among astacins, the physicochemical cleavage signature of ovastacin shows the closest similarity to that of meprin b (Fig. 2E).However, in direct comparison, meprin b equally favors glutamate and aspartate in position P1 0 .
The aspartate residues of the prime side are, depending on the exact orientation of the side chains, proximate for interacting with the side chain of Arg264.Lys175 and Arg174 appear too distant for a direct interaction at least for P1 0 and P2 0 .However, they are likely involved in stabilizing negative charges in position P3 0 -P4 0 .Generally, in astacin-like metalloproteases, the side chain in P1 0 points directly toward the bottom of the active site cleft [37].This is the reason for its size limitation.Other residues are in contact with the upper and lower rim of the cleft.As seen in Fig. 3C,D, the potential to bind negatively charged side chains is located on the right-hand side of the cleft.

High sequence identity suggests functional conservation of ovastacin within the theria
In order to assess the potential translatability of the results generated here with murine ovastacin, we analyzed the conservation and phylogeny for all major mammalian taxa, especially primates, artiodactyls, and carnivores including relevant domestic animals.Here, we primarily focused on the catalytic domain, since, in the mouse model the catalytic domain is released as a consequence of fertilization without CTR [17].Accordingly, we considered the high variability or even a partial loss of CTR would probably not affect the cleavage specificity, vide supra.Ovastacins of all species analyzed share the typical characteristics required for catalysis (Fig. S3).The sequence identity analysis confirmed very high conservation of ovastacins within all analyzed eutherian species (Fig. 4A).The identity is at least 74% to the mouse, and 81% to the human ortholog.Within primates, the identity is even > 95% (outlined in green); within the cetartiodactyls, it is ≥ 88% (red); and within the carnivores, it is ≥ 89% (blue).Even with the Metatheria, here represented by the koala, the mean sequence identity is only slightly reduced, to about 70%.Fish alveolin (from medaka, Oryzias latipes) and frog XHE1 (Xenopus hatching enzyme; also termed UVS.2 [38]) were chosen for comparison, because these enzymes belong to the hatching enzyme (HE) subfamily of astacin proteinases.They are functional homologs of ovastacins with respect to their cleavage of the egg envelope (for review see [39]).However, they are clearly distinct from the ovastacin subfamily and may serve as outgroup with sequence identities to ovastacin of 35-40%.Even protostome hatching enzymes, such as Astacus embryonic astacin (AEA), reveal this conservation level [40,41].If the CTR is included to the analysis, it becomes evident that this region is only slightly conserved in comparison.Only within primates this region exhibits significant conservation.Phylogenetic analysis also reflects the strong conservation of the catalytic domain, with a high level of confidence between the different mammalian taxa (Fig. 4B).Solely within the segregation of certain laurasiatheria and, to a lesser extent, eurarchontoglires and afrotheria, the phylogeny was less well resolved.Finally, we modeled the structures of the Depicted in each case is a sequence logo (upper panel) displaying the difference to the natural abundance of residues in the cleavage site positions P4-P4 0 , the degree of conservation of positions according to the BLOSUM62 substitution matrix [72] (upper graph), the hydrophobicity of the residues [73] (second graph from the top), the volume of the residues [74] (second graph from the bottom), and the flexibility of the residues [75] (bottom graph).The significance level is 95%.The error range displayed in gray/light blue.The diagrams allow a semiquantitative evaluation, since iceLogo does not permit scale normalization.
catalytic domains of several ovastacins from different mammalian taxa, according to the phylogenetic tree (Fig. 4B) to allow conclusions regarding the conservation of function and substrate interaction among different species.Here, the ovastacins of all species reveal the almost identical astacin-like fold of the catalytic domain (Fig. 4C).Accordingly, XHE1 with only 40% identity reflects an almost identical fold.In order to obtain comparative conclusions about the substrate interaction, we have also considered the Coulomb potential of the surface (Fig. 3).Here, we observed a charge pattern very similar to mouse ovastacin in the other mammalian species.The bottom of the catalytic cleft is always uncharged (Fig. 4D).The positive charged residues are mainly located on the right half of the catalytic cleft (standard orientation).The residues Lys175 and Arg264, responsible for substrate recognition on the prime side, are conserved in ovastacins of all species, exept Arg174, which is only present in rodents.Apart from the preference for acidic residues in P1 0 , the anuran XHE1 exhibits a very different physicochemical surface pattern.These features suggest a cleavage specificity that is distinct from that of the mammalian ovastacins.The strong conservation of ovastacin including its surface properties in the catalytic cleft, as well as the completely conserved di-acidic cleavage site in ZP2 proteins (Fig. S1), indicates the overall conservation of its function within the eutheria.

Physiological relevant substrates identified via TAILS analysis
Cleavage of ZP2, the only confirmed physiological substrate of ovastacin so far, is a key regulatory event in fertilization since it destroys the sperm receptor and mediates ZP-hardening.By screening the 489 substrates cleaved by ovastacin in our N-TAILS analysis for proteins involved in fertilization, we identified 29 cleavage sites (Tables 1 and 2) within 12 potentially fertilization-relevant proteins.Several examples are to be mentioned explicitly.b-1,4-galactosyltransferase (B4GALT1) is localized on the sperm membrane and has been shown to bind sugar side chains of ZP3, thereby mediating additional attachment to the ZP [42].Calreticulin (CALR3) is localized on the acrosomal surface of sperm [43] and also on the oolemma [44].There is evidence for its involvement in oocyte signal transduction [45] and the establishment of the polyspermy block after its exocytosis from the oocyte [46].Chaperonin containing TCP1 (Cct2) also localizes to the sperm surface and is directly involved in the binding of sperm to oocytes [47,48].Matrix metalloproteinase 2 (MMP2) is associated with the acrosome and potentially contributes to the penetration of the ZP [49].Taken together, detection of cleavages in these proteins strongly indicates a more complex function of ovastacin within the proteolytic web of egg-sperm interaction beyond regulation of ZP hardening.

Discussion
Our N-TAILS analysis identified hundreds of ovastacin substrates and revealed the substrate specificity of ovastacin.In fact, ovastacin's cleavage specificity exhibits similarity with the substrate specificity of meprin b.Accordingly, some well-known meprin b substrates, such as the amyloid precursor protein (APP), were cleaved by ovastacin as well (Table S1).However, due to the spatiotemporal restriction of ovastacin to oocytes and the functional sphere of the egg-sperm interaction, these are presumably of low physiological relevance.The restriction of substrates to the extracellular space or the context of fertilization reduces the number of physiologically relevant ovastacin substrates in the present study from 489 to 163 and 12, respectively.Owing that ovastacin expression is limited to oocytes, this rigorous restriction may be appropriate [50].Although our data suggest a physiological processing, further steps are required to provide the physiological evidence.At least, a confirmation of the data obtained in a cellular context, for example, egg or sperm secretome, is required to verify the physiological aspects of a cleavage by ovastacin.The germ cell-specific expression pattern might compromise the activity or substrate selection of ovastacin, for example, by potential pre-processing or interactions of Fig. 3. Structural analysis of substrate interaction.Surface models of the catalytic domain of murine ovastacin (AAH64729.2;position 92-282) created with ALPHAFOLD2 using COLABFOLD [76].(A1) Coulomb potential of the surface, highlighting areas with positive (blue, 10 kcalÁ(mol*e) À1 ) and negative (red, À10 kcalÁ(mol*e) À1 ) electrostatic potential.(A2) Surface areas of hydrophilic (cyan) and hydrophobic (yellow) residues according to the Kyte-Doolittle scale [86].(B) Upper model with all accessible positively charged residues (Arg, Lys) and lower model with all accessible hydrophobic residues (Leu, Ile, Val, Pro, Phe, Trp, Metl, and Tyr).(C) Synthetic peptide (Coulomb potential of the surface, vide supra) according to the most preferred residues in position P3 to P3 0 (Fig. 1) and inserted into the catalytic cleft via CHI-MERA; highlighted residues of ovastacin are discussed.(D) Hydrophobic (red) and hydrophilic (cyan) regions in the catalytic cleft.
substrates.This also explains why we were not able to identify the only physiologically proven substrate.ZP2 is not expressed by MEF cells [51].Since the limited proteolysis results in only a few peptides with an ovastacin-specific neo-N-terminus (< 10% of the secretome), which therefore need to be enriched, the application of N-TAILS to germ cells is currently still elusive due to the limited quantity of available secretome.Optimization and establishment of the N-TAILS approach on germ cell secretome would remedy these problems.
Most of the fertilization-relevant substrates listed in Table 2 have been found to positively influence sperm viability or sperm-oocyte binding [47][48][49]52,53].If cleavage of these substrates results in their loss of function, ovastacin could be the initiator of a multifactorial regulation of sperm interaction.For instance, B4GALT1 mediates the binding of sperm head to oligosaccharides of ZP3 until exocytosis of the acrosome [54].Although this interaction is not strictly essential, unlike sperm binding to ZP2, its function is poorly understood.Proteolysis of B4GALT1 at S 248 or D 249 , respectively, in a loop region between two central bsheets, is likely to alter the conformation and binding properties to ZP3.Similarly, for instance, the ovastacin cleavage sites in MMP2 (D 208 , D 209 , and D 210 , Table 2) are located in a b-strand above the zincbinding active site helix.From there, the chain enters the first fibronectin-like domain, which is an important exosite of MMP2 for binding protein substrates.Hence, cleavage within this region would most likely disrupt (protein-) substrate binding to MMP2 and abrogate its function, which might impair the penetration capacity of sperm.Reduction of sperm viability and hindrance of sperm binding to the oocyte would extend the effect of the already known ZP hardening by ovastacin.Recently, the regulation of ovastacin activity has been shown to be clinically relevant in humans [21,23,55], which underpins the importance of these potential substrates.
The cleavage preferences identified in this study displayed a high preference for aspartate in position P1 0 and, to a lesser extent, for aspartate followed by glutamate in the P2 0 to P4 0 region.In addition, there is a preference for leucine in position P2 of the non-prime side.A similar preference for aspartate in position P1 0 has been also described for the astacin proteases BMP1 (bone morphogenetic protein 1), meprin b, and LAST MAM (MAM-domain containing astacin from the horseshoe crab Limulus polyphemus) (Fig. 2E,F, Fig. S2).Substrate binding, analyzed by modeling, might be mediated by interactions with Arg264 and Lys175, and by the properties of hydrophobic (aromatic as well as aliphatic) residues of the non-prime side.In standard orientation, Phe154 is located at the upper left edge of the catalytic cleft, while Val157 and Trp191 form the bottom section, and Ile216, Ile219, Phe214, and Phe243 are present at the lower left edge.Together, these residues wrap around the peptide chain of the substrate in the left half of the catalytic cleft in a cuff-like fashion.
Both the very high conservation level of ovastacin, including its surface properties relevant for substrate interaction, and the presence of the typical di-acidic cleavage motif in all mammalian ZP2 suggest functional conservation of ZP2 cleavage and ZPH, respectively, in all mammalian species.This hypothesis is supported by strong conservation of the structural elements required for the inhibition of ovastacin in fetuin-B [24,30].Perhaps for this reason, the highly variable residues on the prime side (starting with P3 0 ) in ZP2 (Fig. 1E) could be of particular importance for substrate recognition and potentially for speciesspecific cleavage.The results presented here may thereby contribute to understanding a perplexing observation, published two decades ago by Dean and coworkers [56], who observed that mice carrying human ZP2 instead of endogenous mouse ZP2 were fertile in vitro and in vivo.However, only mouse, but not human sperm bound to these humanized mouse eggs and human ZP2 remained uncleaved by mouse ovastacin.As a possible reason for this observation, the authors considered humanized ZP on mouse oocytes would be glycosylated mouse specific and, thereby, triggered a binding mismatch for human, Fig. 4. Phylogenetic and interspecific analysis of ovastacin.(A) Identity matrix (% of identity) of 30 mammalian ovastacin (catalytic domain/ catalytic domain and C-terminal region as specified) orthologs representing all major therian taxa.In addition, astacin proteases from Xenopus laevis (hatching enzyme) and Oryzias latipes (alveolin) with a homologous cleavage of the egg envelope are displayed.Alignment was performed with CLUSTALO (Fig. S3).Data grayed out for species without CTR.(B) Consensus tree of the catalytic domain of ovastacins and homologous astacins presented in A. Computed using MR BAYES v3.2.7_0 via the NGPHYLOGENY tool [83] (100 000 generations, burn-in fraction 0.25) and displayed via ITOL [84] with specification of the credibility scores.(C, D) Models of the catalytic domain of ovastacins were created with ALPHAFOLD2 using COLABFOLD v1.5.2 [76].(C) Superposition the catalytic domain of ovastacins from defined species with CHIMERA.(D) Individual representation of the models from C with Coulomb potential of the surface, highlighting areas with positive (blue, 10 kcalÁ(mol*e) À1 ) and negative (red, À10 kcalÁ(mol*e) À1 ) electrostatic potential.
but not for mouse sperm.This would explain the binding failure of human sperm, but not the inability of mouse ovastacin to cleave human ZP2.However, the latter might be due to species-specific differences of the cleavage sites in mouse and human ZP2.Human ZP2 contains two positively charged lysyl side chains in P4 0 Table 1.Search terms for identification of substrates associated with fertilization.Search terms in the DAVID bioinformatics database [69,70] including their category, the P-value, the number of substrates in the dataset, and their fraction of the total dataset.Table 2. Substrates of ovastacin with proposed function in fertilization.In N-TAILS analysis, identified substrates whose annotations in the DAVID Bioinformatics database suggest a potential physiological impact on the fertilization.Listed are the substrates, the UniProt accession number, the search term, the total number of amino acids, the cleavage site position, and the corresponding residues on the non-prime and prime side.

Uniprot accession number
In vitro fertilized eggs and P7 0 (SGLA*DDSKGTK), which are absent in mouse ZP2 (SRLA↓DENQNVS).Mouse ovastacin, on the other hand, carries an additional arginyl residue (Arg174, Fig. 3C) in the far S 0 side, where human ovastacin features an uncharged glutamine side chain.This additional positive charge would presumably cause repulsion of the positive charge in human ZP2 and thereby prevent cleavage.Distantly related astacins, as crayfish astacin and LAST, unlike ovastacin, do not prefer negatively charged residues in position P1 0 (Fig. 2G, Fig. S2).However, we found great similarities between them in terms of distribution of physicochemical properties of substrates across the different positions relative to the cleavage site.All analyzed astacins share similar physicochemical properties in their substrate binding clefts, independent of their differing cleavage specificity.Thus, based on these findings, the different astacin proteases can be designated as variants of a common prototype.Accordingly, this prototypical signature is independent of the substrate sequence preferred by the respective enzyme.This analysis was performed for seven astacins (astacin, BMP1, LAST, LAST MAM, meprin a, meprin b, and ovastacin), which include four of the six mammalian astacins (Fig. 2).Therefore, analysis of physicochemical properties allows for deeper understanding of cleavage preference and substrate selectivity of this group of proteases in general.This could contribute to higher precision when predicting potential protease substrates.Knowledge of the biochemical properties that distinguish the cleavage of substrates by ovastacin from the closely related meprins appears to be of utmost importance for the development of more specific and selective inhibitors.Such inhibitors would significantly reduce the amount of oocytes required for IVF [23,33], and additionally would overcome the problems of proteinogenic inhibitors (e.g., half-life, application restrictions, production) such as fetuin-B.

Conclusion
In summary, using N-TAILS, we were able to resolve the cleavage preference of ovastacin towards native substrate proteins within a complex secretome.This resulted in a globally applicable cleavage site signature ("ovastacin footprint") based on physiochemical properties of ovastacin.This footprint might serve as a useful tool to elucidate protease-substrate interactions in general.Additionally, this study allowed us to identify new ovastacin substrates with potential physiological relevance and deepens insight into the biochemical complexity of mammalian fertilization.

Cell culture and secretome preparation
We cultured mouse embryonic fibroblasts (MEF), which originated from [57] and were kindly provided by O. Schilling (University Medical Center, Freiburg, Germany), in DMEM (Dulbecco's Modified Eagle Medium; Sigma-Aldrich, Taufkirchen, Germany) with 10% FBS and 100 UÁmL À1 Penicillin/Streptomycin (37 °C, 5% CO 2 ) in adherent culture and passaged three times per week (detaching of adherent cells with 0.05% trypsin/EDTA solution).Cells were regularly tested to be free of mycoplasma.

Activation of pro-ovastacin
Heterologously expressed murine ovastacin was purified and activated with human plasmin (Haematologic Technologies, Essex Junction, VT, USA) as described [8,27].Exemplarily, Fig. S4 illustrates ovastacin purification and activation.Based on the short half-life of the ovastacin activity, the plasmin used for activation was not removed but inhibited using cOmplete TM EDTA-free Protease Inhibitor Cocktail or Pefabloc Ò SC.The inhibition was checked via a fluorogenic assay (as exemplified in Fig. S5).

Sample preparation for terminal isotopic labeling of substrates (TAILS) analysis
The secretome of MEF cells was incubated twice with active ovastacin at a mass ratio of 100 : 1 for 2 h, at 37 °C or with the equal volume of activation solution without ovastacin, respectively.Subsequently, samples were reduced in 5 mM dithiothreitol, at room temperature (RT).After incubation (70 °C, 60 min) the samples were alkylated with iodoacetamide (final concentration 15 mM, 30 min, RT, light-protected).Thereafter, proteins were precipitated with trichloroacetic acid [58] and resuspended (200 mM HEPES, pH 7.5) to a final concentration of approx. 2 mgÁmL À1 .Ntermini were labeled with light and heavy formaldehyde ( 12 CH 2 O in H 2 O, and 13 CD 2 O in D 2 O (252549 and 596388; Sigma-Aldrich)), respectively.Sodium-cyanoborohydride was added to a final concentration of 40 mM, adjusted to pH 7 followed by overnight (ON) incubation at 37 °C [59].Excess formaldehyde was captured by adjusting the sample to 50 mM Tris/HCl pH 7 and incubation for 4 h (37 °C).Proteins were precipitated with 99 volume acetone and 19 volume methanol (each À20 °C) ON at À80 °C and centrifuged (4500 g for 60 min at 4 °C).The pellet was washed four times with 1 mL methanol following centrifugation (4500 g for 15 min at 4 °C) and dissolved in 200 mM HEPES buffer (pH 8).Proteolysis into peptides was accomplished using 0.5% (w/w) Trypsin (sequencing grade; Worthington Biochemical, Lakewood, NJ, USA) (37 °C, overnight).After desalting via C18 Sep-Pak columns (Waters Corporation, Milford, MA, USA), we enriched the labeled peptide N-termini by negative selection using a polymer (hyperbranched polyglycerol-aldehyde, HPG-ALD (University of British Columbia, Vancouver, Canada)).Peptides were adjusted to 200 mM HEPES (pH 7) and incubated with a ratio of 1 : 5 (w/w) of polymer and 100 mM sodium cyanoborohydride (ON at 37 °C, 40 r.p.m.).Free aldehyde groups were saturated using 100 mM glycine pH 7 for 60 min at RT. Subsequently, the N-terminal blocked peptides were isolated using centrifugal filtration (Vivaspin 500, 10 kDa MWCO) and two washing steps (19 300 lL 100 mM ammonium bicarbonate, 19 300 lL 2 M guanidine hydrochloride).The samples were desalted via C18 columns and fractioned via strong cation exchange chromatography (Stagetips) [60].Afterward, the samples were desalted again.Peptide concentrations for mass spectrometric measurement were determined by a NanoDrop ND-1000 spectrophotometer (PeqLab, Erlangen, Germany).Protein or peptide concentration in the course preparation was checked by Bradford-or bicinchoninic acid-assays [61,62].

TAILS analysis by mass spectrometry
Mass spectrometric analysis was performed using LC-MS (liquid chromatography/mass spectrometry) on a Synapt G2-S HDMS (Waters Corporation) in positive ion mode.Chromatographic separation of peptides was conducted on a nanoAcquity UPLC system (Waters Corporation).Peptides were separated using a C18 reversed-phase analytical separation column (HSS-T3, 75 lm 9 250 mm, 1.8 lm particle size) (Waters Corporation).Samples were loaded directly onto the column (2.5 lL injection volume).Peptides were separated using a gradient of mobile phases A (water containing 0.1% (v/v) formic acid and 3% (v/v) DMSO) and B (acetonitrile containing 0.1% (v/v) formic acid and 3% (v/ v) DMSO).The proportion of phase B increased from 5% (v/v) to 40% (v/v) over 90 min (300 nLÁmin À1 ).Afterward, the column was re-equilibrated with phase B (90% (v/v), 600 nLÁmin À1 , 5 min).Total measurement time was 120 min.Data were acquired using HD-DDA [63].[Glu 1 ]fibrinopeptide was used as lock mass at 100 fmolÁlL À1 and sampled every 30 s into the mass spectrometer via the reference sprayer of the NanoLockSpray source.Processing of raw data and database searching was performed using PEAKS (version 8.5; https://www.bioinfor.com/)[64][65][66].We reviewed liquid chromatography plots individually to exclude false positive results.Single signals (FDR = 0.01) identified by PEAKS were checked for the presence of a double signal caused by 6 Da mass difference due to di-methylation, which was considered a duplicate and excluded from the data set.Peptides with naturally acetylated N-termini as well as peptides with unlabeled N-termini or double labeled with light and heavy formaldehyde were also excluded.All peptides identified by PEAKS that had N-terminal labeling by heavy formaldehyde and no analog with N-terminal labeling by light formaldehyde were manually selected.Samples were prepared as two independent biological replicates, and each analyzed in technical triplicates.The final data set only included peptides that could be assigned to a single UNI-PROT accession number from Mus musculus.Amino acid residues N-terminally of the cleaved peptide bond (i.e., "non-prime side") were determined by bioinformatic alignment of the sequences with the UNIPROT database.Cleavage sites < 10 residues distant from N-or C-terminus were not considered.
Comparing all identified cleavage sites with aspartate in position P1 0 (cleavage preference of astacin proteases) against the full data set revealed a negative correlation with lysine and arginine in position P1 (tryptic cleavage specificity).Therefore, in order to exclude peptides generated, for example, by not entirely inactivated plasmin carried over from ovastacin activation, the data sets from both biological replicates were further subjected to cluster analysis (GibbsCluster 2.0) [67,68].This is exemplified for one of the replicas in Figs S6 and S7.The identified clusters from both biological replicates, with the typical acidic residues on the prime side [35], were combined for the further analyses.The dataset was limited to proteins within the physiological context of the extracellular space by analysis using the DAVID Bioinformatics database [69,70].

Identification of the cleavage specificity and physicochemical properties of ovastacin
Determination of cleavage preferences derived from the substrate cleavage sites, their representation as sequence logos, and statistical variations in the abundancies of residues was performed using iceLogo [71].In order to compare cleavage preferences of other proteases based on the cleavage sites annotated in the MEROPS database [10], only cleavage sites with all positions from P4 to P4 0 identified were used.The "subLogo" function was used to check for cooperativity between residues in different positions.To calculate the degree of conservation of residues in each position the "Conservation Line" function with a BLOSUM62-substitution matrix [72] was used.For determination of parameters needed for evaluation of physicochemical properties, the "Aa Parameter" function was used.Underlying parameters are listed in the AAindex1 database (http://www.genome.jp/aaindex/).The "Transfer Energy, organic solvent/water" template was used to determine hydrophobicity [73], the "Residue Volume" template to determine residue volume [74], and the "Average flexibility indices" template to determine residue flexibility [75].

Phylogenetic analysis
Selected ovastacin or ZP2 orthologs deposited in the National Library of Medicine were aligned with CLUSTALO [82].The sequence matrix was generated using BIOEDIT.From the alignment, a consensus tree was calculated using MR BAYES via the NGPHYLOGENY tool (https://ngphylogeny. fr/tools/tool/281/form) [83] (100 000 generations, burn-in fraction 0.25) and displayed via ITOL [84].The frequency plot of ZP2 was generated using WEBLOGO [85] based on the alignment generated with CLUSTALO.

Fig. 1 .
Fig. 1.Heatmaps including iceLogo of cleavage sites (n = 855) identified via N-TAILS and conservation of the ZP2 cleavage sites in mammals.Residues are normalized to their natural occurrence in the mouse proteome (Mus musculus).Distribution of residues in positions P6-P1 (non-prime side) and P1 0 -P6 0 (prime side).Displayed for each position are the total number of residues (A), their relative abundance in percent (B), and the relative abundance corrected by natural abundance in percentage points as heatmap (C) and as iceLogo (D).The deviations were determined with iceLogo[71].Positions without significant changes (P > 0.05) are depicted in gray for the heatmaps, the ice-Logo displays only significant changes (P ≤ 0.05).Amino acids abbreviated as single letter.(E) Frequency plot of the primary ovastacin cleavage site in mammalian ZP2 (same species as in Fig.4; alignment in Fig.S1).

Fig. 2 .
Fig.2.Physicochemical properties of astacin proteases and ovastacin.Overview of the common cleavage preferences and physicochemical properties of all studied astacin proteases combined (A), ovastacin only (B) and comparatively of all astacin proteases combined versus ovastacin (C), meprin a only (D), meprin b only (E), BMP1 only (F), and astacin only (G).The analysis was performed on the basis of the cleavage sites deposited in the MEROPS database (astacin (P07584), n = 199; BMP1 (P13497), n = 25; LAST_MAM (B4F320), n = 415; LAST (B4F319), n = 76; meprin a (Q16819), n = 700; meprin b (Q16820), n = 879) and ovastacin (Q6HA09), n = 855 (this study).Depicted in each case is a sequence logo (upper panel) displaying the difference to the natural abundance of residues in the cleavage site positions P4-P4 0 , the degree of conservation of positions according to the BLOSUM62 substitution matrix[72] (upper graph), the hydrophobicity of the residues[73] (second graph from the top), the volume of the residues[74] (second graph from the bottom), and the flexibility of the residues[75] (bottom graph).The significance level is 95%.The error range displayed in gray/light blue.The diagrams allow a semiquantitative evaluation, since iceLogo does not permit scale normalization.