A Peptide‐Induced Self‐Cleavage Reaction Initiates the Activation of Tyrosinase

Abstract The conversion of inactive pro‐polyphenol oxidases (pro‐PPOs) into the active enzyme results from the proteolytic cleavage of its C‐terminal domain. Herein, a peptide‐mediated cleavage process that activates pro‐MdPPO1 (Malus domestica) is reported. Mass spectrometry, mutagenesis studies, and X‐ray crystal‐structure analysis of pro‐MdPPO1 (1.35 Å) and two separated C‐terminal domains, one obtained upon self‐cleavage of pro‐MdPPO1 and the other one produced independently, were applied to study the observed self‐cleavage. The sequence Lys 355–Val 370 located in the linker between the active and the C‐terminal domain is indispensable for the self‐cleavage. Partial introduction (Lys 352–Ala 360) of this peptide into the sequence of two other PPOs, MdPPO2 and aurone synthase (CgAUS1), triggered self‐cleavage in the resulting mutants. This is the first experimental proof of a self‐cleavage‐inducing peptide in PPOs, unveiling a new mode of activation for this enzyme class that is independent of any external protease.

close to 3 ppm with external calibration. Prior to MS MdPPO2(+) solution was ultra-filtrated by centrifugation and the buffer was exchanged to 5 mM ammonium acetate (pH 7.0) and the protein solution was diluted 100 times in a mixture of 80 % (v/v) acetonitrile and 0.1 % (v/v) formic acid.
Self-cleaving of MdPPO1 at different temperatures. Pro-MdPPO1 was investigated at different temperatures in order to examine the influence of the temperature on self-cleaving. The enzyme was incubated at the following temperatures: 4, 20, 25, 37 and 65 °C. Samples were tested after different incubation times via SDS-PAGE ( Figure S6).
Self-cleaving of MdPPO1 at different pH values. Pro-MdPPO1 was investigated at different pH values in order to examine the influence of pH on self-cleaving. 50 mM sodium citrate buffer was used to analyze the pH-range between pH 2.0 and 6.0 in steps of 1 pH units, whereas 50 mM Tris-HCl buffer was used to investigate the process in the range between pH 7.0 to 9.0 in steps of 1 pH units. Samples were stored at 4 °C and tested after different incubation times via SDS-PAGE ( Figure S7). Crystallization. Crystallization of pro-MdPPO1 was performed as reported elsewhere. [5] To obtain crystals of the C cleaved -domain, a solution containing the pro-enzyme was prepared and was stored more than 30 days at 4 °C to let the self-cleaving process take place. The selfcleaved sample was used for initial screening, which was performed manually by applying the hanging-drop vapor-diffusion technique using 15 well EasyXtal plates (Qiagen, Hilden, Germany). The screening procedure yielded some promising hits (small crystals). The initial hits were optimized and high quality crystals of the C cleaved -domain were finally grown at 20 °C by mixing 1 µl protein solution (15 mg ml -1 ) with 1 µl of the reservoir solution (100 mM Tris-HCl pH 8.25, 13 % PEG8000). Crystals usually appeared after 5 -10 d. The same conditions were used to crystallize the C sole -domain.

Influence of protease inhibitors on
Data collection, structure solution and refinement. Crystal harvesting and data collection of pro-MdPPO1 was performed as described elsewhere. [5] The crystals of the C cleaved -and C sole -domain were quickly plunged into cryo-protectant solution consisting of 100 mM Tris-HCl at pH 8.25, 20% PEG 8000 and 25% PEG 1500 and were afterwards mounted in nylon loops and immediately flash-cooled in liquid nitrogen. Data collection of the C cleaved -domain was carried out at 100 K on beamline ID-30A-1, at the ESRF, Grenoble, France, whereas data of the C sole -domain were collected on beamline ID-23-1, at the same synchrotron facility. Data collection statistics are summarized in Table S3. C cleaved -domain crystals diffracted Xrays to a maximum resolution of 1.35 Å, whereas those of the C sole -domain reached resolutions up to 1.05 Å. All crystals belonged to space group P2 1 2 1 2 1 exhibiting the following cell parameters: a = 45.23 Å, b = 50.14 Å, c = 50.72 Å with α = 90°, β = 90° and γ = 90° for the C cleaved -domain and a = 45.13 Å, b = 50.10 Å, c = 50.54 Å with α = 90°, β = 90° and γ = 90° for the C sole -domain. Data sets of the C-terminal domain were processed with XDS. [6] The structure of pro-MdPPO1 (including the N-terminal domain) was solved by molecular replacement (MR). [5] To find an appropriate model for the (MR) step, a BLAST search [7] was performed using the sequence of pro-MdPPO1 as template. The BLAST results provided several promising MR-models, where the model with the highest coverage (0.97 according to BLAST) was the structure of aurone synthase from Coreopsis grandiflora (PDB entry 4Z11, sequence identity of 43.0%), [8,9] and the model with the highest sequence identity was the structure of tyrosinase from Juglans regia (PDB entry 5CE9, sequence identity of 66.6%). [10,11] Initial MR-runs with both potential models using phenix.phaser from the PHENIX suite [12] revealed that the tyrosinase structure was more suitable as MR-model owing to higher MR-scores in comparison to those obtained with the structure of aurone synthase.
The final MR-model was obtained by modifying the tyrosinase structure with the sequence of pro-MdPPO1 using the modelling software MODELLER. [13] After initial phases were derived, AutoBuild [14] was used to build the initial MdPPO1 model, which was then refined until convergence using phenix.refine. [12] The quality of the final model was verified and evaluated by the MolProbity server and deposited in the PDB under the entry 6ELS. The structures of the C-terminal domains were solved as described above using the C-terminal domain of the solved pro-MdPPO1 as MR model .The final structures of the C cleaved -and C sole -domain were deposited in the PDB and may be retrieved from the entries 6ELT and 6ELV, respectively.

Supplementary Notes
The overall structure of pro-MdPPO1. The active site region of MdPPO1 consists of a binuclear copper center (CuA and CuB), where each copper ion is coordinated by three histidine residues (CuA by His86, His107 and His116 and CuB by His238, His242 and His272), and is wrapped by a four-α-helical bundle. The structure lacks two highly conserved disulfide bonds (Cys11-Cys26 and Cys25-Cys87) most likely due to MdPPO1 expression in E. coli, an organism which lacks the appropriate machinery for cystine formation. Moreover, despite the high resolution, the structure of pro-MdPPO1 suffers from some structural imperfections due to missing electron density, namely at the N-terminus (the sequence of the structure starts with Lys32) and three gaps within loop regions (His86-Gln105, Ala349-Val359 and His453-Lys457). The largest gap from His86 to Gln105 is also the most critical one as it affects a loop connecting two α-helices within the active site forming α-helical bundle. However, some electron density for the CuA-coordinating His86 was found enabling the structural completion of the active site, and in addition, the sulfur atom of the adjacent Cys90 was detected indicating that the thioether bridge, which is conserved among structurally known plant PPOs, is at least partially formed within the structure. The lack of the disulfide bonds (Cys11-Cys26 and Cys25-Cys87) could also explain the missing initial part in the structure as they highly contribute to the structural stability of the N-terminus by anchoring the N-terminal tail to the main core of the enzyme. Thus, the here reported structure contains a very loose N-terminus, which does not produce detectable electron density for the first 31 residues due to its high flexibility. In other plant PPO structures the Nterminus embraces one of the four α-helices of the tetrahelical bundle representing some kind of specific fold. As the N-terminus of MdPPO1 is highly flexible owning due to the lack of the two disulfide bonds, the above described structural region is partially disordered in the MdPPO1 structure ( Figure S17).
Mutagenesis studies on MdPPO1 and the relocation of the cleavage position. In order to gain additional insights into the cleavage reaction the cleavage site Ser366-Ser367-Ser368-Lys369-Val370 was mutated to Ile367-Asp368-Gly369-Arg370 (mutant-1), which corresponds to the recognition sequence of the serine protease Factor Xa. Thus, mutant-1 was not only designed to test its influence on the self-cleaving process but also to enable cleavage by the use of the external protease Factor Xa. Mutant-1 was successfully cleaved by Factor Xa protease yielding a similar pre-active state as in the case of MdPPO1 undergoing self-cleaving. ESI-MS analysis of the fragments obtained upon Factor Xa cleavage revealed two new cleavage sites for the mutated enzyme, which in comparison to the wild-type are shifted by 11 amino acids towards the N-terminus and consist of the sequence Lys355-Lys356-Leu357 ( Figure S10). Moreover, in a second mutant (mutation-2) the two proteolytic sites (Ser366-Ser367-Ser368-Lys369-Val370) and (Lys355-Lys356-Leu357) were mutated (Table S1). However, a third proteolytic region was appeared (His361-Ala362-Ala363) in between the two previous ( Figure S11). Mass spectrometry results (Figures S3, S10 and S11) show that the cleavage site of MdPPO1 is not characterized by one single peptide bond but rather by a cleavage sequence consisting of four contiguous peptide bonds (Ser366-Ser367-Ser368-Lys369-Val370) for the wild type, two contiguous peptide bonds (Lys355-Lys356-Leu357) for the mutation-1 and two contiguous peptide bonds (His361-Ala362-Ala363) for the mutation-2 exhibiting very high specificity as evidenced by mass spectrometry and the sequence of the C cleaved -domain crystal structure. Moreover, several mutants were designed in MdPPO1 in order to stop the self-cleaving reaction however without success the mutant which stops the self-cleaving reaction was the deletion of the loop Lys355-Val371 and described in Table S1.
Pro-MdPPO1 activation requires two steps. Despite undergoing self-cleaving, activity tests on the cleaved pro-MdPPO1 revealed that the enzyme retained its latency. This observation implies that the proteolytic cleavage within the sequence Ser366-Ser367-Ser368-Lys369-Val370 alone is not sufficient to activate the pro-enzyme. Visual inspection and analysis of the interface between the C-terminal and main-domain by PISA analysis [15] revealed strong interactions between the two domains. 33 H-bonds and 13 salt-bridges were detected by PISA analysis and therefore it is suggested that even after the proteolytic reaction, the domains remain attached to each other due to strong electrostatic interface interactions, thereby preserving the enzyme's latency. Thus, self-cleaving of the pro-enzyme converts the enzyme into a stable, pre-active stage, this requires further actions to become fully enzymatically active. Self-cleaved enzyme was in vitro activated by applying various salt concentrations (1 M NaCl, 0.5 M KCl, 0.15 M MgCl 2 or 0.15 M CaCl 2 ), which putatively disrupt the strong electrostatic inter-domain interactions leading to the spatial separation of the two domains. The same effect was also achieved using the activator SDS (3 mM). However, this in vitro activation process might not reflect the in vivo maturation process but it is suggested that the enzyme is transformed to the here observed pre-active stage at the same point during the in vivo maturation. The C-terminal domain of MdPPO1 could be membrane associated as it has been reported for other PPOs. [16][17][18] Thus, the spatial separation of the domains might in vivo be triggered by the fact that C-terminal domain remains attached to a membrane facilitating the departure of the main domain.
Experiments to elucidate the mode of the self-cleaving reaction. A series of experiments were conducted to reveal the mechanism of the self-cleaving reaction. The initial consideration was that the C-terminal domain harbors the cleaving moiety. This was supported by the fact that incubation of the pro-MdPPO1 together with different amounts of separated C-terminal domain (C sole -domain) accelerated the self-cleaving reaction. Structural inspection of the C-terminal domain did, however, not reveal an obvious cleaving element. Thus, the structure of the C-terminal domain (C cleaved , C sole and the C-terminal domain of the pro-form) was submitted to the Dali Server, a network for (3D) structural protein comparison, to find similar structures or structural elements in other enzymes that might have proteolytic activity. However, the best hits (besides other PPOs) were the nephrin-binding-domains of several nephrin-receptors [19] . No proteolytic enzyme was within the top hits.
In addition, several proteases inhibitors were incubated with pro-MdPPO1 to inhibit the selfcleaving process. The serine protease inhibitors 2 mM phenylmethylsulfonyl fluoride and 2 mM benzamidine hydrochloride did slightly attenuate the self-cleaving process, a serine protease-like mechanism was assumed. Analysis of the sequence in the proximity of the cleavage site revealed the presence of a number of serine residues that could act as the decisive nucleophile. However, when the environment of each serine was examined, no acid and base were found that could form a catalytic triad to carry out the reaction. Only the positioning of S435 resembled to some extent that found in the catalytic moiety of thrombin. Furthermore, there are reports on serine proteases that are able to perform the cleavage reaction in a "serine-only" configuration, that is, only the serine residue is required for the self-cleaving reaction of some proteins. [20,21] Therefore, Ser385 and Ser435 were mutated to Gly in different mutants to inhibit the putative serine protease-like mechanism. However, selfcleaving in the resulting mutant was only delayed but not inhibited completely discarding a serine protease-like reaction as mode of action for the here observed self-cleaving (Table  S1).
Since EDTA did also attenuate the self-cleaving process, a metalloprotease-like reaction was also shortly assumed. This assumption was further strengthening by the fact that the separated C-terminal domain possesses a (strong) metal binding site and might thus be the decisive element. Incubation of different metals, 1 mM of (Mg 2+ , Mn 2+ , Co 2+ , Ni 2+ , Zn 2+ , Cu 2+ and Ca 2+ ) with pro-MdPPO1 did not show metal-dependent differences or any effect at all. In addition, the calcium binding residues Asp429, Asp431 and Asp479 were mutated to Gly or deleted completely in order to block the metal binding site. However, the resulting mutations did not inhibit the self-cleaving reaction but led to the opposite effect as self-cleaving was accelerated (Table S1). Thus, a metalloprotease-like or in general metal-based reaction was also excluded.

Structural comparison between the C-terminal domains (C-terminal domain attached in pro-MdPPO1, C cleaved and C sole -domain).
Structural comparison between the C-terminal domain of the pro-enzyme, the C cleaved -and C sole -domains, revealed that the two free Cterminal domains (C cleaved and C sole ) exhibit the exact same structure (Cα-RMSD = 0.051 Å, 872 matched atoms), whereas the structure of the C-terminal domain of the pro-enzyme does slightly deviate from the free C-terminal domains (Cα-RMSD of 0.494 Å, 562 matched atoms). All C-terminal domains exhibit the typical β-sandwich 'jelly roll'-like fold consisting of in total seven β-strands ( Figure S18), however, the position of one large loop (Asn428-Glu441) differs significantly. In the structures of the separate C cleaved -or C sole -domain, this loop moved towards another loop (Glu473-Asp480) in order to form a calcium binding site that stabilizes the C-terminal domain ( Figure S19). The bound metal was identified as Ca +2 based on the composition of the used expression media and buffers, the interacting amino acids, the binding geometry and the presence of the anomalous signal ( Figure S20). The Ca +2 ion is coordinated by eight O-donor ligands, three aspartate residues (Asp429, Asp431 and Asp479), whereby two of these aspartates (Asp429 and Asp479) act as bidentate ligands and three oxygens arising most likely from water molecules ( Figure S21). The calcium binding site is missing in the structure of pro-MdPPO1 as the two loops forming this site are kept apart by the linker region, which connects the active and C-terminal domain.   Figure S1). However, the C cleaved -domain was ionized best and led to the here presented high-resolution spectra. The inset shows that four different C-terminal species (A-D) were found and the second inset of the charge state [11+] shows that isotopic resolution was achieved. The four different species indicate self-cleaving at four different sites within the sequence (Ser366-Ser367-Ser368-Lys369-Val370). Species A corresponds to the fragment obtained upon cleavage between Lys369-Val370, species B to cleavage between Ser368-Lys369, species C to cleavage between Ser367-Ser368 and species D to cleavage between Ser366-Ser367.  . Positive mode ESI-QTOF mass spectra of the self-cleaved C cleaved -domain from mutation-1. During and after self-cleaving the sample solution represents a mixture of proand active MdPPO1 as well as the free C-terminal domain. However, the C-terminal domain was ionized best and led to the here presented high-resolution spectra. The inset shows that two different species (A-B) were found and the second inset of the charge state [15+] shows that isotopic resolution was achieved. The two different species indicate self-cleaving at two different sites within the sequence (Lys355-Lys356-Leu357). Species A corresponds to the fragment obtained upon cleavage between Lys356-Leu357, species B to cleavage between Lys355-Lys356.  Figure S11. Positive mode ESI-QTOF mass spectra of the self-cleaved C cleaved -domain from mutation-2. During and after self-cleaving the sample solution represents a mixture of proand active MdPPO1 as well as the free C-terminal domain. However, the C-terminal domain was ionized best and led to the here presented high-resolution spectra. The inset shows that two different species (A-B) were found and the second inset of the charge state [16+] shows that isotopic resolution was achieved. The two different species indicate self-cleavage at two different proteolytic sites within the sequence (His361-Ala362-Ala363). Species A corresponds to the fragment obtained upon cleavage between Ala362-Ala363, species B to cleavage between His361-Ala362.   shows that isotopic resolution was achieved. The four different species indicate self-cleavage at two different sites within the sequence (Ala362-Ala363-Val364-Ser365-S366). Species A corresponds to the fragment obtained upon cleavage between -Ser366, species B to cleavage between Val364-Ser365, species C to the cleavage between Ala363-Val364 and species D to the cleavage between Ala362-Ala363.     Figure S21. Geometry of the Ca 2+ binding in the C cleaved -and C sole -domain. The Ca 2+ ion coordinates to eight O-donor ligands, three aspartate residues (Asp429, Asp431 and Asp479) and three water molecules, whereby two of these aspartates (Asp429 and Asp479) act as bidentate ligands.  # CC 1/2 is defined as the correlation coefficient between two random half data sets.

Species M calculated Da M (measured) Da ∆/Da
* R work =Σ|F calcd |−|F obs |/Σ|F ob s|, where F calcd and F obs are the calculated and observed structure factor amplitudes, respectively ** R free is calculated for a randomly chosen 5 % of the reflections within each dataset.