A top‐down proteomic approach reveals a salivary protein profile able to classify Parkinson's disease with respect to Alzheimer's disease patients and to healthy controls

Parkinson's disease (PD) is a complex neurodegenerative disease with motor and non‐motor symptoms. Diagnosis is complicated by lack of reliable biomarkers. To individuate peptides and/or proteins with diagnostic potential for early diagnosis, severity and discrimination from similar pathologies, the salivary proteome in 36 PD patients was investigated in comparison with 36 healthy controls (HC) and 35 Alzheimer's disease (AD) patients. A top‐down platform based on HPLC‐ESI‐IT‐MS allowed characterizing and quantifying intact peptides, small proteins and their PTMs (overall 51). The three groups showed significantly different protein profiles, PD showed the highest levels of cystatin SA and antileukoproteinase and the lowest of cystatin SN and some statherin proteoforms. HC exhibited the lowest abundance of thymosin β4, short S100A9, cystatin A, and dimeric cystatin B. AD patients showed the highest abundance of α‐defensins and short oxidized S100A9. Moreover, different proteoforms of the same protein, as S‐cysteinylated and S‐glutathionylated cystatin B, showed opposite trends in the two pathological groups. Statherin, cystatins SA and SN classified accurately PD from HC and AD subjects. α‐defensins, histatin 1, oxidized S100A9, and P‐B fragments were the best classifying factors between PD and AD patients. Interestingly statherin and thymosin β4 correlated with defective olfactory functions in PD patients. All these outcomes highlighted implications of specific proteoforms involved in the innate‐immune response and inflammation regulation at oral and systemic level, suggesting a possible panel of molecular and clinical markers suitable to recognize subjects affected by PD.


INTRODUCTION
Parkinson's disease (PD) is the second common neurodegenerative disease (ND) characterized by motor and non-motor symptoms (including cognitive impairment, olfaction disturbance, sleep disorders, etc) mainly related to the progressive loss of dopaminergic neurons across the brain [1].Accepted mechanisms involve misfolding and oligomerization of α-synuclein in the Lewy bodies, which are disseminated in brain, cerebro-spinal fluid (CSF), submandibular glands, skin, colonic and nasal mucosa [2][3][4].The causes are unknown in most of the cases, although familial forms and environmental factors have been identified as risk factors [5].To date, reliable, specific diagnostic and prognostic biomarkers of PD are still missing and clinical marks and symptoms, and post-mortem examination are the criteria for the clinical diagnosis of PD [1,4,5].Current criteria define PD as the presence of bradykinesia combined with either rest tremor, rigidity, or both [6].However, the clinical presentation is often multifaceted and can include several non-motor symptoms [1].Generally, when PD is diagnosed in patients with classical motor symptoms, they exhibit 80% of loss of dopamine in striatum [7].Some diagnostic tools, as DAT scan, transcranial ultrasonography of the substantia nigra, or olfactory tests may be of support, but expensive and with restricted accessibility and thus not suitable for routine clinical screening.Moreover, it is not negligible that there is a prolonged prodromal phase of PD [8] similar to other neurodegenerative conditions, such as Alzheimer's disease (AD) [9].Therefore, specific, sensitive, either central or peripheral biomarkers are strongly needed, especially for the early diagnosis and severity of PD, differential diagnosis from similar pathologies and for monitoring curative therapies.Omics studies, as metabolomics and proteomics, promise advances in the investigation of molecular mechanisms of the NDs as well as on the biomarker discovery [4,[10][11][12][13][14][15][16].Although in lower number than proteomic researches in AD, extensive proteomic studies in PD are in progress, and performed on CSF, plasma/serum, urine, tears, saliva and tissue samples from brain-banks [4,[10][11][12].
Brain tissue and CSF are elective samples in proteomic studies of NDs because appropriate to individuate central biomarkers, however, brain tissues are available post-mortem and CSF collection is very invasive and not well-accepted by the patients.Peripheral biomark-ers solve these limitations providing non-invasive diagnostic solutions.
Blood, eye's tissues, skin, urine, saliva, olfactory and colonic mucosa show to be indicators of cognitive and biological changes of brain and supposed to differentiate parkinsonian from normal conditions [17].
Recently, a novel combined panel of salivary proteins biomarkers was proposed, including oligomeric α-synuclein, tau protein, microtubuleassociated protein light chain 3 beta and tumour necrosis factor α [22].The high level of these candidate biomarkers measured in saliva supports the idea that the neurodegenerative process in PD is generalized and may be reflected in saliva composition [3].Indeed, in PD patients the function of the major salivary glands, and its secretion, can be altered [23].Saliva is a very advantageous biofluid for proteomic investigations, due to the low invasiveness and feasibility of the collection, which does not require healthcare personnel, and the good tolerability by donors [5,24].Furthermore, the dynamic range of protein concentration in saliva makes less challenging detection of low abundant proteins with respect to plasma.The protein content of human saliva shows a large dynamic range of molecular weights (MW), peptides with MW under 10 kDa, and small and big proteins (considering all those ones with MW over 10 kDa), they are of both glandular and non-glandular origin, released by leucocytes present in the gingival crevicular fluid, and by epithelial cells of mucosa, deriving from blood and CNS [25][26][27].In consideration of these insights, we intended to investigate a panel of salivary peptides and proteins, and their derivatives from PTMs, detectable by a top-down proteomic platform based on HPLC-ESI-IT-MS.Our approach was standardized in previous studies for the detection and quantification of hundreds salivary peptides and proteins [28][29][30][31] and applied successfully to the salivary proteomics associated to AD [32,33].By this approach, it was possible to obtain a profile of the naturally occurring salivary proteome, including isoforms and PTMs, and to quantify the proteins by a label-free method.The salivary proteome of PD patients was compared with that of a healthy control group (HC) and with a pathological control group, composed by subjects affected by AD, to highlight qualitative/quantitative variations.Moreover, the Random Forest (RF) and Multidimensional Scaling (MDS) analyses, applied to the proteomic data, allowed to individuate salivary biomarkers useful to classify accurately the subjects in the three groups.

Demographic and clinical features of subjects included into the study
PD patients were recruited at the Movement Disorders Centre of the University of Cagliari during regular out-patient follow-up visits.
Thirty-six PD patients were included in the study (72 ± 7 years old, mean age ± SD, 11 females, 15 males).PD was diagnosed according to the Movement Disorder Society Clinical Diagnostic Criteria for PD [6].The following exclusion criteria had been defined: identifiable cause of secondary parkinsonism or signs for atypical Parkinsonian disorders, dementia, and psychiatric conditions interfering with study participation, chronic/acute rhinosinusitis, and any systemic disease associated with smell disorders like chronic renal failure or thyroid disorders.All demographic and clinical data of PD patients are reported in supplementary Table S1, including disease duration, the Modified Hoehn and Yahr (H&Y) Stage [34] and the Unified PD Rating Scale (UPDRS) part III [35]  cognitive impairment [36,37].Olfactory impairment was evaluated using the "Sniffin' Sticks" test as described in previous studies [38,39].
Thirty-six healthy volunteers constituted the HC group (78 ± 6 years old, 18 females, 18 males), they suffered from common agerelated illness, such as hypertension, and were treated with standard drugs.None control subject used antidepressants or anticholinergic drugs.For statistical comparisons with the pathological control group constituted by AD patients, we used proteomic data of 35 AD subjects (80 ± 6 years old, 23 females, 12 males) reported in our previous study [32].The diagnosis of probable AD, made according to standardized criteria [41], classified 13 patients as moderate AD and the remaining 22 as mild AD.Among the subjects with or without NDs, 50% carried a dental prosthesis.The included subjects were not affected by any major oral disease (periodontitis, caries, or dry mouth), moreover, they had not history of radiotherapy or chemotherapy.The study was approved by the Ethic Committee of the State Cagliari University (Prot.PG/2018/10157 and PG/2018/8798) and the informed consent process for sample's collection agreed with the latest stipulations established by the Declaration of Helsinki.Participants received an explanatory statement and gave their written informed consent to participate in the study, but eight PD patients did not consent the use of clinical assessments and thus they are not reported in Table S1.

Sample collection and treatment
The non-stimulated whole saliva was collected between 9:00 and 12:00 a.m.Donors, in fasting conditions, were invited to sit assuming a relaxed position and to swallow.Whole saliva was collected as it flowed into the anterior floor of the mouth with a soft plastic aspirator for less than 1 min and transferred to a plastic tube cooled on ice.Salivary samples were immediately diluted in a 1:1 v/v ratio with a 0.2% aqueous TFA solution containing 50 μM of leu-enkephalin as internal standard.After centrifugation (20000 g for 15 min at 4 • C) the separated supernatant was immediately analyzed by LC-MS or stored at −80 • C until the analysis for up to 2 weeks.The total protein concentration (TPC) was determined for each sample in duplicate with bicinchoninic acid (BCA) assay kit (Sigma-Aldrich/Merck), following the provided instructions.

RP-HPLC-low resolution (LR)-ESI-MS analysis
All the salivary samples from PD, HC and AD subjects were collected and analysed by RP-HPLC-(LR)-ESI-MS in the same period with the same experimental and instrumental conditions.Thus, they were comparable.Thirty μL of acidic extracts were injected in HPLC-low resolution ESI-MS applying procedures and conditions optimized for the top-down analysis of human salivary samples in our previous studies [31,32].Proteomic data obtained from samples HC (35 up to 36) and of all the AD subjects were issued in our previous study [32], therefore, they were exploitable in the present study for the comparative assessments.One HC sample and the samples from PD patients were investigated in this study to selectively search and quantify the peptides and the proteins, and their proteoforms from PTMs, listed in supplementary Table S2 that are commonly detectable by experimental conditions used by us to perform top-down proteomic studies in human saliva [28,30,32].Table S2   was accepted when at least the 10% of the sequence was covered (particularly for proteins with more than 100 amino acid residues), and/or both b and y series of fragment ions were attributed.If the same PTM could be localized on different amino acid residue of the protein sequence, every possible modified sequence was tested to individuate either the unique position of the PTM or multiple proteoforms, is an example the methionine oxidation of the S100A9 mono-oxidized (Table S2) that can occur at one of the following positions: 89, 78, 76 or 58.

RP-HPLC-(HR)-ESI-MS/MS analysis
The multiply charged ions on which the best MS/MS have been obtained for sequencing and for the identification and localization of PTMs are reported in Table S2.The MS/MS characterizations of intact peptides, proteins and their PTMs have been deposited into the Pro-teomeXchange Consortium (http://www.ebi.ac.uk/pride) via the PRIDE [45] partner repository with the dataset identifier PXD041787.

Statistical analysis
Statistical analysis considered both the abundance measured for each of the 51 protein targets listed in Table S2 (peptides, proteins and their PTM proteoforms) and in some cases the sum of the abundances of all the proteoforms belonging to the same protein family, for example, in the case of cystatin B. To simplify we call both "components" in an overall FDR less than 10%, were considered significant.Kendall correlations [47] were used to identify components with correlated abundance within the subjects of each group, followed by MDS to obtain a dimensionally reduced diagram of co-expressed proteins.RF analysis [48] was used to provide a classification of subjects from two mixed data sets: one obtained mixing HC and PD patients and a second one mixing AD and PD patients.The classification of HC and AD patients was omitted as it was object of a previous investigation [33].
The Boruta method [49] was used to select a subset of relevant components, by comparing their ability to discriminate different groups with that of shadow variables obtained by random permutation of copies of the original variables.Components resulting significantly less important than the shadow variables were excluded while all others were selected for RF.RF parameters, such as the number of trees to grow and the number of components sampled for each split, were preliminarily tuned to optimize the classification accuracy.Accuracy was calculated as the proportion of correct assessments (both true positive and true negative) to the total number of assessments.RF classification was validated by the "out-of-bag" samples.In detail, only about two-thirds of the samples were used for each decision trees.This method consists in using only about two-thirds of the samples for each decision trees.
The classification obtained with these samples is then tested using the remaining one-third of the samples ("out-of-bag").This procedure is repeated for each of the planned number of trees (from which the term of RF), each time randomly selecting the samples for classification and those for validation.The overall accuracy is ultimately assessed as the average of the "out-of-bag" errors.The classification obtained with these samples was subsequently tested using the remaining onethird of the samples.This procedure was repeated for each of the planned number of 1500 trees, each time randomly selecting the samples for classification and those "out-of-bag" for validation.The overall accuracy was ultimately assessed as the average of the "out-of-bag" errors.The importance of the selected components for classification was assessed by the mean decrease Gini index (MDG).Variables with higher MDG have greater importance in the RF model.Dimensionally reduced diagrams of RF classifications were obtained by MDS of proximity values between each pair of subjects.Proximity between two subjects is evaluated as the normalized frequency of trees that contain the two subjects in the same end node.MDS was computed using the singular value decomposition method, which ensures a matrix factorization numerically accurate even in the presence of a high degree of multicollinearity (i.e., multiple correlations).Nonparametric tests and multivariate analyses were made using the software R [RCoreTeam.R: A language and environment for statistical computing.Vienna, Austria: R Foundation for Statistical Computing; 2021.http://www.R-project.org/].s

Immune-blotting analysis
Dot-blotting analysis was performed for the technical verification of some proteomic data.In particular for α-defensins, SLPI, and Tβ4.
Aliquots of 20 μL from acidic-salivary samples with a final concentration of 0.3 μg/μL were prepared from 26 PD patients (75 ± 6 years old, mean age ± SD, 8 females, 18 males) and from twenty-six HC (76 ± 4 years old, mean age ± SD, 13 females, 13 males).Both pools have been then concentrated to reach a final TPC of 2 μg/μL and 2 μL of each one blotted in triplicate in a nitrocellulose membrane.
The same blotting/detection procedure used previously [32] was here applied.The primary Ab dilutions were: 1:1000 for α-defensins, and SLPI, 1:200 for Tβ4 in TBS-T (TBS with 0.05% Tween-20).SLPI signals were normalized with respect to that of α-defensins, which resulted quantitatively unvaried between PD and HC.Tβ4 signal normalization was performed with respect to the signal of 0.25 nmol of the standard peptide.

RESULTS
The peptides and proteins and their proteoforms soluble in the acidic solution analyzed by our MS apparatus belong to the following protein families: acidic proline-rich proteins (aPRPs), statherin, histatins, salivary cystatins (S-type), cystatins A, B, C, and D, α-defensins, Tβ4, SLPI, S100A8 and S100A9 proteins.Overall, 51 protein targets were investigated in each salivary sample, including proteoforms generated by phosphorylation, proteolysis, N-terminal acetylation, methionine oxidation, and cysteine oxidation (formation of disulfide bridges, glutathionylation, cysteinylation, and nitrosylation) (Table S2).They have been identified by (HR)-MS/MS sequencing in the present study on PD and HC samples, and previously in samples from HC and AD subjects [32], and in other our previous proteomic studies [28,30,31].
The MS/MS spectra analysis to obtain sequence information and PTM localization, in combination with the determination of the monoisotopic intact mass values, and, thus, the accurate Δmass corresponding to a specific PTM, the characteristics of the MS spectra (type and relative intensity of the m/z multiply-charged ions), and, finally, the retention times in the chromatographic separation, were all the elements allowing the identification of the 51 protein targets.An example of (HR)-MS/MS characterization is shown in Figure 1

Comparison between groups
Protein/peptide levels measured in the samples from PD patients were compared with those measured in samples from the two control groups represented by HC subjects and AD patients.Medians, interquartile ranges, fold change (FC) calculated as the log2 ratio between median values, and statistical comparisons between groups by exact Mann-Whitney and Kruskal-Wallis tests are reported in Table 1.The FC is shown only for components resulting significantly changed in the comparisons, while Table S3, in the supplemental material, reports all the FC values determined for all the components.The Venn diagram showed in Figure 2 emphasizes the similitudes and the peculiarities of the three comparisons, PD versus AD, PD versus HC and AD versus HC, indicating which components have been found with abundance significantly different in PD patients with respect to both the control groups (indicated as panel "1" in the Figure 2), in both the pathological groups with respect to the HCs (panel "2"), and in AD patients with respect to HC and PD groups (panel "3").As "4," "5," and "6" are indicated the panels including components specifically varied in their levels in PD versus HC, AD versus HC and PD versus AD respectively.Finally, the panel 7 includes components showing level variations in all the three comparisons.Note: XIC peak areas (median and interquartile range) normalized on total protein concentration, and frequency, expressed as % (F%), of the salivary components in PD, HC and AD patients are reported, as well as the p-values obtained by the two statistical tests.The FC, as the log2 ratio between median values, is reported only for components with significant p-value in the comparisons.ns Salivary protein profile of PD patients, when compared to that of HC subjects, was predominantly characterized by significant higher levels of peptides and proteins not secreted by salivary glands, such as SLPI, Tβ4, cystatin A and its N-terminal acetylated proteoform, monomeric cystatin B-SSC, cysteinylated at C 3 , but not the glutathionylated (SSG) derivative at C 3 residue, and the S-S homo-dimer of cystatin B, as well as the S100A9s.S-S hetero-dimer S100A8-S100A9s was also found more abundant in PD than in HC samples (p-value 0.043) but the difference was considered not significant in consideration of the FDR value greater than 10%.It is relevant to underline that this component, detected in the 28% of PD patients, was never found in saliva of HC subjects (Table 1).Due to the very low concentration below the instru-mental detection limits, some components were detected only in few samples, such as SLPI, and Tβ4 revealed in only 39%, and 50% of the HC subjects, respectively.A lower frequency of detection was observed in the HC group, also, for S100A8 and its nitrosylated form (SNO) (Table 1).Dot-blotting experiments confirmed the similar abundancy in PD and HC groups of α-defensins (Figure 3A,D) and the significant different abundances of SLPI (Figure 3B,E) and Tβ4 abundances (Figure 3C,F).

TA B L E 1
Among the salivary proteins secreted by salivary glands, only cystatin SA and the C-terminal fragment desF 43 of statherin showed a higher level in PD than in HC controls, where it was lower also the frequency of detection.Conversely, most peptides and proteins originated F I G U R E 2 Venn diagram obtained considering the significant differences of the protein levels reported in Table 1 and determined by Mann-Whitney and Kruskal-Wallis tests."1," panel of components in PD patients with respect to both control groups; "2," panel of components different in both the pathological groups with respect to the healthy controls, "3," panel of components different in AD patients with respect to HC and PD groups; "4," "5," and "6," panels including components specifically varied in PD versus HC, AD versus HC and PD versus AD, respectively."7," components showing level variations in all the three comparisons.by salivary glands, among those investigated, were found to be less abundant in saliva from PD patients, especially Hst1, statherin di-and mono-phosphorylated (2P, 1P), and its N-and C-terminal fragments, except for fragments desT 42 -F 43 and des1-13, some proteoforms belonging to the aPRP family, and cystatin SN, and also in this case, the decreased levels were associated with very low frequency (Table 1).
By comparing PD and AD salivary profiles we determined abundancies of SLPI, cystatin SA, the fragments des1-5 and des1-12 of P-B peptide, and statherin desF 43 significantly higher in PD than in AD patient group.The lower abundancy of SLPI in AD patients is accomplished with a lower detection frequency (40%), and, although with similar abundance, the dimer S100A8-S100A9 was detected in only the 9% of AD patients against the 28% of the PD patients (Table 1 than in AD patients (31%).The results obtained from the comparison between HC and AD groups were in accord to our previous studies [32,33]: significant higher abundances, in AD salivary samples, have been determined for statherin 2P and its proteoforms des1-9, des1-13, Hst1, both phosphorylated and not-phosphorylated, P-C peptide, cystatin A, cystatin B-SSG and the S-S dimer, S100A9 proteoforms, especially the S100A9s, α-defensins 1-3, Tβ4, and S100A8-SNO (Table 1).

Random-Forest (RF) classification analysis
RF classification between PD patients and HC subjects, and between the PD and AD patients, was applied to a subset of components selected according to the Boruta method, to implement the classification accuracy [49].Sixteen components were selected for the RF classification of PD and HC subjects (supplementary Figure S1).
According to MDG scores (Table 2) the most discriminant protein was cystatin SA, followed by statherin 2P and cystatin SN.Several statherin proteoforms, and SLPI were also good discriminant components.With lower MDG scores PRP1 3P, and the S100A9s.Twenty-one components were selected for the classification of AD and PD patients (Figure S2).Also in this case, MDG scores ( The large majority of selected components showed also significant changes by Mann-Whitney test (Table 1).Eight components that did not reach statistical significance are indicated in italics, among these only one (PB des1-7) had an MDG score > 1.
should be noted that these findings were validated by the "out of bag" samples, which represented about one third of the entire set of data.
A partial, approximate representation of RF classifications is shown by the MDS of the proximities among the samples (Figure 4 2) showed also significant differences by Mann-Whitney tests.However, some components, prevalently with low MDG scores, did not show significant changes and, on the other hand, some components with significant changes were not selected for RF classifications.This apparent contrast is due to the essential nature of RF classification and in general methods based on decision trees.Indeed, RF is able to operate different "split" points within the same variable and to discriminate groups even when their means (or average ranks) are equal, a method completely outside the logic of tests comparisons.The possibility of identifying multiple split points, while on the one hand it allows to obtain good or excellent classifications, on the other hand it is not appropriate for normal diagnostic purposes, that require unique reference thresholds within a scale monotonically related to the severity of a given disease.For this reason, the components selected for RF classification but exhibiting no significant changes by Mann-Whitney tests were consid-ered without diagnostic potential and discussed in the next Discussion Section.These components are indicated in italics in Table 2.

Kendall correlation analysis within the groups
MDS applied to Kendall correlations (Figure 5), highlighted some clusters generated by components with correlated levels.To facilitate the

Correlation with clinical assessments
The clinical assessments measured for the PD patients, years of disease, UPDRS III, H&Y, MoCA and olfactory function (Table S1), were analyzed by Spearman correlation test to find significant associations with the abundances of the investigated components (Table 3 and supplementary Figure S3).Levels of α-defensin 3 in PD patients showed a negative correlation with the UPDRS III, being more elevated in the patients with low UPDRS III scores and thus less motor impairment (Figure S3A).The highest levels of Hst3 and statherin (Figure S3B) were significantly associated with low values of the MoCA factor, measuring the cognitive impairment.The MoCA scores determined for 24 patients, showed a median of 22.0 (20.0-25.8%),and 18 patients exhibited score under the normal cut-off (<26.0).The olfactory function was determined in terms of normosmia, hyposmia, and anosmia, thirteen patients (52%) showed functional anosmia, eleven subjects (44%) showed hyposmia, and only one patient showed normosmia.
PD patients with the higher levels of statherin 2P showed anosmia, indeed, statherin was negatively correlated with odour discrimination and odour identification, and the global olfactory function (TDI score) (Table 3, Figure S3C-E), indicating that statherin was more abundant in patients with more defective odour function.The highest levels of Tβ4 correlated with less impaired olfactory function, Tβ4 exhibited a positive correlation with global olfactory function (TDI score) and odour identification (Table 3, Figure S3F).

DISCUSSION
A top-down proteomic platform was used in this study to investigate the intact salivary proteome of PD patients and to compare it with those of a healthy and a pathological control group, the last represented by patients affected by AD.Significant differences in the salivary profiles of the three groups were obtained as well as specific TA B L E 3 Results of the Spearman correlation analysis between clinical assessments and abundances of the investigated salivary component.components were found to be good markers classifying PD patients from AD patients and the HC.The Kendall correlation analysis, also, individuated different associations among the investigated protein families in the three compared groups, highlighting that the protein clusters revealed to be more compact in HC than in the two pathological groups, especially than in AD group.The protein correlation analysis among the groups suggested a dysregulation, associated to the two pathologies, of the relationships of proteins, which are correlated in normal conditions, like S100A9 proteoforms, aPRP proteoforms and cystatins S-type.Finally, interesting relationships have been found

Clinical parameter and n
among the proteomic data and the clinical assessments measured in PD patients, especially the olfactory perception, the cognitive ability, and the motor impairment.As far as we know, only two proteomic studies, in the PD research field, have been conducted using saliva [10,52], Figura and colleagues evidenced lower levels of proteins associated with inflammation in salivary proteome of PD patients when compared with a HC group [10], among them α-defensin 1, S100A8, S100A9, and SLPI.Masters and colleagues reported conflicting results indicating an up-regulation of the S100A8 and S100A9 proteins in PD salivary proteome, even if the outcomes must be considered preliminary because obtained from a comparison of three PD patients and one HC [52].
Our results were divergent from those of Figura et al. [10], highlighting significant higher levels of SLPI and S100A9s proteoform in the salivary profile of PD patients than in HC, but not significantly different levels of α-defensin 1 and S100A8 proteoforms, except for the frequency of detection of S100A8, which was higher in PD patients than in HC subjects for both the unmodified and nitrosylated forms.It is worthy to underline that Figura et al. carried out shot-gun proteomics applied to whole salivary samples, both collection and treatment of the samples were different from those used in our standardized protocol.
Our top-down proteomic approach takes the advantage to allow the characterization of intact peptides, proteins and their PTMs soluble in acidic solution and directly analysable by a HPLC-ESI-MS platform, minimizing the sample manipulation and ensuring the preservation of protein content from the degradation [28-33, 44, 50, 51].Moreover, our approach allowed the characterization and the label-free quantification of different intact proteoforms of the same protein, outcomes not obtainable with a shot-gun approach.

S100A9 and S100A8 proteoforms
The panel of S100A8 and S100A9 proteoforms was found to be more similar between PD and HC groups rather than between AD and HC groups, as demonstrated in the present study and in our previous [32,33], where several S100A8 and S100A9 proteoforms were individuated as components classifying the subjects in AD or HC groups [33].The prevalence of certain S100A8 and S100A9 proteoforms in AD group explained the results of the Kendall correlation analysis that highlighted in AD group the most scattered cluster of components.In PD salivary samples S100A9s was the prevalent proteoform, which was identified as good factor classifying PD patients from HC subjects.Whereas the Met-oxidized S100A9s was individuated as discriminating component in the classification of the patients in PD or AD groups.S100A9 and S100A8 are constitutively expressed in immune cells and their expression and extracellular release are upregulated also in other cell types under inflammatory conditions [53].They may interact with toll-like receptors (TLRs) activating the innate immune system and mediating inflammation through induction of cytokine secretion and influencing monocyte and macrophage behaviour [54].
They can exert both pro-and anti-inflammatory effects with a switching depending on local microenvironments, oxidative modifications, and metal ion-binding [55].The methionine oxidation of S100A9 can terminate its chemo-repulsive effect on peripheral neutrophils [56].
S100A8 exerts anti-inflammatory activity when modified by nitrosylation of its cysteine residue [57], and it has been reported that S100A8 disulfide-linked dimers do not exhibit chemotactic action [58] [60,61].An upregulation of TLRs was found in the brain and peripheral blood cells of PD patients [60].In addition, it was proposed that a gut disfunction altering TLR2 and TLR4 signalling in PD promotes α-synuclein aggregation in enteric and vagal neurons, and subsequent migration of these aggregates to the brain via peripheral nerves, contributing to neuroinflammation and neurodegeneration [62].TLRs are involved also in AD, since they can affect synaptic plasticity, microglial activity, tau phosphorylation, and inflammatory responses, moreover, several genetic polymorphisms of TLRs were recognized as protective or risk factors for AD [63].

Inhibitors of proteases/cathepsins
The results obtained in this study suggested a stronger activity of cathepsin and protease inhibitors in PD rather than in AD patients, with the specific implication of SLPI and cystatin SA.These two proteins exhibited the highest level in PD salivary samples, SLPI was individuated as a good factor classifying PD subjects with respect to HC subjects, while cystatin SA was the protein with the highest MDG score classifying the PD subjects with respect to both HC and AD subjects.
An opposite trend has been demonstrated for cystatin SN, belonging together cystatin SA to the S-type cystatins [64].This different trend justified the greater dispersion of S-type cystatins highlighted by Kendall's correlation analysis.S-type cystatins are implicated in the innate immune-response suppressing some viral, bacteria, and fungal infections by inhibiting exogenous cysteine proteinase [65].They are particularly involved in oral inflammatory processes, being secreted by submandibular/sublingual glands, by inhibiting lysosomal cathepsins implicated in the destruction of periodontal tissues [66].The specific biological role of cystatin SA is largely unknown, however, it is to underline that SA inhibits specifically cathepsin L [66], while SN can inhibit cathepsins B, and C [67].SLPI is an anti-inflammatory and antimicrobial protein produced by neutrophils and macrophages associated with the respiratory tract mucosa, and with parotid and submandibular glands [68].It inhibits several serine proteases, as cathepsin G, elastase, trypsin released from many cell types and chymase and tryptase from mast cells [68].Other protease inhibitors, typically detectable in saliva, showed the highest abundance in PD and AD patients, such as cystatins A and B. Furthermore, these two proteins and their derivatives were highly correlated to each other forming a very tight cluster in all the groups.They are important inhibitors of endogenous and exogenous proteases and are involved in the inflammatory processes and innate immunity [69].Cystatin B is an inhibitor of cathepsin B, L, H, and S, cystatin A inhibits cathepsins B, L and H [67]. Cathepsins are the most abundant lysosomal proteases, and recent studies evidenced a possible role of the lysosome activity in neurodegeneration as modulator of proteins prone to aggregate, such as α-synuclein, and β-amyloid [70].Alterations in lysosomal cathepsins D, B and L can contribute to the pathogenesis of NDs, as α-synucleinopathies and AD, being implicated in neuronal functions, synaptic plasticity, and in the autophagy useful to remove abnormal protein aggregates in CNS [70].
Our results suggested that, in both the investigated NDs, the high level of inhibitors of cathepsins/proteases could be associated to an excessive and uncontrolled anti-inflammatory response, which could result in a deficit in lysosomal autophagic activity, especially in PD patients.It was underlined the possible protective role of cystatins B and A in the neurodegeneration [32], and it has been reported that cystatin B binds amyloid-β and interrupts amyloid aggregation in cells [71].Moreover, it is interesting to consider that S-cysteinylated cystatin B was prevalent in PD group, while the S-glutathionylated form in AD group, when compared with the HC.The disulfide dimer exhibited the highest abundance in both the patient groups.These insights suggested the possibility that the cystatin B may be involved with different roles and mechanisms in PD and AD pathogenesis.Indeed, the S-glutathionylation, which is a consequence of GSH addition, is a protective PTM acting in oxidative stress conditions, and it may be implicated in signalling cascades, including those associated with proliferation, inflammatory responses, apoptosis, and senescence [72].The S-cysteinylation occurs in oxidative stress conditions, where a disulfide bond converts a cysteine residue to S-cysteinyl-Cys, and donors can be cysteine or dimeric GSH.

α-defensins 1-4 and thymosin β4
The highest salivary levels of α-defensins 1-4 were found in AD patients, followed by PD patients and HC subjects as abundance order.
α-defensin 2 was one of the components discriminating PD from AD subjects, while α-defensin 3 was one of the factors classifying subjects as AD or HC [32].These antimicrobial peptides, involved in the innate immunity and in the regulation of the inflammatory response, are the major release products of neutrophils in infectious conditions [65].[74].Moreover, such as previously discussed about S100A9-S100A8/TLR interaction, these results reinforce the hypothesis of the "microbiota-induced neuronal inflammation" [75].Hypothesis that may be effective for both PD and AD pathogenesis, with specific but still unknown mechanisms.Oral and gut microbiota, or their released endotoxins, by altering the permeability of the blood-brain barrier facilitate the cerebral colonization by opportunistic pathogens, induce microglia activation and upregulation of proinflammatory cytokines, which lead to neuronal loss and neurodegeneration.The cluster of α-defensins 1-4 showed a strong proximity with Tβ4 in AD group, which was individuated as component classifying AD from HC subjects [33] but not from the PD patients in the present study.However, in both the pathological groups Tβ4 showed trend variations like those observed for α-defensins.Tβ4 is a moonlighting peptide widely expressed in human tissues [76], where it may exert down-regulation of inflammatory chemokines and cytokines, promotion of cell migration, blood vessel formation, cell survival, stem cell maturation, inhibition of microbial growth, and antiapoptotic effects [76].Moreover, it plays a neuroprotective and neuro-regenerative role [77], being found up-regulated in reactive microglia of patients with AD, where it suppresses the pro-inflammatory signalling.

Proteins and peptides secreted by salivary glands
Apart for cystatin SA, fragment desF 43 of statherin, P-B fragments des1-5 and des1-12, showing the highest abundance in PD salivary protein profile, the lowest abundance of histatins, statherin and aPRP proteoforms, and cystatin SN was associated with PD.Conversely, AD patients showed the highest levels of components with glandular origin.Statherin di-phosphorylated, which is the main detectable form of the statherin family [29], was individuated, together with cystatins SA and SN, as optimal component discriminating PD subjects from both the control groups, healthy and pathological, and five of its proteoforms showed the same diagnostic potential.In opposition with our results, Figura et al. determined higher level of statherin in PD patients with respect the HC group [10], but without the characterization of its various proteoforms.Some peptides with glandular origin classified specifically the patients in PD or in AD group, as Hst1, Hst5 and the fragments des1-5 and des1-12 of P-B peptide.These results, although from one side suggested a down-regulation of the glandular secretion in PD patients and an up-regulation in AD patients, from the other side were indicative for a differential expression and/or alteration of the secretory pathways and/or alteration of maturation processes of specific secretory peptides and proteins.
Moreover, the turnover of specific components could be also different in the two NDs.
Certainly, an impairment of salivary glands is associated with PD, as demonstrated by various studies, and especially submandibular glands appear to be affected by synucleinopathy in PD such as in dementia with Lewy bodies [78].However, this phenomenon should have general consequences on the qualitative-quantitative changes of the glandular secretion, instead, we observed different trend variations in PD for proteins of the same family and with common glandular origin and secretory pathways.S-type cystatins, S, SA and SN are expressed principally in submandibular and sublingual saliva, to a lesser extent from parotid glands [29,64], nevertheless, in the PD salivary samples we determined a strong upregulation of SA, a downregulation of SN and no changes of cystatin S. It would be interesting in the future to evaluate if cystatin SA is correlated with PD also at the level of gene expression.P-B peptide is a typical product of the submandibular secretion [29] whose function is largely unknown, in our sample it did not undergo to quantitative variations, but only two of its four fragments truncated at the N-terminal were significantly higher in PD than in AD patients.
These fragments were generated at different cleavage sequences, after Arg 4 for the des1-5, and after Pro 11 for the des1-12, probably by different proteases.Components of histatin, statherin, aPRP families are secreted by both submandibular/sublingual and parotid glands [29], also in these cases different trend variations were observed for diverse proteoforms.
The very low level of histatin 1, cystatin SN, statherin and aPRP proteoforms can be considered a risk factor for contracting oral diseases and infections in patients with PD, which could be associated to the hypothesis of the "microbiota-induced neuronal inflammation" [75].Indeed, they are fundamental for the maintaining of the oral homeostasis, being implicated in forming acquired pellicle and in the antimicrobial protection of the oral surfaces [65].Statherin and aPRPs regulate the calcium homeostasis being potent inhibitor of calcium phosphate precipitation, moreover, they modulate the colonization of the host bacteria on the oral surfaces [79].Histatin 1 indirectly induces wound healing by stimulating epithelial migration [80].

Correlation between proteomic data and clinical assessment in PD patients
Interestingly, the levels of some peptides were found negatively correlated to the UPDRS III score and to the MoCA.The lowest α-defensin 3 levels correlated with the highest UPDRS III scores in PD patients, and thus with increased motor impairment, suggested deficient protection against infections and less controlled inflammatory response, indeed this peptide is an antimicrobial peptide and an inflammatory modulator.Moreover, the patients with the lowest MoCA score, and thus affected by a higher degree of cognitive impairment, showed the highest levels of histatin 3 and statherin that are implicated in the protection and homeostasis of the oral cavity [65].The highest levels of statherin 2P correlated also with the lowest score of two parameters peculiar to define the olfactory function: discrimination and identification of the odours, and as consequence with the TDI score, which indicated global olfactory function.It is interesting to underline that, even if statherin is a typical salivary peptide, it was found expressed also by the epithelial of the nasal mucosa [81], therefore, its possible implication in the discrimination and identification of odours is amazing and suggestive for future investigation, and the highest abundancy of statherin in PD patients with more impaired olfactory perception could reflect an alteration of the molecular mechanism regulating this physiological process.It is noteworthy that the impairment of the olfactory function is associated with the cognitive decline in PD patients [82], and that these two are common non-motor symptoms in PD [83], with the olfactory deficits affecting up to 95% of PD patients [84].The presence of these symptoms may predict the following development of PD dementia [85].
Conversely to statherin, the highest levels of Tβ4 were determined in those patients exhibiting the highest score for the odour identification and TDI, and thus with minor impairment of the olfactory perception.This result may reflect self-protective response mechanism and could be associated to important neuroprotective and neuroinflammatory suppressing role of Tβ4, and its ability to stimulate the tissue regeneration, the angiogenesis, and the cell survival [76,86].
The results obtained in this investigation demonstrated to be novel and original for the utilized methodological approach and the explored biological fluid in the field of PD research.They demonstrated that it was possible to individuate a panel of peptides and proteins detectable in human saliva with high diagnostic potential and useful for recognizing patients with PD from those with AD and from healthy subjects.
Moreover, it was possible to discriminate specific proteoforms from PTMs of the investigated peptides and proteins that in several cases were found to be significantly associated to one or the other ND.
Finally, the correlation analysis between proteomic and clinical data demonstrated that it was possible to delineate a set of candidate biomarkers of PD, including salivary proteoforms, cognitive and motor assessments, and mainly olfactory perception, which potentially could be used to better identify the patient affected by PD.
to measure the motor impairment.Montreal Cognitive Assessment (MoCA) scale was used to evaluate participants' cognitive function in eight different domains (visuo-constructional skills, attention, memory, language, orientation, concentration, conceptual thinking, and calculation), with scores of 25 or below indicating Statement of significance of the study The proteomic investigation here presented was the first study of the salivary protein profile associated to Parkinson's disease obtained by a top-down approach, which provided original and novel outcomes in this topic.The obtained results allowed to identify a panel of peptides and proteins, distinguishing various proteoforms derived from PTMs, exhibiting a relevant diagnostic potential.Indeed, the study highlighted statistically significant variations at quantitative level among the salivary protein profile characterized in PD patients and in the two control groups, healthy and pathological, this last constituted by patients affected by AD.Moreover, the study identified salivary proteoforms able to classify with high accuracy subjects affected by PD from those affected by AD and from the healthy controls.Finally, the correlation between proteomic data with clinical parameters in PD patients, especially with the olfactory function, demonstrated the feasibility to define a set of clinical and peripheral molecular biomarkers for recognizing patients affected by PD.
the text when we refer to the quantitative data.The number of components examined in this study is 57.Distribution of XIC peak AUC of every protein/peptide showed a considerable deviation from nor-mality using Kolmogorov-Smirnov test and other goodness-of-fit tests (p-values < 0.0001 in almost all tests, data not shown).Thus, data were analyzed using statistical methods that do not depend on the specific distribution of data.Correlation among the AUC values of the 57 components measured in the 36 PD salivary samples and the clinical data (years of disease, UPDRS III, H&Y, MoCA, and olfactory function) were performed by a Spearman correlation test with a significant p-value (two-tailed) < 0.05, using GraphPad Prism 6.0.Mann-Whitney and Kruskal-Wallis tests were used to identify components with different abundance between groups.The FDR of multiple tests was controlled by the method of Benjamini-Hochberg[46]. Test p-values < 0.05, with for the intact cystatin D R 26 des1-5, which is the variant of cystatin D with arginine residue at position 26 and carrying an N-terminal pyro-glutamination occurred after 1-5 residue removal, and two intrachain disulfide bridges.The MS spectrum recorded in the retention time range 39.0-39.4min, corresponding to the retention time of cystatin D R 26 des1-5, is shown in Figure 1A, the related deconvoluted spectrum reporting the monoisotopic [M+1H] +1 m/z value is shown in Figure 1B.The experimental monoisotopic [M+1H] +1 at 13509.66 ± 0.08 m/z was attributed to cystatin R 26 des1-5 well matching with the theoretical one, 13509.65 m/z.Panel C of Figure 1 represents the deconvoluted (HR)-MS/MS spectrum obtained by fragmentation of the [M+11H] +11 ion at 1229.80 m/z of cystatin R 26 des1-5.The analysis of the (HR)-MS/MS fragmentation spectrum by ProSightLite tool is reported in panels D and E of Figure 1.Panel D of Figure 1 reports the b and y fragment ions attributed by ProSightLite program, with related theoretical and experimental [M+1H] +1 m/z values and mas difference calculated as ppm.Instrumental detection limit of the (HR)-MS/MS apparatus did not allow the determination of monoisotopic mass values and thus top-down MS/MS sequencing for the following two proteoforms: the disulfide hetero-dimer linking C 42 of S100A8 and C 3 of S100A9s (dimer A8-A9), and the disulfide homo-dimer of cystatin B (S-S dimer) involving the unique C 3 residue of the protein.With the term S100A9s is individuated the N-terminal acetylated short proteoform of S100A9 (108 residues, named here S100A9s), which is generated from the long form (named here S100A9l) by removal of the first five amino acid residues.Both the dimeric proteoforms are, instead, detected by (LR)-MS analysis that determined experimental M av 23986 ± 3 Da for the dimer A8-A9 (theoretical mass 23985 Da), and 22358 ± 2 Da for the cystatin B S-S dimer (theoretical mass 22361 Da).They have been detected and identified in our previous proteomic studies by a bottomup approach based on (HR)-MS/MS analysis [28, 50, 51], thus, here the attribution was based on the comparison of experimental versus theoretical M av values, as well as the comparison of the relative distribution of the multiply-charged ions in their mass spectrum, and the retention times obtained in this investigation with those ones determined in ourprevious studies[28,50,51].

1
Example of top-down (HR)-MS and MS/MS identification of the proteoform des1-5 of cystatin D R 26 .(A) MS spectrum recorded in the retention time range 39.0-39.4min, (B) related deconvoluted MS spectrum reporting monoisotopic [M+1H] +1 m/z values.(C) Deconvoluted (HR)-MS/MS spectrum obtained by fragmentation of the [M+11H] +11 ion at 1229.80 m/z of the protein.(D) Results of the attributions of b and y fragment ions attributed by ProSightLite, theoretical and experimental [M+1H] +1 m/z are reported and the difference mass as ppm.(E) Observed MS/MS fragmentation of the sequence of cystatin D R 26 des1-5 such as represented by ProSightLite.Modified amino acid residues in orange: cysteine residues involved in disulfide bridges, and N-terminal Q, exposed after removal of N-terminal sequence 1-5 from cystatin D R 26 , carrying out a pyro-glutamination.

F I G U R E 3
Results of the dot-blotting analysis with monoclonal Abs of total α-defensins (A), SLPI (B), and Tβ4 (C) with the below corresponding plot (panels D, F) showing the distributions of the normalized intensity signals.For α-defensins normalization was made with respect to the background, for SLPI with respect to α-defensins, and Tβ4 with respect to its standard.p-values and grade of significance are reported (n.s.= not significant; ** = p < 0.01).Standard Tβ4 (panel C) has been blotted in the membrane as 0.25 (St.1), 0.5 (St.2) and 0.75 (St.3) nmol.
panel B and D).MDS plots show only the first two axes that are most representative of the multidimensional structure of the relationships between each pair of samples.However, even with this limitation, it can be observed that several PD samples (samples 3, 4, 6, 7, 9, 14, 17, 18, 20, 22, 23, 28, 29, 32 and 36, Figure 4 panel B and D) are always strongly clustered in both PD-HC and PD-AD plots whereas others are always distant from the PD cluster (samples 15, 16, 19 and 34).The large majority of components selected by the Boruta algorithm with high MDG scores (Table

F I G U R E 4
RF classification of the 72 mixed PD-HC samples (panel A and B) and of the 71 mixed PD-AD samples (panel C and D).Confusion matrix with sensitivity and specificity values.(A, C) MDS diagram of classified samples, obtained by using the proximity between each pair of samples as a measure of distance (B, D).Red dots represent PD samples, the blue HC samples, and the green AD samples.

F I G U R E 5 4 (category 3 ,
MDS diagrams of Kendall correlations among component levels in the HC, PD and AD groups.The degree of clustering of points accounts for the degree of component correlation.To facilitate the understanding of the diagrams, the components were numbered and grouped (colour encoded) into different categories, based on their structural/functional similitudes and secretory origin.understanding of MDS diagrams, the 57 components were subdivided into 12 categories based on their structural/functional analogies and secretory origin.The most compact cluster, in all the groups, was that of cystatins A and B (category 8, from 34 to 40, in red in Figure5), while fewer compact clusters were represented by α-defensins 1from 3 to 7, in dark green), histatins (category 5, from 13 to 17, in orange), statherin proteoforms (category 6, from 18 to 25, in black).Overall, the protein clusters in the HC group resulted more compact than in the pathological groups.With respect to the HC group the following categories resulted more scattered: S100A9 proteoforms (category 12, from 52 to 57, in pink) in both PD and AD groups, with the S100A9sox that was the most isolated component of this category; S100A8 proteoforms (category 11, components 48 and 50, in brown) and histatins in AD; aPRP (category 7, from 26 to 33, in blue) and cystatins C and S-type proteoforms (category 10, from 41 to 47, in purple) in PD.S100A9 proteoforms formed a more compact cluster in PD than in AD group of patients.Moreover, by evaluation of the crowding of different categories it has been highlighted that Tβ4 (category 2, component 2, in light blue) exhibited a good proximity with α-defensins 1-4 in AD group, and mainly with the S100A9 proteoforms in HC and PD groups.S100A9 proteoforms, in PD group, exhibited a stronger proximity with S100A8 proteoforms and α-defensins 1-4.

value Components 25 th perc median 75 th perc F% 25 th perc median 75 th perc F% 25 th perc median 75 th perc
Results of the group comparison by the Mann-Whitney exact test (FDR < 10%), and of the multi-comparison Kruskal-Wallis test performed applying the Benjamini-H FDR.

Table 2
Mean decrease of the Gini index (MDG) of the most important components or their sum, generically indicated as components, selected by Boruta algorithm for RF classification.
and PRP3 mono-phosphorylated.With MDG < 1 the test individuated S100A9sox, α-defensins, especially α-defensin 2, Hst5, and two fragments of statherin.Confusion matrix and sensitivity/specificity of classifications are shown in Figure 4. Sensitivity and specificity were high for both the RF classifications, with a mean error of 11% and 10% for the PD-HC and PD-AD, respectively (Figure 4 panel A and C).It TA B L E 2 [59]00A9 and S100A8 can play a dual role also in oxidative stress conditions, contributing on one side to the generation of reactive oxygen/nitrogen species (ROS/RNS) and consequent exacerbation of the inflammatory status, and on the other side, they can act as ROS/RNS scavenger against oxidative stress[59].The prevalence of different proteoforms of S100A9 and S100A8 in the salivary profiles associated to the two investigated NDs, when compared to each other and to HC, suggested a probable different role in the pathogenesis of PD and AD.The high- [73]only at brain level, but also in other body's districts.Watt et al. demonstrated that the levels of α-defensins 1-2 were elevated in both CSF and sera of AD patients[73].Williams et al. proposed that neuropathological alterations might be associated with abnormal expression and/or regulation of antimicrobial peptides, including defensins The cluster of α-defensins 1-4 have shown a great proximity, in the correlation analysis, with that one of S100A9 in HC and PD groups, suggesting a functional relationship associated to their common neutrophil origin.Our results were in accord with those obtained by other research groups indicating that the inflammatory condition associated to NDs is present