Extracellular vesicles from bodily fluids for the accurate diagnosis of Parkinson's disease and related disorders: A systematic review and diagnostic meta‐analysis

Abstract Parkinsonian disorders, including Parkinson's disease (PD), multiple system atrophy (MSA), dementia with Lewy body (DLB), corticobasal syndrome (CBS) and progressive supranuclear palsy (PSP) are often misdiagnosed due to overlapping symptoms and the absence of precise biomarkers. Furthermore, there are no current methods to ascertain the progression and conversion of prodromal conditions such as REM behaviour disorder (RBD). Extracellular vesicles (EVs), containing a mixture of biomolecules, have emerged as potential sources for parkinsonian diagnostics. However, inconsistencies in previous studies have left their diagnostic potential unclear. We conducted a meta‐analysis, following PRISMA guidelines, to assess the diagnostic accuracy of general EVs isolated from various bodily fluids, including cerebrospinal fluid (CSF), plasma, serum, urine or saliva, in differentiating patients with parkinsonian disorders from healthy controls (HCs). The meta‐analysis included 21 studies encompassing 1285 patients with PD, 24 with MSA, 105 with DLB, 99 with PSP, 101 with RBD and 783 HCs. Further analyses were conducted only for patients with PD versus HCs, given the limited number for other comparisons. Using bivariate and hierarchal receiver operating characteristics (HSROC) models, the meta‐analysis revealed moderate diagnostic accuracy in distinguishing patients with PD from HCs, with substantial heterogeneity and publication bias. The trim‐and‐fill method revealed at least two missing studies with null or low diagnostic accuracy. CSF‐EVs showed better overall diagnostic accuracy, while plasma‐EVs had the lowest performance. General EVs demonstrated higher diagnostic accuracy compared to CNS‐originating EVs, which are more time‐consuming, labour‐ and cost‐intensive to isolate. In conclusion, while holding promise, utilizing biomarkers in general EVs for PD diagnosis remains unfeasible due to existing challenges. The focus should shift toward harmonizing the field through standardization, collaboration, and rigorous validation. Current efforts by the International Society For Extracellular Vesicles (ISEV) aim to enhance the accuracy and reproducibility of EV‐related research through rigor and standardization, aiming to bridge the gap between theory and practical clinical application.


INTRODUCTION
Motor symptoms such as slowness of movement (bradykinesia), stiffness (rigidity), and shaking (tremor) are hallmarks of a collection of neurodegenerative disorders known as parkinsonian disorders.Among these, Parkinson's disease (PD) is the most prevalent (Poewe et al., 2017).Other notable but rarer conditions in this group are multiple system atrophy (MSA), dementia with Lewy bodies (DLB), progressive supranuclear palsy (PSP), and corticobasal syndrome (CBS) (Armstrong & Okun, 2020).Although these diseases are characterized by distinct pathophysiologies with differences in the proteins involved, affected cells, and brain regions, they are often misdiagnosed by neurologists and movement disorder specialists due to symptom overlap and lack of precise biomarkers (Surguchov, 2022), especially in the early stages (Baumann, 2012;Rizzo et al., 2016;Schrag et al., 2002).Moreover, we currently cannot predict the onset of the prodromal conditions known as rapid eye movement (REM) behaviour disorder (RBD) and/or pure autonomic failure (PAF), nor their progression and conversion into synucleinopathies such as PD, MSA and/or DLB (Dauvilliers et al., 2018).Unfortunately, a definitive diagnosis can only be obtained through a postmortem neuropathological examination after the patients have passed away.
Such incorrect diagnoses can carry profound consequences for patients, resulting not only in improper treatment and a deterioration of overall health but also hindering the identification of disease-modifying therapies (Surguchov, 2022).This confusion can escalate patients' emotional distress, exacerbating feelings of uncertainty and anxiety regarding their medical situation.
Extracellular vesicles (EVs) are small, lipid-bilayer-bound entities secreted by cells, instrumental in mediating intercellular communication and orchestrating various physiological functions (Dixson et al., 2023;Surguchev et al., 2019).EVs encompass a heterogeneous mixture of biomolecules, including proteins, lipids, and nucleic acids, reflective of the state of the parent cell (Dixson et al., 2023).As such they have been widely used as a rich source for biomarker discovery (Simeone et al., 2020).
Numerous research groups have analysed biomarkers in general bodily fluid-isolated EVs (Upadhya & Shetty, 2021) or central nervous system (CNS)-originating EVs (Dutta et al., 2023) to differentiate parkinsonian disorders among each other or from healthy controls (HCs).However, consistent failure has been observed in independent validations and replications, leading to varying outcomes even when identical methodologies are applied.
A number of meta-analyses have investigated the utilization of biomarkers in general EVs for differentiating patients with PD from HCs (Nila et al., 2022) or CNS-originating EVs for distinguishing various parkinsonian disorders from one another or from HCs (Taha & Ati, 2023).These studies suggested the potential for elevated concentrations of EVs-associated α-synuclein in patients with PD versus HCs (Nila et al., 2022;Taha & Ati, 2023).
Additionally, a recent meta-analysis assessed the diagnostic accuracy of biomarkers in CNS-originating EVs for parkinsonian disorders and identified significant heterogeneity, variance, inconsistencies, and evidence of substantial publication bias (Taha & Bogoniewski, 2023).However, to date, no research has explored the diagnostic accuracy of biomarkers in general EVs for parkinsonian disorders.
Hence, we conducted a comprehensive systematic review and diagnostic meta-analysis, incorporating all studies aimed at distinguishing parkinsonian disorders among each other or from HCs by utilizing biomarkers in general EVs isolated from bodily fluids.The analysis involves a comparison of results based on the fluid from which the EVs were isolated such as cerebrospinal fluid (CSF), plasma, serum, urine, and saliva.Lastly, we compared the diagnostic accuracy obtained from general EVs with the accuracy reported earlier for CNS-originating EVs isolated from plasma or serum (Taha & Bogoniewski, 2023).

 METHODOLOGY
We conducted a systematic review and meta-analysis in accordance with the guidelines stipulated by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Protocols (PRISMA).We restricted our research to the use of anonymized data, without collecting any personal information or involving human subjects, thereby negating the necessity for ethical approval.The protocol for this study was not registered.

. Data sources and search strategy
We carried out an exhaustive search of pertinent articles using targeted search terms connected to PD and other parkinsonian disorders.This search was performed within two databases, PUBMED and EMBASE, and included articles published from the beginning of these databases up to 5 August 2023.Our search terms comprised combinations such as "Parkinson's disease OR multiple system atrophy OR Lewy body dementia OR corticobasal syndrome OR progressive supranuclear palsy" AND "Extracellular Vesicle OR exosome" AND "Diagnosis".To identify appropriate studies for inclusion, we scrutinized the reference lists of qualifying studies and performed in-depth literature reviews.Any disagreements regarding the selection of articles were settled through dialogue.The detailed search strategy is presented in Table S1.

. Data synthesis and statistics
In this study, we report estimates for the bivariate and hierarchical receiver operating characteristic (HSROC) models (Reitsma et al., 2005).The bivariate model (Nila et al., 2022) is a statistical approach used in meta-analysis for diagnostic accuracy studies.It jointly analyses the sensitivity and specificity, allowing for the correlation between these two measures to be accounted for across different studies.This enables a more nuanced understanding of how a particular diagnostic test performs in various settings, providing a summary of both sensitivity and specificity.Specifically, several critical metrics are considered in this model.Logit-transformed sensitivity reflects the logarithmic transformation of sensitivity values, with the provided mean and confidence interval indicating the precision of this measure.Logit-transformed sensitivity variance assesses the variability associated with logit-transformed sensitivity, offering insights into the uncertainty of sensitivity estimates.Logit-transformed specificity provides a similar analysis but for specificity values.The correlation between sensitivity and specificity quantifies the degree of association between these two critical metrics across studies, shedding light on their interdependence.Lastly, the AUC and partial AUC metrics evaluate the overall diagnostic performance of the test, with partial AUC specifically focusing on a defined range of false positive rates (FPRs), offering valuable information on its discriminative ability within that specific FPR range.
On the other hand, the HSROC model (Trikalinos et al., 2012) offers a more comprehensive view of diagnostic accuracy.It considers both the between-study variability and the threshold effect, which refers to variations in sensitivity and specificity due to different cut-off points or thresholds used in various studies.The HSROC model is beneficial in summarizing the overall diagnostic accuracy by considering these variations, providing a summarized ROC curve that includes the different threshold points.In particular, lambda (Λ) captures sensitivity and specificity variations due to different test thresholds used across studies.Theta (Θ) characterizes how sensitivity and specificity change with varying thresholds.Beta (β) quantifies the relationship between the diagnostic odds ratio (DOR) and the threshold.Variance Λ and variance Θ reflect the uncertainties in estimating Λ and Θ.These parameters collectively provide insights into how threshold variations impact diagnostic accuracy and the shape of the receiver operating characteristic curve, enhancing our understanding of a test or biomarker's performance across diverse studies.Overall, the HSROC represents a more sophisticated approach to combining information from various studies, offering a global summary of the diagnostic ability of a particular test or biomarker.
Begg's rank correlation (Begg & Mazumdar, 1994), Egger's (Egger et al., 1997) and Deek's (Deeks et al., 2005) regression tests, funnel plots, andthe trim-and-fill method (Shi & Lin, 2019) were used to evaluate publication bias (Lin & Chu, 2018).Funnel plots, a graphical representation showcasing the effect sizes against their precision, serve as a primary visual tool for detecting asymmetry that may hint at publication bias.Begg's rank correlation (Begg & Mazumdar, 1994) assesses bias by determining the correlation between the effect sizes and their respective variances; a significant correlation suggests the presence of bias.Egger's regression test (Egger et al., 1997) evaluates funnel plot asymmetry by regressing the standardized effect sizes against their precision, with a non-zero intercept signifying potential small-study effects or biases.Deek's regression test (Deeks et al., 2005) tailored for diagnostic meta-analyses, contrasts the diagnostic odds ratio with the inverse of the effective sample size's square root; a significant outcome indicates potential publication bias.Lastly, the trim-and-fill method (Shi & Lin, 2019) operates on the assumption of a symmetric funnel plot in the absence of bias.This method 'trims' the asymmetrical studies and 'fills' with estimated missing ones, providing an adjusted effect size.
Finally, if there were more than two ROC models available within the same study, we selected the one with the highest AUCindicating the most effective biomarker performance in distinguishing patients with parkinsonian disorders among each other or from HCs -for inclusion in the meta-analysis.
The majority of the studies analysed were found to be of high quality, as outlined in Table S3.Nevertheless, there was an absence of transparent reporting regarding the method of sampling, obscuring the ability to properly evaluate the risk of bias in the selection of patients, which we deemed as unclear.One study was deemed as high risk of bias (Wang et al., 2019) due to the exclusion of four participants who had undetectable levels of Calbindin.In the area of biomarker measurement utilizing EVs, the risk of bias was deemed low since this objective measure is not influenced by prior information about the patient's clinical condition.Regarding the flow and timing domain, the risk of bias was considered low across all studies, as the interval between clinical diagnosis and biomarker measurement could be estimated.Although the bulk of the articles (76.2%) were judged to have a low risk of bias in the reference standard domain, five studies (Lucien et al., 2022;Manna et al., 2021;Wang et al., 2019;Yan et al., 2022;Zheng et al., 2021) were identified as having a high risk of bias due to the lack of quantification using highly sensitive methods in comparison to a reference standard.
We attempted to conduct a comprehensive meta-analysis, evaluating the diagnostic accuracy of biomarkers in EVs for parkinsonian disorders against one another and against HCs.However, as the number of studies in all other scenarios was ≤ 3, it precluded us from conducting meaningful analyses.As such, we only conducted a meta-analysis for patients with PD versus HCs, but also included relevant descriptive statistics for other scenarios in Table 2.

. Diagnosing Parkinson's disease against healthy controls
In this meta-analysis, we employed a bivariate and HSROC model to measure the diagnostic test's accuracy for patients with PD versus HCs.Descriptive statistics including sensitivity, specificity, FPR, DOR, positive likelihood ratio (posLR), and negative likelihood ratio (negLR) for each individual study are summarized in Table 2. Pooled summary of sensitivity (Figure 2a), specificity (Figure 2b), DOR (Figure 2c), posLR (Figure 2d) and negLR (Figure 2e), and the bivariate and HSROC models' (Figure 2f) statistics are summarized in Table 3.
The bivariate model revealed a moderate diagnostic sensitivity (85.2%, 95% CI: 0.77-0.91)and specificity (78.9%, 95% CI: 0.73-0.84)for patients with PD versus HCs.Though the variance in sensitivity (Figure 2a) and specificity (Figure 2b) across the 21 studies indicated substantial heterogeneity, as seen graphically and by the Chi-square (χ 2 ) test for equality of sensitivities (χ2 = 264.22,p < 0.0001) and specificities (χ 2 = 150.81,p-value < 0.0001).This heterogeneity could arise from the difference in the isolation, quantification methodologies, populations included, or the medium used for EV isolation in each study.Furthermore, the correlation between the logit transformations of sensitivity and specificity using a bivariate model was found to be negligible (r = -0.079,95% CI: -0.50-0.37),suggesting that these two measures operate relatively independently.Lastly, the pooled AUC was 0.852 and the partial AUC, focusing on a specific range of FPRs, was 0.672, indicating fair diagnostic accuracy.
In the second part of our analysis, we utilized the HSROC model.Here, the measure of overall test accuracy, denoted by lambda is 3.05, translates to approximately a DOR of 21.2 (95% CI: 12.0-37.5),indicating good accuracy in distinguishing between patients with PD versus HCs.The theta value was approximately -0.14, which suggests that the test performs consistently across different diagnostic thresholds.The asymmetry of the ROC curve, denoted by the beta value of -0.47, shows a slight tendency towards a trade-off between sensitivity and specificity, however, non-significant (p = 0.091).Notably, the substantial variance in accuracy (σ 2 α = 1.52, 95% CI: 0.75-3.03)and threshold (σ 2 θ = 0.44, 95% CI: 0.22-0.93)across the studies highlights that there is considerable variability in both the test's accuracy and its performance at different thresholds.

TA B L E 
Demographics and characteristics of the included studies for EV biomarkers.While the diagnostic test shows good performance in diagnosing patients with PD versus HCs, as evidenced by the sensitivity, specificity, and AUC values from the bivariate model, and the overall good accuracy from the HSROC model (Figure 2f), the substantial variability and heterogeneity across the studies calls for cautious interpretation of these results and emphasizes the need for further independent validations in diverse settings and populations.
Notably, among the studies, the presumed best biomarker offering a balance between sensitivity and specificity utilizing a large sample size is the one integrating the aggregated α-synuclein and total α-synuclein (Hong et al., 2021).This finding aligns with the mechanistic understanding that in PD when α-synuclein misfolds, cells may try to release more of it into EVs to reduce its intracellular toxic effects (Hill, 2019).However, discrepancies in methodologies and patient profiles (Table 1) pose challenges in identifying a single superior biomarker.The effectiveness of a biomarker can vary based on its application context, disease stage, or patient demographics, among many other factors, which are not standardized across the included studies.
We further assessed publication bias using Begg's correlation test (Figure 3a), Egger's test (Figure 3b), Deek's test (Figure 3c), a funnel plot (Figure 3d), a bagplot (Figure 3e) and the trim-and-fill method (Figure 3f).All tests revealed substantial publication bias except for Deek's test.The trim-and-fill method suggested that the unpublished studies are hypothesized to be on the left side of the funnel plot (white circles in Figure 3f) with null or low DOR.This suggested inflation of the perceived diagnostic efficacy of the test, similar to what is observed with CNS-originating EVs (Taha & Bogoniewski, 2023), but to a much lower degree.

. Diagnosing Parkinson's disease against healthy controls by EVs isolation medium
As the size, purity, content, and reliability of EVs depend on media (e.g., CSF, serum, plasma, etc.) and methodology of isolation (Dhondt et al., 2023;Erdbrugger et al., 2021;Krusic Alic et al., 2022;Taha, 2023), we compared the diagnostic metrics by CSF, plasma, and serum.We did not include urine or saliva due to the inclusion of only two studies for each.In one study, both CSF and serum were used (Tong et al., 2022), and were included in both analyses.Four studies quantified biomarkers in EVs isolated from CSF, nine from plasma and four from serum for the differential diagnosis of patients with PD from HCs. Comparative summary of the models' statistics is included in Table 4.   CSF is widely used for the discovery of biomarkers for neurodegenerative conditions, including parkinsonian disorders, due to its direct connection with the brain.On the other hand, plasma and serum are often used due to their minimally invasive nature, foregoing the need to undergo a lumbar puncture procedure.Therefore, measurement of biomarkers in EVs isolated from CSF, plasma, and serum has been appealing to different groups based on their goal.
Meta-analyses using the bivariate and HSROC model of CSF, plasma, and serum-EVs supported the findings, with the highest diagnostic accuracy obtained for CSF (Figure 4a) versus plasma (Figure 4b) and serum (Figure 4c).However, due to the small number of studies using CSF (n = 4), making concise conclusions is difficult.But as mentioned above, due to the CSF's direct connection with the brain, one would expect biomarkers in the CSF to give more of a realistic picture of the brain's biochemistry in comparison to plasma and serum.Nonetheless, in all cases, heterogeneity was large, indicating that further studies are needed to confirm this perceived effect.
TA B L E  Comparison between bulk EVs versus CNS-originating EVs for diagnosing Parkinson's disease (PD) from healthy controls using a bivariate and hierarchal summary receiver operating characteristics (HSROC) model.The sensitivity, specificity, pooled area under the curve (AUC) and partial AUC, focusing on a specific range of false positive rates (FPR), are obtained using the bivariate model.The diagnostic odds ratio (DOR) is obtained from the HSROC model.EV -Extracellular vesicles.CNS -Central nervous system.SE-Standard error.

EV source
Mean

. Diagnosing Parkinson's disease against healthy controls: General EVs versus CNS-originating EVs
As EVs are thought to reflect the status of the parent cell by carrying cell-state-specific messages (Dixson et al., 2023), many studies have attempted to measure biomarkers in CNS-originating EVs (Dutta et al., 2023) isolated from the blood.These studies aimed to differentially diagnose patients with PD versus HCs, and other parkinsonian disorders, with the intention of gaining insights into the brain's biochemistry.
Two recent meta-analyses evaluated the levels of α-synuclein and biomarker diagnostic accuracy in CNS-originating EVs and found substantial heterogeneity, publication bias and inconsistency in the findings (Taha & Ati, 2023;Taha & Bogoniewski, 2023).As such, we compared the diagnostic accuracy of biomarkers in general EVs versus CNS-originating EVs (Taha & Bogonwieski, 2023).
Comparison of the bivariate and HSROC model statistics revealed that biomarkers in general EVs have a higher diagnostic accuracy versus CNS-originating EVs (Table 5).Though there was substantial publication bias in both methodologies, the trimand-fill method estimated only two missing studies out of 21 for biomarkers in general EVs versus 5 out of 16 in CNS-originating EVs (Taha & Bogonwieski, 2023), indicating that the former approach has substantially less publication bias.Moreover, both methodologies suffered from substantially large heterogeneity, indicating that more rigor, standardization and independent validations across groups are needed.
We also noted that only one study involving biomarkers in CNS-originating EVs attempted to differentiate patients with PD versus HCs, while also employing general EVs for the ROC analysis (Yan et al., 2022).The rationale for the omission of such biomarkers in general EVs for diagnosis before transitioning to CNS-originating EVs remains unclear.Notably, the isolation of CNS-originating EVs is significantly more intricate, time-consuming and labour-intensive.

 DISCUSSION
The absence of clear and exact biomarkers for the definitive antemortem diagnosis of parkinsonian disorders, including PD, MSA, DLB, PSP, and CBS frequently results in misdiagnoses, negatively affecting patients' access to suitable and prompt treatment (Baumann, 2012;Rizzo et al., 2016;Schrag et al., 2002).This issue is further compounded by the inability to predict prodromal disease conversion (e.g., RBD) to one of the three synucleinopathies; PD, MSA and/or DLB (Dauvilliers et al., 2018).This situation is unsettling for patients, who face uncertainty about their health and future as well as for physicians who aim to deliver the best care.
EVs are tiny vesicles believed to be released by all cells, serving as carriers of cell-state-specific messages to both neighbouring and distant cells.They have gained attraction as a popular source for biomarker discovery for parkinsonian disorders.While few meta-analyses examined the utility of biomarker concentrations in EVs (Nila et al., 2022) and/or CNS-originating EVs (Taha & Ati, 2023;Taha & Bogoniewski, 2023), no study to date has examined the diagnostic accuracy of biomarkers in general EVs isolated from bodily fluids for parkinsonian disorders.
The current meta-analysis included 21 studies encompassing 1285 patients with PD, 24 with MSA, 105 with DLB, 99 with PSP, 101 with RBD, and 783 HCs.Because the number of studies was low (n ≤ 3) for differentiating parkinsonian disorders among each other or MSA, DLB, PSP, CBS, or RBD from HCs, we only conducted meta-analyses for patients with PD versus HCs.We also noted that no study included a PAF cohort.This glaring omission in the literature signifies an urgent need for researchers to broaden their focus, encompassing not just patients with PD, but also other parkinsonian disorders.Addressing this disparity is vital for clinicians, as it could significantly enhance the accuracy and scope of differential diagnoses.
Both the bivariate and HSROC models revealed moderate diagnostic accuracy of this approach in distinguishing patients with PD from HCs (Figure 2f).However, substantial heterogeneity and variability across the studies, possibly due to differences in isolation and quantification methodologies, populations, and media for EV isolation, caution the interpretation of such results and emphasize the need for further validation.Additionally, publication bias (Figures 2a-2f) was substantial.The trim-and-fill method estimated that there are at least two studies with low or null diagnostic accuracy missing (Figure 3f), suggesting a potential overestimation of the diagnostic test's efficacy.
In the analysis comparing the diagnostic metrics by CSF, plasma, and serum for differentiating patients with PD from HCs, CSF-EVs demonstrated better overall diagnostic accuracy (Figure 4a), likely due to CSF's direct connection with the brain.Plasma-EVs analysis (Figure 4b) revealed the lowest performance, with possible publication bias, while serum-EVs analysis (Figure 4c) showed moderate diagnostic accuracy without bias.Though the highest diagnostic accuracy was obtained for CSF in the meta-analysis, the small number of studies using CSF (n = 4) makes drawing definitive conclusions difficult.Despite these findings, the significant heterogeneity observed across the models emphasizes the need for further studies.The selection among CSF, plasma, and serum for isolating EVs for biomarker discovery often depends on the goal, considering factors like CSF's connection with the brain and the minimally invasive nature of plasma and serum collection as well as the isolation and quantification methodologies available to the researchers.
The use of biomarkers in CNS-originating EVs isolated from the blood in differentiating parkinsonian disorders from one another or from HCs has been popular (Dutta et al., 2023).Though isolating CNS-originating EVs is a complicated and labourintensive process, the hope was that they would provide a more accurate reflection of the brain's biochemistry due to their ability to carry cell-state-specific messages across the blood-brain barrier to the peripheral circulation (Shi et al., 2014).
However, a comparative analysis found that general EVs, which are simpler to isolate, demonstrated higher diagnostic accuracy than CNS-originating EVs (Table 5).Despite the advantage of general EVs, both methods were marked by substantial publication bias and large heterogeneity, but general EVs showed less bias.Specifically, the trim-and-fill method estimated 2 out of 21 versus 6 out of 15 missing studies with low or null diagnostic accuracy in general EVs versus CNS-originating EVs.The poor diagnostic performance of CNS-originating EVs could be due to some of the CNS antibodies, especially those targeting neuronal EV populations (e.g., L1CAM), cross-reacting with the α-synuclein antibody used for quantification (Norman et al., 2021).Few other explanations include the very small number of CNS-originating EVs existing in the blood, the use of polymer-based precipitation kits, believed to provide a "dirty" and heterogeneous EV yield (Brennan et al., 2020), in the majority of CNS-originating EVs in comparison to general EVs, lack of CSF usage for CNS-originating EVs isolation and the need for high expertise and precision in performing reproducible immunoprecipitation of CNS-originating EVs.
It is also important to note that the majority of studies have not provided sufficient details about pharmacological treatments, including their type, dosage, and administration duration, which could influence the biomarkers measured in EVs.Also, if a patient passes away, studies should include any discrepancies between the antemortem and postmortem clinical diagnosis.Furthermore, there is a lack of data regarding participants' demographics and comorbid health conditions present, important factors that could influence the biomarker concentrations found in EVs.Lastly, researchers must not overlook the influence of preanalytical factors (Taha, 2023) and should document them thoroughly either in the methods or by utilizing the EV-TRACK platform (Van Deun et al., 2017).These include the patient's fasting state before drawing blood, the specific time of day the blood is collected, how long the collection takes, the gauge of the needle used, the exact procedure and time taken to separate the blood components and the type of container or anticoagulant molecule used.Other critical details include how the sample is transported, whether the collection tube is kept upright or on its side, the method of centrifugation, the number of freeze/thaw cycles, the steps taken to deplete platelets, the storage conditions such as duration and temperature, treatments to remove the coagulation factors (e.g., thrombin), the method used for lysing the EVs and any processes used to freeze EVs or their lysates.All these elements must be rigorously recorded to ensure the integrity and reproducibility of the study.Because users may handle the samples used for EV isolation and subsequent biomarker quantification differently, researchers must indicate the number of persons involved in handling the samples as well as conduct subanalyses by the user.Finally, as opposed to singular studies, meta-analyses like the one presented here often paint a markedly different picture, one that can challenge and even contradict the conclusions of individual studies.By integrating data across a spectrum of studies, the meta-analysis can reveal underlying trends and discrepancies that single studies may not detect, sometimes casting doubt on upcoming diagnostic approaches.This comprehensive approach prompts a re-evaluation of the evidence base and serves as a catalyst for more rigorous and standardized methodologies, ensuring that clinical practice is truly informed by the best available evidence, even when it overturns established beliefs.

 CONCLUSION
While the diagnostic accuracy of biomarkers in general EVs holds promise and surpasses CNS-originating EVs in distinguishing patients with PD from HCs, this approach remains far from feasible for practical translational use.Key points from our metaanalysis include: (1) An emphasis on the urgent need to diversify research beyond just PD, addressing the notable gap in the literature by focusing on other parkinsonian disorders; (2) Moderate diagnostic accuracy of biomarkers within EVs in differentiating PD patients from HCs, yet substantial heterogeneity and potential overestimation due to publication bias; (3) CSF-EVs showing superior diagnostic accuracy compared to plasma and serum, albeit based on a limited number of studies; and (4) The counterintuitive finding that general EVs, simpler to isolate, offer higher diagnostic accuracy than the labour-intensive CNS-originating EVs.However, it is essential to acknowledge that both methods demonstrate substantial bias and heterogeneity.The focus of research groups should shift towards harmonizing the field by striving for reliable and accurate results.This can be achieved through intensive independent validations, standardization of preanalytical factors, and methodologies.Collaboration, sharing best practices, and rigorous scientific investigation hold the potential to move this area of research from the realm of theory to practical clinical applications.Current efforts by the International Society for Extracellular Vesicles (ISEV) (Thery et al., 2018) and others (Gomes & Witwer, 2022;Van Deun et al., 2017) aim toward more rigorous reporting and standardization to enhance accuracy and reproducibility of research utilizing EVs.Lastly, authors are encouraged to report their detailed methodologies using the EV-TRACK platform (Van Deun et al., 2017) to allow for better reproducibility and accuracy.
Descriptive statistics of the diagnostic metrics of studies included in the meta-analysis.

F
I G U R E  Diagnostic accuracy of biomarkers in extracellular vesicles (EVs) for the differential diagnosis of Parkinson's disease (PD) from healthy controls (HCs).(a-e) Univariate Forest plots for sensitivity, specificity, diagnostic odds ratio (DOR), positive (posLR) and negative (negLR) likelihood ratios, respectively.(f) Summary receiver operating characteristics (SROC).The dotted circle shows the mean summary estimate of sensitivities and specificities using a bivariate model.The summary line is obtained from a hierarchical SROC model.TA B L E  Meta-analysis of diagnostic accuracy for patients with Parkinson's disease (PD) versus healthy controls (HCs) summary statistics for the bivariate and hierarchal summary receiver operating characteristic (HSROC) models.

F
I G U R E  Publication bias was assessed using (a) Begg's correlation, (b) Egger's regression, (c) Deek's regression, (d) Deek's funnel plot, (e) a bagplot and (f) a Funnel plot after application of the trim-and-fill method for biomarkers in extracellular vesicles (EVs) for the differential diagnosis of patients with Parkinson's disease (PD) from healthy controls (HCs).Collectively, they suggested a substantial presence of publication bias.The trim-and-fill method estimated two missing studies (shown as white circles) on the left side of the figure with either small or null diagnostic accuracy.The dotted circle shows the mean summary estimate of sensitivities and specificities using a bivariate model.The summary line is obtained from a hierarchical SROC (HSROC) model.
Comparison of meta-analysis of diagnostic accuracy for patients with Parkinson's disease (PD) versus healthy controls (HCs) by medium of isolation.CSF-Cerebrospinal fluid.SE-Standard error.AUC-Area under the curve.HSROC-Hierarchal summary receiver operating characteristics.