Embracing Monogenic Parkinson ’ s Disease: The MJFF Global Genetic PD Cohort

- - - - - - - - - - - - - - - - - - - - - - - - - - - - -

A BS TRACT: Background: As gene-targeted therapies are increasingly being developed for Parkinson's disease (PD), identifying and characterizing carriers of specific genetic pathogenic variants is imperative.Only a small fraction of the estimated number of subjects with monogenic PD worldwide are currently represented in the literature and availability of clinical data and clinical trialready cohorts is limited.Objective: The objectives are to (1) establish an international cohort of affected and unaffected individuals with PD-linked variants; (2) provide harmonized and qualitycontrolled clinical characterization data for each included individual; and (3) further promote collaboration of researchers in the field of monogenic PD.Methods: We conducted a worldwide, systematic online survey to collect individual-level data on individuals with PD-linked variants in SNCA, LRRK2, VPS35, PRKN, PINK1, DJ-1, as well as selected pathogenic and risk variants in GBA and corresponding demographic, clinical, and genetic data.All registered cases underwent thorough quality checks, and pathogenicity scoring of the variants and genotype-phenotype relationships were analyzed.
Results: We collected 3888 variant carriers for our analyses, reported by 92 centers (42 countries) worldwide.Of the included individuals, 3185 had a diagnosis of PD (ie, 1306 LRRK2, 115 SNCA, 23 VPS35, 429 PRKN, 75 PINK1, 13 DJ-1, and 1224 GBA) and 703 were unaffected (ie, 328 LRRK2, 32 SNCA, 3 VPS35, 1 PRKN, 1 PINK1, and 338 GBA).In total, we identified 269 different pathogenic variants; 1322 individuals in our cohort (34%) were indicated as not previously published.Conclusions: Within the MJFF Global Genetic PD Study Group, we (1) established the largest international cohort of affected and unaffected individuals carrying PD-linked variants; (2) provide harmonized and quality-controlled clinical and genetic data for each included individual; (3) promote collaboration in the field of genetic PD with a view toward clinical and genetic stratification of patients for gene-targeted clinical trials.© 2023 The Authors.Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society.
Key Words: Parkinson's disease; monogenic PD Rapidly advancing sequencing technologies offer new and cost-effective approaches to increasingly define genetic subtypes of common diseases.An illustrative example is Parkinson's disease (PD) that can be genetically stratified into subgroups of patients with well-established, albeit individually rare, genetic forms of PD.These are due to pathogenic variants in LRRK2, SNCA, VPS35, PRKN, PINK1, DJ-1, and GBA, the latter acting as the strongest known genetic risk factor of PD. 1 As a relatively common disease with only up to 10% accounting for genetic subtypes, 2 PD is not considered a hereditary disorder per se.As genetic testing is not a common or even standard element of the diagnostic workup due to the absence of gene-specific therapies, it is currently most often performed in a research setting.However, scientific interest in publications on clinical-genetic screening studies of well-established PD genes is continuously declining, whereas the advent of first gene-targeted therapies 3,4 immediately calls for well-characterized clinical trial-ready cohorts of variant carriers.
To address the lack of systematic data resources on monogenic PD, our team established the Movement Disorder Society Genetic Mutation Database (MDSGene, www.mdsgene.org).Although the actual number of PD patients with a genetic cause is estimated at 650,000, ie, 10% of the 0.65 million PD patients worldwide, 2,5 only a small fraction (n = 2120) of monogenic PD patients with individual patient information are contained in the international medical literature published in the English language.Availability of quantitative clinical data is limited, and there is a strong focus on motor symptoms and on select ethnicities in the literature. 6,7However, frequency, clinical expression, and penetrance of genetic variants may vary considerably across different populations and ethnicities. 8For example, a landmark study comparing patients with the same pathogenic variant in LRRK2 in a Norwegian and a Tunisian sample revealed a much higher proportion of affected variant carriers among the Tunisians than the Norwegians at a 60-year age cut-off.Knowing, understanding, and considering these population-specific factors facilitate the composition of study samples tailored to specific research questions or clinical trials.
The MDSGene resource served to systematically identify researchers following monogenic PD patients, 98% of whom expressed interest to jointly build up the MJFF Global Monogenic PD Study Group 8 to address the following aims: (1) establish an international cohort of individuals with PD-linked variants; (2) provide harmonized and quality-controlled clinical characterization data for each included individual; (3) further promote the collaboration of researchers in the field of monogenic PD with a view toward demographic, clinical, and genetic stratification of patients for gene-targeted clinical trials.

Data Collection Process
To collect individual-level data on the patients that had been reported to us in the first phase of the MJFF Global Genetic PD Project, 9 we developed an online survey.We focused on carriers of variants in genes associated with monogenic PD (LRRK2, SNCA, VPS35, PRKN, PINK1, DJ-1), but also included variants in the strongest genetic risk factor for PD, ie, GBA.Variants in other PD-linked genes were not included in the analyses (Appendix S1: Table 1).We collected detailed genetic information alongside demographic data, disease status, pedigrees, motor scales, nonmotor scales, risk factors, and medication (31 items, Appendix S1: Table 2).All members of the MJFF Global Genetic PD Study Group were invited to participate, and new members were included upon recommendation or request.The survey was open from October 2018 to March 2019, including two rounds of reminders and additional customized extensions of the deadline upon request by several study centers.After 1 year, from September to October 2020, we reopened the survey for its first annual update and invited members of the study group to update their data and to add newly identified individuals with PD-linked variants.An important part of the data collection process was the communication with study centers to keep them informed about the project, to address any questions regarding the survey, and to ensure a high quality of the collected data.

Nomenclature
The nomenclature of the genes follows the recommendations of the HUGO Gene Nomenclature Committee (www.genenames.org) with the exception of PARK7 that we refer to as DJ-1.Variants are annotated corresponding to the following transcript IDs: LRRK2: NM_198578.

Inclusion and Exclusion Criteria
All registered variant carriers underwent thorough quality checks regarding both clinical and genetic data (Fig. 1).The mandatory minimal data set for eligible samples comprised information on the genetic variant, sex, disease status, and age at onset.In case of any missing or contradictory information, we asked the submitting researcher for clarification.Duplicate submissions and samples with an unresolved clinical or genetic status were excluded from further analyses.

Pathogenicity Scoring
Variants of eligible individuals underwent pathogenicity scoring.The presumed pathogenicity of a genetic variant was taken from MDSGene for previously scored variants or assessed using MDSGene criteria, 6 (https://mdsgene.org/methods).The score is based on four items, including information on segregation, variant frequency in patients and controls, in-silico prediction using the Combined Annotation Dependent Depletion (CADD) score (http://cadd.gs.washington.edu/), and functional evidence extracted from published in-vitro and in-vivo studies.Based on these categories, a pathogenicity score was devised, and variants were classified as definitely pathogenic, probably pathogenic, possibly pathogenic, or benign. 6The MDSGene pathogenicity scoring was designed for causative, monogenic causes but is not applicable to variants with an MAF >1% such as risk variants in GBA.

The MJFF Global Genetic PD Cohort
To establish the cohort, we contacted researchers from 232 centers all over the world and obtained data from 92 centers in 42 countries.In total, 5571 cases were registered in our database, of whom 3888 were included in our analyses (1683 cases were excluded; for details, see Fig. 1).
The break-up of centers per continent is as follows: Europe: n = 48; Asia: n = 15; North America: n = 13; South America: n = 8; Australia: n = 3; Africa: n = 2 (Fig. 3A).About 75% of the world population (5.9 billion) inhabits the countries of origin included in our cohort, whereas the countries of 25% of the world population (2 billion individuals) are not yet represented in the MJFF Global Genetic PD Cohort (Fig. 3B).

Data Completeness
Per inclusion criteria, basic data such as age at onset were complete for all participants.Availability of clinical data across the cohort ranged from more basic features (77% for disease duration) to more complex assessment of nonmotor symptoms (41% for cognition).Motor scales were available for 51% (MDS-UPDRS or UPDRS) and 47% (Hoehn and Yahr Stage) of the individuals with PD, respectively.Information on medication was reported for 60% of the sample, and risk factor data (smoking, caffeine) were available for a subset (20%) of individuals.

Gene-Specific Findings
The median age at onset of PD was younger in individuals with variants in recessively inherited genes than in those with variants in dominantly inherited genes and GBA (Fig. 4 and Appendix S1: Tables 4-6).

Genetic Data and Pathogenicity Scoring
Across all cases (including monoallelic cases for PRKN, PINK1, and DJ-1), we found 266 different variants with 22% classified as definitely pathogenic, 48% as probably pathogenic, 30% as possibly pathogenic, and the four included GBA variants.Missense variants represent the most frequent variant type across all genes (84%) as well as for all genes individually, except for PRKN, in which structural variations were most common (51%).Candidate gene testing was the most frequently reported genetic test (42%), followed by PD gene panel (18%, Fig. 5).

Comparison With Published Data (MDSGene)
A total of 1275 individuals in our cohort (32%) were reported as not previously published.Comparing the numbers of individuals in our cohort with those of already published individuals curated in the MDSGene database, our cohort includes fewer individuals for most genes (79% for SNCA, 34% of VPS35, 65% for PRKN, 89% for PINK1, and 52% for DJ-1), but almost twice as many individuals with pathogenic variants in LRRK2 (181%).MDSGene data are overall comparable to data on sex, age at onset, and variant spectrum from our cohort for the most commonly mutated dominant (LRRK2) and recessive (PRKN) genes (Appendix S1: Figs.51  and 52).

The MJFF Global Genetic PD Study Group
The MJFF Global Genetic PD Study Group comprises 70 members initially identified through a search of corresponding authors of articles describing patients with monogenic PD included in the MDSGene database, 10 members additionally included from the Genetic Epidemiology of Parkinson's Disease (GEoPD) Consortium, and 90 (self-)referred members.All clinical and genetic information is being stored in a searchable database similar to the MDSGene database (www.mdsgene.org) that will be made available via the website of the Global Parkinson's Genetics Project (www.gp2.org) in the first quarter of 2023 upon completion of the ethical-legal framework for this database.A Steering Committee has been established and oversees the database as well as data use and access.Project suggestions from the study group or from external researchers will be reviewed by the Steering Committee for scientific and ethical content, as well as for potential overlap with ongoing analyses to avoid duplication of efforts and to promote collaboration among all interested researchers in the best possible way.The network welcomes new members on a rolling basis, and all current members are being contacted once a year for an update of potential new variant carriers to be included in the project.Communication is organized mainly via group or personal email by personnel at the coordinating site in Lübeck, currently having included 15 personal emails per data contributor.Due to the SARS-CoV-2 pandemic, in-person meetings at international conferences were currently possible only in 2018 and 2019 and are expected to be resumed in 2022.

Discussion
The MJFF Global Genetic PD Cohort is the first largescale international collection of individuals with PDlinked variants.Although 10% of the global PD population is expected to carry a pathogenic variant in LRRK2, SNCA, VPS35, PRKN, PINK1, DJ-1 or variants in GBA, published clinical data are overall limited and non-systematic and no well-defined clinical trialready cohort is available to date.Lack of an overall genetic testing routine continuously identifying patients with genetic forms of a progressive degenerative disorder, as is the case in PD, does impact the availability of a clinical trial-ready cohort.For example, the recent antisense oligonucleotide trial in Huntington's disease, 11 currently affecting an estimated 390,000-780,000 patients worldwide, recruited four patients per day.In contrast, the MOVES-PD trial in PD patients with pathogenic GBA variants (NCT02906020) comprising 8.5% of all PD patients, ie, an expected 550,000 individuals, was able to include only one patient every 4 days.In contrast to PD, for Huntington's disease, as well as for other monogenic disorders, there are well-established networks, such as the European Huntington's Disease Network (EHDN; http://www.ehdn.org).The MJFF Global Genetic PD Cohort and Study Group aims to close this gap for hereditary PD, which represents a considerable fraction of all PD and where several promising therapeutic options targeting specific genes or pathways have been entering the clinical trial stage.
Although the need for clinical trial-ready cohorts is undisputed, the MJFF Global Genetic PD Cohort serves two additional important purposes: First, it provides carefully quality-controlled clinical and genetic data  with detailed phenotypic information, including scores for motor-and nonmotor assessments.Second, it includes all available variant carriers followed by the contributing centers, which specifically encompasses unpublished ones representing about one third of our cohort, and more detailed individual-level clinical information on those individuals who have already been included in publications.As a special feature of our cohort, we report whether a participant is still available for future research projects and, in addition to that, the majority of researchers are willing to collaborate and to identify study participants for future projects.Our approach thereby counteracts the increasing trend of decreasing reporting of variant carriers in the literature and the related problem of publication bias toward patients with atypical presentations, as genotypephenotype studies of well-known genetic conditions are increasingly difficult to publish in traditional medical or genetic journals.
Our rigorous quality control, strongly supported by the high degree of responsivity and support of the contributing centers, resulted in the removal of about a third of all submitted variant carriers from the initially reported individuals.Reflecting global mobility and migration, we were able to include individuals originating from 65 countries, although our contributing centers were located in only 42 different countries.We tried to be as inclusive as possible by combining a systematic recruitment approach with "spreading-theword" efforts and were able to cover a significant proportion of countries across the globe, which harbor about three quarters of the world population.Notably, however, in many particularly populous parts of the world, we could only include a relatively small number of centers so that our recruitment efforts resulted in overrepresentation of Europe, parts of Asia, and North America, as also reflected by "white" being by far the most common ethnicity (91%) in our data set.
The clinical and genetic findings in our cohort are well compatible with previous descriptions, which is at least partially driven by the fact that about two-thirds of our cohort constitute previously published patients represented in the MDSGene Database, albeit now with much more comprehensive clinical information available and information on availability for follow-up studies (eg, 70% of the participants can be recontacted).As expected from Mendelian forms of PD, women account for about half of all of the described patients in our cohort.Median ages of onset range from 34 years (DJ-1) to 57 years (LRRK2).Interestingly, the majority (>40%) of variant carriers were identified by candidate gene sequencing, whereas panel sequencing was performed in only 20% of the patients.With the exception of PRKN, where half of the described variants were gene dosage changes, point mutations were by far the most prevalent variant type.
Limitations of the current MJFF Global Genetic PD Cohort are its predominant inclusion of white individuals and its limited outreach to underrepresented populations including the lack of participants from the African continent, and overrepresentation of certain countries due to a higher frequency of specific pathogenic variants in select populations, resulting in easier and more frequent genetic testing for these variants.Furthermore, the data comprise a relatively small minimal data set with gaps for more detailed clinical information beyond the minimal data set and limited availability of structured information on ethnicity.Notably, additional bias will have been introduced due to a focus on tertiary referral centers and academic settings, as well as variable access to genetic testing resources in different countries.In keeping with the latter notion, there has been heterogeneous assessment of pathogenic variants across sites, ranging from single gene sequencing to panels and exomes, thereby impacting on detectable variants and, consequently, frequency and type of pathogenic variants identified.Lack of universally accepted PD genetic testing guidelines and methods promotes this heterogeneity further.Strengths include the large amount of carefully curated clinical and genetic data on 4000 PD variant carriers, build-up of a strong and growing global network of doctors and researchers following PD variant carriers, a sustainable and user-friendly digital infrastructure for regular updates of the cohort, the timeliness of the effort while a number of clinical trials are already actively searching for eligible patients, inclusion of nonmanifesting carriers enabling the study of possible modifying factors of penetrance, and establishment of a cohort for potential future neuroprotective trials.
Regarding future perspectives, we are completing the development of a searchable database that will be made publicly available to facilitate and democratize data access, while all communication with patients and unaffected variant carriers will rest with the local centers in a decentralized fashion to protect patient confidentiality and comply with cultural, ethical, and legal requirements at the respective local centers.Additional future aims and opportunities include (1) in-depth data mining and inclusion of all potentially pathogenic variants (eg, in GBA); (2) further expansion of the study group and cohort to better reflect underrepresented populations; these aims will be achieved in conjunction with GEoPD and the recently established Global Parkinson's Genetics Program (GP2) 12 ; (3) performing regular annual updates to enable a sustainable and current resource; (4) creating a world map of genetic PD centers and facilities ranging from research facilities to information on clinical trial options to take international research and translational collaboration in PD genetics to a new level, which may also serve as a model for other rare disorders.

FIG. 1 .
FIG.1.Number of cases registered in the online survey and number of cases excluded after quality control and evaluating pathogenicity including reasons for exclusion.

FIG. 3 .
FIG. 3. (A) Countries of origin of individuals in the MJFF Global Genetic PD Cohort.This figure displays numbers for reported individuals with variants in Parkinson's disease (PD)-associated genes (including LRRK2, SNCA, VPS35, PRKN, PINK1, DJ-1, GBA, and also monoallelic carriers of variants in PRKN, PINK1, and DJ-1) with and without a diagnosis of PD (numbers after the slash represent subjects without PD).Missing data for 941 subjects (22%), mixed origin for six subjects (0.001%).Country names are abbreviated using the two-letter codes defined in ISO-3166-1 alpha-2.(B) Countries harboring centers that submitted individuals to be included in the MJFF Global Genetic PD Cohort.This figure highlights countries with centers participating in the MJFF Global Genetic PD Project in blue, and countries shaded in gray are not yet reflected in the cohort (for details, see Appendix S1: Supplement 3).