Deficits in or preservation of basic number processing in Parkinson’s disease? A registered report

Neurodegenerative diseases such as Parkinson's disease (PD) have a huge impact on patients, caregivers, and the health‐care system. To date, the diagnosis of mild cognitive impairments in PD has been established based on domain‐general functions such as executive functions, attention, or working memory. However, specific numerical deficits observed in clinical practice have not yet been systematically investigated. PD‐immanent deterioration of domain‐general functions and domain‐specific numerical areas suggests the mechanisms of both primary and secondary dyscalculia. The current study will systematically investigate basic number processing performance in PD patients for the first time, targeting domain‐specific cognitive representations of numerosity and the influence of domain‐general factors. The overall sample consists of patients with a diagnosis of PD, according to consensus guidelines, and healthy controls. PD patients will be stratified into patients with normal cognition or mild cognitive impairment (level I‐PD‐MCI based on cognitive screening). Basic number processing will be assessed using transcoding, number line estimation, and (non)symbolic number magnitude comparison tasks. Discriminant analysis will be employed to assess whether basic number processing tasks can differentiate between a healthy control group and both PD groups. All participants will be subjected to a comprehensive numerical and a neuropsychological test battery, as well as sociodemographic and clinical measures. Study results will give the first broad insight into the extent of basic numerical deficits in different PD patient groups and will help us to understand the underlying mechanisms of the numerical deficits faced by PD patients in daily life.


| INTRODUC TI ON
Modern societies currently face the challenges of demographic changes, with age-associated diseases bringing medical, financial, and psychological burdens to patients, caregivers, and the healthcare system. In the elderly population above 60 years of age, the overall prevalence of mild cognitive impairment ranges between 12% and 18% (Petersen, 2016), whereas dementia affects 5% to 7% (Prince et al., 2013). Severe cognitive impairment and dementia are associated with a loss of independent functioning and social and legal competence (Sherod et al., 2009). Therefore, supporting active participation in society and functional independence of the elderly people (with cognitive deficits) seems to be one of the most crucial challenges.
Parkinson's disease (PD) is characterized by apparent motor deficits like bradykinesia, rigidity, and resting tremor (Postuma et al., 2015). The presence of nonmotor symptoms is common in PD patients and has a substantial impact on their quality of life (Prakash et al., 2016). The development of Parkinson's disease dementia (PDD), especially in more advanced PD disease stages, leads to an increased risk for nursing home placement and mortality (Aarsland et al., 2000;Bjornestad et al., 2017). The presence of mild cognitive impairment in PD patients (PD-MCI) is associated with a shorter window of time before PDD onset and is therefore considered as one of the most important risk factors for dementia (Aarsland et al., 2017).
The onset of cognitive dysfunction varies over the disease course, and deficits can sometimes already be identified at the time of PD diagnosis using sensitive instruments (Kehagia et al., 2010;Martinez-Horta & Kulisevsky, 2019).
Societal burdens caused by PD on a regional to global scale clearly show the need to develop diagnostic instruments for the disease-immanent cognitive deficits to anticipate the burden on patients and caregivers (Dorsey et al., 2018). Considering the most recent knowledge about PD neuropathology and symptoms, different cognitive biomarkers might serve this purpose (Roheger et al., 2018).

| Parkinson's disease-Epidemiology and clinical symptoms
Parkinson's disease is the second most common neurodegenerative disease worldwide with over 6 million affected patients globally (Dorsey et al., 2018). Prevalence differs depending on sex and age with a higher incidence in men and the elderly people (Hirsch et al., 2016). In terms of cognitive symptoms, the prevalence of PDD in the United States is predicted to at least double between 2015 and 2060 (Savica et al., 2018). All of these epidemiological estimates show that PD already signifies a huge concern for the public health system and will become even more crucial in future due to demographic changes (Dorsey et al., 2018;Kowal et al., 2013).
The apparent cardinal symptoms of PD are hypokinesia and bradykinesia alongside at least rigidity and/or resting tremor (Postuma et al., 2015). Nonmotor symptoms can include dysautonomia (e.g., incontinence, sleep disorders) or be psychiatric in nature (e.g., depression;Jankovic, 2008). Furthermore, the cognitive symptoms discovered thus far imply deficits in the domains of executive functions, working memory and attention, memory, language, and visuospatial functions (Aarsland et al., 2017;Litvan et al., 2012).
Further differentiation of cognitive deficits in PD can be pursued on a continuum ranging from normal cognition (PD-NC) over the initial stage of clinically significant cognitive disorders (PD-MCI) to PDD, implying a therapeutic indication. The diagnosis of PD-MCI is based on impairments in global cognition or in at least one of the domains of executive functions, working memory and attention, memory, language, and visuospatial functions (Emre et al., 2007;Litvan et al., 2012). Deficits in PDD additionally affect the activities of daily living (ADL) such as dressing, housekeeping, and financial management (Christ et al., 2013).
Cognitive disorders in PD patients are characterized by marked heterogeneity, with 30% to 60% moving from PD-NC to PD-MCI or PD-MCI to PDD within 3-4 years (Janvin et al., 2006;Lawson et al., 2017). The development of PD-MCI can arise from frontostriatal malfunctioning (triggered by dopaminergic deficits) and posterior cortical impairments (caused by Lewy bodies and nondopaminergic deficits) with the latter being most predictive of PDD (Barker & Williams-Gray, 2014). Whether this posterior cortical profile describes one homogenous etiology is still being investigated (Monchi et al., 2016). The cognition of PD patients is further impacted by factors such as education, aging, comorbidities, and gender (Lin et al., 2018). According to recent findings, one of the most important risk factors for PDD-besides male sex, old age, and severity of motor symptoms-is the occurrence of PD-MCI (Aarsland et al., 2017). Therefore, the development of sensitive instruments for the identification of PD-MCI patients with a high risk of PDD progression is necessary to initiate timely individual and societal compensatory and supportive strategies (Martinez-Horta & Kulisevsky, 2019). Surprisingly, the domain of numerical cognition Significance Numerical activities in daily living are crucial for economic, medical, and legal independence of the elderly participants. As demented patients make errors with money, public transport, or medication early in the disease process, these numerical deficits need to be investigated thoroughly as they negatively impact the quality of life, mortality, and caregiver burden. Parkinson's disease (PD) is an age-related neurodegenerative disease for which little is known about the basic number processing abilities of patients. Therefore, the current study addresses basic research into numerical processing of patients with PD and may also have implications for diagnosing numerical deficits in this population. has been neglected despite the apparent deficits of PD patients observed in clinical practice.
At the PD-MCI stage, cognitive deficits are mild enough to be addressed by therapeutic interventions (Leung et al., 2015). The aim of cognitive training is to improve daily functioning and increase selfsufficiency. Within this framework, a very important domain necessary for daily functioning is numerical literacy, including financial capabilities, time management, and household planning. The management of complex numerical and arithmetic processes requires the integration of different basic number processing functions. If these functions are impaired, deficits need to be considered in therapy, remediation, and intervention. However, the lack of research in this field does not allow for any conclusions to be drawn. For this reason, the current study will systematically investigate deficits in the basic number processing abilities of PD patients based on established neurocognitive theories.

| Theories of basic number processing
Possible deficits in the number processing abilities of PD patients can be characterized by looking at the underlying functions of nonpathological numerical cognition. Neurocognitive models of number processing (i.e., the triple code model and its extensions; Dehaene & Cohen, 1995;Dehaene et al., 2003;Klein et al., 2016) consider number magnitude as the core representation of numerosity, which can be further differentiated. The approximate number system (ANS) is supposed to underlie basic numerical operations, whereas magnitude is also represented symbolically for more accurate processes (Cantlon et al., 2009;Dehaene, 2001Dehaene, , 2009). Whether the same number module underlies these two (non)symbolic number magnitude representations is controversial (Fias et al., 2003;Holloway et al., 2010;Lyons et al., 2012;Matejko & Ansari, 2016). The acuity of the ANS can be evaluated using nonsymbolic number magnitude comparisons. Both symbolic and nonsymbolic number representations can be addressed by studying distance effects in (non)symbolic number magnitude comparison tasks.
Furthermore, there are visuospatial magnitude representations (Dehaene et al., 2003;Nuerk et al., 2011). These can explain the so-called spatial-numerical associations, where different numerical attributes are associated with space (Cipora et al., 2018). Number line estimation can be used to assess this visuospatial representation of magnitude.
An additional prerequisite for the processing of multi-digit numbers is place × value integration. For the appropriate magnitude judgment as well as the application of mathematical operations on multi-digit numbers, distinct places and corresponding values need to be identified, manipulated, and/or integrated .
Place × value integration is revealed by the occurrence of specific errors in transcoding tasks as well as the unit-decade compatibility effect in symbolic number magnitude comparisons of two-digit Arabic numbers. To investigate all of these representations of number processing, an effect-based approach is preferred over a task-based approach and will be used in the current study (Nuerk et al., 2011).
For basic number processing to be performed successfully, both domain-specific numerical and domain-general functions are necessary (covered by the frontoparietal network of number processing; Klein et al., 2016). On the one hand, deficits in basic number processing can be primarily based on domain-specific numerical functions such as number magnitude representation, place-value integration, or knowledge of mathematical procedures. On the other hand, secondary deficits in basic number processing can also result that are triggered by deficits in domain-general functions such as attention, working memory, or executive functions, as these functions are a prerequisite for successful number processing (Knops et al., 2017). Although there is evidence for domain-general deficits in PD patients (Litvan et al., 2012), the effect of PD-characteristic neurodegeneration on domain-specific numerical cognition will be investigated here for the first time.

| Basic number processing and Parkinson's disease
Despite the lack of research on basic number processing of PD patients, hypotheses about lost or preserved functions can be generated based on nonpathological aging effects/deficits in other neurodegenerative diseases such as Alzheimer's disease (AD), and the neuroanatomical and functional overlap between basic number processing and PD neuropathology.
Considering human ontogenesis, number processing changes with age. Domain-general factors such as attention, working memory, processing speed, and executive functions play a role in healthy aging processes concerning number processing (Hinault & Lemaire, 2016;Li et al., 2004;Schretlen et al., 2000). Numerical deficits associated with aging were found in the precision of the ANS (Halberda et al., 2012) and in financial abilities (Finke et al., 2017). This illustrates the implications of domain-specific and domain-general deficits in healthy numerical aging processes and is the reason why the current study uses a healthy elderly control group.
Research on AD has already identified several pathophysiological mechanisms and predictive biomarkers. The diversity of the deficits experienced by AD patients indicates that there may be heterogeneous profiles in the domain of number processing and the overlap with PDD neuropathology is only partial (e.g., the cholinergic system; Bohnen et al., 2003). Therefore, conclusions from these studies are informative, but limited, for PD (Hugo & Ganguli, 2014).
Deficits in the number processing abilities of AD patients are unique, with the occurrence of specific intrusion errors in transcoding. For this reason, transcoding tasks are suggested for use as part of the AD diagnostic procedure (Kalbe, 1999;Rosselli et al., 1998). AD patients also show compensatory brain activation mechanisms shifting to premotor areas due to parietal dysfunction (Ouchi et al., 2004).
This shift in neuronal activation might differ for PD, as neurodegeneration in PD also involves the premotor cortex (Lindenbach & Bishop, 2013).
Numerical activities of daily living (NADL) are crucial for the economic and legal literacy of patients affected by neurodegeneration (Sherod et al., 2009). As the ability of AD patients to complete the associated tasks such as financial management, public transport, or medication management deteriorates early in the disease process (Martini et al., 2003;Sherod et al., 2009), the systematic investigation of deficits in number processing in the context of neurocognitive diagnoses is needed for medicolegal evaluations (Simpson, 2014). In most cases when ADL deficits become apparent in a PD patient, the individual has progressed from a state of PD-MCI to PDD (Emre et al., 2007). Because of their negative impact on the quality of life, depression, mortality, and caregiver burden, ADL deficits in PD patients need to be identified as early as possible (Leroi et al., 2012). In order to fully understand the genesis of NADL deficits, basic number processing needs to be investigated thoroughly as it represents an earlier stage of numerical impairment. Kalbe (1999) showed that PDD patients have impairments in counting, transcoding, and number line estimation. It can only be inferred that some aspects of basic number processing are impaired in PD at the advanced stage of dementia; however, whether deficits are present at earlier disease stages remains unclear. The task-based approach applied by Kalbe (1999) also does not allow for conclusions regarding representations of basic number processing which could be achieved with an effect-based approach. Additionally, Kalbe's study was conducted with small sample sizes, decreasing generalizability to the population level. It also focused on inferences regarding AD, despite the PDD group showing worse performance than the control, AD, and vascular dementia groups in seven out of 14 numerical tests. The comparison of the performance of nondemented PD patients with healthy elderly individuals on complex arithmetic tasks showed the presence of several deficits (Liozidou et al., 2012;Martin et al., 2013;Zamarian et al., 2006). These findings indicate the need to systematically investigate which basic numerical representations might be the underlying cause of complex arithmetic deficits and how the course of degeneration affects both PD-NC and PD-MCI patients.
Due to the lack of information on basic number processing in nondemented PD patients, hypotheses to be investigated in the current study can be generated based on the neuroanatomical and functional overlap between basic number processing and the progression of PD neuropathology. The neuropathology of PD is characterized by the degeneration of the basal ganglia, which are responsible for dopamine production and are active in neuronal loops of motor control and number processing Moustafa et al., 2014). Additionally, degeneration occurs in the cerebellar and additional subcortical and cortical brain structures, depending on the stage of disease progression (categorized using Braak stages ranging from one to six; Braak et al., 2003;Caligiore et al., 2016).
On the molecular level, an imbalance of the dopaminergic, cholinergic, noradrenergic, and serotonergic systems is accompanied by a characteristic abnormal cortical and subcortical accumulation of protein, the so-called Lewy bodies (Braak et al., 2003). The variety of affected neuronal structures and functions explains diagnostic challenges and the complexity of PD symptoms.
In the course of PD progression, brain areas, which are necessary for basic number processing, become affected sequentially. Several of these areas are involved in domain-specific numerical function so their degeneration implies that patients may exhibit the processes of primary dyscalculia. The reduction of acetylcholine production in the nucleus basalis magnocellularis (beginning in Braak stage 3) is associated with a weakening of projections to the neocortex, such as parietal and temporal areas (beginning in Braak stage 5;Jellinger, 2018).
This might affect the responsibility of the intraparietal sulcus for (non)symbolic number magnitude representation and place × value processing as well as superior parietal regions needed for visuospatial representations on a mental number line (Jellinger, 2018;Moeller et al., 2015;Winter et al., 2015). Furthermore, visual number form might be affected and is represented in the temporal cortex (Arsalidou & Taylor, 2011;Koob et al., 2014). Inflammation in the angular gyrus and Lewy body-induced degeneration of the hippocampus and posterior cingulate gyrus might affect retrieval from verbal memory and the verbal representation of numerosity (Jellinger, 2018;López González et al., 2016;Uribe et al., 2018).
Lewy Bodies in frontal, cingulate, and the temporal cortex emerging in Braak stage 5 further impact visuospatial functions and number processing (Arsalidou & Taylor, 2011;Collerton et al., 2003;Klein et al., 2016). In Braak stage 6, lesions in first-order sensory association areas of the neocortex and premotor areas occur, occasionally accompanied by mild changes in primary sensory areas and the primary motor field (Braak et al., 2003). The decrease in dopamine levels further impacts intracortical inhibition in the motor cortex, worsening time estimation (Lindenbach & Bishop, 2013). These processes can be linked to deficits in the embodiment of number processing and phonological processing in premotor and supplementary motor areas (Winter et al., 2015). The degeneration of domaingeneral functions necessary for basic number processing can lead to secondary dyscalculia in PD patients, with degeneration occurring in early disease stages. Dopamine production beginning to decrease in the striatum in Braak stage 3 later resultsin reduced projections to the frontal lobe and consequently deficits in verbal fluency, processing speed, working memory, memory retrieval, attention, and executive functions (Braak et al., 2003;Dirnberger et al., 2005; Martinez-Horta & Kulisevsky, 2019; Rinne et al., 2000).

| Objectives of the current study
The current evidence regarding numerical deficits in PD is scarce and the differences between PD-NC and PD-MCI have not been investigated at all. Therefore, the aim of the current study was to identify deficits in the basic number processing abilities of PD patients with and without mild cognitive impairment as compared to a healthy control (HC) group. To better understand the underlying mechanisms of behavioral deficits, the influence of both domaingeneral and domain-specific functions will be addressed. The final aim was to investigate whether performance on basic numerical tasks can be used to differentiate between the different cognitive statuses of HC, PD-NC, and PD-MCI and thus be applied in the diagnostic assessment of PD-immanent cognitive disorders.
As level I-PD-MCI based on cognitive screening (level I-PD-MCI) patients are situated at a point of considerable cognitive impairment, they are expected to exhibit worse performance than HC and PD-NC groups. Whether performance of the PD-NC group is closer to the HC or the level I-PD-MCI group will be examined. 3. Can basic number processing performance be used to discriminate between HC, PD-NC and level I-PD-MCI patients? (H3:

Discrimination between the cognitive statuses of PD by the tests of basic numerical cognition)
Basic numerical processes will be investigated with the help of (a) transcoding and (b) number line estimation tasks, and (c) nonsymbolic and (d) symbolic number magnitude comparisons.

| Statistical power analysis and sample size estimation
The sample will be composed of the three groups HC, PD-NC, and level I-PD-MCI. Required sample sizes for each hypothesis were computed based on empirical estimates using G*Power (Faul et al., 2007(Faul et al., , 2009. Effect sizes were calculated and transformed using Psychometrica (Lenhard & Lenhard, 2016).
Systematic studies on the impairment of basic number processing in PD, which can be used for effect size calculations, are scarce. Kalbe (1999) reports differences with large effect sizes between a HC and a PDD group in transcoding (d = 1.755), number line estimation (d = 0.868), and magnitude comparison (d = 1.051). The available effect sizes are expected to be overestimates due to publication bias and the comparison with a cognitively more deteriorated PD patient sample than the ones to be tested in the current study. Therefore, the smallest effect size was lowered to the conventional effect size for large effects of d = 0.8 (Cohen, 1992) for separate power analyses estimating the required sample size for the three research questions. Consequently, parameters for the effect size calculations were set to d = 0.8, α = 0.05, and power = 0.9. In the case of participant exclusion, new participants will be recruited for substitution. In case of early ceasing of the testing due to attrition, new participants will be recruited, and data of the dropped out patient will only be included in analyses when the patient gives informed consent and has already been assigned to a cognitive group based on the MoCA classification. As performance of PD patients in cognitive tests can be severely affected by other clinical confounders (e.g., age, gender, education, UPDRS-III and/or Hoehn and Yahr staging, disease duration, depression), a maximum of six covariates will be included in the analyses. For testing all three hypotheses, the same subjects will be used with a total sample of at least n = 117 valid data sets (as required for the most complex analysis for H2). Number of patients per group should be approximately n = 39 participants, but this might vary slightly due to the recruitment process and the outcome of the cognitive diagnosis based on the neuropsychological test data. We were not able to estimate effect sizes for comparison between the two PD groups, as there are no data published yet on this topic.

| Participants
This study has been approved by the ethics committee of the University of Tübingen's medical faculty (161/2020BO2) and has been registered at the Deutsches Register für Klinische Studien (DRKS-ID: DRKS00021091) and with the World Health Organization (Universal Trial Number: U1111-1257-2901). Patients will be recruited via the PD outpatient clinic and in collaboration with rehabilitation facilities specialized in PD. In addition, previously studied PD patients who gave their consent (Ethical vote: 199/2011BO1) to be contacted in case of further potential study participation will be contacted. Patients' caregivers will be recruited as HCs according to defined inclusion and exclusion criteria. Furthermore, pensioners' initiatives will be contacted for control group recruitment and the study will be advertised using the university mailing system. All participants will receive monetary compensation. In the recruitment process, participants will be matched on the group level according to the following criteria: age (M ± 5 years) and gender (max. 65% male).
This will result in sociodemographic and clinical group means that will be as similar as possible. Disease duration cannot be corrected for in this way because cognitive status is confounded with disease duration in PD-MCI patients, who also have a longer PD history as indicated by PD-specific disease progression (Lin et al., 2018). As PD medication can heavily influence experimental performance, all patients will be tested in their "on-state" and will be allowed to take their regular medication during the session if necessary.
Inclusion and exclusion criteria are listed separately for all three groups in Table 1. Idiopathic PD diagnosis according to the United Kingdom Brain Bank criteria (Hughes et al., 1992) needs to be confirmed by a movement disorder specialist in the outpatient clinic of the University Hospital of Tübingen. Verification of the diagnosis is required before inclusion in the current study and needs to be documented in a patient's record. The assessment of the UPDRS-III motor examination will be conducted based on a training in the PD outpatient clinic. The HC group will be recruited as similarly as possible to the two PD groups in regard to age, education, and gender.
Additionally, a spouse, child, other relative, or friend of legal age of all PD patients who agreed to report on the presence and severity of PD-related ADL problems (see section caregiver assessment) will be asked for an assessment of the patient.
The cognitive diagnosis leading to assignment to the PD-NC group, the level I-PD-MCI group, or exclusion in the case of PDD will be carried out according to the level I MDS Task Force criteria for PD-MCI and PDD (Emre et al., 2007;Litvan et al., 2012). This The impairment of ADL is operationalized with an FAQ cutoff score of ≥ 3 (see Becker et al., 2021). For the cognitive diagnosis, patients and caregivers will be asked whether they have noticed progressive deterioration of the patient's cognitive state. The same MoCA cutoff score will be used for the exclusion of individuals with cognitive • Diagnosis of PDD (Emre et al., 2007) impairment from the HC group. The MoCA (Nasreddine et al., 2005;Thomann et al., 2018) is a short screening instrument for global cognitive functions, evaluating short-term memory, visuospatial and executive functions, attention, language, and orientation to time and space. It has a maximum score of 30 and uses education status correction for patients with 12 years of education or less.

| Materials
Our materials and programmed experiments will be openly available via the Open Science Framework (https://osf.io/ap6je/). If the participants gave their consent for study inclusion, inclusion and exclusion criteria will be checked during a clinical visit or via a telephone call using a semi-standardized questionnaire.

| Introductory interview
An introductory interview will be used at the beginning of experimental sessions to acquire information on sociodemographic (age, gender, years of education, handedness, and mother tongue) and clinical variables (diagnosis, age at disease onset, medication for de- will be calculated and included in the analysis. The subtest UPDRS-IV will be carried out to assess dyskinesia, motor fluctuations, and dystonia based on six items with a score ranging from 0 to 24 points.
The Hoehn and Yahr score ranging from 1 to 4 (1 = unilateral involvement only; 2 = bilateral involvement without impairment of balance; 3 = mild to moderate involvement, some postural instability, physical independence, and need for assistance in recovery from pull test; 4 = severe disability, and ability to stand and walk unassisted) will be assessed as an additional measure for PD severity. Along with performance on the UPDRS-III, the items from UPDRS-II are needed to classify patients as being of the tremor or postural instability and gait disorder motor type (Stebbins et al., 2013).

| Neuropsychological test battery
Performance in the cognitive domains of executive functions, working memory and attention, verbal memory, language, and visuospatial functions will be assessed with at least one test per domain. For each test, raw scores will be included as covariate measures.
For executive functions, inhibition will be assessed with the subtest Go/No Go "2 out of 5" in the test battery of attention (TAP = Testbatterie zur Aufmerksamkeitsprüfung; Zimmermann & Fimm, 2017). Participants are required to discriminate five types of stimuli and to react to only two of them. Performance on this test will be measured by the number of errors.
Working memory abilities will be assessed in a letter span forward and backward task (as in Soltanlou et al., 2015). Participants will have to listen to letter strings of increasing length. In the first part, they will have to reproduce them in the same order, and in the second part, the letters have to be reproduced in reverse order. The relevant outcome variable will be the maximum length of reproduced letter strings backwards. Visuospatial working memory will be assessed with the Corsi block-tapping test, requiring to mimic a sequence of blocks tapped on by the experimenter in the correct forward and backward order (Corsi, 1972;Kessels et al., 2000). Sequences increase starting from two blocks with a maximum of nine, when both items of the same length are correctly reproduced, and the outcome measure is the longest Corsi backward span correctly reproduced. Attention will be measured with the TAP subtest Alertness (Zimmermann & Fimm, 2017). The test requires a simple reaction to a visually presented stimulus with or without prior notice via an auditory cue. The median RT in conditions without an auditory cue (as a measure of intrinsic arousal) will be used as the outcome variable. The German version of the WAIS-IV subtest Similarities (Petermann, 2012) measures language operationalized as the conceptual understanding of two words by indicating what they have in common. The total number of correctly solved items will be used as a measure of word knowledge.
The Benton Line Orientation Test (Benton et al., 1978) is a measure of visuospatial function where two lines of a certain angle and position are presented on a sheet of paper. Patients are supposed to compare these two lines, with 11 lines being arranged in a starshaped fashion around a center. Identifying the two lines out of the displayed 11 lines correctly is scored as 1 point and a maximum sum score of 15 can be achieved.

| Numerical tasks
Basic number processing will be assessed with computerized procedures and a paper-and-pencil test. The tasks are typical tasks used to assess basic number processing and the related numerical representations. The current study design requires a detailed assessment instead of an overall screening, and the control group will be used to compensate the lack of standardized norms. The required response mode is adjusted to the PD-immanent motor impairments in as much as cards with single-digit numbers will be used for transcoding to avoid handwritten responses. Additionally, responses in the reaction time tasks are given with easier-to-handle TAP keys (magnitude comparisons) and the mouse (number line estimation). There will be no time restrictions in any of the tasks, but testing will be stopped if a participant cannot answer the first five experimental trials of the current task.
The transcoding task consists of transforming auditory number words into written Arabic digits (16 items) as well as written Arabic digits into written number words (16 items) in a paper-andpencil format, with the experimenter presenting items and coding responses manually. The experimental items will be preceded by a practice block of three trials each. The outcome measures will be overall accuracy and number of errors per category (error categories similar to Kalbe, 1999;Zuber et al., 2009 The accuracy of a range-dependent number-to-space mapping will be assessed in a computerized number line estimation task. Participants will have to locate 20 given numbers per number line (range: small 0-100, large 0-1,000,000) with the arrow keys of a standard QWERTZ keyboard. For each range, numbers will be equally distributed over the entire number line and the percent absolute error per range will be calculated as the measure of (in)accuracy.
The ANS will be assessed with a nonsymbolic number magnitude comparison task. Two groups of dots will be presented simultaneously next to each other for 200 ms. This presentation phase will be followed by the participant's response with a left-or right-hand key press to indicate which group of dots is larger. After an interstimulus interval of 200 ms, the next trial will be presented. Proportions between the two groups of dots will comprise four different ratios (1:2, 3:4, 5:6, and 7:8) with 48 trials each and a minimum group size of four. Furthermore, stimuli will be manipulated based on the visual attributes' total occupied area, individual dot size, convex hull area, and mean occupancy (i.e., the inverse of density) in a balanced design within each ratio (see Guillaume et al., 2020). Performance will be evaluated with average reaction time per ratio and the Weber fraction as a measure of accuracy with small fractions indicating high accuracy (as in Dietrich et al., 2015).
A symbolic number magnitude comparison task will be used to  et al., 1982) is used to measure current social function in older adults.
Participants will have to rate their level of performance (ranging from 0 = normal to 3 = dependent) in 10 different daily life activities, resulting in a sum score characterizing social function.
Health-related quality of life will be assessed with the single index score of the 39-item Parkinson's Disease Questionnaire (PDQ-39; Jenkinson et al., 1997). The eight dimensions, such as activities of daily living and social support, will be coded on a scale ranging from 0 (= perfect health) to 100 (= worst health).

| Caregiver assessment
For a broader evaluation of participant quality of life, the FAQ will also be filled out by a caregiver. Furthermore, caregivers will be asked for their relationship with the participant, their demographic data (i.e., age and gender), and the frequency, as well as the intensity, of contact with the participant. For the cognitive diagnosis, caregivers will be asked whether they have noticed a progressive deterioration of the patient's cognitive state.

| Procedure
Participants will give written informed consent. In the second step, participant eligibility will be assessed according to the defined inclusion and exclusion criteria. PD patients will be assigned to PD-NC or level I-PD-MCI groups according to their cognitive performance. As the current study belongs to a larger project, the testing will be conducted together with further measures which are not included in the current publication. Participants will attend two sessions of 1.5 to 2 hr each.
In order to handle patient attrition, there will be breaks within each session when patients need them. In the first experimental session, the sociodemographic questionnaire, and the numerical tasks transcoding, number line estimation, and nonsymbolic and symbolic number magnitude comparison will be carried out in this order. Afterward, participants and caregivers will complete the clinical scales and questionnaires which may also be filled out at home between the first and second session. The second session consists of the MoCA, clinical variables, motor assessment, and neuropsychological test battery in this order. The two sessions may be scheduled 3 weeks apart at maximum.

| DATA TRE ATMENT AND PROP OS ED ANALYS IS PIPELINE
Data analyses will be run with R version 3.4.1 (R Core Team, 2014), and JASP version 0.11.1.0 (JASP Team, 2018). Data will be managed using REDCap electronic data capture tools (Harris et al., 2009). Anonymized data and analysis scripts will be freely available on the Open Science Framework (https://osf.io/ap6je/). For all inferential statistics, an αlevel of 0.05 will be assumed and effect sizes will be estimated.

| Exclusions
Participants with missing data will be excluded in a case-wise manner for the analyses including the respective measure. For forced choice format tasks (nonsymbolic and symbolic number magnitude comparison, TAP tasks, and Benton Line Orientation Test), participants need to achieve an accuracy (ACC) of at least 75% to be included in the analysis of the respective measure. In the number line estimation task, understanding of task instruction will be checked for by monotony in responses, with participants being excluded if they do not answer overall in a format of numbers ascending from left to right (indicated by a minimum positive space-magnitude correlation of 0.3). Understanding of task instruction in transcoding will be assessed with a minimum accuracy of 50% in single-digit trials.
Furthermore, participants will be excluded from the respective analysis if their performance exceeds 3 SD below the group M.

| Reaction times
Data trimming for reaction times (RTs) will follow Baayen and Milin (2010). After the exclusion of incorrectly solved trials, the RT distribution will be inspected with per-subject quantile-quantile plots.
This method is pursued to identify the theoretical model for data transformation in order to achieve a normal distribution. The most appropriate distribution will be chosen based on the best model fit and used to transform the data. The next step considers two levels of outlier exclusion. Physically impossible RTs will be considered to be anticipations faster than 200 ms and be excluded. Then, outliers will be excluded based on model criticism. This will be achieved with Shapiro tests for normality. Those data points with absolute standardized residuals exceeding 3 will be removed. The last step consists of corrections for temporal dependencies. First, autocorrelation functions by participant will be carried out. Afterward, a regression model will be fitted to responses with a log-transformation for latencies including the covariates trial number and preceding RT.

| Accuracy
Depending on the best model fit for the empirical distribution of accuracies, these data will either be arcsine-or logit-transformed.

| Assumption check
After these separate steps of data preprocessing, specific assumptions for statistical hypothesis testing will be checked. In the case of a violation of linear model assumptions for the ANCOVA, a robust ANCOVA will be conducted. Predictors of the logistic regression will be checked for collinearities based on a variance inflation factor below 10.

| Group characteristics
Sociodemographic, clinical, and cognitive variables will be compared between the three groups in order to identify possible confounding | 2399 LOENNEKER Et aL.
differences between them. The categorical variables gender, motor type, and Hoehn and Yahr staging will be characterized as total amount per category and corresponding percentage and compared with chi-squared tests between HC and PD-NC and between PD-NC and level I-PD-MCI groups. Continuous variables will be described with M (SD) and compared between HC and PD-NC groups and between PD-NC and level I-PD-MCI groups with an independent samples t test. The frequentist group comparisons will be supported with the calculation of Bayes factors. These variables are sociodemographic (age and education years) and clinical (disease duration, age at onset, LEDD, and intake of antidepressants) variables. Furthermore, motor (UPDRS-III & IV, motor type) and cognitive function will be compared. The last group comparison addresses sum scores of completed questionnaires (nonmotor symptoms: NMSQuest, depression: BDI-II, ADL self-report and caregiver report: FAQ, health-related quality of life: PDQ-39). At this point, we will report the number of HC, PD-NC, and level I-PD-MCI participants being impaired in any of the cognitive measures.

| Hypothesis testing
The three groups will first be compared regarding sociodemographic and clinical variables. We assume age, gender, education, UPDRS-III (with motor type), and/ or Hoehn and Yahr staging, disease duration, and depression could significantly differ between groups. In this case, we will include a maximum number of six clinical covariates in all models. The maximum number of covariates will be set to seven for the analysis of H2, allowing the additional integration of the cognitive variable with the largest association (minimum correlation of 0.3) for each numerical task separately. We expect to find an ordinal trend for the HC group outperforming both PD-NC and level I-PD-MCI groups, and the PD-NC group outperforming the level I-PD-MCI group. For all numerical tasks concerning hypotheses 1 and 2, the ordinal trend of performance decreasing from HC over PD-NC to level I-PD-MCI groups will be tested with pairwise ANCOVAs that include the covariates of the respective analysis.

| Basic numerical deficits in PD (H1)
The transcoding task will be analyzed using an ANCOVA with the factor group (HC, PD-NC, or level I-PD-MCI) and the dependent variable ACC. Error categories (lexical errors, syntactic errors, dementia-specific errors, and education-and disease-dependent errors; no rest category) will be compared between groups (HC vs. the respective covariates for each numerical task. These ANCOVAs will be conducted on RTs for the respective effects. Following this frequentist approach, Bayesian ANCOVAs will be calculated to allow for further conclusions on the presence or absence of effects.

| Discrimination between cognitive statuses of PD patients by the tests of basic numerical cognition (H3)
The last diagnostic research question will be answered using logistic regression. This discriminant analysis with multiple predictors will be conducted on z-standardized performance scores for tasks differing between groups (i.e., transcoding, number line estimation, nonsymbolic, and/or symbolic number magnitude comparison) as well as significant numerical effects (i.e., Weber fraction, distance effect, and/or unit-decade compatibility effect) with the dependent variable of cognitive status (HC, PD-NC, or level I-PD-MCI). Covariates included in the preceding analyses will also be included in the model (i.e., a maximum of six clinical confounders and four cognitive covariates). The probability for members of the three groups to be (correctly or falsely) assigned to each group will be calculated based on the regression as a measure of prognostic value.

| Possible limitations
The study design of the current project cannot control for all confounding effects coming from heterogeneous clinical samples.
Differences in disease duration, medication, and staging of PD severity (Hoehn and Yahr staging) between the groups could limit the scope of the study, which is why variables differing between study groups will be included as covariates in the statistical model.
However, interdependences between clinical covariates such as LEDD, motor symptoms, and disease duration need to be considered, and we will limit the number of clinical covariates to six in order to obtain a reasonable complexity of statistical models. As all PD patients will be tested in medication "on" state after intake of the regular dopaminergic medication, no major influence on patient's performance is anticipated. Although we do not anticipate effects, we also think that including patients with medication has higher ecological validity, because the clinical reality is that PD patients, who are well diagnosed and investigated, get dopaminergic medication if this is deemed helpful. PD patients are also prone to cognitive fluctuations in association with attention. To limit the effect of attention deficits, critical tests will be administered at the beginning of sessions with breaks in-between to address patient attrition.
The fact that PD natural disease course is heterogeneous suggests analyzing clinical subtypes. First of all, the issue of patient groups differing too much will be addressed by potentially controlling for the UPDRS-III score as a covariate in the current study.
The sample we want to test consists of a patient group as it presents in the clinic, inducing limited generalizability as patients might represent different PD subtypes. As the proportion of the respective subtypes will potentially be very small and we lack specific hypotheses for numerical cognition, a confirmatory analysis in line with open science and replicability principles would be difficult to achieve.
However, following the Editor's advice, we will "look into subtypes or biotypes (…) (even if preliminary and potentially underpowered, which could also be a specific limitation)" in an exploratory subtype analysis, investigating the effects of disease stage based on Hoehn and Yahr staging, motor type (Stebbins et al., 2013), and/ or age of onset defining common PD subgroups. These findings on heterogeneous subgroups can provide information to guide future research. Furthermore, we try to limit PD heterogeneity by using the diagnosis of idiopathic PD and an age of 60 years or more as inclusion criteria. Other PD subtypes probably included in our study sample will be analyzed in exploratory analyses (e.g., stratifying by age of disease onset, Hoehn and Yahr staging, and/ or motor type), because our sample size affects statistical power for subgroup analyses and consequently hinders confirmatory analyses of small subsamples.
We will also control for covariates differing between the two PD groups by including them in the ANCOVAs.
Furthermore, generalizability of our study is limited due to cultural specificities of basic number cognition-for example, studies with children show that performance in German samples is affected by the inversion property of the German multi-digit number system when compared to other (European or Western) cultures (Göbel et al., 2014). Generalizability to as well as comparability with other neurodegenerative diseases are also limited. However, it is not feasible to include other patient control groups such as AD patients in the current study design. The comparison with other neurodegenerative diseases is an interesting next step to generalize or differentiate number deficits across various neurodegenerative diseases, after we identified a systematic pattern of number cognition deficits in PD. As evidence on PD is rare, but research on numerical deficits in AD already has a tradition, the crucial and recent task is to systematically investigate PD before trying to compare it with other neurological diseases.
In the current study, we do not correct for multiple comparisons, Another psychometric issue is the assessment of language skill.
The guideline by the Movement Disorder Society on the PD-MCI diagnosis (Litvan et al., 2012) suggests the Boston Naming test, the WAIS-IV subtest similarities (Petermann, 2012), and the Graded Naming Test (McKenna & Warrington, 1983). However, all of them show methodological constraints: The Boston Naming test displays a ceiling effect and poor psychometric properties (Harry & Crowe, 2014), the WAIS-IV subtest similarities are confounded with executive function, and the Graded Naming Test does not exist in a German adaptation and is confounded with linguistic and cultural attributes. Therefore, we opted for the WAIS-IV subtest similarities, because we consider it most suited for our needs regarding difficulty and cultural standardization. However, we want to critically acclaim that it does not only assess language skills, but also assess the facets of executive function. Consequently, future work on a methodologically sound neuropsychological test of language skill in German PD patients based on patholinguistic expertise would be greatly desired for our scientific practice.
If our first study verifies deficits in number cognition in PD, further studies in larger samples might build on our behavioral findings by collecting biomarkers (blood, CSF, imaging) to identify the underlying pathomechanisms. In the current study, we are interested in numerical effects occurring under regular intake of dopaminergic medication, to make inferences for an ecologically valid sample and with relevance for daily living. However, it would be interesting for follow-up studies to investigate the pathomechanisms by addressing the influence of PD medication (i.e., comparison between on-and off-state) on numerical deficits.
Finally, our target group most likely has been diagnosed before the most recent diagnostic criteria from the movement disorder society (Postuma & Berg, 2017;Postuma et al., 2016Postuma et al., , 2018 were put into practice. Therefore, diagnosis will be based on the U.K. Parkinson's Disease Society Brain Bank clinical diagnostic criteria (Hughes et al., 1992), despite their diagnostic inferiority because the MDS criteria cannot be applied retrospectively. Due to the complex and time-consuming design of the study, we had to opt for the diagnosis "level I-PD-MCI based on cognitive screening" using the MoCA to define patients' global cognitive status. However, comprehensive test batteries allow for a more detailed evaluation of cognitive status, also differentiating single-and multi-domain PD-MCI. When assessing the progression of numerical deficits in association with cognitive decline, future studies should use a comprehensive test battery to explore the association between cognitive status and numerical deficits. Based on basic numerical deficits identified in the current study, future studies should assess the influence of numerical deficits on patients' everyday life.

| Potential unexpected outcomes
Heterogeneous patient profiles might lead to ambivalent results when averaging over participants in our analyses. To address the heterogeneity, we might explore the data with subject-by-subject analyses with individual regression slopes to identify effect sizes per participant. These individual error profiles might show the patterns of co-occurring or distinct errors. Additionally, the taxonomy for transcoding errors was adapted from children and Alzheimer's studies, so it might be necessary to extend it in case of Parkinsonspecific transcoding errors not categorized in the literature so far.
Furthermore, numerical deficits might be associated with specific subgroups of PD patients (e.g., gender effects; Loenneker et al., 2020) which might be addressed in exploratory analyses.

| FURTHER PRO CEDURE S
Testing is planned for after critical revisions of the preregistered review and following in principle acceptance. However, experiments can only start when the current COVID-19 situation permits humanto-human testing with elderly participants. Recruitment and testing phases are estimated to take 9 months. Data analysis and preparation of the final manuscript are expected to be finished 4 months after the last experimental session.

ACK N OWLED G M ENTS
HL received scholarships of the Landesgraduiertenfoerderung

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

AUTH O R CO NTR I B UTI O N S
All the authors had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the

PEER R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1002/jnr.24907.

DATA AVA I L A B I L I T Y S TAT E M E N T
Anonymized data, own materials, programmed experiments, and analysis scripts will be freely available on the Open Science Framework (https://osf.io/ap6je/).