Comparison between four published definitions of hyposmia in Parkinson's disease

Abstract Objectives Hyposmia is a common feature of Parkinson's disease (PD), yet there is no standard method to define it. A comparison of four published methods was performed to explore and highlight differences. Materials and methods Olfactory testing was performed in 2097 cases of early PD in two prospective studies. Olfaction was assessed using various cut‐offs, usually corrected by age and/or gender. Control data were simulated based on the age and gender structure of the PD cases and published normal ranges. Association with age, gender, and disease duration was explored by method and study cohort. Prevalence of hyposmia was compared with the age and gender‐matched simulated controls. Between method agreement was measured using Cohen's kappa and Gwet's AC1. Results Hyposmia was present in between 69.1% and 97.9% of cases in Tracking Parkinson's cases, and between 62.2% and 90.8% of cases in the Parkinson's Progression Marker Initiative, depending on the method. Between‐method agreement varied (kappa 0.09–0.80, AC1 0.55–0.86). The absolute difference between PD cases and simulated controls was similar for men and women across methods. Age and male gender were positively associated with hyposmia (p < .001, all methods). Odds of having hyposmia increased with advancing age (OR:1.06, 95% CI:1.03, 1.10, p < .001). Longer disease duration had a negative impact on overall olfactory performance. Conclusions Different definitions of hyposmia give different results using the same dataset. A standardized definition of hyposmia in PD is required, adjusting for age and gender, to account for the background decline in olfactory performance with ageing, especially in men.

; the absence of hyposmia is a red flag (but not an exclusion criterion) in the clinical diagnostic criteria for PD . Olfactory dysfunction occurs in a range of neurodegenerative diseases, including PD at motor presentation (Silveira-Moriyama et al., 2009), prodromal PD (Barber et al., 2017;Campabadal et al., 2019;Lo et al., 2021;Noyce et al., 2014;Siderowf et al., 2012), progressive supranuclear palsy (Silveira-Moriyama et al., 2010), and Alzheimer's disease (Jung et al., 2019), but olfaction is generally normal in multiple system atrophy (Xia & Postuma, 2020) and Parkin-related PD . However, hyposmia also occurs with advancing age in healthy people and is more common in males than females (Doty et al., 1984;Stern et al., 1994), making its diagnostic use in PD challenging. Previously, the performance on olfactory testing in PD was considered to be constant and independent of disease duration (Doty et al., 1988), but recent evidence suggests that it declines with disease duration (Berendse et al., 2011). Over the years, there has been extensive research aiming to establish a standard method to define hyposmia, including the use of PD probability curves (Picillo et al., 2014;Silveira-Moriyama et al., 2009), analysis of area under the receiver operator characteristic curve (Baba et al., 2012;Bohnen et al., 2008;Rodriguez-Violante et al., 2014), and categorizing readings below a certain centile level (Ponsen et al., 2004), typically the 15th percentile (Pont-Sunyer et al., 2015;Siderowf et al., 2012;Sierra et al., 2013).
The two most commonly applied tests of olfaction are the University of Pennsylvania Smell Identification Test (UPSIT) which has 40 odors identified from scratch-and-sniff panels and Sniffin' Sticks (SS) which has 16 odors identified from uncapped pens. Both tests have a forced choice option from four odors and have published normative data centiles stratified by age and gender, and results can be combined using established equivalence methods .
There is a wide range of results reported for the prevalence of hyposmia in PD. Some of this heterogeneity is likely to be due to lack of standardization. Several methods have been used to define abnormal olfaction, often involving adjusting for age, gender, or both. It is likely that variation in the analytical methods has contributed to the wide range of reported hyposmia in PD of between 74.3% and 100% (Szewczyk-Krolikowski et al., 2014;White et al., 2016).
Our aim was to compare four established methods that have been previously applied in PD research (Doty, 2008;Noyce et al., 2014;Siderowf et al., 2012;Silveira-Moriyama et al., 2009) to define the rates of hyposmia in early PD, and compare this to normative data.
We sought to empirically highlight the effect of using those definitions, thereby aiding interpretation of past studies, as well as informing decision making in the interpretation of olfaction testing in future clinical research.

Data sources
Olfaction test results were analyzed from two longitudinal cohorts of recent onset Parkinson's disease, Tracking Parkinson's, a UK mul-ticenter prospective study of cases diagnosed within the preceding 3.5 years, and the Parkinson's Progression Markers Initiative (PPMI), a United States led multicountry study of newly diagnosed cases (Marek et al., 2011). All patients had a clinical diagnosis of PD, fulfilling UK Brain Bank criteria (Malek et al., 2015), or supported by abnormal presynaptic dopaminergic imaging (Hughes et al., 1992;Marek et al., 2011).
Olfaction was tested in Tracking Parkinson's at 6 months after recruitment, using either the British 40-item version of UPSIT, or the 16item SS, and in PPMI at study entry applying the 40-item US version of UPSIT. UPSIT is a "scratch-and-sniff" test for 40 odors with a forced choice from four options per odor, and gives a maximum score of 40 (Doty, 2008). The Sniffin' test consists of 16 odors from a "smell pen", again with a forced choice from four items per odor, and a maximum score of 16 (Hummel et al., 1997). We converted UPSIT to SS scores using an algorithm developed from item response theory, as previously reported .

Definitions of hyposmia
We used four definitions of hyposmia as follows: (a) Method 1 (agecorrected) defines patients aged 60 years or older as hyposmic when UPSIT score is below 24, while patients aged under 60 years are hyposmic when the score is below 29 (Silveira-Moriyama et al., 2009).
(b) Method 2 (gender-corrected, using absolute UPSIT values) creates an ordinal variable that also incorporates severity of hyposmia; this defines males scoring below 19 as anosmic, between 19 and 25 as severely hyposmic, between 26 and 29 as moderately hyposmic, between 30 and 33 as mildly hyposmic, and above 34 as normosmic.
For females, anosmia and severe hyposmia have the same cut-offs, while moderate hyposmia is a score of 26-30, mild hyposmia is 31-34, and scores above 35 are normosmic. This method can also be used to create a binary cut-off for hyposmic versus normosmic (Doty, 2008). (c) Method 3 (age and gender corrected) defines hyposmia as UPSIT scores at or below the 15th centile based on normative data obtained from healthy individuals (Siderowf et al., 2012). We applied an extension of this definition, using "smoothed" cut-points as previously demonstrated . (d) Method 4 (percentile method) uses a score below the 15th centile of the population studied to define hyposmia (Noyce et al., 2014) as implemented in the PREDICT-PD study, where eligible healthy participants aged 60-80 years, identified in part through the Parkinson's UK membership list had an olfactory assessment. Participants scoring at or below 27 in UPSIT were classified as hyposmic ("at-risk").

Other variables
The two study cohorts were compared in terms of motor severity, mea- Cohen's kappa (κ) which is widely used to compare observations and methods, and "corrects" for chance agreement. Since kappa can be affected by prevalence of the index condition (e.g., hyposmia), resulting in paradoxically low or high values (Feinstein & Cicchetti, 1990), we also measured agreement using Gwet's AC1, which also has the advantage of not requiring independence between tests (Gwet, 2008). A general rule of thumb proposed by Koch and Landis (Landis & Koch, 1977)  Six different outcome measures were investigated. A binary outcome was used (hyposmic versus normosmic) for each of the four definitions stated above (Methods 1-4). We also examined olfaction as an ordinal outcome (anosmic, severe, moderate, mild hyposmia, and normosmic) as defined in Method 2 above, and finally using olfaction as a continuous variable with a range from 0 to 16 on the raw or converted SS scale.
For each of the four binary outcomes, we used logistic regression and adjusted each model appropriately: for disease duration and gen- We also created a simulated "normative" dataset by applying the data reported for healthy controls by age and gender (Doty, 2008)

RESULTS
In Regardless of method, the prevalence of hyposmia was higher in the PD cases than in the normative simulated control population ( Figure 2 and Table 2). Unsurprisingly, Method 2 resulted in the highest percentage of PD patients being classified as hyposmic and the same was true for simulated controls, followed by Method 4, 1, and then 3.
This was seen in both men and women. Both Methods 2 and 3 reduced any gender differences for the PD and simulated control data. The differences in the proportion of hyposmia between cases and controls was largest for women (60.1%) using the 15th centile uncorrected method (Table 2) and was similar for men (59.4% and 59.8%) using the 15th centile uncorrected method and the age-corrected cut-offs. Method 2 showed the smallest differences between PD and control subjects.
Levels of agreement among the four methods varied (Table 3). In Tracking Parkinson's, the continuous olfactory data had normal residuals with a slight negative skewness (left tail). We found that male gender (p < .001), increasing age (p < .001) and longer disease duration (p = .01) had a negative impact on patients' overall olfactory performance when using the continuous score in our linear regression model (

DISCUSSION
The classification of cases as hyposmic, or the diagnosis of hyposmia, based on clinic-based olfaction testing, shows substantial variability according to the method applied. Given the association of patients' age and gender with olfactory performance, and despite the choice of analysis outcome (continuous test scores or hyposmic/normosmic binary status), correction for age and gender is strongly advised when assessing the level of olfactory impairment in PD. Using the Tracking Parkinson's dataset to analyze all four methods, our results showed that between 68% and 98% of PD patients are defined as hyposmic. Moreover, hyposmia was present in a substantial proportion of healthy individuals, between 14% and 54% depending on the method applied. As expected from such wide ranges, agreement between test methods was limited according to both the kappa statistic, and the alternative Gwet's AC1 measure. It is therefore not surprising that different studies add further variability according to the type of olfaction testing method, the age and gender ratios of patients studied, as well as the disease duration in the population studied, in findings regarding the application of olfactory testing in Parkinson's disease (Gaig et al., 2014;White et al., 2016).
Both kappa and the robust Gwet's AC1 coefficient showed substantial agreement between classifying subjects with scores below the 15th centile as hyposmic (Method 4) or using an age-corrected cut-off in Tracking Parkinson's (kappa = 0.66 vs AC1 = 0.85) and PPMI (kappa = 0.74 vs AC1 = 0.79). A moderate to substantial agreement was evident between Method 4 and using an age-and gender-corrected 15th centile (Method 3) to classify PD patients (Table 3). We found some evidence of a modest association between hyposmia and longer disease duration. There was a significant association between disease duration and olfactory performance in two of the models in Tracking Parkinson's (the linear regression model and ordered logistic regression model) but not in any of the logistic regression mod-els. This may reflect the loss of power when using a dichotomous rather than ordinal or continuous outcome (Altman & Royston, 2006  . Neither study had a gold or reference standard test to compare the screening criteria with, but this is usually the case in large scale prospective studies of PD. It remains possible that a more detailed olfactory assessment than odor identification (which we employed) involving odor threshold testing and odor discrimination (Hummel et al., 1997), could prove more sensitive and/or specific in differentiating between PD and healthy controls, but these tests are more time-consuming and seldom performed in large clinical studies.
In conclusion, standardized testing and the use of consistent statistical modelling for olfactory impairment can help in identifying hyposmia as a risk marker for PD, therefore refining at-risk cohorts in clinical studies.

ETHICS STANDARDS AND INFORMED CONSENT
All participating sites from both studies received approval from an ethical standards committee on human experimentation before study initiation and in accordance with the Declaration of Helsinki and the Good Clinical Practice (GCP) guidelines. Written informed consent was obtained from all patients included in the study. Data used in the preparation of this article were also obtained from the Parkinson's Progression Markers Initiative (PPMI) database

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.