Evaluation of the NIH Toolbox Odor Identification Test across normal cognition, amnestic mild cognitive impairment, and dementia due to Alzheimer's disease

Abstract INTRODUCTION Olfactory decline is associated with cognitive decline in aging, amnestic mild cognitive impairment (aMCI), and amnestic dementia associated with Alzheimer's disease neuropathology (ADd). The National Institutes of Health Toolbox Odor Identification Test (NIHTB‐OIT) may distinguish between these clinical categories. METHODS We compared NIHTB‐OIT scores across normal cognition (NC), aMCI, and ADd participants (N = 389, ≥65 years) and between participants positive versus negative for AD biomarkers and the APOE ε4 allele. RESULTS NIHTB‐OIT scores decreased with age (p < 0.001) and were lower for aMCI (p < 0.001) and ADd (p < 0.001) compared to NC participants, correcting for age and sex. The NIHTB‐OIT detects aMCI (ADd) versus NC participants with 49.4% (56.5%) sensitivity and 88.8% (89.5%) specificity. NIHTB‐OIT scores were lower for participants with positive AD biomarkers (p < 0.005), but did not differ based on the APOE ε4 allele (p > 0.05). DISCUSSION The NIHTB‐OIT distinguishes clinically aMCI and ADd participants from NC participants. Highlights National Institutes of Health Toolbox Odor Identification Test (NIHTB‐OIT) discriminated normal controls from mild cognitive impairment. NIHTB‐OIT discriminated normal controls from Alzheimer's disease dementia. Rate of olfactory decline with age was similar across all diagnostic categories. NIHTB‐OIT scores were lower in participants with positive Alzheimer's biomarker tests. NIHTB‐OIT scores did not differ based on APOE genotype.

• NIHTB-OIT scores were lower in participants with positive Alzheimer's biomarker tests.
• NIHTB-OIT scores did not differ based on APOE genotype.

BACKGROUND
Alzheimer's disease (AD) is a progressive neurodegenerative disease that affects memory and cognition, leading to reduced independence in activities of daily living and lower life expectancy. 1 The gradual progression from a state of normal cognition (NC) to dementia of the Alzheimer's type (ADd) involves an intermediary stage termed amnestic mild cognitive impairment (aMCI), wherein memory impairments are present, but not severe enough to impact daily living. 2 Identifying individuals in the earliest disease stages and at high risk for developing ADd will help target preventative measures and slow disease progression.Recent attention has focused on functional changes in sensory systems, particularly in olfaction, that may either influence or predict aMCI-and ADd-related cognitive impairment. 3,4factory decline has been proposed as an early indicator of aMCI and ADd. 3,5,6[12][13][14] Estimates suggest that 85% and 90% of individuals with ADd have impaired olfaction, 15,16 and the severity of olfactory decline correlates with the severity of cognitive impairment. 17,181][22][23] Olfactory deficits correlate with the degree of tau pathology, 24,25 and occur earlier in life for APOE ε4 allele carriers. 26,27pically, brain imaging, extensive cognitive testing, and specialist evaluation are required to clinically diagnose aMCI and ADd. 28aluating pre-clinical ADd risk factors using genotyping or testing for known AD biomarkers is time-consuming and expensive.AD biomarker tests are invasive, involving positron emission tomography (PET) scans with injected radioactive tracers or lumbar spinal taps to evaluate cerebrospinal fluid markers.0][31] Identifying a low-cost, non-invasive, and easily interpretable measure that can accurately flag individuals at risk of cognitive decline is thus imperative to facilitating earlier diagnosis.Olfactory identification tests are timeand cost-effective and simple to administer, and may serve to identify these at-risk individuals so that they may be referred for more in-depth neuropsychological evaluation and care.
In the present study, we evaluate performance on the National Institutes of Health Toolbox Odor Identification Test (NIHTB-OIT) 32 as part of a multisite study, Advancing Reliable Measurement in Alzheimer's Disease and Cognitive Aging (ARMADA). 33The ARMADA study is unique in that it collected data from a population-based sample across

METHODS
In the present study, we evaluate performance on the NIHTB-OIT for a general population cohort of adults aged 65 and over, across NC, aMCI, and ADd diagnoses.The dataset used in the present study was obtained through the overarching ARMADA study. 33 The ARMADA study was reviewed and approved by the Northwestern University (lead site) Institutional Review Board (IRB #STU00205290).
In addition, each of the participating sites also submitted ARMADA for review to their own IRBs and received approval.All research was completed in accordance with the Helsinki Declaration.
Special emphasis was placed on diversity, equity, and inclusion in recruiting participants so that the resulting general population dataset is approximately racially representative of the United States population.Additionally, emphasis was placed on recruiting a group of NC participants over the age of 85 in order to evaluate performance on the NIHTB-OIT for this age group.Available data through the ARMADA study includes scores on the NIHTB tests, 34,35 and measures of cognitive functioning, health history, and mental health history through the Uniform Data Set (UDS) procedures adopted by ADRCs. 36,37For a subset of participants, APOE genotype and/or results from a variety of AD biomarker tests (cerebrospinal fluid [CSF] Aß40/42, total tau, or hosphor-tau measures, and/or amyloid-PET imaging) were also available.

NIH Toolbox Odor Identification Test
The NIHTB-OIT 32   amyloid-PET methods and CSF biomarker methods are included in table 4 of the original ARMADA project methods paper. 33detailed demographic description of the ARMADA study's baseline visit cohorts (NC aged 65 to 84, aMCI, and ADd) was published in a previous paper.42 This previous publication includes detailed information on participant demographics including race and education, recruitment and diagnostic methods, clinical characteristics including CDR scores and the Neuropsychiatric Inventory Questionnaire, 43 and biomarker and APOE data availability across groups.A summary of the demographics, available APOE genotype data, and AD biomarker group assignments of participants included in the present study is provided in Table 1.

Statistical analyses and evaluation of the NIHTB-OIT across diagnostic groups
For the present study, we first calculated NIHTB-OIT score summary statistics across diagnostic categories.These included the mean, standard deviation, range, N at floor, N at ceiling, skewness, and kurtosis of the distribution of odor identification scores for each group.
We then evaluated NIHTB-OIT scores using a multiple linear regression model with the equation: We fit this main-effects model for the entire participant population (N = 389) with the NC aged 65 to 84 and NC aged ≥85 groups collapsed into one NC group in order to characterize performance on the NIHTB-OIT across the entire sampled age span.With this model, we assessed relationships between olfactory performance, age, and diagnosis, while controlling for sex.We also fit a second model including an interaction term between age and diagnosis, and a third main effects model excluding the NC aged ≥85 from the NC group.For each multiple regression model, age was centered at 77.8 years, the mean age of the entire participant pool.
We then used two logistic regression models to evaluate the NIHTB-OIT's ability to distinguish between diagnostic categories.The first model evaluated whether odor scores, age, and sex can accurately predict whether a participant is in the aMCI versus the NC aged 65 to 84 group with the following equation: ln where

RESULTS
Table 2 displays the summary statistics for the NIHTB-OIT scores across each clinical cohort.In addition to the baseline scores presented in Table 2,

NIHTB-OIT scores across age and diagnostic categories
We fit a multiple linear regression model to evaluate the decline of NIHTB-OIT performance with age and differences in odor scores across diagnostic categories.The model was significant with adjusted R2 = 0.30 (F 4,384 = 40.85,p < 0.001).The resulting model coefficients are displayed in Table 3 (Equation 1).This model suggests that NIHTB-OIT scores decrease by 0.07 points for every year of increase in age (1 point every 14.3 years).Females scored on average 0.530 ± 0.186 points higher than males across diagnostic categories (p < 0.005).Participants in the aMCI group scored on average 1.295 ± 0.235 points lower than NC participants (p < 0.001), while those with ADd scored on average 2.675 ± 0.258 points lower than those with NC (p < 0.001).
A second model was evaluated including an interaction term between age and diagnosis.The interaction terms were not significant, so they were not included in the final model (see Table S1).A third model was evaluated excluding the NC aged ≥85 participants, as they were oversampled compared to the N = 21 participants age 85 and over in the aMCI and ADd groups.The coefficients of this model were very close to those observed in the model with the NC aged ≥85 participants included (see Table S2).While the effect of sex was significant in the full model, with females scoring higher than males across all diagnosis groups, sex differences were not significant within each diagnosis group, evaluated using Welch two-sample t-tests.

Utility of the NIHTB-OIT for detecting aMCI and ADd
To determine whether the NIHTB-OIT is useful for detecting aMCI and ADd compared to NC aged 65 to 84, we computed ROC curves based on two logistic regression models.The models were fit with male as the baseline group, and centered at the mean age of 77.8 years (see Table 3, Equations 2-3).In the P(aMCI) model, using Equation (2) shown in Section 2.3, every 1-point increase in NIHTB-OIT score was associated with a 25.3% decrease in relative risk of aMCI (p < 0.001).Females had a 60.8% decrease in relative risk of aMCI compared to males for a given NIHTB-OIT score (p < 0.005).Each 1year increase in age was associated with a 10.8% increase in relative risk for aMCI (p < 0.001).In the P(ADd) model, based on Equation (3), every 1-point increase in odor score was associated with a 51.3% decrease in relative risk of ADd (p < 0.001).In this model, age and sex were not significantly associated with changes in relative risk of ADd.
For a threshold of 0.50, the sensitivity and specificity of the P(aMCI) model were found to be 49.4% and 88.8%, respectively.The positive predictive value of this model was 69.6%, while the negative predictive value was 77.1%.The calculated AUC of the P(aMCI) ROC plot was 0.78 (95% CI 0.72 to 0.85).For a threshold of 0.50, the sensitivity and specificity of the P(ADd) model were found to be 56.5% and 89.5%, respectively.The positive predictive value of this model was 68.7%, while the negative predictive value was 83.4%.The calculated AUC of the P(ADd) ROC plot was 0.86 (95% CI 0.81 to 0.92).The ROC plots are displayed in Figure 2A.The fitted probability values for having aMCI or ADd are plotted against the NIHTB-OIT scores for each participant in

NIHTB-OIT scores based on AD biomarker presence
The mean ± standard deviation NIHTB-OIT scores for the biomarkerpositive group and biomarker-negative group were 5.60 ± 2.15, and 6.71 ± 1.80, respectively.A three-way ANCOVA was used to evaluate differences in odor scores across biomarker groups while controlling for age and sex.There was a significant main effect of biomarker status (F 1,161 = 7.941, p < 0.005), where participants who had a positive AD biomarker test scored lower on the NIHTB-OIT than participants with a negative AD biomarker test.The main effects of age and sex were also significant (age: F 1,161 = 7.923, p < 0.005; sex: F 1,161 = 5.745, p < 0.05).
The distribution of NIHTB-OIT scores based on AD biomarker status is shown in Figure 3A.

Differences in NIHTB-OIT scores based on APOE ε4 allele status
The mean ± standard deviation NIHTB-OIT scores for the APOE ε4 allele carriers versus non-carriers were 5.72 ± 2.21, and 5.99 ± 2.18, respectively.A three-way ANCOVA was used to evaluate differences in NIHTB-OIT scores between participants with at least one APOE ε4 allele and participants with no APOE ε4 alleles, while controlling for age and sex.The main effect of APOE ε4 allele status was not significant (F 1,271 = 2.852, n.s.), indicating that NIHTB-OIT scores did not significantly differ based on APOE ε4 allele status.The main effects of age and sex were significant (age: F 1,271 = 16.926,p < 0.0001; sex: F 1,271 = 14.611, p < 0.0005).The distribution of NIHTB-OIT scores based on APOE ε4 allele carrier status is shown in Figure 3B.

Differences in NIHTB-OIT scores between NC participants aged 65 to 84 and NC participants aged ≥85
The mean and standard deviation NIHTB-OIT scores for the two NC age groups are listed in Table 2.A two-way ANCOVA was used to evaluate differences in NIHTB-OIT scores between NC participants aged 65 to 84 and NC participants aged ≥85, while controlling for sex.In this model, the main effect of age group was significant (  b The present study targeted recruitment of aMCI and ADd individuals, and does not reflect a random sample of the general population.Thus, the computed relative risk for aMCI and ADd with increasing age in our sample may not accurately reflect the prevalence or relative risk of these disorders in the general population.

DISCUSSION
We found that mean NIHTB-OIT scores decrease with age at a similar rate across all diagnostic categories, but were significantly lower for (A) (B) Other odor identification tests, including the University of Pennsylvania Smell Identification Test (UPSIT), 44 the Brief (Cross-Cultural) Smell Identification Test (B-SIT), 45 and the San Diego Odor Identifica-tion Test (SDOIT), 46 have been previously investigated for their utility in detecting prevalent cases of aMCI and ADd.A recent study identified the 10 most predictive odors on the UPSIT for amnestic disorders, and for a cutoff of 70% correct, these 10 odors performed well at identifying prevalent amnestic disorders (aMCI and ADd; 74% sensitivity, 71% specificity) and prevalent ADd (88% sensitivity, 71% specificity) compared to controls. 16Another study found that 40% of participants with aMCI and 10% of those with ADd performed at a normosmic threshold (8/12 correct) on the B-SIT, compared to ∼70% of participants with non-amnestic MCI or subjective memory complaints. 4714]16 Conversely, less than 4% of participants with normosmic performance on the B-SIT declined to dementia over a 4-year period, 48 and normosmic performance on the SDOIT had a negative predictive value of 97% for new incidence of cognitive impairment over a 5-year period. 13Future directions of the ARMADA study will include longitudinal analyses to determine whether poor NIHTB-OIT scores are similarly related to later cognitive impairment.
While the NIHTB-OIT performs similarly to other olfactory tests in detecting prevalent cases of aMCI and ADd, the NIHTB-OIT may be more suited for routine clinical use.The SDOIT test, while brief, presents odorants inside small containers impractical to maintain on hand within a clinical setting. 46The UPSIT test is much longer (40 odorants) and recent work has suggested that shorter variations, including the B-SIT, have similar predictive power. 16,49However, the NIHTB-OIT may be more accessible compared to the B-SIT, which uses word-based multiple-choice options in a pencil-and-paper format.The NIHTB-OIT, in contrast, provides picture, word, and verbal response options on a touch-screen tablet, and may help to control for test-taker variability in visual, auditory, or lexical ability.One previous study found that a picture-based odor identification test was more reliable than the word-based B-SIT in distinguishing AD participants from controls in a Japanese population. 50 found a weak effect of sex on NIHTB-OIT scores, with females scoring significantly better than males across (but not within) all diagnostic categories.Females have been known to outperform males on tests of olfaction, although usually to a small degree.For a subset of participants, AD biomarker and APOE genotype data were available.We found that NIHTB-OIT scores were significantly lower for participants who had a positive AD biomarker test compared to those who had negative AD biomarker tests, collapsed across available CSF Aß40/42, total tau, or phospho-tau measures and amyloid-PET measures.This is consistent with previous findings that the level of AD biomarkers, particularly tau pathology, correlates with performance on odor identification tests. 24,25,53Future targeted investigations into the relationships between odor identification scores and amyloid and tau burden will help determine the usefulness of these tests for identifying participants with AD-specific pathologies.
We determined that NIHTB-OIT scores did not differ significantly between participants with no APOE ε4 alleles, and participants with at least one APOE ε4 allele.The APOE ε4 allele is associated with an increased lifetime risk of developing aMCI and ADd, 54 more so for populations with European ancestry compared to populations with African ancestry, 55 but when comparing APOE ε4 allele status and olfactory deficits, there have been mixed results. 26,27,56,57Very large sample sizes may be required to detect the incidence of olfactory deficits for APOE ε3/ε4 heterozygotes. 27Within our sample (N = 275), only 19 were identified as APOE ε4 homozygotes.Our sample size is likely too small to detect such single-gene effects.[59] We determined that the NIHTB-OIT has low sensitivity and high specificity for classifying aMCI versus NC, and ADd versus NC.This suggests that a high score on the NIHTB-OIT indicates a low likelihood of having aMCI or ADd, but a low score does not provide enough evidence for a definitive diagnosis of aMCI or ADd.While diminished olfaction is a common symptom in aMCI and ADd, it is not specific to these diseases, and has also been associated with Parkinson's disease, 60,61 Huntington's disease, 62 multiple sclerosis, 63,64 traumatic brain injuries, 39 strokes, 40 and cognitively healthy aging. 32The lack of data on smoking status 65 is also a limitation of the current study.
Additionally, following the COVID-19 pandemic, more attention must be paid to viral causes of olfactory loss, and potentially associated longterm effects on neurological health. 66,67While the NIHTB-OIT cannot provide a definitive diagnosis, we suggest that this easily administrable and cost-effective test may be included in senior patients' annual physical exams.If a score below 5 is obtained, and other contributing factors and rhinological conditions can be ruled out, the patient may be referred for in-depth neuropsychological evaluation by a specialist to determine whether other symptoms of cognitive impairment are present.
The generalizability of odor identification tests across different cultural groups must be considered.These tests require familiarity with the presented odors and their correct linguistic descriptors.Many odors are language-or culture-specific.9][70] We caution that the NIHTB-OIT may not be accurate in evaluating aMCI risk outside of the United States population.Although the ARMADA study aimed to recruit participants reflecting the racial/ethnic distribution of the U.S. population, 71 minority populations are still underrepresented in the final sample (e.g., Black participants in the ADd group; and Asian, American Native, or participants of other races and Hispanic/Latino ethnic identity).A future aim of the ARMADA study is to evaluate whether the NIHTB-OIT can detect aMCI and ADd in two larger cohorts of African American and Spanish-speaking participants.
In summary, we have provided evidence for the association between NIHTB-OIT scores and diagnoses across the cognitive aging spectrum, and we have demonstrated that this test has high predictive power for classifying NC versus aMCI, and NC versus ADd.

CONCLUSION
Olfactory decline is common in aMCI and ADd, and likely occurs earlier on than objective cognitive decline.In the present study, we evaluated performance on the NIHTB-OIT across healthy controls aged 65 and above, participants with aMCI, and participants with ADd.We found that scores on this test decline with age at a similar rate across all diagnostic groups, but that scores are significantly lower in the aMCI and ADd groups compared to healthy controls.We further found that this test can reliably distinguish healthy controls from participants with aMCI or ADd.Based on our results, we suggest that this quick and costeffective test may be included in annual senior wellness exams, and that (in the absence of other rhinological explanations) those with low odor identification scores should be referred for further neuropsychological testing.
Participants in the ARMADA study were recruited across nine separate study sites, including Northwestern University, University of Michigan, University of Wisconsin-Madison, Mayo Clinic (Jacksonville, Florida), University of Pittsburgh, Emory University, University of California-San Diego, Columbia University, and Massachusetts General Hospital.Participants were recruited from existing research cohorts in or affiliated with this network of ADRCs funded by the National Institute on Aging (NIA) and other NIH-funded longitudinal studies that use similar methods to the ADRC longitudinal studies to clinically classify participants.
1-year follow-up scores on the NIHTB-OIT were obtained for N = 124 participants, which we used to evaluate the test-retest reliability of the NIHTB-OIT.For the entire population (N = 124), the correlation between scores at baseline and 1-year follow-up visits was r = 0.69 (95% confidence interval [CI] 0.59 to 0.78, p < 0.0001).The correlation for the NC aged 65 to 84 group (N = 61) was r = 0.57 (95% CI 0.37 to 0.72, p < 0.0001); and the correlation for the aMCI group (N = 34) was r = 0.71 (95% CI 0.49 to 0.85, p < 0.0001).The NC aged ≥85 group (N = 16) and the ADd group (N = 13) with 1-year followup visit NIHTB-OIT scores were too small to compute adequately powered within-group correlations.

Figure
Figure1Ashows a scatterplot of NIHTB-OIT scores versus age, with separate fitted regression lines (NIHTB-OIT score ∼ age) with 95% CIs for each diagnostic category.While NIHTB-OIT scores decrease significantly with age, and the mean NIHTB-OIT scores are significantly different across diagnostic categories (NC all ages, aMCI, and ADd), there is no significant interaction between age and diagnosis (i.e., slopes do not differ significantly across the three diagnostic categories).Figure1Bshows distribution plots for NIHTB-OIT scores, stratified by sex and diagnostic category (NC aged 65 to 84, NC aged ≥85, aMCI, and ADd).

Figure 2B ,
Figure 2B, color coded by diagnosis, and stratified by sex (indicated by the type of line).For females, an NIHTB-OIT score of 3 or below has a ≥50% chance of having aMCI, while for males, an NIHTB-OIT score of 5 or below is has a ≥50% chance of having aMCI.Male odor scores below 5 indicate a ≥50% chance of being classified as ADd compared to NC, while for females odor scores below 4 indicate a ≥50% chance of ADd.
F 1,245 = 12.641, p < 0.0001), indicating that participants in the ≥85 years of age group performed significantly worse on the NIHTB-OIT compared to participants in the 65 to 84 years of age group.The main effect of sex was not significant in this model (F 1,245 = 3.601, n.s.).The distribution of NIHTB-OIT scores based on NC age group are shown in Figure 1B.We further broke down the mean and standard deviation odor scores across each NC decade: For participants aged 65 to 74 (N = 103), the mean ± standard deviation NIHTB-OIT scores were 7.23 ± 1.67; for TA B L E 3 Fitted linear and logistic regression models Multiple linear regression model a :NIHTB − OIT Score ∼  o + Age 1 + Sex 2 + Diagnosis 3

2
U R E 1 NIHTB-OIT scores across age and diagnostic categories.(A) Separate regression lines (NIHTB-OIT score ∼ age) with 95% confidence intervals are fitted for NC (red), aMCI (green), and ADd (blue) diagnostic categories.From the multiple linear regression models, the mean NIHTB-OIT scores (intercepts) for each group were significantly different, but the rate of NIHTB-OIT score decline with age (slopes) were not significantly different across diagnostic groups (all interaction terms, p > 0.05).Points are jittered for visibility.(B) Dot plots displaying the distribution of NIHTB-OIT scores across sex and diagnosis categories.Female = red, and Male = blue.Mean NIHTB-OIT score and standard error of the mean are plotted for each sex and diagnostic group (solid lines, males; dashed, females).While the main effect of sex was significant across all diagnostic groups (linear regression model; p < 0.01), it was not significant within each diagnostic group (pairwise Welch two-sample t-tests, all p > 0.05).Mean NIHTB-OIT scores were significantly different between NC aged 65 to 84 and NC aged ≥85, between NC aged 65 to 84 and aMCI, and between NC aged 65 to 85 and ADd (all p < 0.001).ADd, Alzheimer's disease dementia; aMCI, amnestic Mild Cognitive Impairment; NC, normal cognition; NIHTB-OIT, National Institutes of Health Toolbox Odor Identification Test.***p < 0.001 Utility of the NIHTB-OIT for Detecting aMCI and ADd.(A) Receiver operator characteristic (ROC) curves for classifying aMCI and ADd based on NIHTB-OIT scores, age, and sex.The calculated area under the curve (AUC) for detecting aMCI was 0.78 (95% CI 0.72 to 0.85), and for detecting ADd was 0.86 (95% CI 0.81 to 0.92).(B,C) Scatter plots of participants' fitted probability of having aMCI (B) or ADd (C) from the logarithmic regression models described above, plotted against their scores on the NIHTB-OIT.Loess smoother lines are fitted separately for all participants (solid), female participants (dot-dashed), and male participants (dashed), with 95% CIs.True diagnoses are indicated by point color (NC, red; aMCI, green; ADd, blue).The legend in panel C applies to both panels B and C. ADd, Alzheimer's disease dementia; aMCI, amnestic Mild Cognitive Impairment; CI, confidence interval; NC, normal cognition; NIHTB-OIT, National Institutes of Health Toolbox Odor Identification Test participants aged 75 to 84 (N = 49), the mean ± standard deviation NIHTB-OIT scores were 6.43 ± 1.70; and for participants aged 85 to 91 (N = 96), the mean ± standard deviation NIHTB-OIT scores were 6.13 ± 1.70.
nine study sites in the United States.Many of the ARMADA sites are housed in national Alzheimer's Disease Research Centers (ADRCs) that recruit and longitudinally follow research participants aged 65 and above.The NIHTB-OIT is a nine-item multiple choice test administered using scratch-and-sniff cards and a tablet.NIHTB-OIT scores can be easily interpreted by health care workers or trained personnel.We evaluated performance on the test across clinical diagnostic categories designated as NC, aMCI, and ADd, while controlling for age and sex.We then evaluated the test's ability to discriminate between aMCI and NC participants, and between ADd and NC participants.We additionally evaluated NIHTB-OIT score associations with APOE ε4 allele carrier status and with general AD biomarker presence in a subset of participants with available data.Finally, as the NIH Toolbox was initially validated up to age 85, we compared NIHTB-OIT scores between NC participants aged 65 to 84 and NC participants aged ≥85.
Participant demographics and available AD biomarker and APOE genotype data 32 2. The nine target odors that the participant must identify include lemon, Play-Doh, bubble gum, chocolate, popcorn, coffee, smoke, natural gas, and flower.In the original validation paper,32item-level accuracy for participants in the 65 to 85 age group ranged from 65% to 92%, with the exception of the Play-Doh odor (26%).All participants in the present study took the English language version of the NIHTB-OIT.More information, including an example response screen and examiner training information, can be found on the NIH Toolbox Positive or negative assignments for the amyloid-PET scans were determined based on standardized uptake value ratios or distribution volume ratios readings, with cut points at 1.35 and 1.19, respectively.Study sites collecting CSF biomarker measures used varying methods to designate results for each participant as "Consistent with AD" or "Inconsistent with AD." Details regarding each study site's specific TA B L E 1

TA B L E 2
Summary of National Institutes of Health Toolbox Odor Identification Test scores by diagnosis

3 , Coefficient Estimate Standard error z-value p-value Interpretation
aFor the linear regression model (

Table 3 ,
Equation 1): Adjusted R 2 = 0.30 (F 4,384 = 40.85,p<0.001).The intercept reflects the mean NIHTB-OIT scores for NC males at mean age 77.8 years.For the NC vs. aMCI logistic regression model (Table3, Equation2): The Akaike Information Criterion of the model was 248.3, with residual deviance of 240.30 (df = 227).For the NC vs. ADd logistic regression model (Table3, Equation3): The Akaike Information Criterion of the model was 183.69, with residual deviance of 175.69 (df = 210).The baseline group was males at mean age 77.8 in both models.
[12][13][14] on AD biomarker presence and APOE ε4 allele status.(A)Dot plots illustrating the distribution of NIHTB-OIT scores across AD biomarker groups.Participants positive for AD biomarkers had significantly lower NIHTB-OIT scores than participants negative for AD biomarkers (p < 0.005).(B)Dot plots illustrating the distribution of NIHTB-OIT scores across APOE ε4 allele carrier groups.There was no significant difference in NIHTB-OIT scores based on APOE ε4 allele carrier status.Black points show the mean NIHTB-OIT score within each group, and error bars reflect the standard error of the mean.AD, Alzheimer's disease; NIHTB-OIT, National Institutes of Health Toolbox Odor Identification Test.**p < 0.01 aMCI and ADd participants compared to NC participants, suggesting that olfactory decline may begin earlier in life for those who go on to develop aMCI and ADd.Related findings suggest that olfactory decline may precede aMCI or ADd diagnoses by several years.[12][13][14]