The role of visual rating and automated brain volumetry in early detection and differential diagnosis of Alzheimer's disease

Abstract Background Medial temporal lobe atrophy (MTA) is a diagnostic marker for mild cognitive impairment (MCI) and Alzheimer's disease (AD), but the accuracy of quantitative MTA (QMTA) in diagnosing early AD is unclear. This study aimed to investigate the accuracy of QMTA and its related components (inferior lateral ventricle [ILV] and hippocampus) with MTA in the early diagnosis of MCI and AD. Methods
 This study included four groups: normal (NC), MCI stable (MCIs), MCI converted to AD (MCIs), and mild AD (M‐AD) groups. Magnetic resonance image analysis software was used to quantify the hippocampus, ILV, and QMTA. MTA was rated by two experienced neurologists. Receiver operating characteristic area under the curve (AUC) analysis was performed to compare their capability in differentiating AD from NC and MCI, and optimal thresholds were determined using the Youden index. Results QMTA distinguished M‐AD from NC and MCI with higher diagnostic accuracy than MTA, hippocampus, and ILV (AUCNC = 0.976, AUCMCI = 0.836, AUCMCIs = 0.894, AUCMCIc = 0.730). The diagnostic accuracy of QMTA was superior to that of MTA, the hippocampus, and ILV in differentiating MCI from AD. The diagnostic accuracy of QMTA was found to remain the best across age, sex, and pathological subgroups analyzed. The sensitivity (92.45%) and specificity (90.64%) were higher in this study when a cutoff value of 0.635 was chosen for QMTA. Conclusions QMTA may be a better choice than the MTA scale or the associated quantitative components alone in identifying AD patients and MCI individuals with higher progression risk.


| INTRODUC TI ON
Alzheimer's disease (AD) is a multifactorial neurodegenerative disease. 1 AD patients initially exhibit mild cognitive impairment, early AD includes mild cognitive impairment (MCI) to mild dementia, early AD includes mild cognitive impairment (MCI) to mild dementia. 2th the accumulation of AD pathology, brain neurons are progressively lost.It is difficult to regenerate nerves in the elderly, early prevention of pathological development may achieve better results, but the heterogeneity of MCI and AD leads to significant diagnostic challenges. 3The annual conversion rate of MCI to AD progression is 10%-15%, with a cumulative conversion rate of more than 50% over 5 years. 4,5Although confirmatory diagnostic tools such as cerebrospinal fluid (CSF) and positron emission tomography can detect AD-associated β-amyloid (Aβ) and tau pathology, they have disadvantages such as invasiveness, radiation exposure, high cost, poor accessibility, 6 and low predictive accuracy and specificity. 7,8The early stages are the best time for AD treatment, but the heterogeneity of MCI and AD leads to significant diagnostic challenges. 9ain atrophy is a neurodegenerative change closely associated with cognitive changes in AD. 10 Structural magnetic resonance imaging (sMRI) is a non-invasive tool for routine clinical diagnosis because it can visualize the severity of brain atrophy, 11,12 but subtle and overall changes in brain structure are difficult to identify by MRI.With the development of machine learning, image quantification results calculated using sMRI can be derived into comprehensive and clinically feasible diagnostic metrics, such as the AD-RAI in AccuBrain®. 13,14ppocampal volumes (HV) are one of the most important biomarkers of AD, but its measurement is cumbersome, and assessment is easily influenced by cranial size. 15The Medial Temporal Atrophy (MTA) scale is a recognized semi-quantitative tool for visually assessing the grade of hippocampal atrophy, 16,17 which distinguishes AD from normal aging and has 77% accuracy in predicting the conversion of MCI to AD. 18 Automated computational tools for quantifying brain structures allow for efficient and specific acquisition of structural brain volumes. 19The hippocampal occupancy score is an index used to estimate hippocampal atrophy based on HV occupancy and to predict the risk of early AD progression. 20Quantitative MTA (QMTA) is based on hippocampal and inferior lateral ventricle (ILV)   volumes, and can assess relative hippocampal atrophy to mimic the visual rating logic of the MTA scale.Preliminary studies have found QMTA to be 90% accurate in differentiating AD from controls and provide a better objective and quantitative assessment of the degree of medial temporal lobe atrophy. 14However, no studies have yet elucidated its performance compared with MTA in identifying early AD and predicting the risk of MCI progression.
The main objective of this study was to compare the diagnostic efficiency of QMTA and its related components (ILV and hippocampal volume) with MTA visual scores for the early diagnosis of MCI and AD.The secondary goal was to further evaluate the accuracy of QMTA and MTA in predicting conversion to AD in patients with MCI, which will determine the accuracy of the QMTA optimal diagnostic threshold in differentiating early AD.

| Study participants
All data were acquired from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (http:// adni.loni.usc.edu). 21The ADNI database was established in 2003 and has been extensively reviewed elsewhere (http:// www.adni-info.org).ADRDA) for probable AD.We further used neuropsychological scale assessment criteria to classify the degree of AD dementia, in which those with MMSE scores of 20-26 and CDR of 0.5-1 in the AD group were defined as M-AD.

| CSF biomarkers
Some of the participants underwent baseline CSF tests.The collection and processing methods are described at http:// www.adni-info.org/ .The Aβ42 classification criteria were as follows: if Aβ1-42 ≤ 880 pg/mL, CSF Aβ1-42 was considered positive. 22cording to the 2018 NIA-AA research framework, A/T/N ADrelated biological diagnostic criteria include amyloid (A), pathological tau (T), and neurodegeneration (N). 23The MCI group was divided into two subgroups: MCI with CSF Aβ 42 negative (MCI A-) and MCI with CSF Aβ 42 positive (MCI A+).

| MR imaging
The 3D T1W MRI data of 1696 subjects scanned by Philips, Siemens, and GE MR scanners at baseline were included.MRI data were acquired using a standardized ADNI MRI protocol.We downloaded the preprocessed T1W MRI images after prescaling, intensity nonuniform correction, and distortion correction from the ADNI database for subsequent quantitative and visual rating analysis.

| Visual assessment
Visual rating of MTA was performed on the coronal T1W image parallel to the brainstem axis and through the hippocampus at the level of the anterior pons.Both hemispheres were assessed using the 0-4 scoring system of MTA. 24The average MTA value on the left and right sides is the final MTA score (MTA-avg) of the subject.We used the intraclass correlation coefficient (ICC) to measure the inter-rater reliability of visual scores.MR images of 100 subjects were randomly selected for visual rating of MTA and the assessors were blinded to clinical and demographic information about the subjects.Four neurologists (MYR, CZY, YQ, and XJX) conducted the MTA rating and among them, MYR and CZY were with rich experience in visual scoring.As described previously, 25  When visually assessing all 1696 subjects, MYR and CZY still reached a high ICC of almost 0.9 (see Table S1 for details of the results).

| MR image processing
The 3D T1W MRI of the subjects was processed with AccuBrain®, a brain quantification tool (cloud-based commercial system with FDA approval and CE mark) that performs fully automatic quantification of brain structure and tissue.The absolute HV and ILV were calculated from the corresponding segmentations, and they were normalized by the intracranial volume (ICV) to generate the relative volumes, that is, the hippocampal fraction (HF, % of ICV) and ILV fraction (% of ICV).The QMTA is a quantitative indicator used to reflect MTA.Similar to the logic of MTA assessment (enlargement of the ILV and atrophy of the hippocampus), QMTA was calculated from the volume ratio of the ILV to the hippocampus.

| Statistical analysis
For variables that exhibit normal distribution, we use mean ± SD to describe and use parametric test methods.Otherwise, we use the median and interquartile range (IQR) to describe, and use nonparametric test methods.One-way analysis of variance with post-hoc Bonferroni correction for multiple comparisons was performed to evaluate the differences in subject characteristics, brain volumetric measures, and CSF-based measures.The chi-squared test was used to determine differences in categorical variables between the groups.Correlation analysis was performed using Spearman's correlation coefficient.The ICC two-way random-effects model was used to evaluate the reliability and consistency of the MTA visual rating results between the two visual raters. 26 used the receiver operating characteristic (ROC) area under the curve (AUC) with DeLong's test to assess the diagnostic perfor-

| Demographic characteristics
As shown in Table 1, compared with the other three groups, the QMTA, MTA, hippocampus, and ILV of the AD group were statistically different (p < 0.001).There were also significant differences in brain structure indicators in the MCIc group compared with the MCIs or NC groups (p < 0.001).Demographic information of the A/T/N subgroups is presented in Table S2.

| Diagnostic accuracy of all subjects
The diagnostic accuracies of QMTA, MTA, and a single brain structure in distinguishing M-AD, MCIc, MCIs, and NC are shown in Table 3.The results of the comparison of the AUC values of the different indicators can be found in Table S3.
When distinguishing NC from MCIs or MCIc, the diagnostic accuracy of QMTA was higher than that of MTA-avg, the hippocampus, and ILV.The superiority of QMTA was more pronounced in identifying MCIc from MCIs, with an AUC of 0.725, which was significantly higher than that of MTA (AUC = 0.682, p < 0.001), ILV volume (AUC = 0.675, p < 0.001), and ILV fraction (AUC = 0.686, p < 0.001) but not significantly higher than that of HV and HF (AUC HV = 0.691, AUC HF = 0.709; both p > 0.05).

| Impact of covariates
Although there were significant differences in age, sex, and education between these four groups, the absolute values of the differences were small; thus, we did not adjust the covariates in the primary analyses.To exclude the effects of covariates, we also conducted ROC analysis adjusted for age, sex, and education and found that the pattern of results remained unchanged 27 (see Tables S4-S6).

| Diagnostic accuracy of subgroup analysis
To further analyze the diagnostic accuracy of QMTA, we divided the study subjects into different subgroups for analysis by age, sex, or A/T/N pathology type. 27,28

| Age
We analyzed the accuracy of brain structure diagnosis in age subgroups (see Tables S7 and S8).The results of ≤75 years and >75 years subgroups were similar to the results of the overall population analysis, and the diagnostic accuracy of QMTA was higher than that of MTA, hippocampus, and ILV.To distinguish M-AD from NC, QMTA had the best diagnostic accuracy (≤75 years AUC = 0.979, >75 years AUC = 0.973).In distinguishing NC from MCIs or MCIc, the diagnostic performance of brain structure indicators in the >75 years group was improved (AUC = 0.700-0.900).In distinguishing MCIs and MCIc, the diagnostic accuracy of QMTA in the >75 years group (AUC = 0.707) was slightly lower than that in other groups.

| Gender
Gender subgroups compared structural brain indicators such as hippocampus, lateral subventricular horn, and QMTA (see Table S9).We  31 Both QMTA and MTA were significantly correlated with cognitive assessment; however, the correlation of QMTA was stronger.These results were consistent with those of a previous study. 32Both QMTA and MTA were related to CSF biomarkers, indicating the possibility that AD neuropathology begins with changes in the medial temporal lobe. 33e MTA scale is an important tool for early AD screening and diagnosis, 34 but as a qualitative measure based on 2D MR or CT images, 17 it cannot sensitively identify early or subtle structural changes in AD or MCI. 10,35Meanwhile, MTA visual scoring relies entirely on professional raters, which lacks objectivity and efficiency. 36tomatic measurement tools based on artificial intelligence have provided standardized and highly reproducible results. 31,37QMTA, as a quantitative index based on the measurement of 3D images, can observe subtle and systematic changes in brain structure. 382 In the present study, we found that the accuracy of QMTA in distinguishing MCI from NC was 0.7.MTA is also considered a marker for diagnosing the conversion of MCI into AD. 16To identify patients with MCI who had a higher chance of converting to AD, previous studies combined MRI, CSF, and neurocognitive scale machine learning, and the accuracy of the classification of MCIs and MCIc was reported to be 0.670. 43In addition, fusion of MRI, FDG-PET, and CSF data for machine learning, regression analysis of MMSE and ADAS-Cog scores, and classification accuracy of MCIs and MCIc was 0.739. 44The accuracy of QMTA in this study to distinguish between MCIs and MCIc was as high as 0.725, which is similar to the highest reported multi-domain classification accuracy. 45,46However, as a single indicator only replied on structural MRI, QMTA is more easily accessible, practical, and cost-effective as a good clinical screening tool.
The MTA assessment ranges from 0 to 4, and the study found that patients with AD or MCI can increase their MTA score by 1 point in an average of 8 years. 35In comparison, the rate of hippocampal atrophy varies at different ages, resulting in different diagnostic criteria for MTA around the age of 75. 47Age influences MTA diagnosis accuracy. 48The present study found that the diagnostic accuracy, sensitivity, and specificity of QMTA in different sexes and ages are similar; thus, it is less affected by age and sex.Amyloid levels can predict the risk of early AD progression. 7When the objective was changed to identify AD confirmed by A+ and MCIc confirmed by A+, the quantitative MRI-based indices, including QMTA, showed higher diagnostic accuracy for distinguishing NC from AD or MCIc and the conversion of MCI to AD.The accuracy, sensitivity, and specificity of QMTA were higher than those of MTA.QMTA may be a quantitative continuous index that can be used to sensitively observe changes in the medial temporal lobe.
The strengths of this study were the multicenter-based large sample size, use of artificial intelligence automated quantitative MRI analysis tools to quantify brain structure and quantitative MTA metrics, and comparison of quantitative metrics across subgroups to identify early AD accuracy.There were also some limitations to this study.First, the follow-up time of some MCI patients was too short (only 1 year), and some of them were not converted to AD and listed as MCIs after 1 year of follow-up, which might have led to the estimation error of sensitivity and specificity of QMTA in distinguishing the conversion of MCI to AD.Another limitation was that there was no clinical symptom grouping of MCI patients, that is, amnestic MCI or non-amnestic MCI.Amnestic MCI is more likely to be converted to AD; thus, it could be better to analyze them.

| CON CLUS ION
QMTA can be used as a reliable and effective tool for assisting in the clinical diagnosis of AD.The diagnostic performance of QMTA in the early identification of AD or MCI was better than that of MTA, the hippocampus, and ILV.Compared with MTA, QMTA is reproducible and objective in estimating the risk of MCI conversion and the onset of AD.

(
including 573 MCI stable and 311 MCI converters), and 278 were in the AD group.All participants underwent neuropsychological cognitive assessment and 3D T1W MRI scanning at baseline.The NC group had no clinical symptoms of cognitive impairment, with a Mini-Mental State Scale (MMSE) score ≥ 26 and Clinical Dementia Rating (CDR) = 0. MCI patients had normal activities of daily living, with an MMSE score ≥ 24 and CDR < 0.5.The diagnostic information during the follow-up period of the enrolled study subjects was also considered, and MCI patients were subdivided into two groups: MCI stable (MCIs, remaining MCI during the follow-up period) and MCI converters (MCIc, converted to AD during the follow-up period).AD patients met the diagnostic criteria of the National Institute of Neurological and Communicative Disorders and Stroke and the Alzheimer's Disease and Related Disorders Association (NINCDS/ Age (years), median (IQR) two-way random, absolute, single-measure ICCs [ICC (1, 2)] were used to evaluate the reliability of each scale at the level of each rater.Average measures ICCs [ICC (2, k)] were used to evaluate the improvement in reliability of the scale based on average scores from all raters.Both for single-measures ICCs and average measures ICCs, the highest consistency was found for the combined assessment of MYR and CZY [ICC (1, 2) = 0.908 (0.867, 0.937); ICC (2, k) = 0.952 (0.929-0.968)].When performed by four raters, the ICCs were reduced [ICC (1, 2) = 0.821 (0.768, 0.867); ICC (2, k) = 0.948 (0.930, 0.963)].Therefore, we used the mean scores of the two assessors (MYR and CZY) as the patient's final MTA score.
mance of the brain structural indices.Sub-analyses were conducted by grouping the study cohort by age (>75 years or not), sex, and status of ATN-based pathological diagnosis.Statistical significance was set at p < 0.05 (two-sided).Statistical analyses were performed using SPSS Statistics version 25.0.0,MedCalc, and R version 4.1.2.

Table 1
1696 subjects were included in this study, of which 534 were in the NC group, 884 were in the MCI group

3 of 12 MAI et al. TA B L E 1
Demographic characteristics, brain structure, and cognitive performance across NC, MCI, and AD groups.
ROC curve analyses for differentiating different diagnoses with SMRI indexes.AUC, area under the curve; CI, confidence interval; ILV, inferior lateral ventricle; MTA scale, visual rating of medial temporal lobe atrophy (average score of left and right hemispheres); vs., versus.Diagnostic accuracy of different indexes between amyloid-related subgroups.Abbreviations: AUC, area under the curve; CI, confidence interval; ILV, inferior lateral ventricle; MTA scale, visual rating of medial temporal lobe atrophy (average score of left and right hemispheres); QMTA, automatic quantification of medial temporal lobe atrophy based on total ILV volume divided by total hippocampal volume.
14,39as the hippocampus, to differentiate AD.14,39Thus, it might be more suitable for the clinical diagnosis of AD.
41uld greatly reduce the misdiagnosis rate with an FNR of 7.55%, at the same time improving the sensitivity to 90.26% at the cutoff of 0.635.And studies from other databases have confirmed that the QMTA has greater diagnostic accuracy than a single brain structural index,41An autopsy study of hippocampal volume based on MRI found that hippocampal volume was more in line with the neuropathology of Alzheimer's disease than with clinical diagnosis or cognitive measurement.
TA B L E 5Cutoff values of QMTA and MTA in distinguishing AD from NC. AUC, area under the curve; CI, confidence intervals; FNR, false-negative rate; FPR, false-positive rate; NPV, negative predictive value; PPV, positive predictive values; Sen, sensitivity; Spec, specificity; yrs, years.